Companies

VERSES Genius™ Outshines OpenAI Model in Code-Breaking Challenge, "Mastermind"

Published December 17, 2024

VERSES AI Inc., a cognitive computing company based in Vancouver, has announced significant accomplishments regarding its leading product, Genius. In a head-to-head comparison against OpenAI's renowned generative AI model, known as o1 Preview, Genius proved to be far more effective in completing the code-breaking game Mastermind.

Performance Overview

In over one hundred matches of Mastermind, Genius demonstrated outstanding capabilities, outpacing OpenAI’s o1 Preview by a remarkable 140 times in speed and achieving cost efficiency that was over 5,000 times better. This performance is especially notable given the nature of the task, which demands logical reasoning and adaptive strategies.

Testing Methodology

The performance tests were structured around 100 games of Mastermind, a game where competitors need to decipher a hidden code through educated guesses while receiving feedback to guide their next attempts. Both models aimed to decode the same secret code, consisting of four positions and six possible colors. The evaluation criteria included success rate, time taken to complete each game, the number of guesses made, and overall computational costs.

Results Highlights

The results revealed compelling differences between the two AI models:

MetricGenius™o1-preview
Success Rate100%71%
Total Compute Time5 minutes, 18 seconds12.5 hours
Total Cost for 100 Games$0.05 USD$263 USD
Hardware RequirementsStandard laptop (M1)GPU-based Cloud

Key Takeaways

With its exceptional accuracy and reliability, Genius outperformed in every significant metric:

  • Accuracy: Genius successfully solved all Mastermind challenges without error.
  • Speed: Genius completed each game in a rapid 1.1 to 4.5 seconds, while OpenAI's timings ranged from approximately 7.9 seconds to an extended 15 minutes.
  • Efficiency: The total compute time for Genius was merely over 5 minutes, contrasting sharply with the extensive 12.5 hours for OpenAI.
  • Cost: Genius’s estimated cost for processing 100 games was just $0.05 USD, compared to the staggering $263 USD for OpenAI.

Implications of the Findings

These results emphasize a notable gap in current AI technology, particularly the limitations of language-focused models like OpenAI's in executing logical reasoning tasks with precision and reliability. The game of Mastermind serves as a crucial benchmark, illustrating the necessary causal reasoning capabilities for applications across various sectors, including cybersecurity, fraud detection, and financial forecasting.

Gabriel René, the CEO of VERSES, highlighted that Genius not only excels in logical reasoning but does so in a manner that is faster and cost-effective, positioning it as a powerful tool for tackling complex business challenges.

About VERSES AI Inc.

VERSES AI is dedicated to developing next-generation intelligent software that emulates the wisdom of nature. Their flagship tool, Genius, provides machine learning practitioners with advanced capabilities to model complex dynamic systems and create autonomous agents that can continuously reason, learn, and plan effectively.

To explore more about their innovative solutions, visit verses.ai.

AI, Performance, Mastermind