Introducing Ta-Da: The New Tokenized Way to Contribute to AI Training

Published November 9, 2023

A novel intersection of Web3 and artificial intelligence is emerging, as Vivoka unveils its pioneering approach to data collection for AI training. Heading this venture is William Simonin, leveraging Vivoka's strength in voice recognition to launch the private beta of their latest venture 'Ta-Da' – a clever nod to the valuable 'data' users will contribute.

With a public beta anticipated for release in the following quarter, Ta-Da represents an innovative platform where AI companies of various specialties can source quality, affordable data crucial for developing their models.

Simonin shared with Decrypt, 'Through 'Ta-da,' we are crafting a space where AI entities can procure data without the quality sacrifices or substantial costs typically associated.' Ta-Da employs blockchain to motivate global users to share data, completing activities like reading a passage, writing, or object recognition.

Contributing data could encompass voice recordings, images, videos, and texts, all to be made available for businesses in the AI sector. In exchange, participants receive TADA tokens for their input.

Constructed on the MultiversX blockchain, Ta-Da addresses systemic issues in AI data training—largely, exorbitant costs and inconsistent quality of data. 'Blockchain providers are essential tech partners,' Simonin explains to Decrypt. 'Working with MultiversX offers a more focused and intimate collaboration than one might find as a project on other, larger platforms.'

By giving priority to user privacy, Ta-Da distinctly steps away from methods employed by firms like Meta Platforms, which has utilized public posts from Facebook and Instagram to train its AI virtual assistant. Meta Platforms META has been reported to use this content for training purposes, attracting scrutiny regarding privacy and consent.

Ta-Da's Focus on Diverse Voice Data

Ta-Da, in particular, is gathering a diverse range of voice data to refine AI voice recognition capabilities. Simonin's expertise, gained from his tenure at Vivoka, involves creating technology that supports 42 languages and is designed for use in voice development kits, thereby empowering a range of industries to integrate it into their speech interfaces.

Vivoka's tech is already embedded in over 100,000 devices globally. Through this work, Simonin has seen firsthand the challenges in collecting large volumes of quality voice data. He points out, 'A thousand hours of audio content might fetch up to $100,000—thus AI companies usually set aside $100,000 to $1 million just for data acquisition annually.'

Issues of authenticity and quality are prevalent, with around 5-10% of datasets undergoing thorough evaluation. Moreover, Simonin highlights the need for a diverse set of audio data to train AI effectively. Ta-Da, in response, will provide increased rewards for scarce voice types, addressing the expensive and complex task of gathering a varied data pool.

Ta-Da will offer a tiered reward system. 'You'll find various tasks, each with its own reward level. For example, a Corsican speaker with an English accent would be a valuable contributor for Ta-Da, commanding a higher reward,' Simonin intimates to Decrypt.

Ta-Da, AI, Blockchain