A year later Openai has not yet released his voice cloning tool
At the end of last March, Openai announced a “small review” of AI service, Voice enginethat the company claims that it can clone a person’s voice with only 15 seconds of speech. About a year later, the tool remains in visualization and Openai has not gave an indication of when it can start – or whether it will start at all.
The company’s reluctance to implement the service widely can point out fear of abuse, but it can also reflect efforts to avoid an invitation to regulatory control. Openai has a historic have been charged of a priority of “shiny products” at the expense of safety and of rush To beat competing companies on the market.
In a statement, a spokesman for Openai told TechCrunch that the company continues to test the voice engine with a limited set of “reliable partners”.
“(We) learn from how (our partners) use the technology to improve the usefulness and safety of the model,” the spokesman said. “We were excited to see the different ways in which it was used, from speech therapy, to language study, to customer support, to video game characters, up to AI avatars.”
Pushed back
A voice engine that feeds the voices available in the API of Openai’s Text To-Te-Sech’s as well as Chatgpt’s VoiceIt generates a naturally sounding speech that looks a lot like the original speaker. The instrument transforms written characters into a speech, limited only by certain content railings. But he was the subject of delays and shifting the windows of release from the beginning.
As Openai explained in June 2024 Blog postThe voice engine model learns to predict the most likely sounds that a speaker will make for a text transcript, taking into account different voices, accents and styles of speaking. The model can then generate not only spoken versions of text, but also “speeches” that reflect how different types of speakers will read text aloud.
Openai initially intended to bring a voice engine, originally called personalized votes, on its API on March 7, 2024, according to a draft blog post seen by TechCrunch. The plan was to give a group of up to 100 “reliable developers” access before a broader debut as a priority for applications for the construction of DEVS that provided “social benefit” or showed “innovative and responsible” uses of technology. Openai had even trademark And prices: $ 15 per million characters for “standard” votes and $ 30 per million per million characters for “HD quality” votes.
Then, at the eleventh hour, the company postponed the message. Openai eventually revealed a voice engine a few weeks later without a registration option. Access to the instrument will remain limited to the cohort of about 10 Dev, with which the company began working at the end of 2023, Openai said.
“We hope to start a dialogue for the responsible deployment of synthetic votes and how society can adapt to these new opportunities,” Openai Openai Wrote in the publication on the blog to announce Voice Engine At the end of March 2024, “Based on these conversations and the results of these small-scale tests, we will make a more informed decision whether and how to implement this technology on a scale.”
Long in the works
The voice engine has been running since 2022, according to Openai. The company claims He demonstrated the instrument of Global Politicians at the highest levels in the summer of 2023 to show his potential – and risks.
Several partners have access to the voice engine today, including Livox, which builds devices that allow people with disabilities to communicate more naturally. CEO Carlos Pereira told TechCrunch, while Livox ultimately couldn’t incorporate the voice engine into a product because of the online tool requirement (many Livox customers have no internet), he found that the technology was “really impressive”.
“The quality of the voice and the ability to speak voices in different languages is unique – especially for people with disabilities, our customers,” Pereira told TechCrunch via email. “This is really the most impressive and easy to use (tool for) creating votes that I have seen (…) hope Openai soon develops an offline version.”
Pereira says he has not received Openai guidance on the start of the voice engine, nor has he seen signs that the company plans to start charging for the service. So far, Livox has not had to pay for its use.
In this above publication in June 2024, Openai hinted that one of its considerations for slowing the voice engine was the potential for abuse during last year’s election cycle in the United States. Inform in discussions with stakeholders, the voice engine has several measures to mitigate safety, including water marking to track the origin of the generated audio.
Developers must receive “explicit consent” from the original speaker before using Voice Engine, according to Openai, and they must make “clear announcements” to their audience that the votes were generated by AI. However, the company did not say how to apply these policies. This can be extremely challenging on a scale, even for a company with Openai resources.
In his publications on the Openai blog, he also suggested that he hoped to build a “voice authentication” to check the speakers and a “unemployed” list that prevents the creation of votes that sound too similar to prominent figures. Both are technologically ambitious projects and confuse them, would have a bad affair in a company that is often accused of Safety Removal InitiativesS
Effective ID filtration and verification quickly become basic requirements for responsible voice cloning technological editions. AI VOICE cloning was the third fastest growing scam since 2024, According to one sourceS This has led to deceit and Banking checks for security To be surrounded, as the laws of confidentiality and copyright struggle to continue. Malicious actors have used voice cloning to create flammable deepfakes of celebrity and politiciansand these deepfakes have widespread as a wild fire In social media.
Openai can start a voice engine next week – or never. The company has repeatedly said that it weighs, keeping the service small in scope. But one thing is clear: for reasons for optics, safety reasons, or both, the limited review of the voice engine has become one of the most long in the history of Openai.