Amazon races to transplant Alexa’s ‘brain’ with generative AI

Rate this post


Amazon is set to relaunch its Alexa voice-powered digital assistant as an artificial intelligence “agent” that can perform practical tasks, as the tech group struggles to address challenges that have stymied the system’s AI overhaul.

The $2.4 trillion company has spent the past two years trying to redesign Alexa, its conversational system embedded in 500 million consumer devices worldwide, so the software’s “brain” is being transplanted with generative AI.

Rohit Prasad, who leads the Artificial General Intelligence (AGI) team Amazon:It told the Financial Times that the voice assistant still had to overcome several technical hurdles before launch.

This includes solving the problem of “hallucinations,” or “latency,” and reliability. “Hallucinations should be close to zero,” Prasad said on it.”

Amazon executives’ vision is to transform Alexa, which is currently still used for a number of simple tasks like playing music and setting alarms, into an “agent” product that acts as a personalized concierge. This could include anything from from recommending restaurants to adjusting bedroom lights based on a person’s sleep cycles.

Alexa’s redesign has begun since the launch of OpenAI’s ChatGPT, backed by Microsoft, in late 2022. While Microsoft, Google, Meta, and others have quickly implemented generative AI themselves computing platforms and improve its software services, critics question whether Amazon can solve its technical and organizational struggles in time to compete with rivals.

According to multiple employees who have worked on Amazon’s voice assistant teams in recent years, its efforts have been complicated and follow years of AI research and development.

Several former employees said the long wait for the release was largely due to unexpected difficulties in modifying and combining the simpler, predefined algorithms that Alexa was built on with more powerful but unpredictable large language models.

In response, Amazon said it was “working hard to make its voice assistant more proactive and capable.” It added that a technical investment of this scale, as a live service and a suite of devices used by customers around the world, was unprecedented and not as simple as putting LLM on the Alexa service.

Prasad, Alexa’s former chief architect, said last month’s release of the company’s internal Amazon Nova models, led by its AGI team, was driven in part by specific needs for optimal speed, cost and reliability to help AI apps like Alexa , “getting to the last mile, which is really hard.”

To act as an agent, Alexa’s “brain” needs to be able to call hundreds of third-party apps and services, Prasad said.

“Sometimes we underestimate how many services are integrated into Alexa, and that’s a huge number. . you have to be able to do it in a very cost-effective way,” he added.

The difficulty comes from Alexa users expecting fast responses as well as extremely high levels of accuracy, qualities that run counter to the typical probabilistic nature of today’s generative AI, which is statistical software that predicts words based on speech and language patterns.

Some former employees also point to the struggle to maintain the Assistant’s original features, including its consistency and functionality, while imbuing it with new generative features such as creativity and free dialog.

Because of the more personalized, conversational nature of LLMs, the company also plans to hire specialists to shape the AI’s personality, voice and vocabulary so that it remains familiar to Alexa users, according to a person familiar with the matter.

A former senior member of the Alexa team said that while LLMs were very sophisticated, they did come with risks, such as giving answers that were “completely made up for a while”.

“At the scale that Amazon operates, this could happen many times a day,” they said, damaging its brand and reputation.

In June, Mikhail Eric, Alexa’s former machine learning scientist and founding member of its “conversational modeling team,” said publicly that Amazon “dropped the ball” with Alexa by becoming “the undisputed market leader in conversational AI.”

Eric said that despite strong scientific talent and “enormous” financial resources, the company was “bound by technical and bureaucratic problems”, suggesting that “data was poorly recorded” and “documentation was either non-existent or out of date”.

The historical technology behind the voice assistant was inflexible and difficult to change quickly, weighed down by a clunky and disorganized codebase and an engineering team “spread too thin,” according to two former employees who worked on the AI ​​associated with Alexa.

The original Alexa software, built on technology acquired from British startup Evi in ​​2012, was a question-answering machine that worked by searching for the right answer in a given universe of facts, such as the weather of the day or a specific song in your music library.

The new Alexa uses a bunch of different AI models to recognize and translate voice requests and generate responses, as well as detect policy violations such as inappropriate responses and hallucinations. Software to translate between legacy systems and new AI models provisioning is a major hurdle in Alexa-LLM integration.

Models include Amazon’s own software, including the latest Nova models, as well as Claude, an AI model from startup Anthropic, in which Amazon has invested. More than $8 billion the course of the past 18 months.

“[T]”The hardest thing about AI agents is making sure they’re safe, reliable and predictable,” Anthropic CEO Dario Amodei told the FT last year.

Agent-like AI software must reach a point where . . . people can actually have confidence in the system,” he added. “Once we get to that point, then we will release these systems.”

One current employee said more steps are still needed, such as covering child safety filters and testing specific integrations with Alexa, such as smart lights and the Ring doorbell.

“The issue is reliability, so that it works close to 100 percent of the time,” the employee added. . . or Apple or Google are delivered slowly and incrementally.”

Many third parties developing “skills” or features for Alexa said they were unsure when the new generative AI-powered device would be introduced and how to build new features for it.

“We’re waiting for details and understanding,” says Thomas Lindgren, co-founder of Swedish content developer Wanderword. “When we started working with them, they were much more open. . then over time they changed.”

Another colleague said that after an initial period of “pressure” put on developers by Amazon to start preparing for the next generation of Alexa, things calmed down.

An ongoing challenge for Amazon’s Alexa team, which has been hit by major layoffs in 2023, is how to make money. Figuring out how to make the assistants “cheap enough to run at scale” will be a big challenge, he says. Jared Roche, co-founder of generative AI group OctoAI.

Options being discussed include creating a new Alexa subscription service or cutting back on sales of products and services, a former Alexa employee said.

Prasad said Amazon’s goal is to create different AI models that can be “building blocks” for different applications beyond Alexa.

“What we’ve always focused on is customers and practical AI. We don’t do science for science’s sake,” Prasad said. . . to deliver customer value and impact, which in this age of generative AI is becoming more important than ever as customers want to see a return on investment.”

 
Report

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *