Unlocking Value from Data: How AI Agents Conquer 2024
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn more
If 2023 was the year of AI-powered generative chatbots and search, 2024 was all about AI agents. What was started by Devin earlier this year has grown into a full-fledged phenomenon, offering businesses and individuals a way to transform the way they work at various levels, from programming and development to personal tasks such as planning and booking vacation tickets .
Among these wide-ranging applications, we’ve also seen the rise of data agents this year — AI-based agents that handle different types of tasks in the data infrastructure stack. Some did core data integration work, while others handled downstream tasks such as analytics and execution management, making things simpler and easier for enterprise users.
The benefits were improved efficiency and cost savings, leading many to wonder: How will things change for data teams in the coming years?
Gen AI Agents took over data tasks
While agent capabilities have been around for some time, allowing businesses to automate certain basic tasks, the rise of generative AI took things to the next level entirely.
With natural language processing capabilities and the use of gen AI tools, agents can go beyond simple reasoning and responding to actually planning multi-step actions, independently interacting with digital systems to complete actions while collaborating with other agents and humans . They also learn to improve their performance over time.
Devin at Cognition AI was the first major agency offering to enable engineering operations at scale. The bigger players then started to provide more targeted enterprise and personal agents powered by their models.
In a conversation with VentureBeat earlier this year, Google Cloud’s Gerrit Kazmeier said he heard from customers that their data scientists were constantly facing challenges, including automating manual work for data teams, reducing cycle time on data pipelines and analytics and simplifying data management. In essence, the teams were not short of ideas about how they could create value from their data, but they lacked the time to execute those ideas.
To fix this, Kazmeier explained, Google revamped BigQuery, its core data infrastructure offering, with Gemini AI. The resulting agent capabilities not only provide enterprises with the ability to discover, clean and prepare data for downstream applications – breaking down data silos and ensuring quality and consistency – but also support pipeline management and analytics, freeing teams to focus on tasks with higher value.
Many enterprises today use Gemini’s agent capabilities in BigQuery, including a fintech company More thanwhich leverages Gemini’s ability to understand complex data structures to automate the query generation process. Japanese IT company Honory also uses Gemini SQL generation capabilities in BigQuery to help its data teams deliver information faster.
But discovery, preparation and support for analysis was only the beginning. As the underlying models have evolved, even granular data operations—introduced by startups specializing in their respective fields—have moved toward deeper, agent-driven automation.
for example, AirByte and quickly made headlines in the data integration category. The first launches an assistant that creates data connectors from a link to API documentation in seconds. Meanwhile, the latter enhanced its broader application development offering with agents that generated enterprise-grade APIs — whether reading or writing information on any topic — using only natural language description.
Based in San Francisco Ultimate AIin turn, targets a variety of data operations, including documentation, testing and transformations, with new DataMates technology that uses agent-based AI to extract context from the entire data stack. A host of other startups including Red bird and RapidCanvasis also working in the same direction, claiming to offer AI agents that can handle up to 90% of the data tasks required in AI and analytics pipelines.
Agents feeding RAG and others
Beyond large-scale data operations, agent capabilities have also been explored in areas such as retrieval-enhanced generation (RAG) and downstream workflow automation. For example, the team behind the vector database Weaviate recently discussed the idea of agent RAGa process that allows AI agents to access a wide range of tools—such as web search, a calculator, or a software API (such as Slack/Gmail/CRM)—to extract and validate data from multiple sources to improve response accuracy.
In addition, towards the end of the year, Snowflake intelligence emerged, enabling enterprises to set up data agents that can tap not only business intelligence data stored in their Snowflake instance, but also structured and unstructured data in isolated third-party tools – such as sales transactions in a database data, documents in knowledge bases like SharePoint, and information in productivity tools like Slack, Salesforce, and Google Workspace.
With this additional context, agents derive relevant insights in response to natural language questions and take specific actions around the generated insights. For example, a user can ask their data agent to enter the discovered insights into an editable form and upload the file to their Google Drive. They can even be prompted to write to Snowflake tables and make changes to the data as needed.
Much more to come
While we may not have covered every data agent application seen or announced this year, one thing is pretty clear: the technology is here to stay. As AI models continue to evolve, the adoption of AI agents will be in full swing, with most organizations, regardless of their sector or size, choosing to delegate repetitive tasks to specialized agents. This will directly translate into efficiency.
As proof of this, in a recent survey of 1,100 technical executives conducted by Capgemini82% of respondents said they intend to integrate AI-based agents into their stacks in the next 3 years—up from the current 10%. More importantly, 70 to 75 percent of respondents said they would trust an AI agent to analyze and synthesize data on their behalf, as well as handle tasks such as generating and iteratively improving code.
This agent-driven shift would also mean significant changes in the way data teams operate. Agent outputs are currently non-production, which means a human has to take over at some point to fine-tune the performance for their needs. However, with a few more improvements in the coming years, this gap will likely close – giving teams AI agents that would be faster, more accurate and less prone to the mistakes that humans normally make.
So to summarize, the roles of data scientists and analysts that we see today are likely to change, with users likely to move into the domain of AI supervision (where they can monitor AI actions) or tasks with more high value that the system may struggle to perform.