GenAI in the Enterprise: Risks, Rewards, and Data Readiness
Five steps to trusted AI from Informatica World 2024
Welcome to the Cloud Database Report. I’m John Foley, a long-time tech journalist, including 18 years at InformationWeek, who then worked in strategic comms at Oracle, IBM, and MongoDB. I’m now a VP with Method Communications. Following is my report from Informatica World 2024 in Las Vegas.
Good AI requires good data.
Sounds simple, but few organizations can do it. Many don’t have all of the data-management pieces and processes in place to consistently produce the high quality data they need to be successful with artificial intelligence. Virtually everyone is racing to get there.
That was my No. 1 takeaway from Informatica World 2024 (May 19 - 23), where the company’s leaders, customers, and partners spent every minute of every session talking about how to achieve the sought-after state of data readiness. Gartner analyst Robert Thanaraj used the term “data fitness,” which is a good way to think about it.
The entertainment in Las Vegas included jugglers throwing knives past the face of a person on stage. AI must feel like that for those getting started.
What can go wrong? Informatica CEO Amit Walia, in his opening keynote, said the risks include fragmented and unruly data that can result in AI hallucinations. Gartner added to the list of risks: data bias, out-of-scope prompts, privacy, costs, lack of skills, explainability, liability. Lots of flying knives.
But Walia also spoke about the potential for “huge rewards.” And that, my friends, is the AI carrot that business leaders are reaching for. Which is why so many Informatica World attendees were asking questions like “Is our data ready?” and “Do we have the right policies in place?”
You can see a replay of Informatica World 2024 here.
Following are the key themes that resonated with me at Informatica’s big event.
1. Data quality is foundational to trusted AI
I’ve been reporting on the challenges and opportunities of data management since the mid-1990s—going back to the early days of centralized, enterprise data warehouses—and data quality has been an issue the entire time. The fundamentals haven’t changed much: Integration, replication, metadata, extract/transform/load (ETL), master data management (MDM), and data catalogs were building blocks then, and they still are today.
Yet, the data landscape has changed dramatically. Today, there’s a million times more data and new data types coming from billions of devices and other sources, often in real time. That works best if the pipelines and platforms used to prep the petabytes of data pouring in are modern, cloud-based tools and services. Case in point: Informatica’s Intelligent Data Management Cloud (IDMC) comprises data cataloging, integration, quality, observability, MDM, governance, security, all in an integrated, AI-powered cloud platform that is accessible to data engineers, data scientists, and business users.
The issue is that too many organizations don’t have the data-management infrastructure in place to feed quality data into their AI and generative AI initiatives. According to an Informatica survey, 72% say that data management is a major obstacle in scaling AI use cases. “Somehow, data is not ready for GenAI use cases,” observed Gartner analyst Thanaraj during a presentation on data & AI architecture. “This is where the challenge is.”
2. Data architectures must incorporate the new AI tech stack
One thing that’s very clear is the need for updated data architectures that support AI workflows, including GenAI and retrieval augmented generation (RAG). When asked “What’s next?,” a chief data architect with a financial services company replied a modern data architecture that addresses unstructured data for GenAI “has to be the top thing.”
What might that look like? One presenter showed an enterprise architecture for GenAI that included source and target connectors, embedding models, Q&A models, large language model (LLM) orchestration frameworks, and vector databases, as well as monitoring/caching/validation and LLM serving. And that was a “simplified” architecture!
So, data-engineering teams need to go back to the whiteboard to draw up their AI architecture and infrastructure. “GenAI is going to create more complexity” is a phrase I heard more than once.
Informatica is working with partners to simplify how the pieces fit together. It has created solution blueprints with AWS, Databricks, Google Cloud, Microsoft Azure, Oracle Cloud Infrastructure, and Snowflake.
3. GenAI use cases are emerging
ChatGPT, Cohere, Gemini, Bing—we’re familiar with these and other popular GenAI tools. But we haven’t heard as much about enterprise-specific GenAI apps because many orgs are still figuring out where and how to get started.
Make no mistake, business and IT leaders are eager to get up and running with their own GenAI. They recognize a here-and-now opportunity—for new customer-facing services, internal productivity gains, and revenue and profitability. However, given the potential risks, they’re being deliberate in how they proceed.
In an informal poll of the audience at Informatica World, 82% indicated they were in the exploration or early implementation phases of AI/GenAI adoption. “We’re in the crawl stage with GenAI,” said one panelist, a manager of enterprise architecture.
Others have established internal think tanks, innovation hubs, and invited employees to contribute ideas on where to get started. Early use cases mentioned at Informatica World include content summarization/translation and AI copilots for software development. Informatica put GenAI use cases into three buckets: creative generation and optimization; employee productivity and creativity; and optimizing business processes. Informatica’s CLAIRE GPT copilot brings the benefits of GenAI to many aspects of data management.
4. AI governance is essential to avoid the pitfalls
I’ve always been a big believer in the need for robust data governance—and never more so than with AI and GenAI. Sanofi, a global healthcare and pharmaceutical company, gave a detailed overview of how it has implemented modern data governance.
Check out the job titles of the Sanofi presenters: Head of data engineering, architecture, and governance. And head of R&D data strategy and governance, and data foundations. Those roles show a major commitment to data governance.
Here’s what these two experts said:
“What’s changed is we are looking at data governance from a business point of view.” That includes both risk mitigation (i.e. data integrity, IP, non-compliance) and opportunities (innovation, insights, drug discovery).
They adopted data mesh principles—data as a product; aligning to business domains; assigning data-management leads and stewards for accountability; and automated data sharing via self-service.
Learnings: Executive alignment is required. Federation of governance capabilities helps achieve scale. And it’s essential to communicate policy across the enterprise. “People may say, ‘I don’t want to do this.’ It’s not optional.”
5. AI strategy requires a data-driven culture
Another priority is the need to raise “AI literacy” across your enterprise. Employees must be educated not just on how to use AI, but how it will impact roles, responsibilities, customer engagement, and business processes. One Informatica customer, head of enterprise data management and innovation, said her organization undertook a two-day literacy campaign “on what we want to do and what it means for the business.”
Others see an opportunity to upskill employees, ease the grunt work, and provide more attractive career paths. One panelist said he foresees cost benefits eventually, but the near-term objective is to use AI for quality-of-life improvements for employees and customers.
An AI-ready corporate culture is vital. “You need a culture where data is appreciated,” said Jennifer Nacy, VP of data, analytics, automation, and AI with Jacobs.
Rafeh Masood, chief growth and digital officer with Royal Caribbean Group, described the cruise company as “data-powered, AI-forward.”
The ‘New Informatica’
Informatica sits smack dab in the middle of all of these trends. Its Intelligent Data Management Cloud (IDMC) and newly released CLAIRE GPT—an AI assistant for data management—do most or all of the things that you would need to get an ill-kept data estate into better shape for AI.
I’ve been covering Informatica since it was founded in the early ‘90s, when data management was largely a back office function supporting enterprise data warehouses and business-intelligence systems in corporate data centers. Informatica is a different company today. It transformed its business model from a legacy enterprise software company to a modern AI-powered, cloud-based company. In Q1 2024, Informatica reported $1.64 billion in total ARR (annual recurring revenue) and a 35% increase in cloud subscription ARR to $653M, compared to the same period last year.
Informatica now has more than 250 customers with $1M+ subscription ARR, a 24% increase compared to a year earlier. Customers at Informatica World included Royal Caribbean Group, Takeda Pharmaceuticals, SSM Health, Chubb, Dallas Fort Worth Airport, Frost Bank, BMC, and Sanofi.
Importantly, Informatica has partnerships with the Big Four global hyperscalers—AWS, Google Cloud, Microsoft, and Oracle—and it works closely with Databricks, MongoDB, and Snowflake. All together, it has hundreds of active partnerships, representing many exabytes of data to be managed. IDMC processes in the neighborhood of 100 trillion cloud transactions per month.
Integration in action
You need to look under the hood to appreciate why this matters for trusted AI. For example, Informatica announced new integrations between IDMC and Microsoft’s Azure, making it simpler for Azure and Microsoft Fabric customers to deploy and manage IDMC services from the Azure portal. Separately, it introduced a solution blueprint that combines IDMC services with Snowflake’s Cortex AI service.
Two other recent Informatica advances of note: Cloud Data Access Management (CDAM), a data access and governance solution. And a Master Data Management (MDM) extension for Google Cloud BigQuery that lets customers develop enterprise-grade GenAI apps using IDMC, Google Vertex AI, BigQuery, and Gemini.
These and Informatica’s other innovations and integrations are aimed at simplifying the next big thing: creating business differentiation and advantage with GenAI.
Barbara Latulippe, chief data officer at Takeda Pharmaceuticals, told the keynote audience: “I like to say our data is GenAI-ready.”
Those words need to echo widely. But first, some serious data modernization must happen.
(Note: Informatica is a client of Method Communications, where I’m a VP and tech editor. This blog post was independently written and published.)
Connect with John Foley on LinkedIn.