Larry Ellison on Oracle Database 23ai: 6 Key Takeaways
Oracle goes all in on AI with support for vectors, LLMs, RAG, NLP queries
Welcome to the Cloud Database Report. I’m John Foley, a long-time tech journalist who worked in strategic comms at Oracle, IBM, and MongoDB. This blog and free newsletter are independent, unsponsored, and not affiliated with my current role as VP with Method Communications.
Is AI the most important technology in the history of IT? “The answer is probably yes,” Larry Ellison said in announcing general availability of Oracle Database 23ai, the latest version of the company’s flagship database management system.
That bold view explains why Oracle is going all in on artificial intelligence—or you could say, AI on AI—with the latest release of its database. Until now, Oracle has referred to this version as Oracle Database 23c, with the “c” for cloud. That minor change in suffix—from 23c to 23ai—is actually a very big deal. It signals that Oracle is 100% committed to building AI deeply into its platform.
The company laid out its far-reaching strategy in a May 2 webcast featuring Ellison and EVP Juan Loaiza. And they’re not done yet. Next on the calendar is Oracle DatabaseWorld AI Edition, a virtual event on May 14.
Oracle has been building up to this new AI era for months, through partnerships and by adding AI/ML capabilities up and down its tech stack, including OCI cloud infrastructure.
Loaiza said Oracle has incorporated more than 300 new features and thousands of enhancements in 23ai. That includes new functionality for app dev and mission-critical data. Here are my top six takeaways.
1. Vector search is the No. 1 new capability
The raging debate around the emerging AI tech stack is which kind of vector database to use—a special-purpose database like those from Pinecone Systems, Weaviate, and qdrant or a general-purpose database adapted for vector embeddings.
Oracle is in the latter camp. It started talking about vector management in September at Oracle CloudWorld. Now Oracle AI Vector Search comes integrated with 23ai. Loaiza called it “probably the most important” new feature in 23ai.
Ellison pooh-poohed purpose-build vector databases.
“ I think the answer is always that all of your data should be in one place. It just makes life much easier to ask a question, to ask a query,” Ellison said. “So we think that the right way to solve this problem is to have a database that can manage all of your data, and do it in a highly performant and very economical way.”
For more on Oracle’s vector database approach, see my article, “5 advantages of using an integrated vector database for AI development.”
2. JSON Relational is a paradigm shift
Relational data is structured. Object data is semi-structured or unstructured. The best way to build with and manage these various data types has long been a challenge because developers gravitate to object programming, while corporate data is often organized in rows and columns.
The choice of which data-organizing schema to use—object or relational—can have implications for enterprise data management for years. CIOs and CTOs have employed many techniques (object-relational databases, object mapping, JSON APIs, etc.) to bridge these two models. “There seemed to be no right answer,” Ellison said.
Oracle’s solution, now available with 23ai, is a capability called JSON Relational Duality Views.
“What we did is let you define your JSON objects, and we will generate the relational schema from that,” Ellison said. “They coexist. You do not have to predefine the schema. …You get the best of both worlds with this unification and it’s completely seamless. Some users can think of it one way, some think of it the other way. It all works.”
For more, see my article, “Oracle’s new JSON Relational capability helps solve a big IT challenge.”
3. Mission-critical is more critical with AI
Along with AI and developer capabilities, the other major thrust for 23ai is support for mission-critical data. It’s no coincidence that Loaiza’s job title is EVP of mission-critical database technologies. If there’s any single thing that explains and defines Oracle’s continuing leadership position in the database market, I would say it’s an uncompromising focus on mission-critical data and workloads.
Along these lines, Oracle announced True Cache (in-memory, middle-tier cache) and In-Database SQL Firewall (for warding off SQL injection attacks).
Loaiza and Ellison also talked about the challenges and complexities of transaction management and the related issue of data integrity and data consistency.
“You don't want the application developer to be responsible for security [or] data consistency. That should be the responsibility of the database,” Ellison said. “So the application developer focuses on getting the job done building that application and inheriting from below, from the database, security, consistency, reliability. All of that should be done at the database level, not the application.”
4. AI requires blazing performance
The performance requirements of AI are becoming obvious to everyone, from the industry’s insatiable demand for Nvidia GPUs to the increased infrastructure spending we’re witnessing across the industry. “Every hyperscaler raised their 2024 capex numbers, and most did so meaningfully,” Doug O’Loughlin writes in Fabricated Knowledge on Substack. “Every major company is guiding to acceleration.”
A few proof points: Microsoft and OpenAI reportedly plan to build a $100 billion data center that will include an AI supercomputer, at 100 times the cost of existing data centers. And Oracle is building out its own data centers, has a partnership with Nvidia, and its customers are using OCI for AI training and inference.
Software optimization is the other half of the equation. Oracle is boosting AI speed with some of the new capabilities in and around 23ai, including Exadata System Software 24ai for optimized storage and the aforementioned True Cache for accelerating app performance.
5. You need data sovereignty for global AI
In March, Oracle introduced its Globally Distributed Autonomous Database, which applies database sharding—which means partitioning databases physically while retaining a single logical database—to distribute data across data centers and geographies with location-specific governance controls.
Oracle’s new Globally Distributed Database with RAFT, uses the RAFT protocol to replicate between physical databases for automatic failover. Oracle explains it this way: “Integrating replication inside the database with the RAFT-based protocol simplifies the creation and administration of fault-tolerant distributed databases and reduces the need for manual processes to maintain active-active availability.”
Or, in Ellison’s own words:
“We keep the illusion of a single database—even though it is not a single database. It's partitioned geographically. From the point of view of the organization that owns the data that's running the application, it looks like a unified global database. But from the point of view of regulatory compliance, we partition it and obey all of the rules. So we are trying to make life easier for people who are developing these applications and have to build local applications yet comply with local sovereignty laws.”
For more on distributed databases, see my recent post here.
6. Custom LLMs are the holy grail
Finally, no AI strategy is complete without LLMs and Gen AI. For many enterprises, the greater value will be in custom models that combine public LLMs with their own data.
“The great thing about our vector database is that allows you to supplement the training of the foundational model, whether it’s ChatGPT from OpenAI or Grok or Llama or what have you, you can [augment] the training of that model with your personal data, your proprietary corporate without having to disclose that data to the people who are building those models,” Ellison said. “So you can specialize the AI for your company, for you personally, for a particular topic. So the model is smarter than it otherwise would have been because you can safely add your private data to that.”
What’s more, Oracle is making it easier to find and get information from the database with models. A new tool, called Select AI, lets users query the database using natural-language questions. Select AI works with LLMs from OpenAI and Cohere, Microsoft’s Azure OpenAI, and Oracle OCI Generative AI, which today comprises the Cohere and Llama-2 models.
My next article on the Oracle Connect website will look more closely at Select AI.