7 Database Trends Driven by AWS, Google Cloud, Microsoft, Oracle, and Startups
A roundup of the latest developments in serverless, PostgreSQL, MySQL, and more.
Hello Cloud Database Report readers—and a big welcome all of our new subscribers! I just returned to New York from a trip to the Columbia River Gorge where, among the natural beauty, I saw this awesome rainbow with one end in Oregon and the other in Washington State! And here’s a bonus: If you look closely, there’s a Google data center situated in the middle of the image along the Columbia River. Amazon has a data center 80 miles east of this one.
A lot has happened in the database market over the past few weeks. Below I have condensed a handful of the most significant announcements and put them into the context of seven key industry trends.
1. Three new serverless offerings
The cloud database market is going serverless. On Oct. 26, AWS announced Amazon Neptune Serverless, a version of its popular graph database that automatically provisions and scales workloads, up and down. On the same day, Bit.io launched its new serverless version of PostgreSQL. And more recently, Xata, a startup that I’ve written about before, announced its own new serverless database. Keep reading below for more on each of these announcements.
2. Graph is off the charts
Amazon Neptune Serverless is notable for a second reason—it’s another development in what has been a busy year for graph databases. As I’ve discussed before, there’s been a burst of activity in graph databases—from Neo4j, Memgraph, and others—driven by use cases that are centered around data relationships, such as contract tracing or fraud detection.
To give you an idea of just how extensive these data relationships can be, Neo4j has demoed a graph database comprised of more than 200 billion nodes and 1 trillion relationships. And today, Neo4j announced GA of the latest version of its platform, Neo4j 5.
EdgeDB is joining the fray. The startup disclosed Nov. 7 that it has secured $15 million in Series A funding. EdgeDB describes its still-in-development platform as a “graph-relational database” built on Postgres. (Yes, Postgres seems to be taking over the world.) An Apache 2.0 open-source database, EdgeDB can be hosted in AWS, Azure, Google Cloud, and other environments. A fully managed EdgeDB Cloud service is due in Q1 of 2023. (See the post below for more on graph DBs.)
3. Vector — it’s more than semantics
Pinecone Systems on October 31 (Halloween!) announced “keyword-aware semantic search powered by a new hybrid index.” Sounds scary, but I boil it down to two words—vector database. I wrote about Pinecone in June 2021, when the company introduced its new vector database. CEO Edo Liberty was my first guest on the Cloud Database Report podcast. Now, Pinecone has combined semantic search and keyword search in the form of “keyword-aware semantic search,” which is a more elegant solution for organizations that until now have had to choose either keyword search or semantic search.
Also on the vector database front, there’s a new vector database service from Zilliz, which announced general availability of Zilliz Cloud, also on Oct. 31. It’s the startup behind Milvus, an open-source vector database, which underpins the Zilliz cloud service. Zilliz Cloud is available initially on AWS with plans for Azure and Google Cloud. In August, Zilliz announced a Series B funding round of $60 million, bringing its total raised to $113 million, and the opening of its headquarters in San Francisco. The latest round was led by Prosperity7 Ventures (part of Aramco Ventures) and joined by its existing investors in China.
4. Postgres everywhere
Microsoft has added PostgreSQL support to Azure Cosmos DB, resulting in a hybrid cloud database that can handle both relational (Postgres) and non-relational/NoSQL (Cosmos) workloads. Microsoft claims Azure becomes the first cloud service with this kind of a dual personality in single platform. Postgres joins a long list of interfaces supported by Cosmos DB, including MongoDB, Apache Cassandra, Apache Gremlin, and Table (a schemaless data store).
Support for PostgreSQL continues to expand in new and impressive ways. In a recent blog post titled “Google Cloud’s Strategy to Grow and Win in Cloud Databases,” I pointed out that Google Cloud now has three PostgreSQL variants—Cloud SQL, Spanner, and the new AlloyDB. And as noted in #1 above, Bit.io has jumped on the Postgres bandwagon. Carnegie Mellon’s Database of Databases now lists 30 Postgres database derivatives.
5. BigQuery keeps getting bigger
Google Cloud introduced a series of database advances in recent weeks, but none more important than the new capabilities for its BigQuery data warehouse. At Google Cloud Next, the company announced that BigQuery now supports unstructured data (in addition to structured and semi-structured data), making it a more comprehensive platform for all types of data analysis. In addition, BigQuery now supports Apache Spark for large-scale processing. And Google Cloud has introduced Datastream for BigQuery, which replicates data from operational databases (MySQL, PostgreSQL, AlloyDB, Oracle) into BigQuery. So, Google Cloud’s data warehousing workhorse continues to do ever more of the heavy lifting for big data.
6. MySQL HeatWave gains momentum
In the growing drumbeat of MySQL HeatWave, Oracle introduced MySQL HeatWave Lakehouse (beta) at OpenWorld in Las Vegas. MySQL Lakehouse can process queries in both a MySQL HeatWave database and object storage, so you can think of it as part data warehouse and part data lake. Oracle touts the performance of MySQL HeatWave Lakehouse compared to competing platforms from AWS and Snowflake.
It’s been two years since Oracle launched MySQL HeatWave in November 2020, and Oracle has been putting a lot of emphasis on this new platform with its announcement of MySQL HeatWave on AWS and Azure. HeatWave is based on open source MySQL, supports both transactions and analytics, and is the lynchpin in Oracle’s multi-cloud database strategy. (See “Larry Ellison Gets Serious About Multi-Cloud. What Is Oracle’s Next Move?”)
7. Forget the database
Some database companies don’t describe themselves as database companies. Snowflake may be the best example, but others are catching on. That’s because databases are essentially infrastructure; the greater value is in the digital transformation and new business opportunities they enable.
Xata, a startup that just announced availability of its serverless, relational database, puts it this way: “Think data, not databases.” Xata is taking the path of a better developer experience and ease of use. For more, see my interview with Xata CEO Monica Sarbu.
Final note: Substack has added a chat feature that I plan to test in the coming days. You will need to download the Substack app to access it. Stay tuned for more!