Database Innovations from A (AWS) to Z (Zilliz)
From OLAP to distributed SQL to vectors, the pace of new platform development hasn't slowed.
The database market’s cup runneth over with new platforms and cloud services—and more just keep on coming. In the past few days, ClickHouse launched a new OLAP cloud service, and Zilliz a new vector database service.
In fact, Carnegie Mellon University’s encyclopedic Database of Databases has grown to an incredible 875 different database management systems. Some of the latest are GreptimeDB, a time-series DB written in Rust; and Kuzu, a graph-oriented DB for use in embedded applications.
With so many DBMSes available, there have been rumblings that market consolidation is inevitable. So far, however, in the Yin and Yang of “product innovation vs. product saturation,” the Yin is clearly winning.
Following is my latest roundup of recent activity, leading with Werner Vogels’ master class in how it all starts with engineering.
What is AWS’s secret to success? A big part of it is a dynamic cloud architecture that ranges from milliseconds of compute to exabytes of storage. And much of the credit for AWS’s elastic mega-cloud goes to CTO Werner Vogels, who has been engineering, building, and scaling AWS cloud infrastructure for nearly 20 years.
Attendees at re:Invent got a front row seat to Vogel’s distributed-systems philosophy during his keynote presentation, where he shared the design principles behind S3, one of AWS’s first cloud services. They are: decentralization; asynchrony; local responsibility; decompose into small, well-understood building blocks; autonomy; controlled concurrency; failure tolerant; controlled parallelism; symmetry; and simplicity.
There was one principle in particular that Vogels mentioned repeatedly—asynchrony. Asynchrony is the opposite of synchrony, which is when things happen at the same time in an orderly, serial fashion.
According to Vogels, the real world is by its nature asynchronous—and cloud infrastructure should be, too. “Sometimes the world looks synchronous,” he said. But he added, “Synchrony is an illusion—it is something that we built over a world that is asynchronous. Systems, as we know them, are asynchronous, as well.”
SiliconAngle provides a good overview of Vogels’ keynote, including a few product announcements, here: “It’s evolve or die for systems architecture, says Amazon CTO Werner Vogels.”
Vogels did not talk about how the principles of distributed systems apply to database management, though both synchronous and asynchronous are essential. The key point, I think, is that asynchrony is the overarching principle to keep front of mind when it comes to distributed-systems architecture.
Vogels ended his keynote with a look ahead to quantum-based simulations 20 years from now. The image I shared at the top of this blog post helped illustrate that. From start to finish, his presentation spanned from 2006 to 2042—a remarkable 36 years of cloud development, past, present, and future!
In case you missed it, my earlier post (link below) covered AWS’s cloud database news from re:Invent.
Following a two-month beta program that involved 100+ customers, ClickHouse on December 6 announced the launch of ClickHouse Cloud, its OLAP as a service.
ClickHouse Cloud is available now on AWS or as a managed service from ClickHouse. Availability on Microsoft Azure and Google Cloud are “coming soon.”
Use cases for the company’s online analytical processing system include analytics, e-commerce, IoT, telemetry, and online gaming. eBay, Deutsche Bank, Spotify, and Uber are already using it.
With offices in San Francisco and Amsterdam, ClickHouse was spun off last year from Yandex, a search engine company based in Russia. Following a Series B round in October 2021 and a recent extension, the company now has $300 million in total funding, according to Crunchbase.
Couchbase on December 8 hosted a call with industry analysts to discuss the company’s FY23 Q3 financial results, which included 25% revenue growth to $38.6 million, compared to a year ago, and other recent developments.
Key talking points included:
Customer momentum for Couchbase’s NoSQL cloud database service, Capella. New customers include Nobel Systems, a geospatial services provider; Yapstone, a digital payments company; and Mapotempo by Woop, a route-planning software developer. Couchbase highlighted to a business conglomerate in India as its “largest new logo ever,” though it did not name the company.
A multi-year collaboration and joint marketing agreement with AWS that covers migrating workloads to Capella on AWS and extending Capella App Services to AWS edge services, among other things.
For developers, an updated UX and tools, expanded documentation, and other improvements. “It’s really focused on making it easier and faster for developers to accomplish the tasks they want to do,” said SVP Scott Anderson.
I asked what we should be watching for from Couchbase in 2023. They pointed to three things:
Continued push of dev tools and capabilities. “Developer, developer, developer,” said Anderson, channeling Steve Ballmer’s famous chant from years ago.
Continuing to improve total cost of ownership through options such as buying credits or making a bigger commitment to Capella.
A serverless cloud offering.
Cockroach Labs on December 6 premiered an update to its distributed SQL cloud database, CockroachDB 22.2.
One of the highlights is support for user-defined functions, which developers can now use to execute functions within the database rather than in application logic, for improved efficiency. Cockroach admitted it had resisted implementing UDFs, but wisely gave in to customer requests.
This 3-minute video explains UDFs.
The other thing that got my attention was the progress of CockroachDB MOLT, the recently introduced schema conversion tool for database migrations on AWS. The newly added target databases are Oracle, MySQL, and SQL Server. So it’s now somewhat easier for users to migrate existing databases from those environments to CockroachDB.
You can see all the new features in CockroachDB 22.2 on the company website here.
Yugabyte rolled out v2.17 of its distributed SQL database, YugabyteDB. There are a lot of new features packed into it, including for business continuity, failover, and backups.
Notably, Yugabyte also launched a new program, YugabyteDB Managed Quantum Leap, that offers startups $10,000 in credits towards its managed database service on AWS or Google Cloud. Separately, there’s also a 30-day free trial to YugabyteDB’s managed database service.
These are just some of the latest examples of how cloud database vendors are making it fast and easy for developer teams to get started on their newest platforms, typically for dev, test, and prototype projects.
Zilliz introduced Zilliz Cloud, its new vector database as a service. Potential use cases include image retrieval, video analysis, recommendations, chatbots, and other apps for this kind of unstructured data.
Zilliz is based on Milvus, the open-source vector database for which Zilliz is lead developer. Zilliz Cloud is a commercial, enterprise version of Milvus. Zilliz Cloud is available now on AWS, with MS Azure and Google Cloud to follow.
In August, Zilliz disclosed a $60 million add-on to its Series B funding, bringing its total investments to $113 million. At the same time, Zilliz, which has its roots in China, announced it had opened its headquarters in San Francisco. I look forward to talking to Zilliz founder and CEO Charles Xie sometime soon.
Vectors are long strings of numbers representing documents, images, and other data types. In case you need a refresher, here’s a short blog post I wrote when another vendor, Pinecone Systems, launched last year.
No end in sight for data platform innovation
Finally, here’s some late-breaking news. Microsoft just announced a 10-year partnership with the London Stock Exchange Group to bring together their respective data infrastructure and analytics technologies to co-create solutions and services for financial institutions and markets.
Microsoft sees it as a $5 billion opportunity. The year may be winding down, but it’s not slowing down.