Google Cloud's Strategy to Grow and Win in Cloud Databases
Here's how it's keeping pressure on AWS, Microsoft Azure, and Oracle at the top of the database market.
Imagine a data platform that can manage and analyze all data, from all sources, in all formats, across all clouds.
That is Google Cloud’s vision for the future of database management and, according to CEO Thomas Kurian, it’s already here—in the form of the “open data cloud.”
Kurian, speaking at Google Cloud’s Next event in New York, shared new developments in infrastructure, security, AI/ML, collaboration tools, and more. In this blog post, I’m focused on the latest and greatest in Google Cloud databases and data management.
Google’s open data cloud is a work in progress—new databases and new ways of stitching them together are still evolving, as discussed below. Even so, the pieces are coming together around Google Cloud’s central design point of simplifying technology. As that happens, there’s reason to believe that Google Cloud will continue to grow in influence in the cloud database market—and potentially grab a bigger slice of the pie.
Google Cloud’s ascendance in the database market is noteworthy. Last year, Google Cloud broke into the top five vendors in the market, based on database revenue, according to Gartner. SAP was knocked out of the top five. Meanwhile, Oracle slipped to #3 and IBM to #5.
That’s all the more impressive when you consider that, just five years ago, Google Cloud wasn’t even in the top 10 of Gartner’s ranking. Now, #4 Google Cloud is right behind Oracle, although Oracle continues to have a substantial marketshare lead. The question is whether that margin will narrow again this year. Below is my earlier analysis on the changes in the database pecking order.
4 key areas of focus
During Next, I talked to Andi Gutmans, Google Cloud GM and VP of Engineering for Databases, about recent developments. Google Cloud has about nine of its own databases, the newest of them being AlloyDB, which was introduced in May and is still in preview release.
Gutmans noted that, while Google Cloud made a handful of database-related announcements at Next, there was also news a few weeks earlier that fits into the bigger picture. That included:
90-day free trials of Google’s widely used Cloud Spanner relational database
Database Migration Services (DMS) for AlloyDB, which can be used to migrate Postgres databases, including those from AWS and Microsoft Azure, to Postgres-compatible AlloyDB
Broadly speaking, Gutmans says, Google Cloud is focused on four key areas of database development.
Creating a unified and integrated data cloud for transactional and analytical data
A commitment to open standards and ecosystems and, in the process, helping customers “break free” from legacy databases
Infusing AI and ML into data-driven workflows
Empowering developers and other builders to be more productive and impactful
You can read Gutman’s blog post with details on each of these initiatives here.
Legacy replacement
Our conversation started, as so many do these days, on the topic of legacy database conversions. Like other cloud-native database providers, Google Cloud sees a big opportunity in making it attractive and easier for IT departments to move their existing workloads from on-premises, as well as from other clouds, to its database services.
“One of the things we’ve been hearing from customers is that they want to get off of legacy, proprietary databases as they move into the cloud,” Gutmans says. “And it’s not just for cost reasons. As they’re going to the cloud, they really want to make sure that they have the flexibility to run those workloads wherever they want to, and scale them up, scale them down.”
Customers are trying to get away from the “unfriendly licensing constructs” of some traditional database vendors, he added. Google Cloud’s answer is its embrace of open standards such as PostgreSQL. In that same vein, Google Cloud announced Java/JDBC and Go/pgx drivers in Spanner, and Spark support for BigQuery.
Analytics & transactions
We also talked about the different ways that Google Cloud is increasingly blending analytical data and transactional data to give customers more ways to create applications and business insights with near real-time data.
This has been the Holy Grail going back to the universal databases of a generation ago from vendors such as Illustra and Informix, both of which now exist deep within the bowels of IBM. A few database vendors, like SingleStore and TileDB, are developing modern versions of these multi-purpose database platforms. Snowflake recently got into the action with its Unistore. Oracle’s MySQL HeatWave addresses the same trend.
Google Cloud’s AlloyDB, a PostgreSQL-compatible database service, fits into this category of transactional-plus-analytical systems. And Google Cloud is introducing new ways to cross-pollinate transactions and analytics among its various database platforms by make it easier to query and/or move data from operational systems into its analytics systems. Some of the technologies that enable this include:
Datastream for BigQuery, which replicates data from operational databases (MySQL, PostgreSQL, AlloyDB, Oracle) into BigQuery
Bigtable Change Streams, to track changes in Bigtable and integrate Bigtable data with other systems
Cloud Spanner Change Streams, to track and stream changes from Cloud Spanner
Federated queries from BigQuery to Cloud SQL, Spanner, and Bigtable, so users can query those transactional systems without having to move the data into BigQuery
Use cases for change streams include analytics, archiving data for governance, and event triggering to other databases.
Interesting note: Datastream for BigQuery is already looking like a hit. Introduced a few weeks ago, hundreds of customers are already adopting it, Gutmans says.
At Next, Google Cloud made 10 cloud predictions for tech developments that will happen by the end of 2025. One of them was Gutmans’ prediction that analytics and transactions will continue to merge together.
Google Cloud’s data trifecta
There are many components to Google’s “all data, all sources, all clouds” strategy, but I want to focus on three of them.
BigQuery - Google Cloud’s popular data warehouse needs no introduction. It’s been around for more than 10 years and is widely used by organizations such as P&G, Toyota, and UPS.
At Next, Google announced that BigQuery now supports unstructured data—i.e. documents, video, audio—in addition to structured and semi-structured data.
In addition, there’s a preview of BigQuery support for Apache Spark, the open analytics engine for large-scale data processing. So BigQuery can create stored procedures using Spark.
With these updates, along with the aforementioned Datasteam replication, BigQuery is now more versatile in its support more data types and workloads.
PostgreSQL - Google Cloud has placed a big bet on PostreSQL. The introduction of AlloyDB in the first half of this year gave Google Cloud three databases with Postgres compatibility. The other two are Cloud SQL and Spanner. “We’re doubling down on Postgres—or tripling down,” says Gutmans.
As noted, Database Migration Service can now be used to migrate other Postgres variants (including from AWS and Microsoft Azure) to AlloyDB. Earlier this year, Google Cloud announced a version of DMS for Oracle-to-Cloud SQL for PostgreSQL. In short, Google Cloud continues to make it easier for users to get started with its Postgres-compatible options.
Also, the AlloyDB partner ecosystem has grown to 30+ solutions—evidence of growing industry support for Google Cloud’s newest Postgres-compatible DB.
Data Cloud - I’ve written about Google’s Data Cloud before, which has been described as a cloud for all of a customer’s data. (See article below.) The blueprint continues to grow into something more expansive, with AI, ML, databases, data lakes, analytical data, and operational data all rolled together into an “open, extensible, unified, intelligent” data cloud.
Google Cloud isn’t the only vendor with a data cloud model. Snowflake, Oracle, and others have similar over-arching architectures that tie together data from different sources, normalize data, and make it widely available for sharing.
I find data clouds a helpful construct in thinking broadly about an organization’s “data estate”—the terabytes and petabytes of data from all sources, spread widely across hybrid and multi-clouds. Most businesses are still figuring this out. For a growing number of them, Google Cloud’s rapidly-evolving open data cloud may be what they are looking for.