Three Trends That Will Transform the Database Market in 2023
Hyperscale data is driving the need for innovation and automation, but the industry may be entering a phase of consolidation.
A big welcome to the many new subscribers who have signed up for the Cloud Database Report. It’s great to have you with us. Our audience has grown organically by 50% in the past 12 months. If you know someone who might be interested, please share the Cloud Database Report with them. Thanks and wishing everyone a great 2023!
There’s a long list of things to watch for in the database market in the year ahead: database migration to the cloud; data clouds; governance; purpose-built vs. universal databases; data lakes, fabrics, and meshes; the rise of PostgreSQL; the long, slow demise of the legacy data warehouse.
However, there are three overarching trends in particular that I believe will have far-reaching impact. In a nutshell, they are zettabytes of data, industrywide disruption, and increased automation. Here’s my analysis.
1. Millions of petabytes of data will be created.
More businesses are entering the realm of hyperscale data.
IDC and other prognosticators estimate that the world now generates zettabytes of data each year. A single zettabyte is a million petabytes, or a thousand exabytes. And we’re talking about 10, 20, or more zettabytes being created by billions of users, devices, and applications.
This means there will be a mind-boggling amount of data to manage—above and beyond what is already in the cloud, corporate data centers, laptops, and other systems. That’s not exactly a new revelation, but it’s the stark reality for anyone with responsibility for data management, analytics, data science, storage & infrastructure, machine learning, or AI.
We increasingly hear about enterprise data environments that range into hundreds of petabytes, with some edging into exabytes. In this interview with The Cube, Verizon talks about moving 2 exabytes of data to the cloud.
These massive and constantly growing data stores present a technical challenge, yet also a business opportunity. Only a small percentage (probably less than 10%) of newly generated data is captured, processed, and analyzed. Imagine what we might learn, and what new innovations could be developed, if more of that raw, unused data was actively managed.
Key takeaway: Every organization should evaluate its storage infrastructure and ability to scale as terabytes and petabytes of data flood the enterprise.
2. The competitive landscape will change.
The database market refuses to act its age. Relational databases have been around for 50 years, yet the pace of innovation is accelerating rather than slowing down. It would be a mistake to think of databases as a maturing market.
In fact, there’s been a tremendous amount of startup activity, as I recapped in the article below.
However, Gartner and others have suggested that consolidation in the database market is inevitable. Already this year, Progress disclosed plans to acquire MarkLogic; Qlik to acquire Talend; Confluent to acquire Immerok; and Snowflake to acquire Mist AI.
And startup funding slowed “big time” in the second half of 2022, according to Carnegie Mellon University professor Andy Pavlo in his year in review blog post. “The market cannot sustain so many independent software vendors (ISVs) for databases,” he writes. (Pavlo, co-founder of OtterTune, a startup that raised Series A funding last year, is in a position to know.)
With forecasts of a global recession, and CXOs keeping a watchful eye on cloud spending, the database market will likely experience considerable disruption in the months ahead.
Key takeaway: Developers and IT pros must stay in close touch with their vendors on product roadmaps, partnerships, and contingencies.
3. Increased automation will address growing complexity.
Developers, data engineers, and IT execs have never had more new capabilities to choose from in the data-management toolbox. In an earlier article, I referred to it as a Renaissance of database technologies.
Today, new product development goes on at an unrelenting pace, with Carnegie Mellon University’s Database of Databases approaching 900 database management systems.
Startups are introducing new data and analytics services, while heavyweights such as AWS, Google Cloud, Microsoft, and Oracle launched more than 100 new capabilities at their 2022 events.
All of this innovation is racing to keep up with data workloads that are growing in size and complexity. Ocient CEO Chris Gladwin, in his outlook for 2023, observes that “the nature of data is changing.”
Many IT teams are now grappling with graphs, vectors, spatial, and other abstruse data types that are fast becoming mainstream. Some are using a new breed of purpose-built databases for these unique workloads.
The only answer to this emerging complexity is automation. Last year, we saw the growing popularity of fully managed database services, serverless, embedded ML, and other ease of deployment/admin capabilities. Such as:
AWS, Bit.io, Cockroach Labs, and Xata introduced serverless cloud databases in the fall.
Google Cloud’s new AlloyDB database is fully managed and has ML-enabled autopilot for patching, backup, and replication.
Oracle’s MySQL HeatWave, newly available on AWS, includes an Autopilot feature for automatic provisioning, parallel loading, scheduling, etc.
OtterTune, a startup co-founded by CMU professor Andy Pavlo, is developing software that applies ML to make Postgres and MySQL databases autonomous and self-driving.
These examples are representative and by no means exhaustive.
The key points I really want to make are: A) Myriad tools, platforms, and cloud services are bringing innovation in many forms; and B) automation and integration are necessary to capitalize on these awesome capabilities without creating Database Sprawl 2.0.
Key takeaway: IT teams need to plan for both A and B to be successful.
Another excellent report. Fully agree with these predictions. Well done. BTW, just today I posted my key takeaways from Oracle CloudWorld - http://sanjmo.medium.com