Ocient on AWS: Next-Generation Analytics at Hyperscale
A cloud data warehouse designed to process trillions of records.
Hello Cloud Database Report readers—and a warm welcome to new subscribers! Thanks for spending your valuable time with us. Your comments and feedback are always welcome. You can reach me at jfoley09@gmail.com.
Ocient’s super-scale data warehouse is now available on AWS. This emerging platform is something to watch as more businesses grapple with petabytes and exabytes of data.
Ocient specializes in what it calls “the world’s largest datasets”—those with trillions of records. Typical use cases for this kind of extreme analysis are workloads like digital ad auctions, stock-ticker trending, and telecom network traffic.
“Our focus is on complex analysis of at least hundreds of billions of records, if not trillions or tens of trillions or hundreds of trillions,” Chris Gladwin, Ocient’s co-founder and CEO, told me in a podcast conversation late last year. “That’s territory that was previously impossible.”
Founded in 2016, Ocient is still coming up to full speed. In July, it announced the latest version of its data warehouse software which included, among other things, native support for complex data types (i.e. arrays, tuples, matrices) and ETL for building data pipelines.
Two previous announcements give a hint at how customers use Ocient.
Ocient partnered with Carahsoft to sell into the government market. Potential use cases include intelligence, cybersecurity anomaly detection, and sensor data analysis.
MediaMath, an ad tech company, is using Ocient for digital ad campaign forecasting. MediaMath can handle 6 million ad opportunities per second.
Gladwin has experience with this kind of high-end computing. Previously, he was the founder of object storage vendor Cleversafe, acquired by IBM in 2015. That experience with high-scale data storage carried over to Ocient, whose software is optimized to run on NVMe solid state storage, industry standard CPUs, and 100 GB networking.
AWS instances for data-intensive workloads
That brings me to Ocient’s latest news. The company’s Ocient Hyperscale Data Warehouse is now GA in the AWS Marketplace. List price is $625,000 per year for 900 TB of data storage and 240 core processors.
I had a few questions, and Ocient connected me with Joe Jablonski, the company’s chief product officer. Ocient on AWS uses i3en instances, which are AWS-specific cloud resources. Here’s how AWS describes them:
i3en instances are designed for data-intensive workloads such as relational and NoSQL databases, distributed file systems, search engines, and data warehouses that require high random I/O access to large amounts of data residing on instance storage. i3en instances are powered by AWS’s custom Intel Xeon Scalable (Skylake) processors with 3.1 GHz sustained all-core turbo performance and provide up to 100 Gbps of networking bandwidth and come in seven instance sizes, with storage options from 1.25 to 60 TB.
3 questions for Ocient
With that technical nitty-gritty out of the way, here’s the Q&A with Joe Jablonski.
Q: What is the scalability of Ocient on AWS?
Jablonski: As of today, Ocient can scale to 100 nodes of i3en instances (at 60 terabytes per node). This equates to up to 6 petabytes (PB) of raw disk and up to 10 PB of compressed usable data with reliability.
Q: What other database platforms are comparable to Ocient on AWS?
Jablonski: Our Compute Adjacent Storage Architecture (CASA) is engineered for the largest scale datasets, which, when combined with Ocient’s secondary indexing, delivers performance at the scale of 10’s of petabytes and beyond at an unbeatable pricing structure.
On smaller-scale workloads (less than 2 PB active data under analysis), we might see RedShift Advanced Query Accelerator (AQUA), Snowflake, and Databricks in a competitive set. In terms of competition, we’re just as focused on enabling new and previously infeasible capabilities as we are on replacing legacy or costly cloud data warehouse systems, which can’t cost-effectively compete at hyperscale.
Q: Are any customers using Ocient on AWS?
Jablonski: Yes, we have government and telecommunications customers who have leveraged Ocient in AWS, but we cannot name them publicly.
Beyond human capabilities
Ocient is looking to replace old-style data warehouses with its new cloud database platform.
In August, the company released the results of its survey of 500 data and IT professionals who manage workloads of 150 TB or more. Ocient found that 59% of respondents were looking to switch data warehouse providers. Here’s a blog post I wrote about that.
When I talked to CEO Chris Gladwin in our podcast, I asked him about extreme scale data management. His response was fascinating—and relevant to data engineers who must develop strategies and architectures for these gigantic datasets.
“Billions is kind of the last scale at which humans can actually make or touch data that big. It’s very hard to do, but it’s possible,” Gladwin said. “But at trillions scale, it’s just not possible.”
Here’s a link to the full podcast.
As more organizations are faced with petabytes and exabytes of data to manage and analyze, it’s my expectation that some will need the kind of hyperscale capabilities Ocient is developing.