benedetti68 - Fotolia
As a software engineer at YouTube in 2010, Sugu Sougoumarane realized that scaling the MySQL database for the cloud was a tough challenge. His realization helped lead to the creation of the open source Vitess project, which hit a major milestone with the release of Vitess 4.0.
The Vitess project joined the Cloud Native Computing Foundation (CNCF), which is home to the Kubernetes container orchestration project, in February 2018. At the same time, Sougoumarane co-founded PlanetScale, a commercial service supporting Vitess and its deployment.
Just over a year and a half later, on Nov. 5, 2019, the Vitess project graduated from the CNCF, marking a major milestone for the project. CNCF graduation is the highest level of project status within the CNCF and is an indicator of the maturity of the project code and processes. With graduation, Vitess 4.0 became generally available, providing users with new features.
Among the criteria that are part of project graduation at the CNCF is diversity of corporate and developer participation in a project. Over the past year about 35% of contributions came from PlanetScale, with the rest coming from contributors at Slack, Square, Pinterest and Nozzle among others, said Sougoumarane, who is also CTO of PlanetScale.
The Vitess project enables a database as a service (DBaaS) that provides NoSQL-like horizontal scaling in a MySQL environment, said Gene Leganza, vice president and research director at Forrester.
"It's part of an overall trend to make big data cloud capabilities accessible to the enterprise developer and data analyst," Leganza said. "It provides the unlimited horizontal scaling that modern apps need and can't get with MySQL without introducing sharding logic into the code, within a relational MySQL environment that is familiar to enterprise developers."
How Vitess works
Vitess, including Vitess 4.0, is a storage and database layer that runs in cloud-native Kubernetes environments. Kubernetes is an increasingly popular open source technology that organizations can run on premises or in the public cloud for delivering applications and services.
Gene LeganzaVice president and research director, Forrester
Rather than running as just a storage target for a database, Sougoumarane explained that Vitess runs as an application. The benefit of running as an application in Kubernetes is built-in scalability and survivability controls that Vitess can benefit from.
"Vitess knows how to recover from failure and operate with high availability and without data loss," Sougoumarane said. "It can be run like a regular application, but at the same time, it's more than a regular application because it also manages your storage."
Vitess uses the open source MySQL database at its foundation, with the proxies that direct traffic also using the MySQL protocol. As such, running applications communicate with Vitess, without knowing it's not actually just a regular MySQL database, but rather is an entire distributed farm of MySQL databases. Scalability is a primary characteristic of Vitess, with the ability to run tens of thousands of nodes for production workloads.
In the past, Sougoumarane said he thought of Vitess as database middleware, as it sits between a database and applications. He now refers to Vitess as a cloud-native database, because its core strength is being able to run in Kubernetes, and its ability to expose a database API.
While Vitess has now reached a new stage of maturity with CNCF graduation, Sougoumarane noted that the project itself is nine years old and was already mature as a scalable technology. One area that wasn't mature, however, was ease of use, with the project being complex to use and deploy.
The challenge of reducing complexity is one of the core themes of the Vitess 4.0 release that became generally available in November. Sougoumarane said that Vitess 4.0 provides a new way for users to get started with the technology in a more simplified approach.
Another area of improvement in Vitess 4.0 is with data replication. A primary feature of Vitess is its sharding capability, which slices up data and distributes it across database nodes. It's not always easy to separate data, which has led to the new VReplication feature in Vitess 4.0, Sougoumarane said.
"If you insert data in one place, VReplication will also create a copy of that in other places where it is co-located with other roles that are related to that same data," he said.
Vitess at PlanetScale
As an open source project, Vitess can be used by any organization for free, though organizations would need to set up and manage it on their own. As for Sougoumarane's company, PlanetScale, its goal is to provide a managed cloud DBaaS offering. He noted that the PlanetScale platform supports production customers today.
For both the Vitess project and PlanetScale, Sougoumarane said that a key focus in the months ahead will be on making sure that any organization that decides to move to Kubernetes will be able to move their data with them seamlessly.
"If you are moving to Kubernetes, you shouldn't leave you leave your data behind," Sougoumarane said.