Getty Images

News

Kafka co-creator details event data streaming evolution

The open source Apache Kafka streaming technology and commercial vendor Confluent have grown over the last decade as organizations increasingly have turned to real-time data.

Sean Michael Kerner

Published: 09 Sep 2022

The open source Apache Kafka project got its start at LinkedIn more than a decade ago and has grown dramatically in the years since.

Apache Kafka provides an event data streaming technology that enables organizations to move data from one place to another in support of business operations and data analytics. Among the leading contributors and supporters of Kafka is Confluent, which had its IPO in June 2021. In recent years, Confluent has increasingly focused on building out its cloud platform, which provides a managed service for event streaming data operations.

Confluent isn't the only vendor that supports Kafka today, with multiple services in the market including Amazon MSK, well-funded startup Aiven and Instaclustr, which NetApp acquired in April.

In this Q&A, Jay Kreps, co-founder and CEO of Confluent and co-creator of Kafka, provides insight about how the event streaming data technology is used and where the commercial vendor is headed.

How have things changed for you and for Confluent since you had your IPO?

Jay Kreps: Honestly, the mechanics of being a public company are not that complicated. You need to communicate your results every quarter and try to do a good job of that. A lot of the work comes ahead of time while preparing to go public and making sure that you can produce consistently good results.

There is always tension between producing short-term results while still having a long-term vision, and that is what I think technical founders often end up bringing to the table. You may have less experience running a public company, but you at least hopefully have some vision of where the company is trying to get to.

Are you at all surprised by the continued success and longevity of Kafka and Confluent?

Kreps: I guess the goal when you create something is for it to be big, but maybe that's not the expectation. I'm obviously super pleased.

The realization we had was that this kind of data streaming technology is about how a business can come together. It's actually more valuable in traditional enterprises because they have more complexity, different environments, more geographical locations and more old legacy systems.

Jay KrepsCo-founder and CEO, Confluent

The biggest question mark we had when we were thinking about starting the company was really just how relevant all this real-time streaming stuff is outside of tech. We knew that in tech companies, they're all digital, and the different parts have to come together. But does that make sense in different industries? That was probably one of the most interesting parts of starting the company -- actually starting to talk to all these different types of organizations, and figuring out things like how real-time streaming fits into banking, manufacturing and quality control, for example.

The realization we had was that this kind of data streaming technology is about how a business can come together. It's actually more valuable in traditional enterprises because they have more complexity, different environments, more geographical locations and more old legacy systems.

In the past, the investment in the world of data infrastructure was largely centered around file systems and databases. So people had large piles of data they could store, and that made sense in a world where applications didn't talk to each other much. That's not true anymore, and data in motion with streaming data is the underlying paradigm shift of how to work with data as it moves in a way that makes sense.

What's next for Confluent?

Kreps: There's a large community that's emerging around data in motion, and we're making investments in the core of our cloud service to make it more elastic and faster, and to improve the total cost of ownership for customers -- all that kind of 'better, faster, cheaper' stuff.

Jay Kreps

There's really interesting work being done on the processing of streams and the connectors to plug into sources of data. There is also work on some of the layers that help you visualize this data, integrate it into what you're doing and really build on it.

What's emerging is really a full stack around data in motion that makes it much, much easier to harness this stuff. That, I think, is the biggest change in the last five years -- when Kafka went from something that was kind of low level and powerful, but hard to really put to use in your company, to something where now there is a full offering and ecosystem around it to plug it in and build software on it.

Editor's note: This Q&A has been edited for clarity and conciseness.

Kafka co-creator details event data streaming evolution

The open source Apache Kafka streaming technology and commercial vendor Confluent have grown over the last decade as organizations increasingly have turned to real-time data.

Dig Deeper on Data management strategies

Confluent platform update targets developer choice, security

What to expect from Current 2024

Confluent Current 2023: Day two keynote

Confluent Current 2023: A journey around the data streaming universe