Apache Kafka is dead, just 100 years after Franz Kafka! Kafka's books lived on. Long Live the Kafka API!
The existing Apache Kafka had to die. It’s a good thing, for several reasons.
1. Kafka isn’t cloud-native.
Managing brokers and queues has been a headache for most messaging systems, and Kafka has been one of the best examples. Part of the problem has been tightly coupled storage and compute. Kora is Confluent’s replacement, but they’re not alone. RedPanda and Warpstream are others.
2. The innovation is happening around, not within Kafka.
There’s a great ecosystem around the Kafka API, and it’s driving new innovation. The next generations of real-time data integration, streaming analytics, and all kinds of stream-based apps are all evolving.
Beyond working on modernizing Kafka, the Apache Kafka project needs to invest in supporting all this innovation.
3. Kafka doesn’t understand data, or data integration.
Kafka has always focused on messaging and stream processing, not data management or integration. New technologies have always gotten built on top of messaging to integrate processes and data, or process events. This has happened with TIBCO Rendezvous, JMS, Kafka, even (async) APIs.
In the case of real-time data integration and data sharing, something needs to manage data across sources and destinations. You need to manage and integrate data schema from sources to destinations with transformations and workflows. It needs to be visual, flexible, and fast to change.
That’s not Kafka. Kafka has queues and a schema registry. You code.
But the Kafka API is allowing the next generation of tools to enter the market faster because you can access so many tools, and so much data, via support for the Kafka API.
So what should happen to Kafka and around it over the next few years?
1. Kafka will become cloud-native.
Either Confluent contributes a cloud-native version, one or more Kafka API compatible technologies take off and create multiple market segments, or both. It’s probably both.
2. New messaging protocols will emerge.
Kafka is for data streaming. We need more. There’s NATS and a host of other messaging technologies for IoT. There’s Gazette for exactly-once delivery with built-in stream-store-replay that supports data integration, analytics, and data sharing better. We still need more for transactional messaging. And something will always exist outside the firewall.
Most of these technologies use the Kafka API in some form.
3. Data schema and workflows will hide messaging namespaces.
I say this with the utmost respect: message namespaces have no business value. If managed directly they will slow down change. They must be driven by higher-level tooling. This is how Estuary Flow works. You can connect to Flow and receive data using the Kafka API.
Is Kafka dead? Going cloud-native? What do you think will happen to Kafka?
Apache Kafka #ApacheKafka Confluent Redpanda Data WarpStream Estuary