So, the idea of this short article is to give you a short overview of Apache Kafka. After reading this, you’ll have a notion of what Kafka is, why it was created and how you can integrate it in the microservice architecture.
In the very beginning, LinkedIn was a monolithic application. As the complexity and the number of users increased, it was noticed that the architecture LinkedIn was using was not ideal. So, LinkedIn’s engineering team started migrating it to microservices. However, as fate would have it, as soon as you solve one problem, a new one arises. During the migration process, the LinkedIn team noticed they had issues with tracking, metrics and messaging, and that their analytics, search services had trouble working in real-time. In order to overcome these obstacles, they started building custom data pipelines for these services. Instead of maintaining each pipeline individually, they decided to develop a single, distributed pub-sub system – and thus, Kafka was born.
With time, streaming was developed, which is what Kafka is known for today.
Kafka is a platform
I have just recently subscribed to a culinary channel on YouTube and now every time a new video is posted, I get notified. In Kafka’s world, I’m a consumer, meaning I consume information. The cook who published the video on YouTube is a producer, meaning he produces information.
The cook publishes his videos on YouTube, and a Kafka producer publishes his content to Kafka. Simply put, Kafka is a platform that makes sure that the content published by the producer reaches the consumer. Here’s a LinkedIn example: The user tracking service (producer) publishes the information that the user John Doe liked a certain post – Kafka will make sure that all the interested services (consumers) are informed about this happening.
The information about something happening is called an event. I won’t be going into the technical details of what constitutes an event, but it’s important to note that it should contain information on who, when, where and most importantly, why the event was created (of course, this varies depending on what you need).
Kafka is powerful because we can combine events in order to create new ones.
In most microservice architectures, Kafka is integrated in one of two possible ways
Pros: coordination and flow control, all in one place better error handling Cons: huge coupling
Broker is more frequently used because it has fewer coupling and better performances than Orchestrator.
Now you, as the reader, have a good overview of Kafka which will certainly help you further. I’ve left out the technical details in order to place focus on the questions every beginner has: What is Kafka? Why was it developed? How do you integrate it?
Kafka is used by an increasingly large number of organizations because it simplifies working with data in real-time. We can use it to achieve a high level of decoupling between services in the ever-growing microservice architecture.
The project was co-financed by the European Union from the European Regional Development Fund. The content of the site is the sole responsibility of Serengeti ltd.
Get a Quote
To get an accurate quote, please provide as many details as possible. One of our key account managers will contact you back with a custom quote for your project.
Manage Cookie Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.