Conscious of it or not, we are all one instant generation. We want everything quick and now. Our demands and expectations as users of web and mobile applications are increasing every day. In order to meet these expectations, we are forced to constantly search and develop new, different solutions and to reach for new and more adequate concepts. Today's applications are far more complex - they process much more data, solve more complex problems and provide a much better UX. On the other hand, the number and types of devices that we as users use to consume these applications has grown and this is reason enough to change and adapt some old technologies and practices to new trends.
The aforementioned changes in complexity, technologies and user demands have caused a kind of evolutionary shift in the very architecture of applications and the way they are being designed and developed. Thus, for example, there are legacy applications that were mostly monolithic, were not distributed and had scalability and availability issues, while today prevail micro-services, distributed systems, cloud-based applications that are scalable and have near-zero downtime. Such a systematic approach solves some issues, but not all of them. There are still possibilities of blocked requests, slower response, improving UX, etc. This is where reactive architecture appears as one of the possible solutions. Let's get to know this concept.
The term reactive itself has the meaning of reacting to a stimulus. In my opinion, this is the simplest explanation. Of course, there are more complex ones, but let's try to explain this on a simple example.
So, we have two examples of adding two variables b and c to the addition of the sum of their values to variable a. If we subsequently change variable B, in the first case the value of variable A will remain the same because the sign “=” is not a reactive operator.
int b = 1
int c = 2
int a = b + c
b = 10
System.out.println(a) // 3 (not 12 because "=" is not a reactive operator)
Let's imagine that $= represents a special reactive operator that changes the value of the variable A also when, for example, the value of the referenced variable B changes.
int b = 1
int c = 2
int a $= b + c
b = 10
System.out.println(a) // 12 ($= special reactive operator)
For me personally, a very vivid example is the well-known excel table and formula that adds two numbers and each time the value of cells A1 or B1 changes, the value in cell C1 will also change immediately. Therefore, this is the essence of reactive behavior.
Of course, this is a very simplified example, because when we think about reactivity, we have to keep in mind the bigger picture and the fact that the key to reactivity is in asynchronism. Let's briefly remind ourselves what synchronous and asynchronous process execution is. So, synchronous execution implies that processes are executed sequentially one after the other in a way that the process waiting for execution can start only when the previous process is complete. When it comes to asynchronous execution, it is not necessary to wait for the previous process to complete, but during its execution other processes can start. Therefore, the processes can take place simultaneously or asynchronously.
Now we are a little bit closer, but we have not fully explained the concept of reactive. With asynchronous execution, it is important to know that the process will not be executed immediately, but at some point in the future. With that in mind, each executed request will "push" a message, that is, a response that the request has been processed. The segment of sending information about completion as well as other information is very important, and message management is crucial in reactive systems, i.e., architectures.
Considering that this type of issues has existed for a long time and that reactive proved to be a possible solution, in 2014 a group of experts created a reactive manifesto containing the basic principles of reactive.
The fundamental idea is based on the following assumption:
User demands and business needs have changed dramatically in recent years. Just a few years ago, a large application had dozens of servers, response time was measured in seconds, we had hours of offline maintenance and only megabytes of data. Today, applications are used everywhere, from mobile devices to cloud clusters running thousands of multi-core processors. Users expect millisecond response time and 100% uptime. Data is measured in petabytes. Today's demands simply cannot be met by yesterday's software architectures.
This requires a different, coherent approach to system architecture, and we believe that all the necessary aspects are already individually recognized: we want systems that are responsive, resilient, elastic and message-driven. We call them reactive systems.
In order for an application, system, or anything else to become reactive, it must be:
Responsive: The system should respond to requests in a timely manner. Responsive systems should have fast and consistent response time. Perhaps you have already seen similar data somewhere, there is a large amount of research by Google, Amazon and others that have come to the conclusion that only three seconds are enough for a user to become impatient and ultimately leave a slow website or web application. About 60% of dissatisfied users actually leave the site. Of those 60%, even 80% of them never come back, and 50% of them will share their dissatisfaction with friends, acquaintances and followers on social media.
Resilient: The system remains responsive in the event of failure or outage. This doesn't apply only to highly available, critical systems - any system that isn't resilient won't respond after a failure. Resilience is achieved by replication, outage isolation, delegating and limiting. Failures are contained within each component, isolating the components from each other ensures that parts of the system can fail and recover without compromising the system as a whole. The recovery of each component is delegated to another (external) component, and when needed, high availability is ensured by replication.
Elastic: An elastic system is the system that can adapt to different workloads. For example, it can increase its resources during peak load and decrease when the load is very light. The elastic system should react even under heavy load. In simple words, we can scale the system vertically, paying attention to the performance of the instances, and horizontally, paying attention to the number of instances, of course, depending on the workload.
Message-driven: Reactive systems rely on asynchronous messaging to establish a boundary between components that provides loose coupling, isolation and transparency. Also, in this way, we can delegate failures and outages as messages. Imagine that your system uses an unstable database that happens to be unavailable. If you have used traditional calls, each time one of the threads of your application called that database, that thread would be blocked for a long time. It will take up resources, it will take up the pool. This will also block other threads for the rest of your application. Asynchronous message-driven communication has no such issues.
It is important to note that these four principles are just that - principles. No system is immune to failure or performance degradation. However, if our goal is to build a reactive system, then our system must comply with these principles as much as possible.
(RESILIENCE + ELASTICITY) x MESSAGE-DRIVEN = RESPONSIVE
This would be the formula for achieving responsiveness based on the principles of the reactive manifesto. It is important to highlight that the collaboration of the principles of Resilience and Elasticity, supported by the Message-Driven principle, provide responsiveness. Now that we have familiarized ourselves with the basic principles of responsiveness, I suggest we see which principles help us in building a reactive architecture.
Stay responsive - Mark Zuckerberg got a pretty good sense of how important this principle is when Facebook, Instagram, WhatsApp and other services experienced a global outage in October 2021. In those six hours of being unavailable, it is estimated that FB lost about $60 million in advertising revenue, Facebook shares dropped 5% in one day, and Mark Zuckerberg himself lost $5.6 billion. The essence of responsiveness is not only in low latency, but in the change management in the context of data, environment, pattern design. Reactive, responsive applications efficiently detect and solve problems. They are focused on a constant fast response, and in the worst-case scenario, they respond with an error message or provide a service of lower quality but still sufficiently usable.
Keep uncertainty in mind - I believe you've already heard or said the famous developer phrase: "But it works on my machine". Once an application leaves the security of the local environment, it enters a space of insecurity. Especially when it comes to distributed systems. Although, theoretically, those are coordinated systems where everything is defined and strives for consistency, due to various compromises regarding responsiveness, this determination tends to get smaller. The key to manage uncertainty is in the application architecture itself. To design such applications, we must have protocols and functionalities that clearly define what they provide, what events and controls are acceptable and what will be the output. Also, we need to clearly define what kind of data models will be used. All of this should be transparent to other components in the system (of course, where necessary) through defined communication protocols so that everyone can know the state of individual parts of the system at any moment.
Accept failure - failure, outage or errors in general are expected states in reactive applications. This means that this state must be explicitly represented and managed at some level. For example, to an http request we always respond with a status, regardless of the final state of the API we contacted. This kind of management gives us the space to provide at least some kind of response instead of letting everything fall apart. Likewise, we must be aware that some outages may pass under the radar, and even in that case, we must ensure that it does not affect the system operation.
Strive for autonomy – components, that is, parts of a system are responsive to the extent of their autonomy in relation to the rest of the system. This autonomy is defined by clearly defining the boundaries of who owns the data and how the data availability is ensured. Another aspect of autonomy is that these boundaries are crossed only through documented protocols. These protocols must be asynchronous and event-based. Useful design patterns that encourage autonomy are domain driven design, event-sourcing and CQRS.
Build Consistency - Consistency means to guarantee the correctness and integrity of your application and user data. Insisting on consistency more than is really necessary will not add any value, it can only reduce the availability, efficiency and performance of your application. It doesn't matter if everything is right if we can't be reliable. When possible, we should design consistent system using asynchronous processing, systems that tolerate delays and temporary unavailability of its parts (e.g., using an event-driven architecture, using a NoSQL database, etc.)
Manage time - very often, both in real life and in IT systems, we need to communicate to coordinate activities. When performing actions synchronously, some component sends a request and tries to get a response from, for example, a component that is unavailable. During this time, the component that is calling is also unavailable since it is waiting for a response. This can be avoided by the so-called temporal decoupling. This approach helps to break time dependencies between components.
In other words, we give the calling component the option to do something else asynchronously instead of being blocked by waiting for another component to be available.
The simplest example is message queuing, component A sends a request to the queue and continues to do something else, and receiving component B will receive the request independently of component A.
Manage space - We can create a resilient system only by allowing it to live in multiple locations in order to be functional even when other components, e.g., hardware, are not functioning or are unavailable. Once distributed, the now-autonomous components collaborate, depending on the situation, making the most of their location independence. That so-called space decoupling enables replication, which ultimately increases system resilience and availability. By running multiple instances of a single component, those instances can share the workload. Thanks to location transparency, the rest of the system does not need to know where these instances are located, and system capacity can be increased transparently, on demand. If one instance crashes or is not deployed, the other replicas continue to run and share the load. This shutdown capability is necessary to avoid service interruption.
Manage dynamics - Applications must be able to respond to workloads that can vary drastically and constantly adapt to the situation, i.e., ensure that the supply always meets the demand, avoiding resource over-allocation. This means being flexible and reacting to changes by increasing or decreasing the resources that have been allocated. After all, distributed systems go through different kinds of dynamics. In the cloud, the topology is continuously evolving, emphasizing the need for space decoupling. The availability of services is also subject to evolution: services can be available or unavailable at any time, and this is a type of dynamism that emphasizes the need for temporal decoupling.
Pros and Cons of Reactive Architecture
Pros:
Cons:
Conclusion
From the above, we have seen that reactive architecture has many pros. However, building reactive systems is not simple. Not only are the languages and technologies different from the currently most popular languages and technologies, but some components do not support non-blocking calling methods, such as JDBC.
Fortunately, some open-source organizations, such as R2DBC, are promoting innovation in these areas. To facilitate the building of a complete reactive system, some organizations and individuals have adapted the main technical components, such as Reactor-Core, Reactor-Netty, Reactor-RabbitMQ and Reactor-Kafka.
And finally, when planning reactive architecture, we have to think in layers, from the base layer and the use of reactive programming principles, to the higher layers of the architecture that support the principles of the reactive manifesto. When we have a system designed in this way, through all layers of the architecture, we can say that we have built a reactive architecture.