Comparison of Efficiency and Resilience to High Load of Java and NodeJs REST API Architectures.

Published in

Sysco LABS Sri Lanka

11 min readApr 24, 2023

In this article, I am going to do an experiment. In this experiment, I am going to test and compare four REST API architectural styles and give insight into how the efficiency and resilience to the high load of each architecture. I am going to do this experiment using the scientific method.

The Observation and the Problem that we discovered

Backend REST APIs are some main software that will be used by many business organizations to achieve their business goals. At the end of the day, almost every business organization's goal is to increase their revenue by giving a well-optimized value to their customers with high customer satisfaction which leads to retaining their customer base for a long period of time.

In the above scenario, technology plays a major role in order to achieve those business goals. So a REST API is a good technological tool to reach to customer base easily and manage other business activities such as employee details, revenue and growth statistics, inventory management tasks and etc.

But in order to use the technology in the best and most profitable way, we need to build software in a way that a very efficient, scalable, and resilient. Otherwise, the ROI of that software for a business organization will become a waste and sometime it will harm revenue as well.

So in the context of REST APIs, there are many technologies and many architectural styles. As an architect or developers, we need to choose the right technology with the right architectural pattern in order to give maximum value to a business organization. So in this experiment, I am going to test efficiency and resilience to a high load of four architectural patterns for REST APIs.

Background Research

Architecture patterns

There are many architectural patterns that we use to develop REST APIs, but in this article, I will test only four patterns

Three from Java

Standard Sprinboot REST API application
The standard architecture that normally a REST API will be going to build using Springboot
Sprinboot REST API application using CompletableFutures
CompletableFuture in java is a way to achieve concurrency. In this architecture, I will test a Sprinboot REST API that will execute more than one data object in a concurrent and async way using CompletableFutures
Reactive Spring WebFlux application.
In this architecture, I will test a REST API that was built in a reactive way using Spring WebFlux. This architecture will work as an event-driven model using Flux and Mono.

One from NodeJs

Standard NodeJs REST API application using Express
Standard NodeJs REST API using Express framework Which will work as a Non-Blocking and Async way (Pretty much the same as Reactive java)

Load Testing

In order to run a load test, JMeter will be used in this experiment. JMeter is an open-source tool to do a load test for any API. In this experiment, I will use JMeter DSL which is a version of JMeter that we can use as a coding approach rather than run JMeter as a software application. The reason to use JMeter DSL rather than JMeter itself is that we can create the HTTP request POST body programmatically rather than define a concrete body. So we can create a request body in the way we need it programmatically.

This is how I create this request body

{
  "images": [
    {
      "name": "cute dog",
      "width": 120,
      "height": 120,
      "format": "jpg",
      "url": "https://images.examples.com/cute_dog_123",
      "size": 1320000,
      "device": "DSLR",
      "number_of_pixels": 14400,
      "created_date": "2017-07-21T17:32:28Z",
      "last_modified_date": "2018-07-21T17:32:28Z",
      "captured_by": "sandun"
    }
  ]
}

With the following java code

public String getRequestBody(PreProcessorVars preProcessorVars) {
     JSONArray imagesArray = new JSONArray(IntStream.range(0, 10).mapToObj(value -> new JSONObject(getImageDetailsAsMap(value))).toArray());
     Map<String, JSONArray> imageDetailsMap = new HashMap<>();
     imageDetailsMap.put("images", imagesArray);
     return new JSONObject(imageDetailsMap).toString();
 }

 public Map<String, String> getImageDetailsAsMap(int index) {
     long next = random.nextInt();
     Map<String, String> imageDetailsMap = new HashMap<>();
     imageDetailsMap.put("name", "cute dog - " + Math.abs(next));
     imageDetailsMap.put("width", "120");
     imageDetailsMap.put("height", "120");
     imageDetailsMap.put("format", index % 3 == 0 ? "jpg" : index % 3 == 1 ? "png" : "jpeg");
     imageDetailsMap.put("url", "https://images.examples.com/cute_dog_" + Math.abs(next));
     imageDetailsMap.put("size", "1320000");
     imageDetailsMap.put("device", index % 5 == 0 ? "DSLR" : index % 5 == 1 ? "PHONE" : index % 5 == 2 ? "DIGITAL_CAMERA" : index % 5 == 3 ? "GO_PRO" : "PIXEL");
     imageDetailsMap.put("number_of_pixels", "14400");
     imageDetailsMap.put("created_date", "2017-07-21T17:32:28Z");
     imageDetailsMap.put("last_modified_date", "2018-07-21T17:32:28Z");
     imageDetailsMap.put("captured_by", "sandun - " + Math.abs(next));
     return imageDetailsMap;
 }

Defining Boundaries

We cannot do an experiment by covering all the logical scenarios at the same time practically, so we define the boundaries for an experiment. The same thing applies to this experiment as well. Rather than cover the entire REST API paradigm I will test this based on the below boundaries.

I will consider a POST endpoint only because it will write data to the database. in order to write data, some validations and exception handling have to be done. So in the POST endpoint, there is some work to process in addition to IO operations.
I will run this on my machine and I will not use any cloud service. So the latency of an HTTP request will be less than some production deployment on a cloud service like AWS. But this experiment is about comparison, so you can get an idea about efficiency by comparing rather than focusing on the absolute value.
The application will be containerized and will be run using docker. I will use the Linux environment as the host operating system.
I will run this program on one instance and will not use any load balancing. That means you can gain a high load efficiency than this in a prod environment using many instances with load balancing. But again this is a comparison, so we compare all this using one instance. So you can gain an idea rather than focusing on absolute values.
I won't use any Authorization and Authentication to the endpoint. In real-world applications, this concept is a must and it will affect the performance of overall HTTP request latency. again don’t focus on absolute values because I will be doing a comparison here.
I will only use database operation as an IO operation. but this experiment can be done with any IO operation and any amount of them. For example, 3rd party API calls. Kafka or any streaming operation, OAuth authentication with social media, etc.
There are no high computation-heavy processes in those four APIs. These APIs are typical business application APIs like
getting data -> validate -> save in DB
getting data -> validate -> send to Kafka stream
getting data -> validate -> call 3rd party API -> save in DB

The Hypothesis

While normal java applications will work in a blocking way, NodeJs will work in a nonblocking way to IO operations and also in an async way

But Java async approach will use many threads to achieve concurrent behavior in addition to tomcat container threads than the normal approach. So the async approach should work more efficiently than a normal springboot application

Java CompletableFuture Async Application

But java is still blocking IO operations, while NodeJs is not. So NodeJs should work in an efficient way with a high load

But when we come to the reactive approach in java, we can achieve a non-blocking way with a reactive approach. In NodeJs it will use an event loop with one main thread to treat all HTTP requests. Even NodeJs work in a non-blocking way, it will work in a single main thread with a single event loop for that thread. But in Spring WebFlux, it uses a reactor-netty server. In reactor netty, there is an event loop mechanism to process events plus we can have multiple event loops (Not more than a number of cors in CPU). So each event loop can accept requests and treat them in a parallel way. So the Java Reactive approach should work in an efficient way than NodeJs with high loads.

Remember in java, physical threads are limited and thread creation is expensive. So if all the threads are allocated, then new HTTP requests have to wait until one thread will become available. On the other hand, we use the same physical thread pool for both tomcat threads and the threads that we create to process one data item from the payload in the async approach.

Remember in NodeJs, before going to call DB, the main thread will be busy validating and processing the payload in POST request so that new HTTP requests have to wait until the main thread will become available. The same happens after the IO operation.

Finally, the efficiency and resilience to high load should vary like this.

But in order to be sure we need to do the experiment.

Testing

Tech stack used in the experiment

Springboot
Spring WebFlux
Gradle
ExpressJs
PostgreSQL
JMeter DSL

I will test the APIs in the following order with 6 cores CPU and 6GB available memory

Test each API with 10 concurrent users
Test each API with 100 concurrent users
Test each API with 1000 concurrent users
Test each API with 10000 concurrent users

Results

As you can observe

Java Reactive approach will be the winner with the capability of handling 10000 concurrent users without any failure and with less average load time.
For lower concurrent users java async approach will be the most efficient and the java regular approach will be the next. The reactive approach will perform its best with more concurrent users than with fewer concurrent users.
But NodeJs is the less efficient approach among the four approaches. the average load time is less for all concurrent user counts than other approaches but it can handle all 10000 concurrent users without errors
So as these results, java reactive is the most prominent approach and we can say Nodejs is the next reliable approach because hence nodeJs get more time to process requests, and it can handle all the concurrent users without any errors.

As we can see, the above data is insufficient to come to a final and more accurate conclusion. To be a more fair experiment we need to consider more different situations. Also In a business organization, the aim is not only to achieve high availability and speed but to achieve those aims with less cost. So let’s do this experiment with a different set of resource utilization.

We can allocate a number of CPU cores that will be allocated to a container to use with the docker run command. So the container will use the defined number of cors when we use the API.

I will test the APIs again in the following order with 1 core CPU and 4 cores CPU using the docker containers.

Test each API with 10 concurrent users
Test each API with 100 concurrent users
Test each API with 1000 concurrent users
Test each API with 10000 concurrent users

Results

As you can see in this diagram we cannot notice much difference between the values of six, four, and one cores.
This will happen because NodeJs will use only one event loop, so only one core is sufficient to run the event loop. but to run other worker threads full NodeJs app may need additional cores, which is why we notice some error percentage for one core.
However, it is a waste of resources if we use six cores because we won’t achieve any performance gain after 4 cores.

We can see an interesting observation here. The reactive application can handle 10000 concurrent even one core without any failures and also with the highest efficiency
This reactive app can utilize the resources in the best efficient way. So this will lead to achieving high availability and speed at a low cost.
Increasing cores can achieve better performance as this reactive app can create more than one event loops based on available cores. So resources can be utilized as the business needs.

In this regular app, as you can see error percentage will increase when we decrease the resources. even this cannot handle 1000 concurrent users without failures with less than four cores
Also, the average load time also will increase when we decrease the resources

In this aync app, as you can see error percentage will increase when we decrease the resources. but better than the regular app
Also, the average load time also will increase when we decrease the resources

Final Conclusion

After observing all the above results and thinking logically, we can come up with the below conclusions.

Java reactive is a very prominent architecture to use because it can handle more concurrent users with fewer resources. So it will not only work with high load and high availability, but it will reduce the cost of the business as well.
If an organization has a team with java developers and wants to serve their customers with a high availability system, java reactive architecture is a good solution.
If an organization has fewer customers to serve and has a team with java developers, then the java async and regular approach is the most suitable approach. because the cost will be high to hire a development team that has experience in a different technology like NodeJs.
Also to use the java async approach with completable futures, require some experienced developers in java. So in some situations, java regular way is the most cost-effective approach. That decision should be taken by considering lots of factors, like the experience of the development team, the number of customers that the business has, cost constraints that have to handle the software system and cloud and etc.
NodeJs is the next very prominent architecture after the java reactive approach because it can handle more concurrent users with fewer resources than java regular and async way.
If an organization has a development team that has developers with experience in NodeJs and has a large customer base, then this NodeJs is the suitable architecture to use.
If an organization has a moderate size customer base and developers with javascript backgrounds with some cost constraints, then this NodeJs is the best to use.

So in this experiment according to our observations and conclusions, our hypothesis will become true as like below

But you should keep in mind any architecture among the above four architectures is a usable and most suitable architecture depending on the business requirements and technology infrastructure of the organization. So the most suitable one should be decided by considering various factors.