Java threads in web application

How is multi-threading different in a Java based Web Application vs Stand-alone Java Application

I am fairly new to Java and my experience is limited to Web Based Applications running on a Web Container (Jboss in my case). Am I correct in saying that for Web Applications the web container takes care of multi-threading? If so, can I introduce new treads in a Web Based applications? Is there any advantage in doing so and in what scenario one would need to do that?

3 Answers 3

Am I correct in saying that for Web Applications the web container takes care of multi-threading?

Most webservers (Java and otherwise, including JBoss) follow a «one thread per request» model, i.e. each HTTP request is fully processed by exactly one thread. This thread will often spend most of the time waiting for things like DB requests. The web container will create new threads as necessary.

Some servers (in the Java ecosystem primarily Netty) do asynchronous request handling, either with a «one thread does everything» model, or something more complex. The basic idea there is that having lots of waiting threads wastes resources, so working asynchronously can be more efficient.

If so, can I introduce new treads in a Web Based applications?

It’s possible, but should be done very carefully, since errors (like memory leaks or missing synchronization) can cause bugs that are very hard to reproduce, or bring down the whole server.

Is there any advantage in doing so and in what scenario one would need to do that?

Well, the advantage is that you can do stuff in parallel. Using threads to improve pure computational speed is something you should not do on a webserver, as it would slow down the handling of other requests. That kind of thing should be done on a separate server, probably using some sort of job queue.

A legitimate scenario for multithreading in the context of handling an HTTP request might be if you need to access other network resources, e.g. call several different web services. If you do it in a single thrad, you have to wait for each call to finish in turn. But if you use multiple threads, the total waiting time is only the delay of the single slowest call.


Читайте также:  Avast html document html


Threads, concurrency, or synchronization are not very easy to understand concepts. When some concurrency is involved in our applications it’s pretty hard to avoid making mistakes. Although Java provides mechanisms to deal with parallel programming, sometimes there are just too many options. And often some essential options are missing. For web applications, Jakarta EE provides a simplified programming model to deal with parallel tasks. But in order to use it effectively and avoid mistakes, you need to understand the basic concepts which I’d like to explain here. Java provides a lot of mechanisms to help working with threads and concurrent tasks. As Java evolved, some new mechanisms were added while old mechanisms stayed. And it’s often not clear which of them are better and recommended to use in new applications. Jakarta EE builds on these features and makes them easier to understand and use. The standard Jakarta EE API intentionally specifies only interfaces and essential conceptual behavior. A lot of the complexity is abstracted away and provided by Jakarta EE runtimes to keep things simple. As a result, developers have a concise set of features. Easy to learn and understand but enough to build applications.

What is a thread pool

The basic concept of threading and parallelism in Jakarta EE runtimes is a thread pool. This is connected with the request processing model, where most of the tasks originate as a request from an external caller, they are then processed sequentially in a single thread, and produce some response that is usually sent back to the caller, persisted into a database or sent as a message to another system. Many separate tasks can run in parallel, each using its own separate thread. So there’s usually a simple mapping – one request needs one thread. After a task is finished, a thread doesn’t have to be destroyed. It can be reused to run another task, in order to avoid creating and destroying threads too often.

Читайте также:  Конвертация даты в python

Incoming tasks are not tied with any specific thread. A thread pool always makes sure there’s a thread available for a task. This is called “scheduling” and is often referred to as “thread scheduling”. Tasks are scheduled and processed either: by an existing thread that is finished with its previous task by a new thread if no free thread is available or they are queued and wait until a thread is available if all threads are busy Vice versa, threads aren’t tied to tasks either. They just take a task from a queue and process it. When threads are done with their tasks, they start processing another new task from a queue or wait for a task if the queue is empty. A group of such threads, together with the logic how they are managed and scheduled, is called a thread pool.

Why thread pools?

  • Threads allocates some memory for their stacks and keep the memory until the threads are disposed
  • The CPU can run very small number of threads at once; having too many threads can even lead to performance degradation
  • Each task requires some heap memory; it makes sense to limit the maximum number of parallel tasks to avoid overwhelming the system

The memory argument is pretty clear and often it’s evident when it becomes and issue. There’s always a limited amount of memory on the system. When threads consume a big portion of that memory, it’s easy to reach the limit. The default stack size for each thread is 1MB, which means that each thread needs 1MB of system memory. If there are 1000 threads in a JVM, they clearly need 1GB of memory on top of the Java heap size just to exist. You can probably imagine the consequences if there are even more threads.

Читайте также:  Вывести все четные элементы массива php

The CPU argument isn’t so straightforward but is really valid. Only X amount of threads can run on a CPU at the same time (usually 8 on an 8-core CPU). With more threads, it’s more likely that a thread will be suspended and another thread will be scheduled. Switching of threads is a relatively time-consuming operation and doesn’t contribute to the computation. Therefore, having an excessive amount of threads can actually decrease performance.

The last argument is that executing too many tasks in parallel costs too much memory as each tasks needs to store something to the heap even if it’s waiting for an I/O operation or the CPU. As described above, more parallel tasks doesn’t not always lead to increased performance. But it definitely leads to increased memory. Therefore it’s better to limit how many tasks can run in parallel and queue other tasks to be executed later.

For these reasons, it’s good to have some reasonable amount of threads ready to handle new tasks immediately. It’s also necessary to limit the number of threads to a reasonable amount. And threads should also be disposed after some time if they are really not needed. With this, it’s possible to avoid system become thrashed and unusable under high load. If there are too many requests to handle, some of them simply need to wait while others can be efficiently processed. This will make the system usable at least for some requests/users.

Sometimes a single request needs to be processed by multiple threads in parallel. This doesn’t fit the simplified thread-per-request model but it’s also supported by Jakarta EE runtimes. Applications can use a specialized concurrency API, which allows splitting a task and execute each part in a separate thread. This again works on top of thread pools to retain all the advantages mentioned above. But more on that later in a separate post.

Published on Java Code Geeks with permission by Ondrej Mihalyi, partner at our JCG program. See the original article here: INTRODUCTION TO CONCURRENCY AND THREADS IN JAVA WEB APPS


Оцените статью