Analyze java thread dump

Analyze java thread dump

Applications sometimes hang or run slowly, and finding the root cause is not always an easy task. Thread Dump provides a snapshot of the current state of a running Java program**. However, the generated data includes multiple long files. Therefore, we need to analyze Java thread dump and dig out problems in a large amount of irrelevant information.

In this tutorial, we will see how to filter data to effectively diagnose performance issues. In addition, we will learn to detect bottlenecks and even simple errors.

2.Threads in JVM

JVM uses threads to perform every internal and external operation. As we all know, the garbage collection process has its own thread, and tasks within the Java application also create its own thread.

In its life cycle, the thread will go through various states. Each thread has an execution stack that tracks the current operation. In addition, the JVM also stores all previous methods that were successfully called. Therefore, the entire stack can be analyzed to study what happened to the application when the problem occurred.

To demonstrate the topic of this tutorial, we take a simple Sender-Receiver application ( NetworkDriver ) as an example. Java programs send and receive data packets, so we will be able to analyze what is happening behind the scenes.

2.1. Capture Java thread dump

After the application is running, there are many ways to generate a Java thread dump for diagnosis. In this tutorial, we will use the two utilities included in the JDK7+ installation. First, we will execute the JVM Process Status (jps) command to discover the PID process of our application:

$ jps 
80661 NetworkDriver
33751 Launcher
80665 Jps
80664 Launcher
57113 Application

Second, we get the PID of the application, in this case NetworkDriver. the PID next to it. NetworkDriver. Then, we will use jstack to capture the thread dump. Finally, we store the results in a text file:

$ jstack -l 80661 > sender-receiver-thread-dump.txt

2.2. The structure of the sample dump

Let’s take a look at the thread dump that was generated. The first line shows the timestamp, and the second line shows information about the JVM:

2021-01-04 12:59:29 
Full thread dump OpenJDK 64-Bit Server VM (15.0.1+9-18 mixed mode, sharing):

The next section shows safe memory reclamation (SMR) and non-JVM internal threads:

Threads class SMR info: 
_java_thread_list=0x00007fd7a7a12cd0, length=13, elements= 0x00007fd7aa808200, 0x00007fd7a7012c00, 0x00007fd7aa809800, 0x00007fd7a6009200,
0x00007fd7ac008200, 0x00007fd7a6830c00, 0x00007fd7ab00a400, 0x00007fd7aa847800,
0x00007fd7a6896200, 0x00007fd7a60c6800, 0x00007fd7a8858c00, 0x00007fd7ad054c00,
0x00007fd7a7018800
>

Then, the dump shows a list of threads. Each thread contains the following information:

  • Name: If the developer includes a meaningful thread name, it can provide useful information
  • Priority (prior): the priority of the thread
  • Java ID (tid): the unique ID given by the JVM
  • Native ID (nid): a unique ID provided by the operating system, which can be used to extract the correlation with CPU or memory processing
  • State State : the actual state of the thread
  • Stack trace : the most important source of information that can be used to explain what is happening in our application
Читайте также:  Переадресация ссылок в html

We can see from top to bottom what the different threads are doing at the time of the snapshot. Let’s focus only on the interesting part of the stack used by the waiting message:

"Monitor Ctrl-Break" #12 daemon prio=5 os_prio=31 cpu=17.42ms elapsed=11.42s tid=0x00007fd7a6896200 nid=0x6603 runnable [0x000070000dcc5000] 
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.SocketDispatcher.read0([email protected]/Native Method)
at sun.nio.ch.SocketDispatcher.read([email protected]/SocketDispatcher.java:47)
at sun.nio.ch.NioSocketImpl.tryRead([email protected]/NioSocketImpl.java:261)
at sun.nio.ch.NioSocketImpl.implRead([email protected]/NioSocketImpl.java:312)
at sun.nio.ch.NioSocketImpl.read([email protected]/NioSocketImpl.java:350)
at sun.nio.ch.NioSocketImpl$1.read([email protected]/NioSocketImpl.java:803)
at java.net.Socket$SocketInputStream.read([email protected]/Socket.java:981)
at sun.nio.cs.StreamDecoder.readBytes([email protected]/StreamDecoder.java:297)
at sun.nio.cs.StreamDecoder.implRead([email protected]/StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read([email protected]/StreamDecoder.java:188)
- locked (a java.io.InputStreamReader)
at java.io.InputStreamReader.read([email protected]/InputStreamReader.java:181)
at java.io.BufferedReader.fill([email protected]/BufferedReader.java:161)
at java.io.BufferedReader.readLine([email protected]/BufferedReader.java:326)
- locked (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine([email protected]/BufferedReader.java:392)
at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:61)
Locked ownable synchronizers:
- (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

At first glance, we see that the main stack trace is executing java.io.BufferedReader.readLine , which is the expected behavior. If we look further down, we will see all the JVM methods executed by the application in the background . Therefore, we can determine the source of the problem by looking at the source code or other internal JVM processing.

At the end of the dump, we will notice that there are some other threads** performing background operations, such as garbage collection (GC) or object** termination :

"VM Thread" os_prio=31 cpu=1.85ms elapsed=11.50s tid=0x00007fd7a7a0c170 nid=0x3603 runnable 
"GC Thread#0" os_prio=31 cpu=0.21ms elapsed=11.51s tid=0x00007fd7a5d12990 nid=0x4d03 runnable
"G1 Main Marker" os_prio=31 cpu=0.06ms elapsed=11.51s tid=0x00007fd7a7a04a90 nid=0x3103 runnable
"G1 Conc#0" os_prio=31 cpu=0.05ms elapsed=11.51s tid=0x00007fd7a5c10040 nid=0x3303 runnable
"G1 Refine#0" os_prio=31 cpu=0.06ms elapsed=11.50s tid=0x00007fd7a5c2d080 nid=0x3403 runnable
"G1 Young RemSet Sampling" os_prio=31 cpu=1.23ms elapsed=11.50s tid=0x00007fd7a9804220 nid=0x4603 runnable
"VM Periodic Task Thread" os_prio=31 cpu=5.82ms elapsed=11.42s tid=0x00007fd7a5c35fd0 nid=0x9903 waiting on condition

Finally, the dump shows the Java Native Interface (JNI) reference. When memory leaks occur, we should pay special attention to this, because they are not automatically garbage collected:

JNI global refs: 15, weak refs: 0

The structure of all thread dumps is very similar, but we want to get rid of the non-important data generated for the use case. On the other hand, we need to save and group important information in a large number of logs generated by stack traces. Let’s see how to do it!

3.Suggestions for analyzing thread dumps

In order to understand what is happening in our application, we need to effectively analyze the generated snapshots. When dumping , we will get a lot of information** and precise data of all threads**. However, we need to organize the log files, do some filtering and grouping to extract useful hints from the stack trace. Once the dump is ready, we will be able to use other tools to analyze the problem. Let’s see how to decrypt the contents of the sample dump.

Читайте также:  Java приложения графический редактор

3.1. Synchronization problem

An interesting trick to filter out the stack trace is the state of the thread. We will mainly focus on ** RUNNABLE or BLOCKED threads, and finally TIMED_WAITING threads**. These states will point us in the direction of conflict between two or more threads:

  • In the case of deadlock** , multiple running threads reserve a synchronization block on the shared object**
  • In thread contention , when a ** thread is blocked to wait for other threads to complete. **For example, the dump generated in the previous section

3.2. Implementation issues

According to experience, for abnormally high CPU usage, we only need to check the RUNNABLE thread . We use thread dump with other commands to get more information. One of these commands is that top -H -p PID, it shows which threads in a particular process are consuming OS resources. Just in case, we also need to look at internal JVM threads (such as GC). On the other hand, when processing performance is abnormally low , we will look at the BLOCKED thread.

In that case, a dump will certainly not be enough to understand what is going on. In order to compare the stacks of the same thread at different times, we need to ** do a large number of dumps** in a very short interval **. On the one hand, a snapshot is not always enough to find the source of the problem. On the other hand, we need to avoid interference between snapshots (too much information).

In order to understand the evolution of threads over time, the recommended best practice is ** at least 3 dumps, once every 10 seconds**. Another useful trick is to divide the dump into small pieces to avoid crashing loading files.

In order to effectively understand the source of the problem, we need to organize a lot of information in the stack trace. Therefore, we will consider the following suggestions:

  • In the execution problem, capturing several snapshots at 10 second intervals will help focus on solving the actual problem. It is also recommended to split the file as needed to avoid loading crashes
  • Use naming when creating new threads to better identify source code
  • According to the problem, ignore internal JVM processing (such as GC)
  • When issuing abnormal CPU or memory usage, please pay attention to ** long-running or blocked threads**
  • Associate the thread’s stack with the CPU processing by using top -H -p PID
  • Most importantly, use the analyzer tool

Manually analyzing Java thread dumps can be a tedious job. For simple applications, the thread that caused the problem can be identified. On the other hand, for complex situations, we need tools to simplify this task. In the next section, we will use the dump generated for the sample thread contention to show how to use these tools.

Читайте также:  Case insensitive count python

4.Online tools

There are several online tools available. When using such software, we need to consider security issues. Please keep in mind that we may share logs with third-party entities .

4.1. Fast thread

FastThread may be the best online tool for analyzing thread dumps in a production environment. It provides a very beautiful graphical user interface. It also includes a variety of functions, such as thread CPU usage, stack length, and the most commonly used and most complex methods:

how-to-analyze-java-thread-dumps.png

FastThread integrates REST API functions to automatically analyze thread dumps. Using a simple cURL command, the results can be sent immediately. The main disadvantage is security, because it ** stores the stack trace in the cloud**.

4.2. JStack review

JStack Review is an online tool that can analyze dumps in the browser. It is only on the client side, so no data is stored outside the computer . From a security point of view, this is the main advantage of using it. It provides a graphical overview of all threads, shows the methods that are running, and also groups them by state. JStack Review separates the thread that generates the stack from the rest of the threads, which is very important for ignoring internal processes. Finally, it also includes the synchronizer and the ignored line:

how-to-analyze-java-thread-dumps-1.png

4.3. Spotify online Java thread dump analyzer

Spotify Online Java Thread Dump Analyzer is an online open source tool written in JavaScript. It displays the results in plain text format, which separates threads with and without stacks. It also shows the top-level methods in the running thread:

how-to-analyze-java-thread-dumps-2.png

5.Standalone application

We can also use several independent applications locally.

5.1. JProfiler

JProfiler is the most powerful tool on the market and is widely known in the Java developer community. You can use a 10-day trial license to test the functionality. JProfiler allows to create profiles and attach running applications to them. It includes a variety of functions that can identify problems on the spot, such as CPU and memory usage and database analysis. It also supports integration with IDE:

how-to-analyze-java-thread-dumps-3.png

5.2. IBM Java Thread Monitor and Dump Analyzer (TMDA)

IBM TMDA can be used to identify thread contention, deadlocks and bottlenecks. It is distributed and maintained for free, but does not provide any guarantee or support from IBM:

how-to-analyze-java-thread-dumps-4.png

5.3. Irockel** Thread Dump Analyzer (TDA)**

Irockel TDA is an independent open source tool licensed under LGPL v2.1. The latest version (v2.4) was released in August 2020, so it is well maintained. It displays the thread dump as a tree, and also provides some statistics to simplify navigation:

how-to-analyze-java-thread-dumps-5.png

Finally, the IDE supports basic analysis of thread dumps, so the application can be debugged during development.

5 Conclusion

In this article, we demonstrated how Java thread dump analysis can help us pinpoint synchronization or execution issues.

Most importantly, we reviewed how to analyze them correctly, including suggestions for the large amount of information embedded in organizational snapshots.

Источник

Оцените статью