Overview
Why is codecache critical for understanding the Java application performance. Before we jump into topic, let’s first understand how codecache is related to the code written in a .java file
As shown in the figure above, the code written in java file is compiled into bytecode ie. .class file
The bytecode is understood by JVM. This JVM makes Java special in terms of write once run anywhere ie. each OS(eg: Mac/Windows/Linux) has its own JVM implementation which converts the bytecode into machine code. Also there are different types of JVM implementation available, the default one (if not specified) is HotSpot JVM.
One of the main components of JVM is the JIT(Just in Time) compiler which improves the performance of JVM by compiling code that executes above a certain threshold into native instruction.
Note: While doing performance analysis for certain method’s it’s important to understand whether the analysis is done after the conversion of bytecode into native code or before the conversion(ie. Only interpretation).
With the understanding of JVM and JIT, the next step is to understand how the code compilation is done and how it can be viewed so that we can understand which part of the code is interpreted or converted into native code.
To get a view on how the code is running and which pieces are converted into native code we can use the below VM arg while running the application: `-XX:+PrintCompilation`
For applications running on remote machines we may not have access to the console. For such scenarios we can log the compilation using:
`-XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation`
Post enabling the option, you will see a log file in the root directory of the project. Eg: `hotspot_pid77745.log`
Sample output:
================================================
28553 1357 3 sun.nio.ch.SocketChannelImpl::ensureOpen (16 bytes)
28553 1356 1 sun.nio.ch.OptionKey::name (5 bytes)
28553 1358 1 java.net.InetAddress$InetAddressHolder::getAddress (5 bytes)
28864 1359 4 java.util.ArrayList::iterator (9 bytes)
28866 629 3 java.util.ArrayList::iterator (9 bytes) made not entrant
29682 1360 4 java.lang.StringBuilder::<init> (7 bytes)
29685 159 3 java.lang.StringBuilder::<init> (7 bytes) made not entrant
30534 1361 4 java.util.LinkedHashMap$LinkedValueIterator::next (8 bytes)
30537 434 3 java.util.LinkedHashMap$LinkedValueIterator::next (8 bytes) made not entrant
32075 1362 4 java.util.concurrent.locks.AbstractQueuedSynchronizer::compareAndSetState (13 bytes)
32076 1040 3 java.util.concurrent.locks.AbstractQueuedSynchronizer::compareAndSetState (13 bytes) made not entrant
==================================================
Let’s understand the above output. The below table describes what each column and symbol stands for:
First column | Number of milliseconds from the point when virtual machine started |
Second column | Order in which the method was compiled |
Third column: | – contains :0,1,2,3,4- 0: No compilation ie. the code has just been interpreted- 1 to 4: Progressively deeper level of compilation |
! | Method has exception handlers |
% | On stack replacement ie. method has been natively compiled and running in code cache ie. code running in most optimal way possible |
* | Generating a native wrapper |
n | Native method |
s | Synchronized method |
b | Blocking compiler (always set for client) |
made not entrant | compilation was wrong/incomplete, no future callers will use this version |
made zombie | code is not in use and ready for GC |
In the above table, if you notice the highlighted row, the corresponding column in the sample output represents a number from 0-4. Let’s understand what this number means:
JVM has two compilers built in:
– C1: Able to do first three level of compilation. Each level is progressively more complex than the last one
– C2 : Can do the 4th level. If the code is called too many times we reach the C4 level of compilation. Most optimized. Then the code is is put by JVM in **code cache** ie. the special area of memory for the code to be accessed quickly
For eg:30534 1361 4 java.util.LinkedHashMap$LinkedValueIterator::next (8 bytes)
This code is executed 30534 ms after the JVM started with an order of 1361 and compiled using C2 JIT compiler ie. converted into native code and stored in the codecache.
Key points to note:
- Higher the level of compilation the more optimized the code should be
- JVM does not optimize each piece of code into L4 as there is a trade off and benefits are far less.
So far we have understood how the code written in the java file reaches the codecache.
Next question which comes up is when to look for CodeCache when analysing/debugging Java Application Performance.
A good indicator for getting into codecache is to look out for the VM warning
CodeCache is full. Compiler has been disabled
This means all the code in the code cache is actively being used. Thus no part of the code cache can be easily cleaned up implying the application is not running in an optimal manner. At this point we can increase the size of the codecache to improve the performance.
Use the VM option to check the codecache size: `-XX:+PrintCodeCache`
Codecache can be upto 240MB
InitialCodeCacheSize: Codecache size at the start of the application . Generally 160KB
ReservedCodeCacheSize: Maximum size of the codecache
CodeCacheExpansionSize: How quickly the codecache could grow
Eg: Setting the reserved code size to 28 MB. `-XX:ReservedCodeCacheSize=28m`
As of java9 the codecache is divided into three segments:
– The non-method segment contains JVM internal related code `XX:NonNMethodCodeHeapSize`
– The profiled-code segment contains lightly optimized code with potentially short lifetimes `XX:ProfiledCodeHeapSize`
– The non-profiled segment contains fully optimized code with potentially long lifetimes `XX:NonProfiledCodeHeapSize`
Viewing CodeCache Size at Runtime
JConsole can be used to monitor the codecache of the application running on a remote machine.
JConsle will interact with JVM to fetch the metrics and during this process uses around 2MB of codecache.