techiehub.in

Java Performance: Role of CodeCache

Overview

Why is codecache critical for understanding the Java application performance. Before we jump into topic, let’s first understand how codecache is related to the code written in a .java file

As shown in the figure above, the code written in java file is compiled into bytecode ie. .class file

The bytecode is understood by JVM. This JVM makes Java special in terms of write once run anywhere ie. each OS(eg: Mac/Windows/Linux) has its own JVM implementation which converts the bytecode into machine code. Also there are different types of JVM implementation available, the default one (if not specified) is HotSpot JVM.

One of the main components of JVM is the JIT(Just in Time) compiler which improves the performance of JVM by compiling code that executes above a certain threshold into native instruction.

Note: While doing performance analysis for certain method’s it’s important to understand whether the analysis is done after the conversion of bytecode into native code or before the conversion(ie. Only interpretation).

With the understanding of JVM and JIT, the next step is to understand how the code compilation is done and how it can be viewed so that we can understand which part of the code is interpreted or converted into native code.

To get a view on how the code is running and which pieces are converted into native code we can use the below VM arg while running the application: `-XX:+PrintCompilation`

For applications running on remote machines we may not have access to the console. For such scenarios we can log the compilation using:

`-XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation`

Post enabling the option, you will see a log file in the root directory of the project. Eg: `hotspot_pid77745.log`

Sample output:

================================================

  28553 1357       3       sun.nio.ch.SocketChannelImpl::ensureOpen (16 bytes)

  28553 1356       1       sun.nio.ch.OptionKey::name (5 bytes)

  28553 1358       1       java.net.InetAddress$InetAddressHolder::getAddress (5 bytes)

  28864 1359       4       java.util.ArrayList::iterator (9 bytes)

  28866  629       3       java.util.ArrayList::iterator (9 bytes)   made not entrant

  29682 1360       4       java.lang.StringBuilder::<init> (7 bytes)

  29685  159       3       java.lang.StringBuilder::<init> (7 bytes)   made not entrant

  30534 1361       4       java.util.LinkedHashMap$LinkedValueIterator::next (8 bytes)

  30537  434       3       java.util.LinkedHashMap$LinkedValueIterator::next (8 bytes)   made not entrant

  32075 1362       4       java.util.concurrent.locks.AbstractQueuedSynchronizer::compareAndSetState (13 bytes)

  32076 1040       3       java.util.concurrent.locks.AbstractQueuedSynchronizer::compareAndSetState (13 bytes)   made not entrant

==================================================

Let’s understand the above output. The below table describes what each column and symbol stands for:

First columnNumber of milliseconds from the point when virtual machine started
Second columnOrder in which the method was compiled
Third column:– contains :0,1,2,3,4- 0: No compilation ie. the code has just been interpreted- 1 to 4: Progressively deeper level of compilation
!Method has exception handlers
%On stack replacement ie. method has been natively compiled and running in code cache ie. code running in most optimal way possible
*Generating a native wrapper
nNative method
sSynchronized method
bBlocking compiler (always set for client)
made not entrantcompilation was wrong/incomplete, no future callers will use this version
made zombiecode is not in use and ready for GC

In the above table, if you notice the highlighted row, the corresponding column in the sample output represents a number from 0-4.  Let’s understand what this number means:

JVM has two compilers built in:

– C1: Able to do first three level of compilation. Each level is progressively more complex than the last one

– C2 : Can do the 4th level. If the code is called too many times we reach the C4 level of compilation. Most optimized. Then the code is is put by JVM in **code cache** ie. the special area of memory for the code to be accessed quickly

For eg:30534 1361       4       java.util.LinkedHashMap$LinkedValueIterator::next (8 bytes)

This code is executed 30534 ms after the JVM started with an order of 1361 and compiled using C2 JIT compiler ie. converted into native code and stored in the codecache.

Key points to note:

  • Higher the level of compilation the more optimized the code should be
  • JVM does not optimize each piece of code into L4 as there is a trade off and benefits are far less.

So far we have understood how the code written in the java file reaches the codecache. 

Next question which comes up is when to look for CodeCache when analysing/debugging Java Application Performance. 

A good indicator for getting into codecache is to look out for the VM warning 

CodeCache is full. Compiler has been disabled

This means all the code in the code cache is actively being used. Thus no part of the code cache can be easily cleaned up implying the application is not running in an optimal manner. At this point we can increase the size of the codecache  to improve the performance.

Use the VM option to check the codecache size: `-XX:+PrintCodeCache`

Codecache can be upto 240MB

InitialCodeCacheSize: Codecache size at the start of the application . Generally 160KB

ReservedCodeCacheSize: Maximum size of the codecache

CodeCacheExpansionSize: How quickly the codecache could grow

Eg: Setting the reserved code size to 28 MB. `-XX:ReservedCodeCacheSize=28m`

As of java9 the codecache is divided into three segments:

– The non-method segment contains JVM internal related code `XX:NonNMethodCodeHeapSize`

– The profiled-code segment contains lightly optimized code with potentially short lifetimes `XX:ProfiledCodeHeapSize`

– The non-profiled segment contains fully optimized code with potentially long lifetimes `XX:NonProfiledCodeHeapSize`

Viewing CodeCache Size at Runtime

JConsole can be used to monitor the codecache of the application running on a remote machine.

JConsle will interact with JVM to fetch the metrics and during this process uses around 2MB of codecache.

Resources 

CodeCache Tuninghttps://docs.oracle.com/javase/8/embedded/develop-apps-platforms/codecache.htm

Categories