EzDevInfo.com

high-load interview questions

Top high-load frequently asked interview questions

What caused socket connections to be slow after Full GC?

We have a client server app, 1 server, about 10 clients. They communicate via tcp sockets using custom queries.

The system had been running smooth for many months, but at some point, after the daily scheduled server FULL GC that took about 50s, we figured out that the time between the queries sent by the client and responses received from the server was large, > 10-20s. After some 3 hours the system recovered, everything was running fine again.

While investigating the issue, we found:

No garbage collection problems on both clients and server
Query processing time on server was small.
Load on server was high.
The network bandwidth was not saturated.
Connections were not reset during the FULL GC (the daily FULL GC was a normal event until then)
The machine and OS changed recently from Centos 6 (kernel 2.6.32) to Centos 7 (kernel 3.10.0), but the new config was extensivelly tested. Also Oracle JDK version changed from 1.7.65 to 1.7.75.

We took a thread dump on the server:

java.lang.Thread.State: RUNNABLE
    at java.io.FilterInputStream.read(FilterInputStream.java:83)
    at util.network.BytesBasedSocketConnection$ReadConnectionRunnable.run(BytesBasedSocketConnection.java:293)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

The FilterInputStream.read() is the following:

    public int read() throws IOException {
    return in.read();
}

The in in our code is a BufferedInputStream.

The questions are: Why most connections slowed after the Full GC pause? Why the stacktrace ends in FilterInputStream.read()? Shouldn't it end somewhere in the BufferedInputStream or in socket input stream? Can this read lead to high load on server?

The code we use for reading:

int constructLength = _socketDIS.readInt();
ByteArrayOutputStream constructBOAS = new ByteArrayOutputStream(constructLength);
for (int i = 0; i != constructLength; i++)
      constructBOAS.write(_socketDIS.read());
constructBOAS.close();
byte[] bytes = constructBOAS.toByteArray();

where:

_socketDIS = new DataInputStream(new BufferedInputStream(_socket.getInputStream()));

Here is the stacktrace from the well working client connections:

java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:152)
    at java.net.SocketInputStream.read(SocketInputStream.java:122)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
    - locked <0x00007f522cbebca8> (a java.io.BufferedInputStream)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at util.network.BytesBasedSocketConnection$ReadConnectionRunnable.run(BytesBasedSocketConnection.java:287)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

UPDATE:

Regarding the EJP answer:

There was no EOS involved, the connections were up, but they were very slow
Even if there was an EOS I can't see how the code could spin at the EOS, the for is bounded by the constructLength value. But still, the suggested improvement is valid.
The stacktrace with the problem ends in a read done on a DataInputStream ((_socketDIS.read()) that is inherited from FilterInputStream.read(), see code above. DataInputStream, not BufferedInputStream is missing the read(). Here in FilterInputStream.read() there is a in.read() called on a BufferedInputStream, this one has its own read() method defined. But the stacktrace stops in the middle, is not reaching BufferedInputStream.read(). Why?