Home Home > 2019 > 07 > 31 > Debugging jenkins
Sign up | Login

Deprecation notice: openSUSE Lizards user blog platform is deprecated, and will remain read only for the time being. Learn more...

Debugging jenkins

July 31st, 2019 by

We had strange near-daily outages of our internal busy jenkins for some weeks.

To get to the root cause of the issue, we enabled remote debugging with

-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9010 -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=ci.suse.de -Dcom.sun.management.jmxremote.password.file=/var/lib/jenkins/jmxremote.password

and attached visualvm to see what it was doing.
This showed the number of threads and memory usage in a sawtooth pattern. Every time the garbage collector ran, it dropped 500-1000 threads.

Today we noticed that every time it threw these java.lang.OutOfMemoryError: unable to create new native thread errors, the maximum number of threads was 2018… suspiciously close to 2048. Looking for the same time in journalctl showed
kernel: cgroup: fork rejected by pids controller in /system.slice/jenkins.service

So it was systemd refusing java’s request for a new thread and jenkins not handling that gracefully in all cases.
That was easily avoided with a

Now the new peak was at 4890 live threads and jenkins served all Geekos happily ever after.

Both comments and pings are currently closed.

One Response to “Debugging jenkins”

  1. Tony Su

    Thank you for posting about the problem, your steps to analyze, your conclusion and your solution.

    Your description and solution looks suspiciously like a common required setting when deploying high capacity Java apps in general which would have nothing to do with systemd(maybe you’re only applying the new configuration in a Unit file vs the native app configuration file?)… It’s fairly well known that in high capacity Java apps that the default number of supported threads is inadequate, and needs to be increased.

    This is something that has to be set in all Java apps including very large deployments of Logstash if you’re running Elastic, various other Java server apps.