CI – openSUSE Lizards https://lizards.opensuse.org Blogs and Ramblings of the openSUSE Members Fri, 06 Mar 2020 11:29:40 +0000 en-US hourly 1 https://wordpress.org/?v=4.7.5 Debugging jenkins https://lizards.opensuse.org/2019/07/31/debugging-jenkins/ https://lizards.opensuse.org/2019/07/31/debugging-jenkins/#comments Wed, 31 Jul 2019 08:04:23 +0000 http://lizards.opensuse.org/?p=13945 We had strange near-daily outages of our internal busy jenkins for some weeks.

To get to the root cause of the issue, we enabled remote debugging with

-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9010 -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=ci.suse.de -Dcom.sun.management.jmxremote.password.file=/var/lib/jenkins/jmxremote.password

and attached visualvm to see what it was doing.
This showed the number of threads and memory usage in a sawtooth pattern. Every time the garbage collector ran, it dropped 500-1000 threads.

Today we noticed that every time it threw these java.lang.OutOfMemoryError: unable to create new native thread errors, the maximum number of threads was 2018… suspiciously close to 2048. Looking for the same time in journalctl showed
kernel: cgroup: fork rejected by pids controller in /system.slice/jenkins.service

So it was systemd refusing java’s request for a new thread and jenkins not handling that gracefully in all cases.
That was easily avoided with a
TasksMax=8192

Now the new peak was at 4890 live threads and jenkins served all Geekos happily ever after.

]]>
https://lizards.opensuse.org/2019/07/31/debugging-jenkins/feed/ 1