Weird issue with TW docker image and OpenJDK

Hi, I'm using current TW to build a docker image available on Docker Hub, for running a software called 'BubbleUPnP Server' https://hub.docker.com/repository/docker/bubblesoftapps/bubbleupnpserver https://bubblesoftapps.com/bubbleupnpserver2/ This is Java program, thus the docker image includes the current java-11-openjdk-11.0.14 package. I built this image 2 days ago to update an image built a few months ago. It is based on TW 20220206. This image generally works fine. But I got 2 users reporting the JVM failing to start with a very unusual failure. First user as a QNAP NAS running an unspecified Linux kernel and Docker version Second user run a Debian 10 system on a x64 PC, which gives this failure: root@microserver:/srv/docker/bubbleupnp# docker run --rm --net=host bubblesoftapps/bubbleupnpserver Unable to find image 'bubblesoftapps/bubbleupnpserver:latest' locally latest: Pulling from bubblesoftapps/bubbleupnpserver d7be85cf8653: Pull complete 9a5818aa0147: Pull complete Digest: sha256:00446df066fbb3bdfe60c6f4c8536bd943c7a4f6714e1ab87f7fdc5519b2b5da Status: Downloaded newer image for bubblesoftapps/bubbleupnpserver:latest [0.007s][warning][os,thread] Failed to start thread - pthread_create failed (EPERM) for attributes: stacksize: 1024k, guardsize: 4k, detached. # # There is insufficient memory for the Java Runtime Environment to continue. # Cannot create worker GC thread. Out of system resources. # An error report file with more information is saved as: # /opt/bubbleupnpserver/hs_err_pid1.log So pthread_create is failing super early for a mysterious reason. After checking in-depth with that user various things about his docker install (in particular process limits, ram limits), everything looks fine. Moreover, the previous version of this container based on a TW snapshot from a few months ago with glibc 2.33 and OpenJDK 11.0.11 worked just fine. And he confirmed that alternate version of the container using Eclipse openJ9 as JVM, also using an older TW snapshot, also works fine. So I'm really puzzled about this crash and out of ideas. Something in this image cause crash above, but only on some Linux/docker distros. The image works fine on my TW development machine as well as an Ubuntu 18.04 server. I'm out of idea where the problem could be: in the java-11-openjdk package ? something glibc related ?

It turns out that running the problematic container with --priviledged fixed it, allowing the JVM to start. Then, the reporting user realized he had an old version of Docker (19.03.12~3-0~debian-buster) and updating it to 20.10.12~3-0~debian-buster fixed that issue, not requiring --priviledged anymore. Conclusion, on some systems running older Docker (possibly only Debian specific), running Java 11.0.14 requires --priviledged for the JVM to start while previous versions of Java (and possibly TW snapshots) did not.

On Feb 10 2022, Michael Pujos wrote:
[0.007s][warning][os,thread] Failed to start thread - pthread_create failed (EPERM) for attributes: stacksize: 1024k, guardsize: 4k, detached.
EPERM looks like a bad syscall filter. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."

On 2/10/22 16:58, Andreas Schwab wrote:
Bingo! I installed Debian 10 in a VM and has this issue. It comes with ancient docker 18.09.1+dfsg1-7.1+deb10u3. The container launches fine with "--security-opt seccomp=unconfined", disabling seccomp and syscall filters. When it crashes, strace gives this weird futex failures: futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004ac140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0xc0004ac140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004ac140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL[0.004s][warning][os,thread] Failed to start thread - pthread_create failed (EPERM) for attributes: stacksize: 1024k, guardsize: 4k, detached. # # There is insufficient memory for the Java Runtime Environment to continue. # Cannot create worker GC thread. Out of system resources. # An error report file with more information is saved as: # /opt/bubbleupnpserver/hs_err_pid1.log ) = 0 Still a bit annoying to have this Docker JVM crash (without disabling seccomp) on a stock Debian 10 install, assuming it is limited this distro...

It turns out that running the problematic container with --priviledged fixed it, allowing the JVM to start. Then, the reporting user realized he had an old version of Docker (19.03.12~3-0~debian-buster) and updating it to 20.10.12~3-0~debian-buster fixed that issue, not requiring --priviledged anymore. Conclusion, on some systems running older Docker (possibly only Debian specific), running Java 11.0.14 requires --priviledged for the JVM to start while previous versions of Java (and possibly TW snapshots) did not.

On Feb 10 2022, Michael Pujos wrote:
[0.007s][warning][os,thread] Failed to start thread - pthread_create failed (EPERM) for attributes: stacksize: 1024k, guardsize: 4k, detached.
EPERM looks like a bad syscall filter. -- Andreas Schwab, SUSE Labs, schwab@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."

On 2/10/22 16:58, Andreas Schwab wrote:
Bingo! I installed Debian 10 in a VM and has this issue. It comes with ancient docker 18.09.1+dfsg1-7.1+deb10u3. The container launches fine with "--security-opt seccomp=unconfined", disabling seccomp and syscall filters. When it crashes, strace gives this weird futex failures: futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004ac140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0xc0004ac140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable) futex(0xc0004ac140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0xc0004a2140, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0 futex(0x5615349a4ea0, FUTEX_WAIT_PRIVATE, 0, NULL[0.004s][warning][os,thread] Failed to start thread - pthread_create failed (EPERM) for attributes: stacksize: 1024k, guardsize: 4k, detached. # # There is insufficient memory for the Java Runtime Environment to continue. # Cannot create worker GC thread. Out of system resources. # An error report file with more information is saved as: # /opt/bubbleupnpserver/hs_err_pid1.log ) = 0 Still a bit annoying to have this Docker JVM crash (without disabling seccomp) on a stock Debian 10 install, assuming it is limited this distro...
participants (2)
-
Andreas Schwab
-
Michael Pujos