[opensuse] Fork fails with Leap 42.2 on AMD FX machines
![](https://seccdn.libravatar.org/avatar/e36b6c985738e9832c9a2404e3aeadf4.jpg?s=120&d=mm&r=g)
All, I have a couple machines running AMD FX CPUs. Both are running Gigabyte motherboards. One is a GA-990FXA-UD5 with 32G of RAM and an FX(tm)-8350 (8 core). The second is a 970A-UD3P with 32G of RAM and an FX(tm)-6300 (6 core). The 990FXA uses the 990FX northbridge, the 970A uses the 970 northbridge. Both use the SB950 southbridge. I upgraded the 970A machine from Leap 42.1 to 42.2 several weeks ago, and immediately started having problems with Google Chrome / chromium crashes, as well as a lot of "bash: fork: retry: No child processes" from shells. I tried upgrading to several different kernels (all from OBS), with similar results. I also tried playing with the IOMMU settings, both in BIOS and in the kernel parameters. The problems persisted (and still do). Last night, I upgraded the 990FX box, and am now having the same problems. I was hoping the the 990FX chipset would be different enough, but apparently it's not. At work, I'm running Leap 42.2 on several Intel boxes, and I've not seen this problem, so I'm guessing that it's linked to either the CPU or, I think more likely, the chipset. I don't think that it's relevant, but both boxes are running ATI / AMD video cards: Radeon R7 370 and 360, respectively. Various searches of the mailing list archives Somebody else must have run into this, I can't believe that I'm the first. dmesg (and /var/log/messages (I use syslog-ng)) are surprisingly quiet about it all. The most common things I see are messages like: trap invalid opcode ip:5608d1232a92 sp:7ffe373a73d0 error:0 in chrome[5608cd3fb000+67d5000]traps: If anybody has any suggestions, requires more information, etc., please don't hesitate. Thanks! -Nick -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
![](https://seccdn.libravatar.org/avatar/28fb60f36a5c05d6e95d00be1c0c257c.jpg?s=120&d=mm&r=g)
Le 14/02/2017 à 20:12, Nick LeRoy a écrit :
If anybody has any suggestions, requires more information, etc., please don't hesitate.
was it a fresh install ,or an upgrade? sorry, I have no AMD proc here :-( jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
![](https://seccdn.libravatar.org/avatar/dcb1cafc096989977947ac1b483706a4.jpg?s=120&d=mm&r=g)
All,
I have a couple machines running AMD FX CPUs. Both are running Gigabyte motherboards. One is a GA-990FXA-UD5 with 32G of RAM and an FX(tm)-8350 (8 core). The second is a 970A-UD3P with 32G of RAM and an FX(tm)-6300 (6 core). The 990FXA uses the 990FX northbridge, the 970A uses the 970 northbridge. Both use the SB950 southbridge.
I upgraded the 970A machine from Leap 42.1 to 42.2 several weeks ago, and immediately started having problems with Google Chrome / chromium crashes, as well as a lot of "bash: fork: retry: No child processes" from shells. I tried upgrading to several different kernels (all from OBS), with similar results. I also tried playing with the IOMMU settings, both in BIOS and in the kernel parameters. The problems persisted (and still do).
Last night, I upgraded the 990FX box, and am now having the same problems. I was hoping the the 990FX chipset would be different enough, but apparently it's not.
At work, I'm running Leap 42.2 on several Intel boxes, and I've not seen this problem, so I'm guessing that it's linked to either the CPU or, I think more likely, the chipset. I don't think that it's relevant, but both boxes are running ATI / AMD video cards: Radeon R7 370 and 360, respectively.
Various searches of the mailing list archives Somebody else must have run into this, I can't believe that I'm the first.
dmesg (and /var/log/messages (I use syslog-ng)) are surprisingly quiet about it all. The most common things I see are messages like:
trap invalid opcode ip:5608d1232a92 sp:7ffe373a73d0 error:0 in chrome[5608cd3fb000+67d5000]traps:
If anybody has any suggestions, requires more information, etc., please don't hesitate.
Thanks!
-Nick I am running 42.2 with 970A-UD3P at my parents (so I do not have the machine under hand. It is a known issue that Iommu has to be set to on, otherwise you do not have usb3 support. But there is also a kernelparameter that has to be set. At boot time do you get a long list of error warnings that suddenly stops? This is an issu related to the chipset that you have to start with a specific
In data martedì 14 febbraio 2017 13:12:36, Nick LeRoy ha scritto: parameter (iommu=soft) to get the USB 3.0 ports running. You may wish to try. As for the 990 I do not know. I am running the board with amd graphics and opensource amd driver. When updating from 42.1 I needed to uninstall the outdated flglrx driver, unblacklist the radeon driver and then do a mkinitrd. After this the system started normally. Hope that helps in any way. P.S. as GUI is heavily graphics dependent these days, you should tell what card and driver you use. You may also wish to controll what OpenGL version you did set to use (provided you are running KDE). -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
![](https://seccdn.libravatar.org/avatar/5a1ff996671b3bbad16acab9a1723a8b.jpg?s=120&d=mm&r=g)
On 02/14/2017 02:12 PM, Nick LeRoy wrote:
All,
I have a couple machines running AMD FX CPUs. Both are running Gigabyte motherboards. One is a GA-990FXA-UD5 with 32G of RAM and an FX(tm)-8350 (8 core). The second is a 970A-UD3P with 32G of RAM and an FX(tm)-6300 (6 core). The 990FXA uses the 990FX northbridge, the 970A uses the 970 northbridge. Both use the SB950 southbridge.
I upgraded the 970A machine from Leap 42.1 to 42.2 several weeks ago, and immediately started having problems with Google Chrome / chromium crashes, as well as a lot of "bash: fork: retry: No child processes" from shells. I tried upgrading to several different kernels (all from OBS), with similar results. I also tried playing with the IOMMU settings, both in BIOS and in the kernel parameters. The problems persisted (and still do).
Last night, I upgraded the 990FX box, and am now having the same problems. I was hoping the the 990FX chipset would be different enough, but apparently it's not.
I am running the same GA-990FXA-UD5 with the same CPU but with 16GB of memory. I did a "trial" install of Leap 42.2. I'm still running 13.2 with vanilla kernels. Just to get it installed I had to disable the IOMMU in the bios and set my memory to 4gb. "kernel command line mem=4096M". The reason was that the kernel version in the dist package has an AMD IOMMU bug. I was getting IOMMU page faults all over the place. But if the IOMMU was not enabled and running more than 4GB, the I/O devices have to use Dual Address Cycles (DAC) to access memory above that 4GB. And I have yet to see a MB that supports DAC reliably. So I was having major issues just getting everything installed. But once I got up and running with the latest kernel, I was then able to turn the IOMMU back on and run with the full 16GB of memory. I'm still running 13.2 on it now though. Regards Mark -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
![](https://seccdn.libravatar.org/avatar/7c8d44c6b8084760ec77c99f075de32d.jpg?s=120&d=mm&r=g)
On Tuesday, February 14, 2017 2:50:17 PM EST Mark Hounschell wrote:
I am running the same GA-990FXA-UD5 with the same CPU but with 16GB of memory. I did a "trial" install of Leap 42.2. I'm still running 13.2 with vanilla kernels. Just to get it installed I had to disable the IOMMU in the bios and set my memory to 4gb. "kernel command line mem=4096M". The reason was that the kernel version in the dist package has an AMD IOMMU bug. I was getting IOMMU page faults all over the place. But if the IOMMU was not enabled and running more than 4GB, the I/O devices have to use Dual Address Cycles (DAC) to access memory above that 4GB. And I have yet to see a MB that supports DAC reliably. So I was having major issues just getting everything installed. But once I got up and running with the latest kernel, I was then able to turn the IOMMU back on and run with the full 16GB of memory. I'm still running 13.2 on it now though.
I am running the Asus counterpart to that Gigabyte board with the FX-6300, the 970/950 chipset, and 8GB of RAM. IOMMU is disabled in the UEFI bios, I have seen no problems with that. But just fwiw, have you checked the disk? I had a lot of weird problems after multiple attempts clean installing 42.2. Finally I installed vanilla and applied each kernel update one at a time, checking journalctl after each reboot. The 4.4.36-8 kernel (but not the previous; strange) threw a disk access error at each boot. The disk had been running 13.2 fine for a long time, and SMART showed the disk as healthy. But a closer look revealed a large number of bad blocks. After installing to a different disk, all the problems are gone. Might be worth a look? --dg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
![](https://seccdn.libravatar.org/avatar/7891b1b1a5767f4b9ac1cc0723cebdac.jpg?s=120&d=mm&r=g)
Nick LeRoy wrote:
dmesg (and /var/log/messages (I use syslog-ng)) are surprisingly quiet about it all. The most common things I see are messages like:
trap invalid opcode ip:5608d1232a92 sp:7ffe373a73d0 error:0 in chrome[5608cd3fb000+67d5000]traps:
That is certainly a problem, but whether it is related to your forking issue, I cannot say. I expect there is more info - you ought to be able to tell which instruction/opcode. -- Per Jessen, Zürich (1.6°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
![](https://seccdn.libravatar.org/avatar/0ab7e02630bf8ea7396ee92940c8b594.jpg?s=120&d=mm&r=g)
On 02/14/2017 08:12 PM, Nick LeRoy wrote:
I upgraded the 970A machine from Leap 42.1 to 42.2 several weeks ago, and immediately started having problems with Google Chrome / chromium crashes, as well as a lot of "bash: fork: retry: No child processes" from shells.
I'd check "ulimit -a" ... or wait a minute: systemd also limts the number of processes per user as far as I remember. At least I'm seeing such "fork failed" also on Tumbleweed when running "make -j" while the same project does not have such a problem on openSUSE13.2. Wasn't this like the "DefaultLimitNPROC" setting ...? Have a nice day, Berny -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
![](https://seccdn.libravatar.org/avatar/e36b6c985738e9832c9a2404e3aeadf4.jpg?s=120&d=mm&r=g)
All, First, thank you for the many replies.
I upgraded the 970A machine from Leap 42.1 to 42.2 several weeks ago, and immediately started having problems with Google Chrome / chromium crashes, as well as a lot of "bash: fork: retry: No child processes" from shells. I tried upgrading to several different kernels (all from OBS), with similar results. I also tried playing with the IOMMU settings, both in BIOS and in the kernel parameters. The problems persisted (and still do).
Well, I was apparently mistaken about the nature of the problem. It turns out to not have been related to the hardware after all. Rather, it's a new "feature" of systemd 228. The part that I don't understand is why I wasn't seeing these problems after the systemd upgrade to 228 when the systems were running Leap 42.1. I can only conclude that there is some backward compatibility built into the systemd 228 on Leap 42.1. There appear to be several problems. 1. Basic ulimit, in particular ulimit -u. I cranked up the setting for my user in /etc/security/limits.conf (hard and soft nproc settings). 2. systemd system configuration. Edit /etc/systemd/system.conf, adjust the value for DefaultTasksMax. 3. systemd logind configuration. Edit /etc/systemd/logind.conf, adjust the value for UserTaksMax. For my system, I set these values to that seem more reasonable to me (10000), and now the systems are running properly. Thanks again to all who responded! -Nick -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (7)
-
Bernhard Voelker
-
Dennis Gallien
-
jdd
-
Mark Hounschell
-
Nick LeRoy
-
Per Jessen
-
stakanov