https://bugzilla.novell.com/show_bug.cgi?id=712958 https://bugzilla.novell.com/show_bug.cgi?id=712958#c0 Summary: Kernel Problems Under Multi-User Loads Classification: openSUSE Product: openSUSE 11.4 Version: Final Platform: x86-64 OS/Version: openSUSE 11.4 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: drichard@largo.com QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20100101 Firefox/6.0 I've tried changing and testing every know bottleneck that I can find and so far nothing has improved our OpenSuse 11.4 server. My belief now is that this is a kernel/scheduling problem of some type. All of my research seems to come back to the kernel. After about 10-15 concurrent users the server gets extraordinarily slow. On earlier versions of OpenSuse, we were able to get 100-200 users into hardware that was far less robust. The first visible sign of problem is the network. Here is the network stats on an OpenSuse 11.3 server which is being hammered with networking and working well. Note the low RX packets. eth0 Link encap:Ethernet HWaddr F4:CE:46:C0:EA:A8 inet addr:128.222.233.237 Bcast:128.222.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1970250698 errors:0 dropped:175 overruns:0 frame:0 On this OpenSuse 11.4 server (which is used to deploy GNOME to thin clients), here is the same information: eth0 Link encap:Ethernet HWaddr 00:1C:C4:93:DF:72 inet addr:128.222.99.243 Bcast:128.222.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:67783895 errors:0 dropped:175438 overruns:0 frame:0 The other very visible issue is disk performance. I've noticed: - If you type in 'vi /etc/hosts' it sits for 3-4 seconds with a blinking cursor before the file appears. I tried it from the console and it's doing the same thing. - If you change a password for a user, it sits for 3-4 seconds before the prompt comes back. - If you try and install any software with Yast2 with a user load, the whole server basically crawls. All X events freeze during the install and you have to wait for them to install. The server is acting like it would if we were swapping, but that's not the case. top shows barely any load, we have 64GB on the server and only 12 is in use. I installed iotop and it's barely showing a load, and even when iotop is at zero, it still takes multiple seconds for a file to open in vi. When we get back below 10 users, everything immediately gets faster and things work as expected. Something is happening at a low level, and no tools seem to report the failure. Things we have tried: - Copied it to a VM instance and after it had a load, the same issues appeared. This kind of rules out the hardware. - I turned off nscd with the idea that it was slowing down file access, no changes in speed. - I turned off barriers on the ext4 file system, no change. - The VM instance actually downgrades it to ext3, with same results, so it seems not related to the physical file system. - Various sysctl.conf settings that people have mentioned, none of which seem to affect this issue. Kernel is currently: kernel-desktop-2.6.37.6-0.5.1.x86_64 Any tips or ideas are appreciated, whatever this problem is...it seems like it will prohibit Enterprise use and potential future SLED problems. Reproducible: Always Steps to Reproduce: 1. 2. 3. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.