Mailinglist Archive: opensuse-bugs (4067 mails)

< Previous Next >
[Bug 712958] New: Kernel Problems Under Multi-User Loads
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Thu, 18 Aug 2011 18:34:28 +0000
  • Message-id: <bug-712958-21960@http.bugzilla.novell.com/>

https://bugzilla.novell.com/show_bug.cgi?id=712958

https://bugzilla.novell.com/show_bug.cgi?id=712958#c0


Summary: Kernel Problems Under Multi-User Loads
Classification: openSUSE
Product: openSUSE 11.4
Version: Final
Platform: x86-64
OS/Version: openSUSE 11.4
Status: NEW
Severity: Normal
Priority: P5 - None
Component: Kernel
AssignedTo: kernel-maintainers@xxxxxxxxxxxxxxxxxxxxxx
ReportedBy: drichard@xxxxxxxxx
QAContact: qa@xxxxxxx
Found By: ---
Blocker: ---


User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20100101
Firefox/6.0

I've tried changing and testing every know bottleneck that I can find and so
far nothing has improved our OpenSuse 11.4 server. My belief now is that this
is a kernel/scheduling problem of some type. All of my research seems to come
back to the kernel.

After about 10-15 concurrent users the server gets extraordinarily slow. On
earlier versions of OpenSuse, we were able to get 100-200 users into hardware
that was far less robust.

The first visible sign of problem is the network. Here is the network stats on
an OpenSuse 11.3 server which is being hammered with networking and working
well. Note the low RX packets.

eth0 Link encap:Ethernet HWaddr F4:CE:46:C0:EA:A8
inet addr:128.222.233.237 Bcast:128.222.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1970250698 errors:0 dropped:175 overruns:0 frame:0

On this OpenSuse 11.4 server (which is used to deploy GNOME to thin clients),
here is the same information:

eth0 Link encap:Ethernet HWaddr 00:1C:C4:93:DF:72
inet addr:128.222.99.243 Bcast:128.222.255.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:67783895 errors:0 dropped:175438 overruns:0 frame:0

The other very visible issue is disk performance. I've noticed:
- If you type in 'vi /etc/hosts' it sits for 3-4 seconds with a blinking cursor
before the file appears. I tried it from the console and it's doing the same
thing.
- If you change a password for a user, it sits for 3-4 seconds before the
prompt comes back.
- If you try and install any software with Yast2 with a user load, the whole
server basically crawls. All X events freeze during the install and you have
to wait for them to install.

The server is acting like it would if we were swapping, but that's not the
case.

top shows barely any load, we have 64GB on the server and only 12 is in use.

I installed iotop and it's barely showing a load, and even when iotop is at
zero, it still takes multiple seconds for a file to open in vi.

When we get back below 10 users, everything immediately gets faster and things
work as expected.

Something is happening at a low level, and no tools seem to report the failure.

Things we have tried:
- Copied it to a VM instance and after it had a load, the same issues appeared.
This kind of rules out the hardware.
- I turned off nscd with the idea that it was slowing down file access, no
changes in speed.
- I turned off barriers on the ext4 file system, no change.
- The VM instance actually downgrades it to ext3, with same results, so it
seems not related to the physical file system.
- Various sysctl.conf settings that people have mentioned, none of which seem
to affect this issue.

Kernel is currently:
kernel-desktop-2.6.37.6-0.5.1.x86_64

Any tips or ideas are appreciated, whatever this problem is...it seems like it
will prohibit Enterprise use and potential future SLED problems.


Reproducible: Always

Steps to Reproduce:
1.
2.
3.

--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

< Previous Next >