Hello, I am running SuSE 8.2 (fully patched) on a Dell PowerEdge 2650 with an external PowerVault RAID array on an LSI-based PERC2 card. The server shares files via Samba and NFS, and also hosts several moderately intensive applications for gene sequencing and mapping. The box is a dual Xeon with 4GB of RAM, with root and swap on an internal HW mirror and 400GB of data on the external RAID. Users have been complaining for some time now that the box hesitates in some situations, particularly when using "tab" to autocomplete commands and/or file names, and when entering any command for the first time in a while. That is, cd'ing to and ls'ing a directory for the first time might cause the machine to pause for anywhere from 2 to 10 seconds. Oddly, the second time there's no hesitation. This pause occurs when executing commands at the shell, when connecting to files via samba, and when accessing server applications via an X-based client. Everything I can find for performance stats suggests that the box is not hitting even a significant portion of its resources. Top, sar and iostat all show that the CPUs have never been more than 30% occupied, "free" shows that a gig of physical memory is available, and the swap partition (2GB) has only used about 40MB of its space. I've checked to see if any of the problem users have folders in their path that include a large number of files. None are more than 1200 or so and most are much less (20-200). I've confirmed that this occurs when ls'ing local volumes from the console (suggesting it's not a network problem?). This problem has persisted across several kernel updates. "iostat -x -d" did show some heavier stats for the root drive, especially the swap partition, and the fact that this appears to be related to caching of commands and filenames made me wonder if caching/paging/swapping was bogging down somewhere. As a shot in the dark I added another physical drive, configured it as a single (18GB) swap partition, and moved all swapping to it. No difference. The problem still occurs with all scientific applications stopped, all file sharing disabled, and no users logged in except for one (root) session. I'm aware that this may not be enough information to make any real guesses (I've been googling on this for over a week and still haven't found even any similar problems, let alone a solution). I welcome any suggestions or feedback, however, and am especially hopeful that someone else may have seen this and found its cause. Randall Rue, postmaster Fred Hutchinson Cancer Research Center Seattle, WA USA
FWIW, I have seen nothing like this over the last year or so on several 4GB PE2650 systems using several versions of SuSE and with hyperthreading enabled and disabled (the latter is our default). However, no network file systems are used, and the systems have few users logged in simultaneously.
Randy Rue wrote:
Users have been complaining for some time now that the box hesitates in some situations, particularly when using "tab" to autocomplete commands and/or file names, and when entering any command for the first time in a while.
Does your box have running NIS/YP processes (such as ypbind) when you're not using NIS? If so, kill the processes and disable the service. -- ---------------------------------------------------------------------- Patrick Greenwell, Support Account Manager, Fortune 500 SUSE LINUX, 1100 Sansome St., San Francisco, CA, 94111 T: +1 415 591 6607 - Cell: +1 510 499 7896 F: +1 510 591 6619 - patrick@suse.com ----------------------------------------------------------------------
On Monday 09 February 2004 10:45, Randy Rue wrote:
"iostat -x -d" did show some heavier stats for the root drive, especially the swap partition, and the fact that this appears to be related to caching of commands and filenames made me wonder if caching/paging/swapping was bogging down somewhere.
So is it pageing? Do you need more ram? If swap isn't being activly used then regardless how fast your swap drive is it would make no difference. Your numbers do not suggest any significant paging at all. The 18gig swap space may actually slow it down if it were to be used, but as you pointed out, it didn't help at all. So what does jump to the top of top when you enter a directory? Are you 100% positive your RAID is not running with a failed drive? Your raid card might be working its little chips to the bone while your main CPU is sitting around idle waiting for data. -- _____________________________________ John Andersen
On Tuesday 10 February 2004 06:45, Randy Rue wrote:
Users have been complaining for some time now that the box hesitates in some situations, particularly when using "tab" to autocomplete commands and/or file names, and when entering any command for the first time in a while. That is, cd'ing to and ls'ing a directory for the first time might cause the machine to pause for anywhere from 2 to 10 seconds. Oddly, the second time there's no hesitation. This pause occurs when executing commands at the shell, when connecting to files via samba, and when accessing server applications via an X-based client.
I had a similar problem with a 2650 "hesitating"
and got the clue from a post on the linux-poweredge@dell.com list.
The box was trying to increase throughput by caching disk writes ???
and the hesitation was while it flushed its buffers.
Decreasing the buffer size made the pause acceptably short.
participants (5)
-
Gary Gapinski
-
John Andersen
-
Michael James
-
Patrick Greenwell
-
Randy Rue