В Tue, 11 Jun 2013 13:16:59 +0200
Roger Oberholtzer
Despite being quiet on this, we have not solved the problem. We have:
* Tried other file systems (e.g., ext4) * Tried faster "server-grade" SATA disks. * Tried SATA3 interface as well as SATA2.
The same thing happens. Periodically, write calls are blocking for 4-5 seconds instead of the usual 20-30 msecs.
I have seen one unexpected thing: when running xosview during all this, the MEM usage shows the cache use slowly growing. The machine has 32 GB of RAM. The cache use just grows and grows as file file system is written to. Here is the part I don't get:
* If I close all apps that have a file open on the file system, the cache use remains. * If I run the 'sync(1)' command, the cache use remains. I would have thought that the cache would be freed as there is nothing left to cache. If not immediately, over a decent amount of time. But this is not the case. * Only when I unmount the file system does the cache get freed. Immediately.
Why would the cache grow and grow?
Because unused memory is wasted memory. It is better to use it as cache than to not use it at all. Data in cache has low priority and RAM consumed by filesystem cache can be considered "free" for all practical purposes.
Since the delay, when it happens, grows and grows, I get the feeling that this file system cache in RAM is slowly getting bigger and bigger, and each time it needs to be flushed, it takes longer and longer.
This is probably misinterpretation. What more likely happens, is - your program writes to memory - very fast - until dirty memory threshold kicks in, at which point system forces writeback to disk.
If the cache is being emptied at some reasonable point, why would it continue to grow? Remember that for each mounted file system there is one process writing to a single file. The disk usage remains 100% constant in terms of what is sent to be written.
It has nothing really to do with cache growing. When you write to a file, data is going to memory cache. If you program writes very fast, faster that data can be written to disk in background, at some point your program will be suspended until there is enough space.
Is there some policy or setting that controls how the file system deals with file system cache in RAM? More specifically, is there any way to limit it's size for a file system?
Not really. You can try to lower /proc/sys/vm/dirty_background_ratio; it should make kernel to start write back earlier. But at the end, if you generate data faster than it can be written to disks you hit the same issue, only later. Or use O_DIRECT as already suggested. Solaris throttles programs writing to UFS file when they are "too fast" ... -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org