Roger Oberholtzer wrote:
On Mon, 2013-06-03 at 11:57 -0700, Linda Walsh wrote:
Roger Oberholtzer wrote:
I am using XFS on a 12.1 system. The system records jpeg data to large files in real time. We have used XFS for this for a while since it has as a listed feature that it is well suited to writing streaming media data. We have used this for quite a while on openSUSE 11.2.
We have developed a new version of this system that collects more data. What I have found is that the jpeg data is typically written at the speed I expect. Every once in a while, the write takes 100x longer. Instead of the expected 80 msecs or so to do the compress and write, it takes, say, 4 or 5 seconds.
1) Have you tried using a XFS Real-Time segment. It was designed to prevent this type of lag
I will have to explore this. I am not familiar with it.
It's basically a way to get you guaranteed I/O speeds, but I think it sacrifices some flexibility -- like maybe requiring pre-allocation of files (pure guess what the requirements are, as I haven't used it either).
It is a binary file that grows and grows (up to 2 GB, which is the max file size we allow). The file contains a stream of JPEG images. One after another. Each image is 1920 x 450. There are 50 of these per second at max speed.
---- What I'm not clear on is your earlier statement that you increased the size per image and now are re-writing them? Is there a 'rewrite' involved, or are you simply dumping data to disk as fast as you can? If it is the latter -- pre-allocate your space, and you will save yourself tons of perf issues. "xfs_alloc_file" (or its equivalent calls). If you have a 2ndary process allocate one of these when the old one gets to 75% full, you shouldn't notice any hiccups. Second thing -- someone else mentioned it -- it sounds like (this is true if you are writing or rewriting, so independent variable), is to do writes with O_DIRECT and do your own buffering to buffer to at least 1M, better 16M boundaries. If you use O_DIRECT you will want to be page & sector (I think the kernel changed, and you now you HAVE to be) aligned or you will get an error indication. You will get about a 30% or greater increase in write throughput. This is assuming your app doesn't immediately turn around and need to read the data again, in which case, you'd be penalized by not using the buffer cache. Do you watch your free memory? I have an "xosview" window open with LOAD/CPU/MEM/DISK (and an outside net)... but I can see used memory or cache memory becoming tight. Attached is a sample of what you can see.. I did a kernel build (make -j) so you could see how it emptied out the cache, for example.
The system has no problem doing this. It can work fine for 30 minutes. Then a single compress suddenly takes 4 or 5 seconds.
4-5 seconds after 30 minutes?... Geez, even I have to catch my breath now and then! If I write to /dev/null instead of a physical file, the
compress per image stays a constant 10 milliseconds. It is only when I fopen/fwrite a real file on an XFS disk that this happens.
If #3 is true, you might get better long-term performance improvement by restructuring your database by copying files to another partition, and on the other partition, set the allocsize= in your fstab, on the new partition to the size of the largest size your files will become. This will spread out data when the allocator first allocates the files so later updates won't require finding space that is far from the file.