[opensuse] Re: Revisit old topic: Re: Question re filesystem reserved block percent

11 Mar 2013

      Anton Aylward wrote:
...
Linda Walsh said the following on 03/10/2013 02:41 PM:
...
-----
  Sorry to bring this up so long after the fact, but I wanted
to add some *counter* information to the idea of *decreasing* a reserve factor.
Instead of decreasing it, there are times when you might want to increase
it -- it depends on *your priorities*.
If you need space and don't care so much about speed, shrinking is
likely the best way to go, but if you want to keep your disks running in their
maximum speed range, I've noticed with 1-1.5TB partitions on a 24gB (12x2tB)
RAID that free space should be kept around 20-25% and that's on a B-Tree based
file system (XFS).
Context please!

You were part of the original discussion -- I remember your name.
In that discussion, the general consensus was decreasing it to 3% free space
on larger disks was probably fine.

	I was giving one specific (i.e. context provided) example, yet
you go off like I was saying "always do XXYZ".... ?!?!!?  did you not
read the whole note before thinking of your counterpoints?
...
I can think of specifics:
* a file system devotes to things that don't change
  (except for updates) such a the hierarchies /usr/lib, /usr/share/man,
  /usr/share/icons and many others, so there is no churn and in the
  limiting case these might be treated as 'read only' file systems.
  Certainly in a shared environment such as thin workstations that PXE
  boot[1] or have only a minimal local file system (such as /etc) and
  NFS mount everything else, administration would have it that
  everything except the home directory/roving-share
----
	How is that a file system that has been up to 90% full and back down again?
...
* a file system that contains structured data, such as database files,
  which are very large and whose internals change but whose size does
  not, so the issue of free space and allocation and the 'churn' of the
  free list does not matter. (Some FS based databases on RAID systems
  used by OLTP applications can be affected by the 'small write'
  problem.)
----
	Not my scenario.
...
* a file system that has a very high 'churn' such as one supporting a
    development project - editing of source, rcs/subversion, compiling
    and testing. (Which may also run into the 'small write' problem
    with some RAID set-ups.)
----
	Much closer.
...
In a high churn situation such as the 3rd case or users home
directories, files being created, edit and deleted at a fair clip, then
the space allocation is going to try to 'optimise' for something.  Some
cases might want a complete rewrite of the files (think: using VI) so
having ample free space so the file partitions can be 'grouped' is going
to improve the performance of not only allocator[2] but also of the file
access.

Right, and it sounded like a home user was asking this question.
earlier, no?   That's not the advice he went off with.
...
Of course that makes little sense with small files such as the config
files that litter /etc - under 4K or whatever your allocator blocks size
is set to.  Or, right, some B-tree FSs can pack small files into 'tails'
of other files.

AFAIK, only ReiserFS has that, and development on that FS doesn't
seem to be progressing.   EXT4 just added packing content into inodes
as XFS has had for 20 years...(first white paper I read about it's
design in replacing the EFS file system talked about the decisions that
were made -- and the features included and was dated in 1993.
...
Now, given all that, there is a big rider.
Some disk managers such as LVM mean that while a file system is
optimized, the blocks that make up the partition (Logical Volume in LVM
parlance) may be subject to a scatter-gather.  HO HO HO!
----
	?!?  I'm pretty sure you are being vague and confusing.
If you have your disks allocated to be contiguous and aren't trying to use
any 'thin provisioning', which AFAIK, Suse doesn't support anyway, the File
system itself isn't moved around on disk.  But how does that tie into
file system optimization -- and what do you mean by that?  I don't know of many
file systems that support free space defragmenting.  Until the past few year
XFS was the only one that supported any type of defragmenting util at all (other
than windows)... but it doesn't handle defragging free space.
...
...
That's because my max rate in read/write is 1TB/s, so even the smallest
delays will create a much more noticeable hit in performance than they would
out of 1 disk -- i.e. the computer can do many computations while the I/O controller
is doing a 64K read/write on 1 disk, but the time it has to do calculations on
a 12 spindle raid is a decimal order of magnitude less.  So the cpu needs to search
through 12X the data in 1/10th the time to maintain max I/O rates.  The block
allocator
starts having problems finding large contiguous free spaces on file systems over 75%
full, if (like me) you've gone up into the low 90's% usage, and back down.  Once
you've
done that, you've got files spread throughout the free area.  (I could rebuild the
fs from scratch and that would give some more speed, until I let it get too full
again).

...
Apart from that, your reasoning is correct, so long as we are
considering the 'high churn rate' type of file systems.
...
If your HD was the speed of a floppy, using it up 99%, you wouldn't notice
a speed hit because the I/O was so slow, but the faster your HD subsystem, the more
you'll notice the lag time caused by the block allocator and the buffering system.
Many people with RAIDs have noticed a 30% or greater speed boost by avoiding the
buffer cache.  As long as CPU and memory speeds were staying 100-1000 times faster than
disk, alot can be done to optimize I/O in memory, but with cpu speeds going down (to
conserve power) and disk speeds going up with moves to SSD's and RAID's, those
same algorithms won't work as well at the limits...
Again, there are a lot of unstated assumptions.
There are various types of caching to do with a file system.  Oh, right,
you can say "its all data" but that's unhelpful.
----
There are no unstated assumptions.  Those are statistical trends.

you bring the number of cpu's up to 256 and the GZ down to 1 as on some
new intel machines, then toss in a stack of RAID10 w/30 spindles, you are
easily going to be pushing the limits of the machine's allocators...

With single SSD's *boasting* 100's of MB/s, how do they hold up in
RAIDS -- some of the newer Enterprise SSD's maintain rated speeds without
trim (though the rated speeds are not the highest categories).
...
Yes "a lot can be done to optimise I/O in memory" and many of the
algorithms are independent of the file system.  Whether things like
inode caching and name segment caching are used is one.
See http://makarevitch.org/rant/bufchint.html, and note the part on how
an application can advise the kernel about its expected FS use.

The day you can get all the apps to cooperate and tell the OS about
their projected I/O usage is the day I'll never need to work again (or hell
freezes over, something like that...;-))...   Appwriters are hard pressed to
know themselves how their app will behave in the field under customer load --
let alone, predict it well enough to tell the OS about it...
...
I note, also, that you've omitted a lot of IO tuning, such as the disk
elevator algorithm, queue depth and other matters.

Um... The point was how much free space to leave... not a full
dissertation on all factors affecting disk speed and optimization.
The [home] user was given information that you now say is wrong; i.e. they
had their free space down to 3%, and you said they had plenty of space left...

-------- Original Message --------
Subject: Question re filesystem reserved block percent
Date: Sun, 10 Feb 2013 17:02:44 -0500
From: Anton Aylward 
Dennis Gallien said the following on 02/10/2013 03:43 PM:
...
...
Advice, please . . .
The default filesystem reserved blocks is 5%.  IIRC that goes back a long time  
to when much smaller drives were in use.  Is there a formula or rule-of-thumb 
now for today's large drives/partitions?  I have quite a few 100-300GB 
partitions which I have tuned down to 3%, but it still seems I'm wasting a lot 
of space.  Suggestions?
Run 'df' on the partition that/those filesystem(s) is/are on.

Unless you're really s[h]ort of space then you're not wasting space, you've
still got plenty to spare.

----------------------------------------
...
I keep saying
        Context is Everything
and there is no 'one size fits all' solution.
Oh, and "time changes everything".  Who knows what BtrFS will be
delivering a year from now ...

Yeah... who knows?   The claims sound a bit too good to be
true... and usually when they sound that way, they are, but occasionally
there are diamonds in the rough...

-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse+owner@opensuse.org