[opensuse] Sizing ext4 partitions
One of the major reasons I use ReiserFS rather than ext-series for my file systems is the problem that that ext-series have inherited from 1982-era UNIX V6 and V7 file systems, that of pre-alloacting a certain but fixed amount of space for inodes and a certain but fixed amount of space for data. The ratio can be varied but its set there at mkfs time. Now there are, I realise options. The file /etc/mke2fs.conf has entries for common ratios. What I'm wondering is this: If I know the overall characteristics of the files in the file system can I set the ratios sensibly? For example, if my camera is producing 24MB RAW images accompanied by 5K .xmp files from Darktable and 5MB JPEGs .. Say 30MB per image, what should my ratio be? Another example might be music files, .ogg files of between 3MB and 5MB. Though realistically music is going to vary a lot more than photgrpahs. Or, to ask this another way, is there a utility that will walk though a file system and tell me things like average file size as well as, perhaps, graph the curve of sizes, and recommend the best setting for a ext4FS? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/17/2015 08:35 PM, Anton Aylward wrote:
What I'm wondering is this: If I know the overall characteristics of the files in the file system can I set the ratios sensibly?
I don't know of such a tool; however, I've never run into problems because of too few free inodes on normal-sized ext3/4 file systems. Of course, the defaults may be odd for extreme small file systems like 100MB (for testing e.g.), but unless you have a very special use case with extraordinarily many, small files, then you won't have a problem. For example: the 20G "/" file system has automatically been created with 1.3M inodes, while a 800G data partition would take 51M files; look at the inode percentage: df -h --out / /media/sdb5 Filesystem Type Inodes IUsed IFree IUse% Size Used Avail Use% File Mounted on /dev/sda2 ext4 1.3M 296K 985K 24% 20G 9.9G 8.7G 54% / / /dev/sdb5 ext4 51M 1.8M 49M 4% 797G 343G 455G 43% /media/sdb5 /media/sdb5 With more sound, photo and video files, the IUse% is even smaller. If you fear that you'd waste space for never-to-be-allocated inodes, then you could go away from the default. I personally don't think it's worth the trouble - if some basic math is trouble at all. ;-) Have a nice day, Berny -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tuesday 17 of February 2015 14:35:27 Anton Aylward wrote:
What I'm wondering is this: If I know the overall characteristics of the files in the file system can I set the ratios sensibly?
Or, to ask this another way, is there a utility that will walk though a file system and tell me things like average file size as well as, perhaps, graph the curve of sizes, and recommend the best setting for a ext4FS?
About the statistics of file sizes, you could run: as.numeric(system("find DIR -type f -printf '%s\n'", intern=T)) -> fs_sizes barplot(table(cut(fs_sizes, breaks=c(0,2^(1:ceiling(log2(max(fs_sizes))))) ))) in R (package R-base), to display a bar plot. This is essentially the same answer as the one I had given you 5 years ago. -- Regards, Peter -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/17/2015 02:43 PM, auxsvr@gmail.com wrote:
On Tuesday 17 of February 2015 14:35:27 Anton Aylward wrote:
What I'm wondering is this: If I know the overall characteristics of the files in the file system can I set the ratios sensibly?
Or, to ask this another way, is there a utility that will walk though a file system and tell me things like average file size as well as, perhaps, graph the curve of sizes, and recommend the best setting for a ext4FS?
About the statistics of file sizes, you could run:
as.numeric(system("find DIR -type f -printf '%s\n'", intern=T)) -> fs_sizes barplot(table(cut(fs_sizes, breaks=c(0,2^(1:ceiling(log2(max(fs_sizes))))) )))
in R (package R-base), to display a bar plot.
This is essentially the same answer as the one I had given you 5 years ago.
Hmmmmm,
as.numeric(system("find DIR -type f -printf '%s\n'", intern=T)) -> fs_sizes barplot(table(cut(fs_sizes, breaks=c(0,2^(1:ceiling(log2(max(fs_sizes))))) ))) Error: unexpected symbol in "as.numeric(system("find DIR -type f -printf '%s\n'", intern=T)) -> fs_sizes barplot"
Maybe that's why he's asking again? -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/17/2015 06:08 PM, John Andersen wrote:
On 02/17/2015 02:43 PM, auxsvr@gmail.com wrote:
On Tuesday 17 of February 2015 14:35:27 Anton Aylward wrote:
What I'm wondering is this: If I know the overall characteristics of the files in the file system can I set the ratios sensibly?
Or, to ask this another way, is there a utility that will walk though a file system and tell me things like average file size as well as, perhaps, graph the curve of sizes, and recommend the best setting for a ext4FS?
About the statistics of file sizes, you could run:
as.numeric(system("find DIR -type f -printf '%s\n'", intern=T)) -> fs_sizes barplot(table(cut(fs_sizes, breaks=c(0,2^(1:ceiling(log2(max(fs_sizes))))) )))
in R (package R-base), to display a bar plot.
This is essentially the same answer as the one I had given you 5 years ago.
Hmmmmm,
as.numeric(system("find DIR -type f -printf '%s\n'", intern=T)) -> fs_sizes barplot(table(cut(fs_sizes, breaks=c(0,2^(1:ceiling(log2(max(fs_sizes))))) ))) Error: unexpected symbol in "as.numeric(system("find DIR -type f -printf '%s\n'", intern=T)) -> fs_sizes barplot"
Maybe that's why he's asking again?
Maybe I can handle, and debug, in a script language I know like Perl, awk, Ruby or something like that. Five years? What's the URL? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tuesday 17 of February 2015 21:52:50 Anton Aylward wrote:
Maybe I can handle, and debug, in a script language I know like Perl, awk, Ruby or something like that.
It should be split in 2 lines, the code is syntactically correct. I've tested it many times here.
Five years? What's the URL?
http://lists.opensuse.org/opensuse/2010-07/msg00405.html -- Regards, Peter -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tuesday 17 of February 2015 15:08:45 John Andersen wrote:
as.numeric(system("find DIR -type f -printf '%s\n'", intern=T)) -> fs_sizes barplot(table(cut(fs_sizes, breaks=c(0,2^(1:ceiling(log2(max(fs_sizes))))) )))> Error: unexpected symbol in "as.numeric(system("find DIR -type f -printf '%s\n'", intern=T)) -> fs_sizes barplot" Maybe that's why he's asking again?
It's supposed to be split in two lines: as.numeric(system("find DIR -type f -printf '%s\n'", intern=T)) -> fs_sizes barplot(table(cut(fs_sizes, breaks=c(0,2^(1:ceiling(log2(max(fs_sizes))))) )) This has been working here for the past 5 years. -- Regards, Peter -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/18/2015 02:35 AM, auxsvr@gmail.com wrote:
This has been working here for the past 5 years.
That's fine, peter, but I really don't want to load R and all its dependencies just to run this the once to see what parameters I need to give to the creation of a ext4FS to move my photos there. So far on this thread I'm disappointed that no-one really is trying to anwers the question i'm asking.. No, I don't want to use XFS. It solves the inode problem the same way that ReiserFS does. No, I don't want to 'fergetaboutit', I've been bitten by inode/data imbalance/exhaustion. Its why I use ReiserFS and BtrFS. If I were so in lined I might have gone the XFS path instead of ReiserFS, si ce both are B-tree and both 'create' inodes dynamically and both are easily resizeable. But I went the ReiserFS route and that has proven stable and reliable. Part of the reason I'm looking at ext4 is that size of the files under ~/Photography is pretty predictable. RAW files are *always* about 24M. The .xml files are *always* just a few k. The JPG files are *always* a few meg, always less than 10M usually only 2-3M, Things in ~/Documents, ~/Downloads and ~/Music are much more variable. So there's a lot of predictability to the files in ~/Photogrpahy. Both to the overall FS and to the actual layout of each file on the disk allocation. This sounds like something ext4 could take advantage of. What I'm surprised at is that none of the ext4 boosters have had that "yes, this is a good match for ext4". Perhaps it isn't or perhaps they are just rah-rah-rah fanbois who don't actually understand the technology :-O Seriously: is there someone who understands ext4 well enough to say "yes this is a good match, here's how" or "no, it isn't, just, as Berny says, make the FS so over-provisioned that it doesn't matter. If the latter, then I know that its not worth bothering with changing from ReiserFS to ext4FS. This is all just a query, not a raging argument. If the ext4 people gan give me adequate reasons to change then I'm interested, but if not I'm happy using ReiserFS into the foreseeable future. I've never had any unrecoverable problems with ReiserFS. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wed, 18 Feb 2015 15:07, Anton Aylward wrote:
On 02/18/2015 02:35 AM, auxsvr@gmail.com wrote:
This has been working here for the past 5 years.
That's fine, peter, but I really don't want to load R and all its dependencies just to run this the once to see what parameters I need to give to the creation of a ext4FS to move my photos there.
So far on this thread I'm disappointed that no-one really is trying to anwers the question i'm asking..
No, I don't want to use XFS. It solves the inode problem the same way that ReiserFS does.
No, I don't want to 'fergetaboutit', I've been bitten by inode/data imbalance/exhaustion. Its why I use ReiserFS and BtrFS. If I were so in lined I might have gone the XFS path instead of ReiserFS, si ce both are B-tree and both 'create' inodes dynamically and both are easily resizeable. But I went the ReiserFS route and that has proven stable and reliable.
Part of the reason I'm looking at ext4 is that size of the files under ~/Photography is pretty predictable. RAW files are *always* about 24M. The .xml files are *always* just a few k. The JPG files are *always* a few meg, always less than 10M usually only 2-3M,
Things in ~/Documents, ~/Downloads and ~/Music are much more variable.
So there's a lot of predictability to the files in ~/Photogrpahy. Both to the overall FS and to the actual layout of each file on the disk allocation. This sounds like something ext4 could take advantage of. What I'm surprised at is that none of the ext4 boosters have had that "yes, this is a good match for ext4". Perhaps it isn't or perhaps they are just rah-rah-rah fanbois who don't actually understand the technology :-O [snip]
A few thoughts before hand: - In the last five years, the ext4 guys learned about their failures, and adapted, too. Just not as loud as the BTRFS guys. - The deaulfts have gotten much better. - Still, for /home at least I'm going to overprovision the inodes. How do I calc my numbers for mkfs.ext4 / mke2fs? - to know about beforhand: 1. ca. number of files and dirs (see df -i /home) 2. ca. number of used 1k blocks (see df -k /home) 3. Partionsize. - Now I calc my median bytes per inode ratio: inodes density = used blocks / used inodes - Next I test what mke2fs would use as defaults. (If unmounted try on /home, with -n, else try loopback-file same size) "mke2fs -n /home" (In this example just a 8GB Partition gives:) [code] Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 524288 inodes, 2097152 blocks 104857 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=2147483648 64 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632 [/code] - Interesting for us here is the lines: [code] Block size=4096 ... 524288 inodes, 2097152 blocks [/code] - How many of our files would fall below "Block size" in percent? - How many directories would be oversized (default inode-size is 256) ("Block size"/256) - What bytes per inode ratio would be default? (here 2097152 blocks / 524288 inodes = 4 blocks per inode corrected to 1k blocks = 4block of 4k = 16kB per Inode) - For most cases a Block-size of 4k is OK, but the default bytes per inode ratio is to big for a /home partition, IMHO, I prefer a higher density for home, BS=2k BpI=4k. - Combined into the mke2fs command: "mke2fs -b 2048 -i 4096 [partition]" This would give me (for the same 8GB partion as before) [code] Filesystem label= OS type: Linux Block size=2048 (log=1) Fragment size=2048 (log=1) Stride=0 blocks, Stripe width=0 blocks 2097152 inodes, 4194304 blocks 209715 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=541065216 256 block groups 16384 blocks per group, 16384 fragments per group 8192 inodes per group Superblock backups stored on blocks: 16384, 49152, 81920, 114688, 147456, 409600, 442368, 802816, 1327104, 2048000, 3981312 [/code] - Inode ratio now: 4194304 blocks / 2097152 inodes = 2 blocks per inode with 2k blocks -> 4kB per inode For me as a programmer that is fine. 2 million inodes in a 8 GB space. Shorthand: Absolute bare mimimum: minimum_inodes = number of files + number of dirs / space_used * space_avail Two times that to be on the save side for minimum. Mixed environs (here big photo files + normal $HOME cruft) are not trivial, ca. 10000 inodes just for the normal $HOME cruft is a bare minimum to be save in long term. Does that help to answer your questions? - Yamaban. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/18/2015 11:30 AM, Yamaban wrote:
A few thoughts before hand: - In the last five years, the ext4 guys learned about their failures, and adapted, too. Just not as loud as the BTRFS guys.
That's what I was banking on :-)
- The deaulfts have gotten much better. - Still, for /home at least I'm going to overprovision the inodes.
OK but by what margin and how are you going to rationalise that?
How do I calc my numbers for mkfs.ext4 / mke2fs? - to know about beforhand: 1. ca. number of files and dirs (see df -i /home) 2. ca. number of used 1k blocks (see df -k /home) 3. Partionsize.
Bully for you. As for #3, I'm banking on using LVM and if necessary growing the partition size. I do know that with ReiserFS when I do that it grows both data and inodes. I understand ext4 is growable. I'd like to know if & how the data/inode availability/ratio grows when that happens. As for #1 and #2, I don't know. I only know what I have for the photographs I took last year and how I chose to organise them based on last years events and volume. I don't know what next year will hold. Right now I don't even know if I'm going to organise by year, by event category or what. I don't know if the FS will contain last years photos as well as this years photos. I do know that the result of each photo is a RAW file, a darktable XML file and a JPG produced by darktable all of which add up to about 30Megabytes. I do know that the RAW files are all 24Megabytes of continuous data with no holes.
- Now I calc my median bytes per inode ratio: inodes density = used blocks / used inodes
"used blocks" is an interesting concept. I know the size of the files but that may or may not have an easy correspondence to used blocks. This is ReiserFS and it does its best to fit stuff in to blocks. So a block might be one byte full with the tail end of a file and then nothing, or that nothing might be stuffed with an inode, another file, some structural data ... Who knows. So its not easy to determine in a way that maps to ext4 concepts. We don't have a strong separation.
- Next I test what mke2fs would use as defaults.
Oh dear. This reminds me of the mid 1980s and SCO UNIX and because of the lack of answers to these kinds of question I would install over and over doing step and repeat to find out how much space was needed by the system and application code when locked down, then step-and-repeat doing shrink or stretch to optimise. Its to avoid this kind of 'test' that I decided to go for LVM and growable/shrinkable ReiserFS. The more I read, the more I think that I'm foolish to consider ext4 as an alternative to ReiserFS. ReiserFS, and I gather XFS, and I note that BtrFS, avoids all this craziness. The only reason I even looked at ext4 was the predictability of file sizes of ~/Photography because of my use of RAW.
[snip]
- For most cases a Block-size of 4k is OK, but the default bytes per inode ratio is to big for a /home partition, IMHO,
Indeed, but I'm not looking at /home with files of all sizes, I'm looking at ~/Photography where there are files of only 3 sizes: the RAW files art 24M, the XML files at a couple of K and the JPG files at a few M. Al in all about 30M per "image"
I prefer a higher density for home, BS=2k BpI=4k.
So I get something close to 10M per inode. The issue now comes down to the block size. If I have the RAW files doing N blocks + 1 byte the wasted ratio with 4K blocks isn't much. That's because the files are so big. Heck, I could go to 8K or 116K blocks and it won't make much difference. Almost the same for the JPG files at a few M each. But 30% of the files are 2K or less. That means at 4K I'm guaranteed to be wasting space, and it gets worse for larger block size. Unless ext4 does 'stuffing' like ReiserFS and BtrFS do. Well, does it? What does it 'stuff' into spare space? How aggressive is it about that?
This would give me (for the same 8GB partion as before)
FYI I took about 300GB worth of photos last year. I expect to match or exceed that this year.
Two times that to be on the save side for minimum.
Ah. Fifty percent redundancy. let me look in my DatabaseOfDotSigQuotes ... Ah yes: Optimist: The glass is half full Pessimist: The glass is half empty Cost Accountant: The vessel is too large for its purpose Engineer: The glass has a 100% safety margin Financier: What you're looking at is a half pint of depreciable assets sitting in a pint of capital infrastructure that can be amortized over two accounting periods. XXX: There is a glass with a certain volume of liquid in it. From there, it's up to you! Bartender: Half empty or Half full depends on whether you are drinking or pouring! YYY: It doesn't matter -- Whatever's inside it is evaporating either way.
Mixed environs (here big photo files + normal $HOME cruft) are not trivial, ca. 10000 inodes just for the normal $HOME cruft is a bare minimum to be save in long term.
Whereas I'm talking about only ~/Photography and there is a very, very deterministic file set there. Quite different from the stuff in $HOME, ~/Documents, ~/Downloads or even ~/Music
Does that help to answer your questions?
Not really. All you've done is convinced me my decision to avoid ext4 and go with ReiserFS was right in the first place. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/18/2015 06:52 PM, Anton Aylward wrote:
I'm looking at ~/Photography where there are files of only 3 sizes: the RAW files art 24M, the XML files at a couple of K and the JPG files at a few M. Al in all about 30M per "image"
[...]
FYI I took about 300GB worth of photos last year.
Ah, now let's do the Math based on your numbers. 300GB of photos = (roughly) 10000 RAW + 10000 XML + 10000 JPG = 30000 inodes That means you'd fill a 2TB disk with about 200000 inodes (originating from about 66666 photos). Assuming now: you have another 300000 other files on that file system; that makes 500K inodes when you run out of disk (block!) space. As written earlier, mkfs.ext4 created my 797GB file system with 51M inodes. Thus said, IMO it's not worth thinking about the ratio of inodes vs. block size and stuff. BTW: with the assumed 300GB per year, you'd fill up the above 2TB disk in ~6 years ... given the disk is still alive. ;-) Have a nice day, Berny -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/18/2015 05:21 PM, Bernhard Voelker wrote:
BTW: with the assumed 300GB per year, you'd fill up the above 2TB disk in ~6 years ... given the disk is still alive.
More likely he will figure out that RAW is good for maybe .009% of his photos, and he will find he never diddles with the RAW, and he will forget the whole concept of keeping RAW well before he ever needs the space. In which case his inode calc will be off by a huge factor. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/18/2015 08:26 PM, John Andersen wrote:
On 02/18/2015 05:21 PM, Bernhard Voelker wrote:
BTW: with the assumed 300GB per year, you'd fill up the above 2TB disk in ~6 years ... given the disk is still alive.
More likely he will figure out that RAW is good for maybe .009% of his photos, and he will find he never diddles with the RAW, and he will forget the whole concept of keeping RAW well before he ever needs the space. In which case his inode calc will be off by a huge factor.
NOT! I may produce more than one JPG from the RAW, or I may not even produce one. The RAW is the archival, contains the most information. The transformations that darktable makes to produce the JPG are described in the XML files. I can delete the JPG and get it back from the RAW+XML I use RAW as a generic term for the cameras native mode dump of the sensor. That's why it contains the most information. Right now each camera vendor has their own RAW format, though there is pressure on for a camera independent standard, DNG http://en.wikipedia.org/wiki/Digital_Negative Low end cameras, the point and shoot type, produce JPG directly and have a number of 'scene' settings which process the sensor information according to one of a number of formulas. JPGs are lossy and the transformation looses information that is available at the sensor level. The RAW file is not lossy. It leaves the choice of transformation, perhaps multiple different transformations, to the "digital darkroom" software such as darktable. This is a lot more capable than the software in the camera. There are many web sites that illustrate how different information can be made on the same raw image, as well as all the traditional 'darkroom' techniques such as cropping, dodging, spotting, and many more, such as noise reduction, colour standard translation, tone mapping and others that the camera cannot perform. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Although I started this thread raising the sceptre of past experience with inode exhaustion, no-one has addressed the 'advantages' of ext4. The advice I've been given boils down to :- * use XFS Well since I use ReiserFS that would not be a great change for the better. Both are Btree based file systems. I've also discussed the advantages and problems of BtrFS. * over-provision the inodes The calculations were not different from those I faces with the UNIX V6/V7 file system back in the late 1970s. What's been absent from this thread are the supposed advantages of ext4 for space management. Googling, I find mention of things like : multiblock allocation, delayed allocation, journal checksum. Fast fsck, extent-mapped files for more efficient storage of file metadata. It was that last item that caught my attention. But the thread has avoided all these. So far no-one, nothing I've read, has given me a compelling reason to move from ReiserFS. Its certainly efficient and functional for general and smaller files. -- Editing is a rewording activity. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (5)
-
Anton Aylward
-
auxsvr@gmail.com
-
Bernhard Voelker
-
John Andersen
-
Yamaban