On Monday 09 August 2004 23:16, Randall R Schulz wrote:
Anders,
On Monday 09 August 2004 13:58, Anders Johansson wrote:
On Monday 09 August 2004 22:36, Randall R Schulz wrote:
I use mbox format, not maildir, which strikes me as an insanely bloated way to store email
This statement requires elucidation
mbox and maildir store the exact same information, except that mbox *adds* an additional "From" header to signify envelope sender. So mbox actually stores more data. Assuming you are using a sane file system - like reiserfs for example - that is capable of filling up disk completely and not leave little empty "tails" for half filled blocks, á la ext2/3, it will consume *less* disk space. So how is it bloated?
I may be behind the times as regards Linux / Unix file systems, but I am under the impression that mass storage is allocated with granularities not less than a sector size and almost always a multiple thereof.
Under such allocation schemes and a typical allocation unit size of 4096 (eight 512-byte sectors), a message whose entire contents (body, headers and mail client overhead / added headers) were, say, 2000 bytes, then 2096 bytes would be wasted if this message occupied a file of its own.
Are you telling me that there are file systems under Linux that eliminate internal fragmentation?
Yep
If so, all I can say is that I'm impressed. But I also have to wonder what happens under such a scheme when a file is to grow and it ends mid-sector (or mid-allocation unit) and another file shares that sector or allocation unit. It seems that a hell of a lot of shuffling would ensue.
Why? it would require at most blocksize-1 bytes be moved some place else and then the file can grow as under any other file system, and that is assuming that the tail of another file has had time to move in there. Have a look at the whitepapers at www.namesys.com for the technical specifics (or the reiserfs source, if you are that way inclined :)