Mailinglist Archive: opensuse (1606 mails)

< Previous Next >
Re: [opensuse] Re: Transparent content compression for space savings on linux like on NTFS?

----- Original Message -----
From: "Randall R Schulz" <rschulz@xxxxxxxxx>
To: <opensuse@xxxxxxxxxxxx>
Sent: Sunday, August 31, 2008 10:23 PM
Subject: Re: [opensuse] Re: Transparent content compression for space savings
on linux like on NTFS?


On Sunday 31 August 2008 18:22, Brian K. White wrote:

Although I don't think I have much use for mounting an archive as a
filesystem, it's interesting to know about none the less. Maybe it's
slightly fewer keystrokes for working on initrd's which are now cpio
files instead of ext2 filesystems.

In my case, the need arose when I devised a set of scripts (additions to
existing ones, actually) to facilitate large-scale experimentation
[...]
So I augmented the invocation scripts to maintain a log of compressed
test results. Each run of the prover generally produces many files,

Wow interesting.

So, is there a special reason why you want these in individual archives instead
of simply in directories?
Sounds to me like a compressed filesystem is your better answer after all.

I have something similar where I have a lot of different automated EDI
transaction scripts spawned many times a day by cron or by apache or postfix,
or by local users in my application. Sometimes a script will involve importing
or exporting or a several step process of both, and sometimes the other side is
a big company with not-so-great IT staff, and sometimes the transactions
involve big consequences, and sometimes disputes arise where the other side
claims we never sent them a file or that we sent them a file with bad data
etc.. Or that they sent us a file with such & such data in it that we failed to
react to...
Of course sometimes it is us and sometimes it is them and sometimes it is a
weakness in the procedure they specified we use, sometimes the program and
scripts all worked perfectly but the customer put in bad data, and sometimes it
is the simple fact that the internet isn't perfect.

So over time by now I have most of these scripts maintaining logs that are
comprised of the entire working directory and all work files for the transction.
The script starts by creating a new unique directory and then doing all work
within it, never deleting any temp or data import/export files, capturing
stdout and stderr and sometimes other diagnostic info from each command into
seperate files. And then just leaving the whole directory there when done. So
when such disputes arise now, there is a lot more evidense to look at.

I have a cron job the just runs a simple find command to delete all files/dirs
older than X days, or sometimes I have the find command right in the script so
that script maintains it's own history and tims off any old data every time the
script runs.

The total data set is lots of files but it's all organized into trees and dirs
and it's no hassle to go look at any particular transaction, or all the
transactions for a given time frame etc, and the total data size isn't
straining the disk space. But the point of all this was, if I had more data, or
wanted to keep it longer or forever such that I wanted to compress it all, and,
if I really needed immediate convenient random access to all of it (not just
the most recent X days/months worth) then I think I would still want this to be
a compressed filesystem instead of individual archive files. If I ever wanted
one or some logical grouping of them in archive form to mail to someone, that's
easy enough to do in one command.
About the only reason I can see for really wanting them to be maintaned in the
form of archives is if you had say an internal web/ftp/samba share, and you
needed to provide a simple browseable list for other people to grab an archive
file instead of browseable access to the equivalent directories. (maybe some
are using interfaces that don't allow them to download a group of files or a
directory all at once easily)

Another approach if you don't need immediate/convenient random access to old
files, even if you don't want to delete them, is just leave the dirs on the
regular filesystem and and have a find command that turns old dirs into
tar.bz2's instead of deleting them. Thats still a simple command and is maybe a
simpler system than having a seperate cmpressed filesystem. Then you could use
fuse or just traditional unpacking or a front-end like kio to access the old
ones since it would be uncommon enough not to matter much, and the ones you
access more often are just plain files in plain dirs, no special access hoops
to jump though and full normal filesystem functionality a-la the df hitch you
discovered.

--
Brian K. White brian@xxxxxxxxx http://www.myspace.com/KEYofR
+++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
filePro BBx Linux SCO FreeBSD #callahans Satriani Filk!

--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx

< Previous Next >
List Navigation
Follow Ups