[opensuse-packaging] %fdupes
Hi! We did some analysis on how much space is wasted by packages storing the same file twice (or more). While few packages waste megabytes (only 88 waste more than 1000Mib), 657 waste more than 20K - which sums up to 703MiB in total. Impressed? Consider using fdupes in your package. It's pretty simple: BuildRequire fdupes and then use "%fdupes $RPM_BUILD_ROOT" in your install section. This will check for duplicated files and make them hardlink. Just be careful that these duplicated files do not end up in different subpackages - I haven't tried what rpm does in that case. But you can also use %fdupes -s, which will create symlinks, which are easier to grasp for rpm :) So you can also combine this like this # create symlinks for my man pages %fdupes -s $RPM_BUILD_ROOT%_mandir # create hardline for the rest %fdupes $RPM_BUILD_ROOT I also added an rpmlint check that will give an error for the package if it's wasting more than 20KB (which is basically a random number). Greetings, Stephan -- SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Stephan Kulow escribió:
Hi!
We did some analysis on how much space is wasted by packages storing the same file twice (or more). While few packages waste megabytes (only 88 waste more than 1000Mib), 657 waste more than 20K - which sums up to 703MiB in total.
Interesting.. that is one CD less ,, wow.;-) Will be nice if the list of offending packages can be published in order to fix them ;)
Am Mittwoch 16 Mai 2007 schrieb Cristian Rodriguez R.:
Stephan Kulow escribió:
Hi!
We did some analysis on how much space is wasted by packages storing the same file twice (or more). While few packages waste megabytes (only 88 waste more than 1000Mib), 657 waste more than 20K - which sums up to 703MiB in total.
Interesting.. that is one CD less ,, wow.;-) Most of the packages wasting a lot are also big enough to not be on our CDs.
Will be nice if the list of offending packages can be published in order to fix them ;)
I'd prefer if every packager checks his own rpmlint reports instead of putting out a list of blame[¹] Greetings, Stephan [1] And yes, that means one or two KDE packages score pretty well ;) -- SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On Wed, 16 May 2007 at 04:21, Cristian Rodriguez R. wrote:
Stephan Kulow escribió:
[...] which sums up to 703MiB in total.
Interesting.. that is one CD less ,, wow.;-)
I guess the 703MB are the size of these files when installed, not the size they add to the (compressed) RPM files that go to the CDs. cu Reinhard --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On Wednesday, 16. May 2007, Stephan Kulow wrote:
I also added an rpmlint check that will give an error for the package if it's wasting more than 20KB (which is basically a random number).
Has been copied to http://en.opensuse.org/Packaging/SUSE_Macros Greetings, Dirk --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On středa 16 květen 2007, Stephan Kulow wrote:
It's pretty simple: BuildRequire fdupes and then use "%fdupes $RPM_BUILD_ROOT" in your install section. This will check for duplicated files and make them hardlink. Just be careful that these duplicated files do not end up in different subpackages - I haven't tried what rpm does in that case.
There seems to be another problem. %fdupes can create hardlinks between files that would finally end on different partitions. See https://bugzilla.novell.com/show_bug.cgi?id=304167 Using something like %fdupes $RPM_BUILD_ROOT/usr %fdupes $RPM_BUILD_ROOT/srv ... fixes the problem. Do you think that the %fdupes macro should be changed to do this automatically? Vladimir --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On Aug 24 2007 17:52, Vladimir Nadvornik wrote:
It's pretty simple: BuildRequire fdupes and then use "%fdupes $RPM_BUILD_ROOT" in your install section. This will check for duplicated files and make them hardlink. Just be careful that these duplicated files do not end up in different subpackages - I haven't tried what rpm does in that case.
There seems to be another problem. %fdupes can create hardlinks between files that would finally end on different partitions. See https://bugzilla.novell.com/show_bug.cgi?id=304167
Using something like %fdupes $RPM_BUILD_ROOT/usr %fdupes $RPM_BUILD_ROOT/srv ...
fixes the problem.
What if /srv/ftp and /srv/www were separate mounts?
Do you think that the %fdupes macro should be changed to do this automatically?
Jan -- --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Am Wednesday 29 August 2007 schrieb Jan Engelhardt:
On Aug 24 2007 17:52, Vladimir Nadvornik wrote:
It's pretty simple: BuildRequire fdupes and then use "%fdupes $RPM_BUILD_ROOT" in your install section. This will check for duplicated files and make them hardlink. Just be careful that these duplicated files do not end up in different subpackages - I haven't tried what rpm does in that case.
There seems to be another problem. %fdupes can create hardlinks between files that would finally end on different partitions. See https://bugzilla.novell.com/show_bug.cgi?id=304167
Using something like %fdupes $RPM_BUILD_ROOT/usr %fdupes $RPM_BUILD_ROOT/srv ...
fixes the problem.
What if /srv/ftp and /srv/www were separate mounts?
Then you still had to find a package that puts files in both? Greetings, Stephan --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On Sep 2 2007 16:23, Stephan Kulow wrote:
Using something like %fdupes $RPM_BUILD_ROOT/usr %fdupes $RPM_BUILD_ROOT/srv ...
fixes the problem.
What if /srv/ftp and /srv/www were separate mounts?
Then you still had to find a package that puts files in both?
I mean I have not seen %fdupes yet, or what it does. Fact is, that I think that the rpm archive should be created as if the whole tree was one filesystem, and hardlinks be broken no earlier than rpm -Uhv. Jan -- --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Am Friday 24 August 2007 schrieb Vladimir Nadvornik:
On středa 16 květen 2007, Stephan Kulow wrote:
It's pretty simple: BuildRequire fdupes and then use "%fdupes $RPM_BUILD_ROOT" in your install section. This will check for duplicated files and make them hardlink. Just be careful that these duplicated files do not end up in different subpackages - I haven't tried what rpm does in that case.
There seems to be another problem. %fdupes can create hardlinks between files that would finally end on different partitions. See https://bugzilla.novell.com/show_bug.cgi?id=304167
Using something like %fdupes $RPM_BUILD_ROOT/usr %fdupes $RPM_BUILD_ROOT/srv ...
fixes the problem.
Do you think that the %fdupes macro should be changed to do this automatically?
I think it would be logical to make this automatic. Greetings, Stephan --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On 2007-09-02 16:22:58 +0200, Stephan Kulow wrote:
Am Friday 24 August 2007 schrieb Vladimir Nadvornik:
On středa 16 květen 2007, Stephan Kulow wrote:
It's pretty simple: BuildRequire fdupes and then use "%fdupes $RPM_BUILD_ROOT" in your install section. This will check for duplicated files and make them hardlink. Just be careful that these duplicated files do not end up in different subpackages - I haven't tried what rpm does in that case.
There seems to be another problem. %fdupes can create hardlinks between files that would finally end on different partitions. See https://bugzilla.novell.com/show_bug.cgi?id=304167
Using something like %fdupes $RPM_BUILD_ROOT/usr %fdupes $RPM_BUILD_ROOT/srv ...
fixes the problem.
Do you think that the %fdupes macro should be changed to do this automatically?
I think it would be logical to make this automatic.
and it would be still broken. you can not assume that hardlinks between different directories will _always_ work. the only place where you can say "it wont break anything" are hardlinks in the same directory. anything else can be on a different partition. that said i think the best would be to patch fdupes and let it use hardlinks for any duplicates in the same directory, but symlinks for anything else. darix -- openSUSE - SUSE Linux is my linux openSUSE is good for you www.opensuse.org --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
* Marcus Rueckert <darix@web.de> [2007-09-02 18:00]:
but symlinks for anything else.
But using any automatism like %fdupes for symlinks is also a bad idea IMO since the semantics of two files (or two hardlinks to the same file) is different from the semantics of a file and a symlink. Consider for example the difference when you delete the file and not the symlink, or chmod, or something else. Also (for hardlinks _and_ symlinks), what's if a program installs the same configuration file in /etc and as documentation in /usr/share/doc/packages. Initially, the contents is the same, but if you modify the configuration in /etc, the sample configuration in /usr/share/doc/packages should stay the same. Thanks, Bernhard --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On Sunday 02 September 2007 18:00:21 wrote Marcus Rueckert:
On 2007-09-02 16:22:58 +0200, Stephan Kulow wrote:
Am Friday 24 August 2007 schrieb Vladimir Nadvornik:
On středa 16 květen 2007, Stephan Kulow wrote:
It's pretty simple: BuildRequire fdupes and then use "%fdupes $RPM_BUILD_ROOT" in your install section. This will check for duplicated files and make them hardlink. Just be careful that these duplicated files do not end up in different subpackages - I haven't tried what rpm does in that case.
There seems to be another problem. %fdupes can create hardlinks between files that would finally end on different partitions. See https://bugzilla.novell.com/show_bug.cgi?id=304167
Using something like %fdupes $RPM_BUILD_ROOT/usr %fdupes $RPM_BUILD_ROOT/srv ...
fixes the problem.
Do you think that the %fdupes macro should be changed to do this automatically?
I think it would be logical to make this automatic.
and it would be still broken. you can not assume that hardlinks between different directories will _always_ work. the only place where you can say "it wont break anything" are hardlinks in the same directory. anything else can be on a different partition. that said i think the best would be to patch fdupes and let it use hardlinks for any duplicates in the same directory, but symlinks for anything else.
That is right, but what happens acctually when you have different partitions ? Does rpm fail to install the package or does it create a full copy of the file on the other partition ? If it is the later, I think hardlinks are okay to use .. bye adrian -- Adrian Schroeter SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) email: adrian@suse.de --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On 2007-09-03 09:12:18 +0200, Adrian Schröter wrote:
and it would be still broken. you can not assume that hardlinks between different directories will _always_ work. the only place where you can say "it wont break anything" are hardlinks in the same directory. anything else can be on a different partition. that said i think the best would be to patch fdupes and let it use hardlinks for any duplicates in the same directory, but symlinks for anything else.
That is right, but what happens acctually when you have different partitions ?
Does rpm fail to install the package or does it create a full copy of the file on the other partition ?
If it is the later, I think hardlinks are okay to use ..
it fails horribly. taking into account the comment from bwalle about different meanings of files in different subdirectories, i think the only valid thing is that fdupes should only hardlink files in the same directory. darix -- openSUSE - SUSE Linux is my linux openSUSE is good for you www.opensuse.org --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Hi, On Mon, 3 Sep 2007, Marcus Rueckert wrote:
Does rpm fail to install the package or does it create a full copy of the file on the other partition ?
If it is the later, I think hardlinks are okay to use ..
it fails horribly. taking into account the comment from bwalle about different meanings of files in different subdirectories, i think the only valid thing is that fdupes should only hardlink files in the same directory.
Fix rpm. _That's_ the only valid thing. If not possible for 10.3, make %fdupes a noop for now. Ciao, Michael. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On 2007-09-03 15:41:45 +0200, Michael Matz wrote:
On Mon, 3 Sep 2007, Marcus Rueckert wrote:
Does rpm fail to install the package or does it create a full copy of the file on the other partition ?
If it is the later, I think hardlinks are okay to use ..
it fails horribly. taking into account the comment from bwalle about different meanings of files in different subdirectories, i think the only valid thing is that fdupes should only hardlink files in the same directory.
Fix rpm. _That's_ the only valid thing. If not possible for 10.3, make %fdupes a noop for now.
as mls mentioned offline that none of the tools is handling that case nicely. rsync fails with that too for example. and he declined to fix that in rpm. darix -- openSUSE - SUSE Linux is my linux openSUSE is good for you www.opensuse.org --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Hi, On Mon, 3 Sep 2007, Marcus Rueckert wrote:
it fails horribly. taking into account the comment from bwalle about different meanings of files in different subdirectories, i think the only valid thing is that fdupes should only hardlink files in the same directory.
Fix rpm. _That's_ the only valid thing. If not possible for 10.3, make %fdupes a noop for now.
as mls mentioned offline that none of the tools is handling that case nicely.
Invalid reasoning. There needs to be just one tool handling it correctly, namely rpm, perhaps cpio. If other programs don't handle this correctly doesn't matter for installation of rpms.
rsync fails with that too for example. and he declined to fix that in rpm.
Not sure what rsync has to do with the problem at hand. Ciao, Michael. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Hi, On Mon, 3 Sep 2007, Michael Matz wrote:
rsync fails with that too for example. and he declined to fix that in rpm.
Not sure what rsync has to do with the problem at hand.
Especially because it seems to handle copying hardlinks across directories, when the target directories are on different filesystems just fine. Just tested. Ciao, Michael. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On Monday, 3. September 2007, Michael Matz wrote:
Not sure what rsync has to do with the problem at hand. Especially because it seems to handle copying hardlinks across directories, when the target directories are on different filesystems just fine. Just tested.
would you please discuss this in the appropriate bugreport (bug 304167) instead of the list here, where it is likely getting forgotten again? Thanks a lot, Dirk -- RPMLINT information under http://en.opensuse.org/Packaging/RpmLint --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Marcus Rueckert escribió:
and he declined to fix that in rpm.
Sure, because RPM is not broken, what seems to be broken is the idea of using this %fdupes thingy, as AFAICS it will cause more harm than good. -- "You don't have to burn books to destroy a culture. Just get people to stop reading them." --Ray Bradbury Cristian Rodríguez R. SUSE LINUX Products GmbH Research & Development --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Am Dienstag 04 September 2007 schrieb Cristian Rodriguez:
Marcus Rueckert escribió:
and he declined to fix that in rpm.
Sure, because RPM is not broken, what seems to be broken is the idea of using this %fdupes thingy, as AFAICS it will cause more harm than good.
Thanks for your warm words. Fact 1: hard links are a normal part of the UNIX world, not handling them can be considered a bug (aka being broken). If it's an important bug is another issue. Fact 2: The good %fdupes thingy does is making it possible to have a 700MB ISO Fact 3: Many packages are broken in installing massive overlap of files Fact 4: Yes, running fdupes in hardlink mode without thinking twice might not not be the best idea. But as a matter of fact, I consider every tool not good or bad per se. It always depends on the use of the tools. Fact 5: Your communication style is broken, you should check the facts before calling other people's ideas broken. Greetings, Stephan -- SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Stephan Kulow escribió:
Fact 3: Many packages are broken in installing massive overlap of files
thats the real problem, but while fixing the broken packages may be a long term goal, currently such task seems to be an overkill.
Fact 5: Your communication style is broken, you should check the facts before calling other people's ideas broken.
Dont take it personal, my intention was never offend people. -- "You don't have to burn books to destroy a culture. Just get people to stop reading them." --Ray Bradbury Cristian Rodríguez R. SUSE LINUX Products GmbH Research & Development --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Hi, On Tue, 4 Sep 2007, Cristian Rodriguez wrote:
Sure, because RPM is not broken, what seems to be broken is the idea of using this %fdupes thingy, as AFAICS it will cause more harm than good.
Then you can't see very far. Ciao, Michael. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On pondělí 03 září 2007, Adrian Schröter wrote:
On Sunday 02 September 2007 18:00:21 wrote Marcus Rueckert:
On 2007-09-02 16:22:58 +0200, Stephan Kulow wrote:
Am Friday 24 August 2007 schrieb Vladimir Nadvornik:
On středa 16 květen 2007, Stephan Kulow wrote:
It's pretty simple: BuildRequire fdupes and then use "%fdupes $RPM_BUILD_ROOT" in your install section. This will check for duplicated files and make them hardlink. Just be careful that these duplicated files do not end up in different subpackages - I haven't tried what rpm does in that case.
There seems to be another problem. %fdupes can create hardlinks between files that would finally end on different partitions. See https://bugzilla.novell.com/show_bug.cgi?id=304167
Using something like %fdupes $RPM_BUILD_ROOT/usr %fdupes $RPM_BUILD_ROOT/srv ...
fixes the problem.
Do you think that the %fdupes macro should be changed to do this automatically?
I think it would be logical to make this automatic.
and it would be still broken. you can not assume that hardlinks between different directories will _always_ work. the only place where you can say "it wont break anything" are hardlinks in the same directory. anything else can be on a different partition. that said i think the best would be to patch fdupes and let it use hardlinks for any duplicates in the same directory, but symlinks for anything else.
IMHO the best approach is to identify hardlinks between directories with rpmlint and let the maintainer decide whether they are dangerous or not.
Does rpm fail to install the package or does it create a full copy of the file on the other partition ?
RPM fails, see the bugreport above. Vladimir --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Vladimir Nadvornik wrote:
On pondělí 03 září 2007, Adrian Schröter wrote:
On Sunday 02 September 2007 18:00:21 wrote Marcus Rueckert:
On 2007-09-02 16:22:58 +0200, Stephan Kulow wrote:
Am Friday 24 August 2007 schrieb Vladimir Nadvornik:
On středa 16 květen 2007, Stephan Kulow wrote: Do you think that the %fdupes macro should be changed to do this automatically? I think it would be logical to make this automatic. and it would be still broken. you can not assume that hardlinks between different directories will _always_ work. the only place where you can say "it wont break anything" are hardlinks in the same directory. anything else can be on a different partition. that said i think the best would be to patch fdupes and let it use hardlinks for any duplicates in the same directory, but symlinks for anything else.
IMHO the best approach is to identify hardlinks between directories with rpmlint and let the maintainer decide whether they are dangerous or not.
Or patch rpm to check whether hardlinks are not created across different partitions/volumes? Best regards Petr --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
participants (12)
-
Adrian Schröter
-
Bernhard Walle
-
Cristian Rodriguez
-
Cristian Rodriguez R.
-
Dirk Mueller
-
Jan Engelhardt
-
Marcus Rueckert
-
Michael Matz
-
Petr Cerny
-
Reinhard Max
-
Stephan Kulow
-
Vladimir Nadvornik