![](https://seccdn.libravatar.org/avatar/9eaec8ef6e1bb27da5ae2c86af89e1d9.jpg?s=120&d=mm&r=g)
Hi, I was wondering if anyone can recommend a utility for finding duplicate files within a filesystem. It needs to operate relatively fast as I have to search through about 30Gig of files. I have found a couple of Java apps on Freshmeat but most do not appear to work on SuSE 9.2. and I have a very limited knowledge of using java. -- Thanks, Graham Smith ---------------------------------------------------------
![](https://seccdn.libravatar.org/avatar/d48e0fab41b188849be0dfd65aaa07a2.jpg?s=120&d=mm&r=g)
On Friday 18 February 2005 11:13 pm, Graham Smith wrote:
Hi,
I was wondering if anyone can recommend a utility for finding duplicate files within a filesystem. It needs to operate relatively fast as I have to search through about 30Gig of files.
I have found a couple of Java apps on Freshmeat but most do not appear to work on SuSE 9.2. and I have a very limited knowledge of using java.
Graham, I found a small utility on another list some time ago and it seems to do a pretty good job. It's a comman line thingie about 8 lines long. It'll do recursive checking and then list all the dups by groups so you can delete the ones you dont want. I found it useful to clean some crap off my drives. It's only about 300 bytes small. If you want, I'lll send it to you as an attachment. Richard -- Old age ain't for Sissies!
![](https://seccdn.libravatar.org/avatar/861b5545c111d2257fa12e533e723110.jpg?s=120&d=mm&r=g)
The Saturday 2005-02-19 at 00:01 -0600, Richard wrote:
Graham, I found a small utility on another list some time ago and it seems to do a pretty good job. It's a comman line thingie about 8 lines long. It'll do recursive checking and then list all the dups by groups so you can delete the ones you dont want. I found it useful to clean some crap off my drives.
It's only about 300 bytes small. If you want, I'lll send it to you as an attachment.
If it is an script, put it here, and we can all coment on it. -- Cheers, Carlos Robinson
![](https://seccdn.libravatar.org/avatar/9eaec8ef6e1bb27da5ae2c86af89e1d9.jpg?s=120&d=mm&r=g)
On Sun, 20 Feb 2005 11:42, Carlos E. R. wrote:
The Saturday 2005-02-19 at 00:01 -0600, Richard wrote:
Graham, I found a small utility on another list some time ago and it seems to do a pretty good job. It's a comman line thingie about 8 lines long. It'll do recursive checking and then list all the dups by groups so you can delete the ones you dont want. I found it useful to clean some crap off my drives.
It's only about 300 bytes small. If you want, I'lll send it to you as an attachment.
If it is an script, put it here, and we can all coment on it.
Well it looks like a utility called fslint works on SuSE 9.2 and does a good job of finding duplicates as well as other problems in an acceptable amount of time. I installed the rpm package. http://www.iol.ie/~padraiga/fslint/ Richard, thank you for the for sending the file, but fslint is a much better utility for my needs as it picks up renamed files. -- Regards, Graham Smith ---------------------------------------------------------
![](https://seccdn.libravatar.org/avatar/6d198f8c8f1c94ccef873cebcf4f5dfa.jpg?s=120&d=mm&r=g)
Graham Smith wrote:
On Sun, 20 Feb 2005 11:42, Carlos E. R. wrote:
The Saturday 2005-02-19 at 00:01 -0600, Richard wrote:
Graham, I found a small utility on another list some time ago and it seems to do a pretty good job. It's a comman line thingie about 8 lines long. It'll do recursive checking and then list all the dups by groups so you can delete the ones you dont want. I found it useful to clean some crap off my drives.
It's only about 300 bytes small. If you want, I'lll send it to you as an attachment.
If it is an script, put it here, and we can all coment on it.
Well it looks like a utility called fslint works on SuSE 9.2 and does a good job of finding duplicates as well as other problems in an acceptable amount of time. I installed the rpm package.
http://www.iol.ie/~padraiga/fslint/
Richard, thank you for the for sending the file, but fslint is a much better utility for my needs as it picks up renamed files.
For a Java util http://midori.shacknet.nu/dff/DuplicateFileFinder_v1.1.1.tgz Regards Sid. -- Sid Boyce .... Hamradio G3VBV and Keen Flyer =====ALMOST ALL LINUX USED HERE, Solaris 10 SPARC is just for play=====
![](https://seccdn.libravatar.org/avatar/d48e0fab41b188849be0dfd65aaa07a2.jpg?s=120&d=mm&r=g)
On Saturday 19 February 2005 06:42 pm, Carlos E. R. wrote:
The Saturday 2005-02-19 at 00:01 -0600, Richard wrote:
Graham, I found a small utility on another list some time ago and it seems to do a pretty good job. It's a comman line thingie about 8 lines long. It'll do recursive checking and then list all the dups by groups so you can delete the ones you dont want. I found it useful to clean some crap off my drives.
It's only about 300 bytes small. If you want, I'lll send it to you as an attachment.
If it is an script, put it here, and we can all coment on it.
-- Cheers, Carlos Robinson
Ok Carlos, here it is. #! /bin/sh OUTF=rem-duplicates.sh; echo "#! /bin/sh" > $OUTF; find "$@" -type f -print0 | xargs -0 -n1 md5sum | sort --key=1,32 | uniq -w 32 -d --all-repeated=separate | sed -r 's/^[0-9a-f]*( )*//;s/([^a-zA-Z0-9./_-])/\\\1/g;s/(.+)/#rm \1/'
$OUTF; chmod a+x $OUTF; ls -l $OUTF
I have it in the /usr/local/bin directory as ckdups and to use it I go to a terminal window, su then cd to the directory I want to check out, then type ckdups and wait. Eventually there will appear a list of duplicate files. They are grouped together with a space between the groups. From the list you can delete the files you dont want to keep. Worked for me, hope it helps you. BTW, I should give the author proper credit but I cant remember where it came from and his name is not in the file. It's a neat thing and he should get credit, but .. . . . Sorry for not being able to give credit. Richard -- Old age ain't for Sissies!
![](https://seccdn.libravatar.org/avatar/b12cfb65ca4faebc3e3aac17838e8f8d.jpg?s=120&d=mm&r=g)
Richard, On Saturday 19 February 2005 16:42, Carlos E. R. wrote:
The Saturday 2005-02-19 at 00:01 -0600, Richard wrote:
Graham, I found a small utility on another list some time ago and it seems to do a pretty good job. It's a comman line thingie about 8 lines long. It'll do recursive checking and then list all the dups by groups so you can delete the ones you dont want. I found it useful to clean some crap off my drives.
It's only about 300 bytes small. If you want, I'lll send it to you as an attachment.
If it is an script, put it here, and we can all coment on it.
Yes. Please let us see this concise script. I live to learn.
Carlos Robinson
RRS
![](https://seccdn.libravatar.org/avatar/9eaec8ef6e1bb27da5ae2c86af89e1d9.jpg?s=120&d=mm&r=g)
On Sun, 20 Feb 2005 13:13, Randall R Schulz wrote:
Richard,
On Saturday 19 February 2005 16:42, Carlos E. R. wrote:
The Saturday 2005-02-19 at 00:01 -0600, Richard wrote:
Graham, I found a small utility on another list some time ago and it seems to do a pretty good job. It's a comman line thingie about 8 lines long. It'll do recursive checking and then list all the dups by groups so you can delete the ones you dont want. I found it useful to clean some crap off my drives.
It's only about 300 bytes small. If you want, I'lll send it to you as an attachment.
If it is an script, put it here, and we can all coment on it.
Yes. Please let us see this concise script. I live to learn.
As previously mentioned I have found fslint. On examining it is using a number of scripts to achieve the functionality and has a GTK GUI wrapper to make it easier to use. If you are interested in bash scripts and how to do a number of checks on files you can download it from here. http://www.iol.ie/~padraiga/fslint/ I just ran the program on a directory containing about 3Gig of files and it took about 1.5 minutes to come back with a list of dups. -- Regards, Graham Smith ---------------------------------------------------------
![](https://seccdn.libravatar.org/avatar/d48e0fab41b188849be0dfd65aaa07a2.jpg?s=120&d=mm&r=g)
As previously mentioned I have found fslint. On examining it is using a number of scripts to achieve the functionality and has a GTK GUI wrapper to make it easier to use.
If you are interested in bash scripts and how to do a number of checks on files you can download it from here. http://www.iol.ie/~padraiga/fslint/
I just ran the program on a directory containing about 3Gig of files and it took about 1.5 minutes to come back with a list of dups.
-- Thanks Graham, I downloaded the thing but ran into some dependency
On Saturday 19 February 2005 10:30 pm, Graham Smith wrote: problems. Will come back to it a little later. It sure sounds a lot faster than the one I found. Richard -- Old age ain't for Sissies!
participants (5)
-
Carlos E. R.
-
Graham Smith
-
Randall R Schulz
-
Richard
-
Sid Boyce