David Haller wrote:
Hello,
On Tue, 30 Oct 2018, Dave Howorth wrote:
On Tue, 30 Oct 2018 17:36:09 +0100 Per Jessen
wrote: This might be file system dependent, I'm not sure. I've been doing some tidying up and got stuck on a few directories with millions of files in them. 3+ million per directory. Doing a 'find' takes a very long time and also essentially chokes the system. I ended up writing a small utility using getdents() instead, much faster and the system remains operational.
Try:
ionice -c 3 nice find ...
That way, your system should stay nicely responding while your disks/ssds are exercised ;) Just did a find ... over a couple of TB on 7 disks (no dirs with tons of files though), and the system was as smooth as ever while I'm typing this ;)
Generally the system responds fine, and the volume is only about 600Gb. 94% full which is why I'm tidying up :-) I tried your 'ionice -c 3 nice find ...' - it appears to take the same time. No output for the first 10min. An strace shows it doing the same - running getdents() one after another.
That find generates a plain-text index ('name\tsize') over currently about ~900k files ;) Accordingly, 'free' shows ~2.3GB cached and 682MB buffers ATM just after the find is just done ;) And that's with just 4GiB RAM ...
The killer seems to be the 'find' on a directory will millions of files.
But: on whatever FS, you'll have to get the info _somehow_ from the VFS, so it depends on what your program wants... For plain names, well, a dir is just a (special) file which you could mmap and parse (or just use readdir which does just that ;)
I need name and mtime, but the stat() that I added to the getdent() loop did not seem to cause any problem or delays. I mean, it will be noticeable over 3mill files, but that's okay as long as the system remains responsive. Seems to me 'find' ought to have an option to say "one getdent() at a time" or some such. -- Per Jessen, Zürich (5.8°C) http://www.cloudsuisse.com/ - your owncloud, hosted in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org