TransWikia.com

Ext4 and Linux - very large number of files in one directory - operations

Unix & Linux Asked on November 9, 2021

I have a problem with a very large number of files in one directory.

The file system is ext4.

I reached the 2**32 file limit and couldn’t even write any file on this partition.

The problem is very big.

I don’t know how I can move some of the files to other resources?

The classic "ls" and "mv" are not working.
The files is too much…

Is there any way to quickly output any file in bash?

One arbitrary file per directory, which is almost 2**32 files.

If I can download one file quickly, I can write a script.

Any ideas?

2 Answers

Using ls to operate on a large directory is very inefficient, since GNU ls will read all of the entries in the directory before returning any of them, even with --sort=none, because it wants the output to be "pretty". This is both slow and uses a lot of RAM, since an ext4 directory can have many millions of files in it.

Instead, you should use find to list files in the directory, which will print out the filenames as soon as they are read from the directory. If you want to find particular files (e.g. all the "*.jpg" files smaller than 1MB), you could run e.g.

find /my/directory -type f -name "*.jpg" -size -1M

See the find(1) man page for full details on how to use it.

Once you find a bunch of of files you want to do something with, then you can use xargs to run a command for each file. For example, to delete temporary files use e.g.:

find /my/directory -name "*.tmp" -type f -print0 | xargs -0 rm

or to move them into a different directory like:

find /my/directory -name "*.jpg" -print0 | xargs -0 -I '{}' mv '{}' /my/otherdirectory

or any number of things. The xargs program runs the specific command for each file that it reads from the standard input, see xargs(1) man page for details. The mv command is a bit more complex than rm because mv needs to put the target directory at the end of the command, while xargs normally adds all the files after the specified command.

You could instead save the list of files to an output file like find ... > /tmp/file_list and then edit file_list to contain only the files you want to delete/move, and pipe it into xargs separately:

xargs -a /tmp/file_list -I '{}' mv '{}' /my/otherdir

Answered by LustreOne on November 9, 2021

Please try running:

ls --sort=none --no-group

or limit to some number of files, e.g.

ls --sort=none --no-group | head -500

Answered by Artem S. Tashkinov on November 9, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP