annotate modules/fts @ 10481:47fe4e48e158

fts: sort dirent entries on inode number before traversing This avoids a quadratic, seek-related performance penalty when operating on a directory containing many entries (measurable at 10k; 3.5 hours at 2 million entries with a cold cache) on certain types of file systems, including ext3 and ext4, but not tmpfs. * lib/fts.c (DT_MUST_BE, NOT_AN_INODE_NUMBER, D_INO): Define. (FTS_INODE_SORT_DIR_ENTRIES_THRESHOLD): Define if not defined. (S_MAGIC_TMPFS, S_MAGIC_NFS): Define. (fs_handles_readdir_ordered_dirents_efficiently): New function. (dirent_inode_sort_may_be_useful, fts_compare_ino): Likewise. (fts_build): Set the stat.st_ino member from D_INO. If it is likely to be useful, sort dirent entries on inode number. * m4/fts.m4 (gl_FUNC_FTS_CORE): Check for fstatfs, sys/vfs.h, and the struct statfs.f_type member. * modules/fts (Depends-on): Add d-ino.
author Jim Meyering <meyering@redhat.com>
date Tue, 16 Sep 2008 10:05:47 +0200
parents 25ee90a28a16
children 3cb22d0bfd0b
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
5872
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
1 Description:
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
2 Traverse a file hierarchy.
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
3
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
4 Files:
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
5 lib/fts_.h
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
6 lib/fts.c
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
7 lib/fts-cycle.c
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
8 m4/fts.m4
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
9
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
10 Depends-on:
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
11 cycle-check
10481
47fe4e48e158 fts: sort dirent entries on inode number before traversing
Jim Meyering <meyering@redhat.com>
parents: 8873
diff changeset
12 d-ino
7482
26914593f079 Big performance improvement for fts-based tools that use FTS_NOSTAT.
Jim Meyering <jim@meyering.net>
parents: 7225
diff changeset
13 d-type
5874
20361980c152 (Files): Add m4/inttypes-pri.m4.
Jim Meyering <jim@meyering.net>
parents: 5872
diff changeset
14 dirfd
8873
25ee90a28a16 2007-05-26 Bruno Haible <bruno@clisp.org>
Bruno Haible <bruno@clisp.org>
parents: 7797
diff changeset
15 fchdir
7225
3307ae6ea2e5 * lib/fcntl_.h: New file.
Paul Eggert <eggert@cs.ucla.edu>
parents: 7172
diff changeset
16 fcntl
7482
26914593f079 Big performance improvement for fts-based tools that use FTS_NOSTAT.
Jim Meyering <jim@meyering.net>
parents: 7225
diff changeset
17 fcntl-safer
5872
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
18 hash
7639
1670d42131d7 Make fts (in FTS_CWDFD mode) more efficient by caching a few open
Jim Meyering <jim@meyering.net>
parents: 7497
diff changeset
19 i-ring
5874
20361980c152 (Files): Add m4/inttypes-pri.m4.
Jim Meyering <jim@meyering.net>
parents: 5872
diff changeset
20 lstat
7151
43e3888c56c4 Update from coreutils.
Paul Eggert <eggert@cs.ucla.edu>
parents: 6097
diff changeset
21 openat
5874
20361980c152 (Files): Add m4/inttypes-pri.m4.
Jim Meyering <jim@meyering.net>
parents: 5872
diff changeset
22 stdbool
7151
43e3888c56c4 Update from coreutils.
Paul Eggert <eggert@cs.ucla.edu>
parents: 6097
diff changeset
23 unistd-safer
5872
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
24
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
25 configure.ac:
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
26 gl_FUNC_FTS
7797
10432bdf90d3 2007-01-08 Bruno Haible <bruno@clisp.org>
Bruno Haible <bruno@clisp.org>
parents: 7639
diff changeset
27 gl_MODULE_INDICATOR([fts])
5872
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
28
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
29 Makefile.am:
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
30
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
31 Include:
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
32 "fts_.h"
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
33
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
34 License:
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
35 GPL
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
36
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
37 Maintainer:
fab6701e5cb2 New fts module.
Paul Eggert <eggert@cs.ucla.edu>
parents:
diff changeset
38 Jim Meyering