I initially wanted to keep a directory's block count and size as a
separate field so that exporting an in-memory tree to a JSON dump would
be easier to do, but that doesn't seem like a common operation to
optimize for. We'll probably need the algorithms to subtract sub-items
from directory counts anyway, so such an export can still be
implemented, albeit slower.
Eaiser to implement now that we're linking against libc.
But exclude pattern matching is extremely slow, so that should really be
rewritten with a custom fnmatch implementation. It's exactly as slow as
in ncdu 1.x as well, I'm surprised nobody's complained about it yet.
And while I'm at it, supporting .gitignore-style patterns would be
pretty neat, too.
The new data model is supposed to solve a few problems with ncdu 1.x's
'struct dir':
- Reduce memory overhead,
- Fix extremely slow counting of hard links in some scenarios
(issue #121)
- Add support for counting 'shared' data with other directories
(issue #36)
Quick memory usage comparison of my root directory with ~3.5 million
files (normal / extended mode):
ncdu 1.15.1: 379M / 451M
new (unaligned): 145M / 178M
new (aligned): 155M / 200M
There's still a /lot/ of to-do's left before this is usable, however,
and there's a bunch of issues I haven't really decided on yet, such as
which TUI library to use.
Backporting this data model to the C version of ncdu is also possible,
but somewhat painful. Let's first see how far I get with Zig.