Commit graph

8 commits

Author SHA1 Message Date
Yorhel
705bd8907d Move nlink count from inode map into Link node
This adds another +4 bytes* to Link nodes, but allows for the in-memory
tree to be properly exported to JSON, which we'll need for multithreaded
export. It's also slightly nicer conceptually, as we can now detect
inconsistencies without throwing away the actual data, so have a better
chance of recovering on partial refresh. Still unlikely, anyway, but
whatever.

(* but saves 4+ bytes per unique inode in the inode map, so the memory
increase is only noticeable when links are repeated in the scanned tree.
Admittedly, that may be the common case)
2024-07-17 14:15:53 +02:00
Yorhel
6bb31a4653 More consistent handling of directory read errors
These are now always added as a separate dir followed by setReadError().
JSON export can catch these cases when the error happens before any
entries are read, which is the common error scenario.
2024-07-17 09:09:04 +02:00
Yorhel
7558fd7f8e Re-add single-threaded JSON export
That was the easy part, next up is fixing multi-threaded JSON export.
2024-07-17 07:05:18 +02:00
Yorhel
d2e8dd8a90 Reimplement JSON import + minor fixes
Previous import code did not correctly handle a non-empty directory with
the "read_error" flag set. I have no clue if that can ever happen in
practice, but at least ncdu 1.x can theoretically emit such JSON so we
handle it now.

Also fixes mtime display of "special" files. i.e. don't display the
mtime of the parent directory - that's confusing.

Split a generic-ish JSON parser out of the import code for easier
reasoning and implemented a few more performance improvements as well.
New code is ~30% faster in both ReleaseSafe and ReleaseFast.
2024-07-16 14:20:30 +02:00
Yorhel
ddbed8b07f Some fixes in mtime propagation and hardlink refresh counting 2024-07-15 11:00:14 +02:00
Yorhel
db51987446 Re-add hard link counting + parent suberror & stats propagation
Ended up turning the Links into a doubly-linked list, because the
current approach of refreshing a subdirectory makes it more likely to
run into problems with the O(n) removal behavior of singly-linked lists.

Also found a bug that was present in the old scanning code as well;
fixed here and in c41467f240.
2024-07-14 20:17:34 +02:00
Yorhel
cc12c90dbc Re-add scan progress UI + directory refreshing 2024-07-14 20:17:19 +02:00
Yorhel
f2541d42ba Rewrite scan/import code, experiment with multithreaded scanning (again)
Benchmarks are looking very promising this time. This commit breaks a
lot, though:
- Hard link counting
- Refreshing
- JSON import
- JSON export
- Progress UI
- OOM handling is not thread-safe

All of which needs to be reimplemented and fixed again. Also haven't
really tested this code very well yet so there's likely to be bugs.

There's also a behavioral change: --exclude-kernfs is not checked on the
given root directory anymore, meaning that the filesystem the user asked
to scan is being scanned even if that's a 'kernfs'. I suspect that's
more sensible behavior.

The old scan.zig was quite messy and hard for me to reason about and
extend, this new sink API is looking to be less confusing. I hope it
stays that way as more features are added.
2024-07-14 20:17:18 +02:00