POSIX defines ino_t to be of an unsigned integer type, and searching
around the net didn't tell me of any definitions conflicting that. So
every ino_t can be represented in an uint64_t. (Assuming that is the
largest integer type in use for an inode number, but I'm sure that
assumption will hold for a while)
(dev_t, on the other hand, is a bit messier. Still figuring out what to
do with that.)
2 billion files should be enough for everyone. You probably won't have
enough memory to scan such a filesystem. int is a better choice than
long, as sizeof(int) is 4 on pretty much any system where ncdu runs.
The architecture is explained in dir.h. The reasons for these changes is
two-fold:
- calc.c was too complex, it simply did too many things. 399ccdeb is a
nice example of that: Should have been an easy fix, but it introduced
a segfault (fixed in 0b49021a), and added a small memory leak.
- This architecture features a pluggable input/output system, which
should make a file export/import feature relatively simple.
The current commit does not feature any user interface, so there's no
feedback yet when scanning a directory. I'll get to that in a bit.
I've also not tested the new scanning code very well yet, so I might
have introduced some bugs.
Rather than storing a pointer to another memory allocation in the
struct. This saves some memory and improves performance by significantly
decreasing the number of calls to [c|m]alloc() and free().
This optimizes a few actions (though not all), and makes the code easier
to understand and expand.
The behaviour of the browser has changed a bit with regards to
multi-page listings. Personally I don't like this change much, so I'd
probably fix that later on.
The displayed directory sizes are now fully correct, although in its
current state it's not all that intuitive because:
directory size != sum(size of all files and subdirectories)
This should probably be fixed later on by splitting the sizes into a
shared and non-shared part.
Also, the sizes displayed after a recalculation or deletion are
incorrect, I'll fix this later on.
The directory sizes are now incorrect as hard links will be counted
twice again (as if there wasn't any detection in the first place), but
this will get fixed by adding a shared size field.
This method of keeping track of hard links is a lot faster and allows
adding an interface which lists the found links.
Hard link detection is now done in a separate pass on the in-memory tree,
and duplicates can be 'removed' and 're-added' on the fly. When making any
changes in the tree, all hard links are re-added before the operation and
removed again afterwards.
While this guarantees that all hard link information is correct, it does
have a few drawbacks. I can currently think of two:
1. It's not the most efficient way to do it, and may be quite slow on
large trees. Will have to do some benchmarks later to see whether
it is anything to be concerned about.
2. The first encountered item is considered as 'counted' and all items
encountered after that are considered as 'duplicate'. Because the
order in which we traverse the tree doesn't always have to be the
same, the items that will be considered as 'duplicate' can vary with
each deletion or re-calculation. This might cause confusion for
people who aren't aware of how hard links work.