This mostly avoids the issue of getting negative sizes. It's still
possible to get a negative size after refresh or deletion, I'll get to
that in a bit.
There used to be four bytes of padding in the struct on systems with
32bit pointers. Moving the pointers down so that they are in between the
64bit and 32bit fields means that there'll never be any padding.
There may, however, be some extra padding at the end of the struct to
make the size a multiple of 8. Since we use the name field as a sort of
"flexible array member", we don't particularly care about that padding
and just want to allocate enough memory to hold the struct and the name
field. offsetof() allows us to do that without relying on compiler
support for flexible array members.
POSIX defines ino_t to be of an unsigned integer type, and searching
around the net didn't tell me of any definitions conflicting that. So
every ino_t can be represented in an uint64_t. (Assuming that is the
largest integer type in use for an inode number, but I'm sure that
assumption will hold for a while)
(dev_t, on the other hand, is a bit messier. Still figuring out what to
do with that.)
2 billion files should be enough for everyone. You probably won't have
enough memory to scan such a filesystem. int is a better choice than
long, as sizeof(int) is 4 on pretty much any system where ncdu runs.
The architecture is explained in dir.h. The reasons for these changes is
two-fold:
- calc.c was too complex, it simply did too many things. 399ccdeb is a
nice example of that: Should have been an easy fix, but it introduced
a segfault (fixed in 0b49021a), and added a small memory leak.
- This architecture features a pluggable input/output system, which
should make a file export/import feature relatively simple.
The current commit does not feature any user interface, so there's no
feedback yet when scanning a directory. I'll get to that in a bit.
I've also not tested the new scanning code very well yet, so I might
have introduced some bugs.
Rather than storing a pointer to another memory allocation in the
struct. This saves some memory and improves performance by significantly
decreasing the number of calls to [c|m]alloc() and free().
This optimizes a few actions (though not all), and makes the code easier
to understand and expand.
The behaviour of the browser has changed a bit with regards to
multi-page listings. Personally I don't like this change much, so I'd
probably fix that later on.
The displayed directory sizes are now fully correct, although in its
current state it's not all that intuitive because:
directory size != sum(size of all files and subdirectories)
This should probably be fixed later on by splitting the sizes into a
shared and non-shared part.
Also, the sizes displayed after a recalculation or deletion are
incorrect, I'll fix this later on.
The directory sizes are now incorrect as hard links will be counted
twice again (as if there wasn't any detection in the first place), but
this will get fixed by adding a shared size field.
This method of keeping track of hard links is a lot faster and allows
adding an interface which lists the found links.
Hard link detection is now done in a separate pass on the in-memory tree,
and duplicates can be 'removed' and 're-added' on the fly. When making any
changes in the tree, all hard links are re-added before the operation and
removed again afterwards.
While this guarantees that all hard link information is correct, it does
have a few drawbacks. I can currently think of two:
1. It's not the most efficient way to do it, and may be quite slow on
large trees. Will have to do some benchmarks later to see whether
it is anything to be concerned about.
2. The first encountered item is considered as 'counted' and all items
encountered after that are considered as 'duplicate'. Because the
order in which we traverse the tree doesn't always have to be the
same, the items that will be considered as 'duplicate' can vary with
each deletion or re-calculation. This might cause confusion for
people who aren't aware of how hard links work.