Which is, AFAIK, a recommended practice. Reduces the number of times
translate-c is being run and (most likely) simplifies a possible future
transition if/when @cImport is thrown out of the language.
Also uses zstd.h instead of my own definitions, mainly because I plan to
use the streaming API as well and those need more definitions.
This prevents displaying invalid zero values or writing such values out
in JSON/bin exports. Very old issue, actually, but with the new binfmt
experiments it's finally started annoying me.
I realized that the 16 MiB limitation implied that the index block could
only hold ((2^24)-16)/8 =~ 2 mil data block pointers. At the default
64k data block size that means an export can only reference up to
~128 GiB of uncompressed data. That's pretty limiting.
This change increases the maximum size of the index block to 256 MiB,
supporting ~33 mil data block pointers and ~2 TiB of uncompressed data
with the default data block size.
Code is more hacky than I prefer, but this approach does work and isn't
even as involved as I had anticipated.
Still a few known bugs and limitations left to resolve.
And do dynamic buffer allocation for bin_export, removing 128k of
.rodata that I accidentally introduced earlier and reducing memory use
for parallel scans.
Static binaries now also include the minimal version of zstd, current
sizes for x86_64 are:
582k ncdu-2.5
601k ncdu-new-nocompress
765k ncdu-new-zstd
That's not great, but also not awful. Even zlib or LZ4 would've resulted
in a 700k binary.
By calling die() instead of propagating error unions. Not surprising
that error propagation has a performance impact, but I was hoping it
wasn't this bad.
Import performance was already quite good, but now it's even better!
With the one test case I have it's faster than JSON import, but I expect
that some dir structures will be much slower.
This isn't the low-memory browsing experience I was hoping to implement,
yet, but it serves as a good way to test the new format and such a
sink-based import is useful to have anyway.
Performance is much better than I had expected, and I haven't even
profiled anything yet.