It was deprecated before, and became hard compile error.
See https://github.com/ziglang/zig/pull/19847 .
Signed-off-by: Eric Joldasov <bratishkaerik@landless-city.net>
I was trying to remove the need for that strip command with some
-fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables,
but passing even one of those to the ncurses build would result in an
ncdu binary that's twice as large and fails to run. I don't get it.
I had a feeling my last workaround wasn't correct, turns out my basic
assumption about ZSTD_decompressStream() was wrong: rather than
guaranteeing some output when there's enough input, it always guarantees
to consume input when there's space in the output.
Fixed the code and adjusted the buffers again.
Turns out that zstd can consume compressed data without returning any
decompressed data when the input buffer isn't full enough. I just
increased the input buffer as a workaround.
Fixes#245
json/scanner.zig in std notes inconsistencies in the standard as to
whether unpaired surrogate halves are allowed. That implementation
disallows them and so does this commit.
Which is, AFAIK, a recommended practice. Reduces the number of times
translate-c is being run and (most likely) simplifies a possible future
transition if/when @cImport is thrown out of the language.
Also uses zstd.h instead of my own definitions, mainly because I plan to
use the streaming API as well and those need more definitions.
Haven't mentioned the new -O flag in the examples section yet. Let's
first keep it as a slightly lower-profile feature while the format gains
wider testing and adoption.
Kernfs checking was previously done for every directory scanned, but the
new parallel scanning code only performs the check when the dev id is
different from parent, which isn't nearly as common.
(In fact, in typical scenarios this only ever happens once per dev id,
rendering the cache completely useless. But even people will 10k bind
mounts are unlikely to notice a performance impact)
Saves another 70k on the x86_64 binary, more on x86.
None of the included C or Zig code will unwind the stack at any point,
so these sections seem pretty useless.
This prevents displaying invalid zero values or writing such values out
in JSON/bin exports. Very old issue, actually, but with the new binfmt
experiments it's finally started annoying me.
I realized that the 16 MiB limitation implied that the index block could
only hold ((2^24)-16)/8 =~ 2 mil data block pointers. At the default
64k data block size that means an export can only reference up to
~128 GiB of uncompressed data. That's pretty limiting.
This change increases the maximum size of the index block to 256 MiB,
supporting ~33 mil data block pointers and ~2 TiB of uncompressed data
with the default data block size.
I find it hard to imagine that this will happen on a real filesystem,
but it can be triggered by a malicious export file. Better protect
against that than invoke undefined behavior.
Sadly, it doesn't seem to be called on segfaults, which means that will
still output garbage. I could install a custom segfault handler, but not
sure that's worth the effort.
Code is more hacky than I prefer, but this approach does work and isn't
even as involved as I had anticipated.
Still a few known bugs and limitations left to resolve.
This avoids embedding Zig's floating point formatting tables and
ancillary code, shaving 17k off the final static binary for x86_64.
Also adjusted the cut-off points for the units to be more precise.
And do dynamic buffer allocation for bin_export, removing 128k of
.rodata that I accidentally introduced earlier and reducing memory use
for parallel scans.
Static binaries now also include the minimal version of zstd, current
sizes for x86_64 are:
582k ncdu-2.5
601k ncdu-new-nocompress
765k ncdu-new-zstd
That's not great, but also not awful. Even zlib or LZ4 would've resulted
in a 700k binary.
By calling die() instead of propagating error unions. Not surprising
that error propagation has a performance impact, but I was hoping it
wasn't this bad.
Import performance was already quite good, but now it's even better!
With the one test case I have it's faster than JSON import, but I expect
that some dir structures will be much slower.
This isn't the low-memory browsing experience I was hoping to implement,
yet, but it serves as a good way to test the new format and such a
sink-based import is useful to have anyway.
Performance is much better than I had expected, and I haven't even
profiled anything yet.
This ended up a little different than I had originally planned.
The bad part is that my idea for the 'prevlnk' references wasn't going
to work out. For one because the reader has no efficient way to
determine the head reference of this list and implementing a lookup
table would be pretty costly and complex, and second because even with
those references working, they'd be pretty useless because there's no
way to go from an itemref to a full path. I don't see an easy way to
solve these problems, so I'm afraid the efficient hardlink list feature
will have to be disabled when reading from this new format. :(
The good news is that removing these references simplifies the hardlink
counting implementation and removes the requirement for a global inode
map and associated mutex. \o/
Performance is looking really good so far, too.
The goals of this format being:
- Streaming parallel export with minimal mandatory buffering.
- Exported data includes cumulative directory stats, so reader doesn't
have to go through the entire tree to calculate these.
- Fast-ish directory listings without reading the entire file.
- Built-in compression.
Current implementation is missing compression, hardlink counting and
actually reading the file. Also need to tune and measure stuff.