On 22 Sep 2017, at 21:24, Scott Bennett bennett@sdf.org wrote:
nusenu nusenu-lists@riseup.net wrote:
Thanks for the pointer. I'm glad it has been reported, but I still
have no sense of what in tor is malfunctioning because the compression has failed.
It's ok, we're also somewhat clueless. But we're good at tracking down weird bugs. Just give us some time.
Are user cells lost? Do user connections get broken when this happens?
This error occurs during directory document decompression using zstd.
There's a loop that calls out to a function that decompresses some data. The loop is meant to terminate when decompression finishes. But if decompression doesn't make any progress, it terminates with an error. (Otherwise, we'd just keep spinning endlessly, wondering when the decompressor was going to cough up something useful.)
The relay discards the received document, and tries another mirror. At the moment, most mirrors don't support zstd, so the relay will likely get a zlib-compressed document, decompress it successfully, and continue merrily on its way. And the users don't even notice.
Although, I wonder if we should do more to guard against network-wide breakage like this. It would be awkward if all relays ran into an error like this at the same time.
Fortunately, zstd isn't available for all platforms and architectures. So as long as we steer clear of a platform monoculture, we'll be fine. (That said, this particular bug is on BSD. But people should still run relays on BSD, so we get a good OS mix.)
Has the error been reported to the project(s) that develop the compression libraries?
Well, it's kinda rude to go straight to blaming zstd. First we have to work out what the issue is, and whose code it's in.
Although, I have to say that I wouldn't surprise me if we have a zstd version incompatibility here. Or maybe truncated input data?
Truncated input data could definitely explain why zstd can't make any progress, but isn't returning an error code. In that case, it's our fault for not giving zstd enough data to work with. (And not failing the loop quietly?)
Because if it were corrupt data, it really should return an error: http://facebook.github.io/zstd/zstd_manual.html#Chapter9
T
-- Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n xmpp: teor at torproject dot org ------------------------------------------------------------------------