teor teor2345@gmail.com wrote:
On 22 Sep 2017, at 21:24, Scott Bennett bennett@sdf.org wrote:
nusenu nusenu-lists@riseup.net wrote:
Thanks for the pointer. I'm glad it has been reported, but I still
have no sense of what in tor is malfunctioning because the compression has failed.
It's ok, we're also somewhat clueless. But we're good at tracking down weird bugs. Just give us some time.
Are user cells lost? Do user connections get broken when this happens?
This error occurs during directory document decompression using zstd.
There's a loop that calls out to a function that decompresses some data. The loop is meant to terminate when decompression finishes. But if decompression doesn't make any progress, it terminates with an error. (Otherwise, we'd just keep spinning endlessly, wondering when the decompressor was going to cough up something useful.)
The relay discards the received document, and tries another mirror. At the moment, most mirrors don't support zstd, so the relay will likely get a zlib-compressed document, decompress it successfully, and continue merrily on its way. And the users don't even notice.
Oh. Okay.
Although, I wonder if we should do more to guard against network-wide breakage like this. It would be awkward if all relays ran into an error like this at the same time.
Fortunately, zstd isn't available for all platforms and architectures. So as long as we steer clear of a platform monoculture, we'll be fine. (That said, this particular bug is on BSD. But people should still run relays on BSD, so we get a good OS mix.)
Are the BSDs the only systems that have zstd available?
Has the error been reported to the project(s) that develop the compression libraries?
Well, it's kinda rude to go straight to blaming zstd. First we have to work out what the issue is, and whose code it's in.
I guess I misunderstood the traceback. Apologies.
Although, I have to say that I wouldn't surprise me if we have a zstd version incompatibility here. Or maybe truncated input data?
Here's what I have.
Sep 23 00:36:33.957 [notice] Tor 0.3.1.6-rc (git-efc306c59aa9ee1a) running on FreeBSD with Libevent 2.1.8-stable, OpenSSL 1.0.2l, Zlib 1.2.8, Liblzma 5.2.2, and Libzstd 1.3.1.
Truncated input data could definitely explain why zstd can't make any progress, but isn't returning an error code. In that case, it's our fault for not giving zstd enough data to work with. (And not failing the loop quietly?)
Now, that's interesting because last night my Comcast connection was down and back up a few times beginning around 11:30 p.m. until it went down just before 1:30 a.m. and stayed down for about an hour and a half. During that time, of course, I called Comcast to get whatever information (as usual, not much) I could about the outage. They said their crew was out working on it and that it affected 327(?) customers in the area, so I surmised something was gradually failing (such nighttime down and back up sequences have been becoming more frequent recently) and after my complaints a couple of days earlier, they figured out what it was and were replacing it. The error messages started last night after they were done and the line was working again. If there's some leftover file that's dirty from having been cut off during transmission that continues to trigger the error messages from time to time, I'll get rid of it if someone can tell me what to look for.
Because if it were corrupt data, it really should return an error: http://facebook.github.io/zstd/zstd_manual.html#Chapter9
Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************