Lately my relay hasn't been seeing much traffic, which I didn't notice for a while, but now I'm turning my attention to it. I just updated to 0.4.8.9 and see these notices (with some lines cut out):
[all the startup stuff looks normal, ending with success]
Nov 09 13:07:31.000 [notice] External address seen and suggested by a directory authority: 24.212.137.85 Nov 09 13:08:29.000 [notice] Self-testing indicates your ORPort 24.212.137.85:9001 is reachable from the outside. Excellent. Publishing server descriptor. Nov 09 13:10:39.000 [notice] Performing bandwidth self-test...done.
[but then]
Nov 09 13:36:42.000 [notice] Tor has not observed any network activity for the past 521 seconds. Disabling circuit build timeout recording. Nov 09 13:38:03.000 [notice] Failed to find node for hop #1 of our path. Discarding this circuit. Nov 09 13:38:03.000 [notice] Our circuit 0 (id: 151) died due to an invalid selected path, purpose Hidden service: Pre-built vanguard circuit. This may be a torrc configuration issue, or a bug. Nov 09 13:39:24.000 [notice] Failed to find node for hop #1 of our path. Discarding this circuit. Nov 09 13:40:10.000 [notice] Failed to find node for hop #1 of our path. Discarding this circuit.
[later]
Nov 09 14:01:27.000 [notice] Failed to find node for hop #1 of our path. Discarding this circuit. Nov 09 14:01:34.000 [notice] Tor now sees network activity. Restoring circuit build timeout recording. Network was down for 2013 seconds during 43 circuit attempts. Nov 09 14:02:35.000 [notice] No circuits are opened. Relaxed timeout for circuit 2627 (a Measuring circuit timeout 3-hop circuit in state doing handshakes with channel state open) to 60000ms. However, it appears the circuit has timed out anyway. Nov 09 14:02:44.000 [notice] Tor has not observed any network activity for the past 66 seconds. Disabling circuit build timeout recording. Nov 09 14:04:03.000 [notice] Failed to find node for hop #1 of our path. Discarding this circuit. Nov 09 14:06:24.000 [notice] Failed to find node for hop #1 of our path. Discarding this circuit.
There are hundreds of those notices about failing to find a node for hop #1. (I don't know why it complains about the network being down to 2013 seconds (over half an hour), because I didn't notice anything, but there were scores of the same warnings before that.)
What would cause this, or what could I do to identify the problem?
Bill Denton
-- William Denton https://www.miskatonic.org/ Librarian, artist and licensed private investigator. Toronto, Canada CO₂: 419.18 ppm (Mauna Loa Observatory, 2023-11-07)
On Thu, Nov 09, 2023 at 06:33:08PM -0500, William Denton wrote:
Lately my relay hasn't been seeing much traffic, which I didn't notice for a while, but now I'm turning my attention to it. I just updated to 0.4.8.9 and see these notices (with some lines cut out):
Thanks for running a relay!
Do you know if you were seeing those messages on earlier Tor versions too?
Nov 09 13:36:42.000 [notice] Tor has not observed any network activity for the past 521 seconds. Disabling circuit build timeout recording. Nov 09 13:38:03.000 [notice] Failed to find node for hop #1 of our path. Discarding this circuit.
These are client-side messages, that is, your Tor is acting as both a relay (because you configured it that way) and a client (in case you try to use it that way), and it is not finding itself to be stable as a client.
These particular circuits are probably the exploratory testing circuits that Tor clients make at first, to understand how fast their network is in order to avoid using the bottom 20% ("long tail") of circuits that take the longest to make.
There are hundreds of those notices about failing to find a node for hop #1. (I don't know why it complains about the network being down to 2013 seconds (over half an hour), because I didn't notice anything, but there were scores of the same warnings before that.)
What would cause this, or what could I do to identify the problem?
Well, the first question is, are you sure your network connection is stable and working this whole time? An easy explanation would be that you had connectivity problems during that time, and from a relay perspective Tor just sat there patiently not minding that nobody was arriving, but from a client perspective it noticed that something was wrong and said so.
I am guessing that this is your relay: https://metrics.torproject.org/rs.html#details/2A6E7ABF43F9796AD4A13DF2B2047...
Another possibility, which I don't think applies in your case, is that your relay is so overwhelmed with traffic, and/or is rate limiting all of its traffic, that the client-side testing circuits are squeezed out. But based on the bandwidth graphs I don't think that's happening here.
--Roger
tor-relays@lists.torproject.org