All nodes bootstrap properly and reach
100%, the authorities both manage to vote and exchange
information. Also the relays and the client bootstrap to
100%.
When are these messages logged?
Sorry, I must update this: The authorities bootstrap to 100%, relays
and client are stuck with 80% (sometimes reach 85%).
Nevertheless, the consensus seems to lack
relays with guard flags:
Feb 12 10:35:56.000 [notice] I learned some more
directory information, but not enough to build a circuit:
We need more microdescriptors: we have 2/2,
This log message says that there are only 2 nodes
in the consensus at that time.
and can only build 0% of likely paths.
(We have 0% of guards bw, 100% of midpoint bw, and 100%
of end bw (no exits in consensus,
This log message say that there are no exits in the
consensus at that time.
Right now there are even less available nodes and bandwidth showing
up in the logs. This changes between runs but never to more
promising numbers.
using mid) = 0% of path bw.)
Because of this, no default
circuits can be built in the client or the relays
When there are only 2 nodes in the network, you
can't build a 3-hop path.
There should be 8 nodes in total so it's kind of strange that only 2
seem to be available in this relay.
in all logs the following message
appears every second:
[warn] Failed to find node for hop #1 of our path.
Discarding this circuit.
…
In the data_dir/state file I see several guard entries:
Guard in=default rsa_id=[...] nickname=auth01
sampled_on=2019-01-17T18:33:12 sampled_by=0.3.5.7 listed=1
Guard in=default rsa_id=[...] nickname=relay03
sampled_on=2019-01-22T17:17:10sampled_by=0.3.5.7
unlisted_since=2019-01-27T11:00:36 listed=0
Guard in=default rsa_id=[...] nickname=relay02
sampled_on=2019-01-24T22:19:10sampled_by=0.3.5.7
unlisted_since=2019-01-29T09:08:59 listed=0
Guard in=default rsa_id=[...] nickname=relay03
sampled_on=2019-02-06T21:07:36sampled_by=0.3.5.7
listed=1
Guard in=default rsa_id=[...] nickname=relay05
sampled_on=2019-01-27T16:37:38 sampled_by=0.3.5.7 listed=1
The state file says that there were some nodes
in some previous consensuses. None of these nodes come from
the current consensus at the time of your log messages.
I use a bash script that manages all the VMs. It kills Tor on all
machines, then waits for 5 seconds just to be sure
(ShutdownWaitLength 0), then removes all cached, old logs, the state
file, ... and some more stuff on the authorities (see below).
ssh auth01 rm /var/lib/tor/cached*
ssh auth01 rm /var/lib/tor/*.log
ssh auth01 rm /var/lib/tor/state
ssh auth01 rm -r /var/lib/tor/router-stability
ssh auth01 rm -r /var/lib/tor/sr-state
ssh auth01 rm -r /var/lib/tor/v3-status-votes
ssh auth01 rm -r /var/lib/tor/diff-cache
The client also seems to receive
a complete consensus, at least all fingerprints of my
setup show up if I fetch the file manually.
How do you fetch the file manually, and from where?
wget http://authip:7000/tor/server/all
which should be the cached-descriptors.new file on the authority
(which also means it gets deleted on each new startup and must be
fresh).
In this file I see all the fingerprints that are supposed to be
there. It's also possible to connect to the client's control port
and manually build circuits to all relays that should be there. This
is an indicator that the client knows the relays (using a
fingerprint that is not in the consensus would not work).
Again, guards also show up in the state files of the relays
Guard in=default rsa_id=C122CBB79DC660621E352D401AD7F781F8F6D62D
nickname=relay03 sampled_on=2019-02-07T16:24:21 sampled_by=0.3.5.7
listed=1
Guard in=default rsa_id=2B74825BE33752B21D17713F88D101F3BADC79BC
nickname=relay06 sampled_on=2019-02-03T22:16:29 sampled_by=0.3.5.7
listed=1
Guard in=default rsa_id=E4B1152CDF0E5FE697A3E916716FC363A2A0ACF3
nickname=relay07 sampled_on=2019-02-12T18:51:00 sampled_by=0.3.5.7
listed=1
Guard in=default rsa_id=911EDA6CB639AAE955517F02AA4D651E0F7F6EFD
nickname=relay02 sampled_on=2019-02-11T22:58:28 sampled_by=0.3.5.7
listed=1
Guard in=default rsa_id=8E574F0C428D235782061F44B2D20A66E4336993
nickname=relay05 sampled_on=2019-02-01T17:46:05 sampled_by=0.3.5.7
listed=1
The dates are still old, but I delete all states in the big cleanup
procedure. Are there some more old caches I need to remove, where
does the date information come from?
I'm not sure what is happening here. It looks like
some consensuses only have 2 nodes. But other consensuses have
most of the nodes.
You might have a bug in your network setup, or you
may have found a bug in Tor.
I think it's a bug somewhere in the setup but I just can't find it
:(
The most likely explanation is that you had a
working network at some time, which gave you the state file. And
you had a failed network at some time, which gave you the log
messages.
I suggest that you start again with the same
config, but remove all previous state.
(Move the cached state, consensuses, descriptors,
and log files somewhere else. Do not remove the keys.)
Then you'll know if your current network actually
works.
Questions are: Why does the client know all the relays' fingerprints
but the network still has problems finishing the bootstrapping and
building a complete circuit? Are there any other things I should
look into and check to understand the problem?
T
_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev