On Thu, Dec 26, 2013 at 6:05 AM, Karsten Loesing karsten@torproject.org wrote:
On 12/17/13 10:31 PM, Nick Mathewson wrote:
164 Reporting the status of server votes
This proposal explains a way for authorities to provide a slightly more verbose document that relay operators can use to diagnose reasons that their router was or was not listed in the consensus. These documents would be like slightly more verbose versions of the authorities' votes, and would explain *why* the authority voted as it did. It wouldn't be too hard to implement, and would be a fine project for somebody who wants to get to know the directory code. (5/2011)
Hi Nick,
I very much like this proposal! I want to help move it forward and integrate the additional information into Onionoo, so that people can diagnose the network better. Knowing why an authority rejected a descriptor, when it last performed a successful and an unsuccessful reachability test, why it didn't include a relay in its vote, why it assigned which relay flags, etc. can be very helpful information.
Here's some feedback:
- The URL /tor/status-vote-info/current/authority[.z] doesn't really fit
into the schema that prefixes everything related to the voting process with /tor/status-vote/(next|current)/. A more consistent choice would be /tor/status-vote/current/vote-info[.z].
Sounds okay to me.
- The WFU and MTBF thresholds are already contained in votes in
"flag-thresholds" lines since February 2013 (admittedly, four years after the proposal was written). We should either use the same line format in vote-info documents, or leave out flag thresholds here.
If they're already in votes, then we should just include them verbatim in vote-info. vote-info should not diverge from vote needlessly.
- The proposal says in two places that explanations should be given in
English. The better approach, IMO, would be to enumerate all possible explanations in dir-spec.txt and assign error codes to them. Reasons include: a) requires fewer bytes in an authority's memory; b) requires fewer bytes in vote-info documents; c) easier for applications to process vote-info documents; d) forces us to enumerate reasons for rejecting a router descriptor or not including a router in a vote and explicitly specify them in dir-spec.txt. (Happy to help enumerating reasons.)
There will always be more explanations we didn't think of; what if we do something that uses enumerated error codes *and* explanatory messages?
The total size of vote-info documents won't actually be affected if we store them compressed. Perhaps we should make compression mandatory.
- I'd want to make the format of vote-info documents more compact,
though I don't have a good suggestion yet. (But I also didn't want to delay sending this email, so here's my half-baked thought.) Ideally, every status entry has one "r" line to identify the relay and then one line per noteworthy event. Noteworthy events are: a) the authority receives a router descriptor and decides whether to accept or reject it; b) the authority performs a reachability test that is either successful or not; c) the authority produces a vote document and decides whether to include a relay and what flags to assign. (Are there more events that are worth including?) I'm aware that you mentioned the same information in the proposal, I'm just wondering about better ways to represent it. As I said, this thought is only half-baked and will hopefully become clearer when going through the code.
- The proposal says under "Risks" that it doesn't make provisions for
caching these documents. But authorities have to cache these documents! An authority can only generate a vote-info document at the same time as it generates a vote document. Any later attempts to say why it voted the way it did would require the authority to keep state that it doesn't need for anything else. The authority should simply write its vote-info document to disk and serve it whenever somebody asks for it. (To be extra precise, it should only serve a vote-info document when the consensus becomes valid.)
I meant that these new documents aren't cached at directory caches. (And they possibly shouldn't be.) But this opens a risk of creating more traffic at authorities, which wouldn't be good.
There, that concludes my review of directory-protocol related proposals. Looking forward to what you think. Please let me know how I can best move these proposals forward, e.g., by writing patches to the proposals or dir-spec.txt, by writing code, etc.
For proposal 164, I'd love it if you can patch the proposal to make the uncontroversial changes above. (that is, the changes that we can agree on how to do.)
Writing patches to merge stuff into dir-spec isn't useful without code; we don't change dir-spec until after the code changes, per 001-process.txt.
As for writing code, that sounds fine! Let me know which ones you're most interested in working on some time and we can figure out how to prioritize?
Most of these are currently in "Lorax" status[*], meaning that I agree that they'd be a good idea, but nobody is currently writing code for them or making a plan to do so. (I think I got a partial implementation of 185 at some point.) In some cases (like 212) the biggest issue is testing.
[*] "Unless someone like you cares a whole awful lot Nothing is going to get better. It's not!" --Dr. Seuss, _The Lorax_