[Declassifying this discussion and posting on [tor-dev]]
David Goulet dgoulet@ev0ke.net writes:
Hello HS elves!
I wrote a document to organize my thought and also list what we have in the bug tracker right now about HS behaviours that we want to understand/measure/assess/track.
It's a bit long but you can pass the first section describing the tickets and go right into the How and The Work to be done.
Nick, you will see there is a SponsorS component but I didn't go into hard details there. We all know we need a testing network but for now I'm more focuses on making sure we can collect the right data (for HS).
Very important part I would like feedback on is the "HS health service" for which I would like that we all agree of it's usefulness and way to do it properly.
Cheers! David
This document describes the methodology and technical details of an hidden service measurement framework/tool/<insert what this is>.
NOTE: This is NOT intended to be run in the real Tor network. ONLY testing.
Why and What
The goal is to answer some questions we have regarding HS behaviours. Most of them have a ticket assigned to them but needs an experiment or/and added feature(s) so we can measure what we need.
Is rend_cache_clean_v2_descs_as_dir cutoff crazy high? https://trac.torproject.org/projects/tor/ticket/13207
In order to address this, it seems we need a way to measure all the interactions with the cache of an HSDir and a client. We need to assess the rend cache cleanup timing values which will also helps with the upload and refetch timings.
What's the average number of hsdir fetches before we get the hsdesc? https://trac.torproject.org/projects/tor/ticket/13208
Using the control port for that is trivial but this needs a testing network to be setup and has actual load on it.
It could also be setup as a feature of an "HS health measurement tool" with a client fetching over and over the same .onion address randomly over time.
Write a hidden service hsdir health measurer https://trac.torproject.org/projects/tor/ticket/13209
This is a useful one, being able to correlate relay churn and HS desc. fetch. This one needs more brainstorming on how we could setup some sort of client or service that report/logs the results on crunching the consensus for HSDir for a specific .onion address that we know and control.
Refactor rend_client_refetch_v2_renddesc() https://trac.torproject.org/projects/tor/ticket/13223
Insure correctness of this very important function that do fetches for the client. It's in there that the HSDir (with replicas) are looped on so the descriptor can be fetched.
Maybe we want three preemptive internal circs for hidden services? https://trac.torproject.org/projects/tor/ticket/13239
That's pretty trivial to measure and quantify with the tracing instrumentation added in Tor. No need for a new feature but an experiment has to be designed to measure 2 internal circuits versus 3.
rend_consider_services_upload() sets initial next_upload_time which is clobbered when first intro point established? https://trac.torproject.org/projects/tor/ticket/13483
Do the RendPostPeriod option is working correctly. What's the exact relation in time of service->desc_is_dirty and upload time of a new descriptor.
Do we have edge cases with rend_consider_descriptor_republication()? Can we refactor it to be cleaner? https://trac.torproject.org/projects/tor/ticket/13484
This is a core function that is called every second so we should make sure it behaving as expected and not trying to do uneeded upload.
Hello,
nice list of tickets. Here are some more ideas if you are looking for more brainstorming action.
There is #3733 which is about a behavior that affects performance and could benefit from a testing network.
And there is #8950 which is about the number of IPs per hidden service. It's very unclear whether this functionality works as intended or whether it's a good privacy idea.
And there is also #13222 but it's probably easier to hack the solution here, than to measure its severity.
How
Here are some steps I think are needed to be able to measure and answer the Why section.
2> 1) Dump the uploaded/fetched HS in a human readable way.
* Allows us to track descriptor over time while testing and analyse them afterwards by correlating events with a readable desc. This kind of feature will also be useful for people crawling HS on SponsorR. * Should be a control event like for instance (ONLY client side): > setconf HSDESC_DUMP /tmp/my/dir
- On how many HSDir (including replicas) have been probed for one
single .onion request. (Which should be repeated a lot for significant results.) * Why have we probed 1 or 5? * What made us retry? Failure code? * Did the descriptor was actually alive on the HSDir? If not, when did it move? (Correlate timings between HSdir and client in a testing network)
- HS desc cache tracker. We want to know, very precisely, how things are
moving in the cache especially on the HSDir cache side. * When and why an HS desc is removed? * Why it hasn't been stored in the cache? * Count and when a descriptor is requested.
- Track the HS descriptor upload. Log at what time it was done. Use this
to correlate with RendPostPeriod or when desc_is_dirty is set. Also should be correlate with the actual state of the HSDir. Did it already have it? Is the HSDir gone?
What to be done
- Collect data
"Collect it all" --> https://i.imgur.com/tVXAcGGl.jpg
It's clear that we have to collect more data from the HS subsystem. Most of it can be collected through the control port but some are missing. Measuring precise timing of HS actions (for instance let say descriptor store) is not possible with the control port right now and also might not be that relevant since the job of this feature is to report high level events and push command to the tor daemon.
Tracing should be used here with a set of events added to the HS subsystem to collect the information we need so it can be analyzed after the experiment is run. This is only for performance measurement, the rest should as much as possible use the control port.
- Testing network (much SponsorS)
Once we are able to extract all the data we need, time to design experiment that allows us to run scenarios and collect/analyze what we want. A scenario could be this example with a set of questions we want to answer going with it:
- 50 clients randomly accessing an HS in a busy tor network.
- What is the failure rate of desc. fetch, RP establishment, ...?
- What are the timings of each component of the HS subsystem?
- What are the outliers of the whole process of establishing a connection to the HS?
- How much relay churn affected HS reachability.
And dump a human readable report/graphs whatever is useful for us to investiguate or assess the HS functionnalities.
- HS health service
ref: https://trac.torproject.org/projects/tor/ticket/13209
What about a web page that prints the result of:
- Fetch last 3 concensuses (thus 3 hours)
- Find the union of all HSDir responsible for a.onion (we control that
HS service and should be up at all time else the results are meaningless.) 3) Fetch the descriptor on each of them 4) Graph/log how many of them had it thus giving us a probability of reaching the HS within a time period.
So 3) is the tricky one. There are multiple ways of achieving that possibly:
i) New SOCKS command to tor that a client could use. - Command would have an onion address with it and the reply should be 0 or 1 (successful attempt or not) with the HSDir fingerprint with it.
ii) Control event. > setconf HSDESC_FETCH_ALL <this_is_a.onion> [...] Prints out the results as they come in with the HSDir information.
iii) A weird way of doing this with an option "tor --fetch-on-all-hs-dir this_address.onion", print out the results and quit.
I much prefer i) and ii) here. Not sure which one is best though.
Hm, I think I like (ii) here. It doesn't seem to be much more work than (i) and a few researchers have been asking for such functionality for years.