Hello,
during the past months we have been working on evaluating and confirming the stability and correctness of Tor hidden services. The hidden services protocol has multiple steps, and its soundness depends on various components of the Tor network. Throughout this document, we assume that the reader is familiar with the hidden services protocol. For people who want to get started now, we suggest you read the hidden services protocol description on our website [0].
During our evaluation, we continuously fetched the descriptors of a few hidden services, and collected statistics on the introduction points they picked. The hidden services we tested were selected arbitrarily so that they have decent uptime and are moderately used. We have labeled them alphabetically from (a) to (e). The data was collected over a period of 90 days using tor-hs-descriptor-fetcher [1].
With our experiment we were trying to answer questions like:
Q: Do hidden services publish descriptors correctly and in the intended time schedule?
Q: Is the introduction point codebase working properly? Are hidden service introduction points as stable as we want them to be?
Q: How many introduction points do hidden services expose themselves to?
Q: Normally, hidden services will use 3 intro points. However if they think they are getting too much traffic they will dynamically adjust the number of their introduction points according to a self-evaluation of their popularity. In this case, very popular hidden services may use up to 10 introduction points. Does this algorithm work as intended?
Let's start our analysis by looking at some graphs:
-----
[*] https://people.torproject.org/~asn/desc_stats/lifetimes-2015-07-14-b.png
This graph shows the number of descriptors published per hour by each hidden service. Including descriptor replicas, a normal hidden service would publish 2 descriptors per hour which indeed seems to be the case most of the time according to the graphs. This is good since it shows that our system works properly.
However, we also observe that all HSes will publish more than 2 descriptors every hour at least 15% of the time. We believe this occurs when they republish their descriptor because of an expired or dead intro point.
----
[*] https://people.torproject.org/~asn/desc_stats/lifetimes-2015-07-14-a.png
Normally hidden services will keep their introduction points for a random lifetime between 18 and 24 hours. This graph shows how long the measured hidden services kept their introduction points for.
Looking at the three hidden services (a), (e) and (f) it seems that introduction points indeed rotate as intended most of the time. Specifically, we see that about 75% of the intro points of those three hidden services indeed stay up for more than 18 hours. This reassured us that the introduction point rotation code works well.
However, we can see that hidden services (b), (c) and (d) have lower intro point lifetimes. We believe that this is caused by the dynamic intro point formula which adjusts the number of introduction points for popular hidden services. Consider a hidden service with 9 intro points that starts getting less traffic and needs to go down to 5 intro points; in that case, the HS will discard 4 IPs reducing the average intro circuit lifetime. If the above procedure happens multiple times, it will drastically reduce the average lifetime of intro circuits.
----
[*] https://people.torproject.org/~asn/desc_stats/lifetimes-2015-07-14-g.png
In this graph we present the number of introduction points of the hidden services over time. As mentioned previously, hidden services normally use 3 intro points, but they may increase that number if they believe they are too popular.
Although this self-evaluation sounds like a neat feature, it also means that an attacker can estimate the popularity of a hidden service just by looking at the number of its introduction points. While this might not sound as a dangerous information leak, we think that there are legitimate use cases where the popularity of a hidden service should be hidden [2].
Because we want to avoid this popularity leak and also because we think that the number of introduction points is not that important for scalability, we have disabled the dynamic intro point formula completely (#4862). Now, hidden services establish 3 intro points by default but operators have the option to tune that number using a torrc parameter.
----
[*] https://people.torproject.org/~asn/desc_stats/lifetimes-2015-07-14-h.png
In this graph, we see the total number of relays that were used as the introduction points of each hidden service over the whole measurement period.
This data is interesting because introduction points can estimate the popularity of their hidden services, so a service should ideally not expose itself to too many of them.
Looking at the last graph we see that the three normal hidden services have used approximately 250 distinct relays as intro points. In other words, about 250 relays had the chance to measure the popularity of those hidden services. This seems reasonable given the 90 days measurement period and if we assume an average lifetime of 20 hours per introduction point and about 3 intro points per HS, which gives us some confidence for the correctness of the whole system.
In the meanwhile, hidden service (b) used about 2000 relays as IPs! This was again caused by the dynamic intro point formula which forced it to rotate introduction continuously. We have also heard rumors that (b) got attacked by a DoS in the beginning of our measurement period, which caused it to rotate introduction points even more.
We also see that hidden service (e) starts using more introduction points from the 15th of May and onwards. This seems to be caused because that hidden service started using two hidden service instances for load balancing and each instance advertises a different introduction point set.
With regards to load balancing, it's worth mentioning that we are currently developing a tool called 'onionbalance' which will become the better way of load-balancing hidden services. Donncha, the author, released an alpha version just a few weeks ago which is worth trying [3].
----
And this sums up our short analysis for today.
All in all, it seems that the system works properly most of the time. Descriptors get published in the intended frequency and introduction points get rotated as they should. The popularity leakage we found for popular hidden services was also interesting, and we are happy that it has since been fixed.
We hope you had fun reading our analysis, and please let us know if there are any other privacy-preserving experiments you would like to see on the hidden services world.
Footnotes:
[0]: https://www.torproject.org/docs/hidden-services.html.en
[1]: https://github.com/DonnchaC/tor-hs-descriptor-fetcher
[2]: https://lists.torproject.org/pipermail/tor-dev/2015-April/008597.html
[3]: https://github.com/DonnchaC/onionbalance/ https://lists.torproject.org/pipermail/tor-talk/2015-July/038312.html