One of the aims of proposal 236 is to reduce the period of inactiveness when a relay becomes a guard (see 'Phase three' of [0]). This phenomenon will become worse when the lifetime of the guard gets increased to 9 months, so we need to find a good fix.
Proposal 236 tries to make young guards more likely to be picked as middle nodes by clients: this way their guard inactivity will be compensated by working as middle nodes . This is specified in section 1.3 of proposal 236 [1] and you can read a discussion about this in [2].
I quote here the most relevant part of that section:
A guard N that has been visible for V out of NNN*30*24 consensuses has had the opportunity to be chosen as a guard by approximately F = V/NNN*30*24 of the clients in the network, and the remaining 1-F fraction of the clients have not noticed this change. So when being chosen for middle or exit positions on a circuit, clients should treat N as if F fraction of its bandwidth is a guard (respectively, dual) node and (1-F) is a middle (resp, exit) node. Let Wpf denote the weight from the 'bandwidth-weights' line a client would apply to N for position p if it had the guard flag, Wpn the weight if it did not have the guard flag, and B the measured bandwidth of N in the consensus. Then instead of choosing N for position p proportionally to Wpf*B or Wpn*B, clients should choose N proportionally to F*Wpf*B + (1-F)*Wpn*B.
The relevant trac ticket is #9321 and I asked myself various engineering questions in [3]. These engineering questions can be ignored for now but they will need to resolved eventually.
This email is more concerned with the research side of this task.
I recently wrote a python script that calculates in how many consensuses a guard appears and then calculates the guard/middle weights and probabilities according to section 1.3 of proposal 236 [1].
The results were as expected for the biggest part. That is, new guards got a boost in their middle probabilities. You can see the output of my script here https://people.torproject.org/~asn/guards3/output and you can see boxplots of the guard and middle probabilities here: https://people.torproject.org/~asn/guards3/guard_boxplot.png https://people.torproject.org/~asn/guards3/middle_boxplot.png I mainly uploaded them so that the outliers probs are seen better.
I also highlighted some snippets of the output at [4].
Here are some notes:
- You can see that old guards (like RichardFeynman) see a shrinkage both on their guard and on their middle probabilities. This happens because both the total guard weight and the total middle weight get bigger [5], so their weight percentage gets smaller.
- You can see that weird things happen to relays that are *both* a Guard and an Exit (like thevillage1 and TorLand1). Specifically, even for young guards, their middle probability gets decreased and their guard probability gets increased. This is especially visible for the village1 for which the guard probability gets increased by 0.008 making it the most probable guard of all, with probability 0.0101.
This happens because of the Guard+Exit bandwidth weights [6]. Specifically, it happens because Wgd << Wgm.
We should decide whether this behavior is a feature or a bug (it might be a feature, since we don't really want to overload exit nodes with middle traffic if they also happen to be young guards)
- You can also see that for young guards (like freefrcv2), the feature works as intended: pumping up their middle probability.
This is just an update on this task and any comments are welcome :)
[0]: https://blog.torproject.org/blog/lifecycle-of-a-new-relay
[1]: https://gitweb.torproject.org/torspec.git/blob/7bd906b6ecef7a0dcf3b420944da0...
[2]: https://lists.torproject.org/pipermail/tor-dev/2014-March/006571.html https://lists.torproject.org/pipermail/tor-dev/2014-April/006654.html
[3]: https://trac.torproject.org/projects/tor/ticket/9321#comment:13
[4]: A685A493342B15476A8AA35CE2868B62C0331D6B in 22 out of 1966 consensuses (1%) (tingPLrice5) A685A493342B15476A8AA35CE2868B62C0331D6B: guard prob 0.00001466095664981207628234955862 -> 0.00001236359278611040906223472603 (delta: -0.00000229736386370166722011483259) A685A493342B15476A8AA35CE2868B62C0331D6B: middle prob 0.00001411262320417776315899083553 -> 0.00002313283079875486344342985726 (delta: 0.00000902020759457710028443902173)
2009F87590E626C98E1A5C8D08C23366C58B7951 in 46 out of 1966 consensuses (2%) (thevillage1) 2009F87590E626C98E1A5C8D08C23366C58B7951: guard prob 0.001799671590571654712486494434 -> 0.01018493194688813591277789142 (delta: 0.008385260356316481200291396986) (exit) 2009F87590E626C98E1A5C8D08C23366C58B7951: middle prob 0.002531389208535432867885203535 -> 0.00003970964677012672298440629596 (delta: -0.002491679561765306144900797239) (exit)
B1A88CFED023588C713E42B9ABA0AD2A294BECCF in 470 out of 1966 consensuses (23%) (ChronosDaKnObNET) B1A88CFED023588C713E42B9ABA0AD2A294BECCF: guard prob 0.00002846894283361195170183376972 -> 0.0001308372445794862073501048856 (delta: 0.0001023683017458742556482711159) (exit) B1A88CFED023588C713E42B9ABA0AD2A294BECCF: middle prob 0.00004004395860053895959111507317 -> 0.000006418213054140890599306823457 (delta: -0.00003362574554639806899180824971) (exit)
2D958EED2BC8EB672187C99CF6A4D8D8EBDBE412 in 474 out of 1966 consensuses (24%) (freefrcv2) 2D958EED2BC8EB672187C99CF6A4D8D8EBDBE412: guard prob 0.00004462030284725414520715083057 -> 0.00003762832587077081018941003575 (delta: -0.00000699197697648333501774079482) 2D958EED2BC8EB672187C99CF6A4D8D8EBDBE412: middle prob 0.00004295146192575840961431993422 -> 0.00006073004942478721908966913122 (delta: 0.00001777858749902880947534919700)
4A5ADDBAC82BC071D9516FB01429A1CAD493D36C in 865 out of 1966 consensuses (43%) (monoversum) 4A5ADDBAC82BC071D9516FB01429A1CAD493D36C: guard prob 0.008658463528693363891387601646 -> 0.007301687043971002453421233128 (delta: -0.001356776484722361437966368518) 4A5ADDBAC82BC071D9516FB01429A1CAD493D36C: middle prob 0.008334628921307881865635891998 -> 0.01016060938520887622674424842 (delta: 0.001825980463900994361108356422)
B7A4718F146139B8137BBD7CF2890AFA61C2BAB7 in 1065 out of 1966 consensuses (54%) (PPTOR0001) B7A4718F146139B8137BBD7CF2890AFA61C2BAB7: guard prob 0.00001880427048562853262301356431 -> 0.00001585765161696769857982280078 (delta: -0.00000294661886866083404319076353) B7A4718F146139B8137BBD7CF2890AFA61C2BAB7: middle prob 0.00001810097324014104405174911514 -> 0.00002026262077783844197435510639 (delta: 0.00000216164753769739792260599125)
E1E922A20AF608728824A620BADC6EFC8CB8C2B8 in 1616 out of 1966 consensuses (82%) (TorLand1) E1E922A20AF608728824A620BADC6EFC8CB8C2B8: guard prob 0.001442840154510033519493482607 -> 0.002483448551109987826331785865 (delta: 0.001040608396599954306838303258) (exit) E1E922A20AF608728824A620BADC6EFC8CB8C2B8: middle prob 0.002029475830980993592356240765 -> 0.001118418926871964944505332348 (delta: -0.000911056904109028647850908417) (exi)
8699371415DC052A60BCA3AAD68E55D9B759CAE0 in 1858 out of 1966 consensuses (94%) (torrorist) 8699371415DC052A60BCA3AAD68E55D9B759CAE0: guard prob 0.000009508374059117252371523807943 -> 0.000008018417060557113123695710000 (delta: -0.000001489956998560139247828097943) 8699371415DC052A60BCA3AAD68E55D9B759CAE0: middle prob 0.000009152752005608042048765795507 -> 0.000006628989446427573975351367609 (delta: -0.000002523762559180468073414427898)
3FE4CF4366E128154E951FEBB3B448CC2F3E1EBA in 1966 out of 1966 consensuses (100%) (RichardFeynman) 3FE4CF4366E128154E951FEBB3B448CC2F3E1EBA: guard prob 0.0003304027187022866466529501978 -> 0.0002786287939478505230692028838 (delta: -0.0000517739247544361235837473140) 3FE4CF4366E128154E951FEBB3B448CC2F3E1EBA: middle prob 0.0003180453490216872711917499891 -> 0.0002132317299501384937260708373 (delta: -0.0001048136190715487774656791518)
7E43F05A970431239898FDDCE014B1BA5134703F in 1966 out of 1966 consensuses (100%) (PrivacyRepublic0003) 7E43F05A970431239898FDDCE014B1BA5134703F: guard prob 0.0002102202590710855289154482723 -> 0.0001772788598061129408325756509 (delta: -0.0000329413992649725880828726214) (exit) 7E43F05A970431239898FDDCE014B1BA5134703F: middle prob 0.0002956924463418544427572802405 -> 0.0001982453510501193425034660120 (delta: -0.0000974470952917351002538142285) (exit)
[5]: total guard weight: 11176705.8531 new total guard weight: 13253526.12584313326551373350
total middle weight: 7945992.6321 new total middle weight: 11851828.99651449643947100714
[6]: Wme=0 # Weight for Exit-flagged nodes in the middle Position Wmd=867 # Weight for Guard+Exit flagged nodes in the middle Position Wmg=4063 # Weight for Guard-flagged nodes in the middle Position Wmm=10000 # Weight for non-flagged nodes in the middle Position
Wgd=867 # Weight for Guard+Exit-flagged nodes in the guard Position Wgg=5937 # Weight for Guard-flagged nodes in the guard position Wgm=5937 # Weight for non-flagged nodes in the guard Position
On Thu, Jul 31, 2014 at 11:24 AM, George Kadianakis desnacked@riseup.net wrote:
- You can see that old guards (like RichardFeynman) see a shrinkage both on their guard and on their middle probabilities. This happens because both the total guard weight and the total middle weight get bigger [5], so their weight percentage gets smaller.
This doesn't sound right - total guard weight shouldn't change. All the proposal does is re-allocate some fraction of the weight of a guard back to the middle (M) category.
Nicholas Hopper hopper@cs.umn.edu writes:
On Thu, Jul 31, 2014 at 11:24 AM, George Kadianakis desnacked@riseup.net wrote:
- You can see that old guards (like RichardFeynman) see a shrinkage both on their guard and on their middle probabilities. This happens because both the total guard weight and the total middle weight get bigger [5], so their weight percentage gets smaller.
This doesn't sound right - total guard weight shouldn't change. All the proposal does is re-allocate some fraction of the weight of a guard back to the middle (M) category.
Oops, you are right!
On my previous post, please disregard anything that had to do with guards probabilities then.
In this light, the behavior with Guard+Exit is good. That is, even for young guards (that are also exits), their middle probability is decreased, hence prioritizing exit traffic.
Also thanks for the comments on the #9321 questions.
Moving forward!
If a node is an exit, maybe it shouldn't *ever* be used as a guard? This is just off the top of my head, but it seems like there might be some abuse possibilities in a node that sees both entering and exiting traffic, even if they're never for the same circuit (which I believe is the current behavior).
On Fri, Aug 1, 2014 at 7:48 AM, Zack Weinberg zackw@panix.com wrote:
If a node is an exit, maybe it shouldn't *ever* be used as a guard? This is just off the top of my head, but it seems like there might be some abuse possibilities in a node that sees both entering and exiting traffic, even if they're never for the same circuit (which I believe is the current behavior).
I think if someone is interested in observing some fraction of entering and exiting traffic, they could probably just run two nodes. The one advantage I can see is that a Sybil attack aiming to catch both ends of a circuit would be about half as effective, as you would have to split your resources between guards and exits, rather than "playing both sides."
On the downside, you might create congestion by reducing the number of guards (or exits) as existing guard+exit nodes get pushed into one of the two categories; my feeling is that this would be a significant performance hit.
- Nikita
Nicholas Hopper hopper@cs.umn.edu writes:
On Thu, Jul 31, 2014 at 11:24 AM, George Kadianakis desnacked@riseup.net wrote:
- You can see that old guards (like RichardFeynman) see a shrinkage both on their guard and on their middle probabilities. This happens because both the total guard weight and the total middle weight get bigger [5], so their weight percentage gets smaller.
This doesn't sound right - total guard weight shouldn't change. All the proposal does is re-allocate some fraction of the weight of a guard back to the middle (M) category.
OK, I have some new data.
I plotted the (vanilla) probability distribution of all possible middle nodes (so not only guard nodes like the previous graphs) for the consensus with valid-after '2014-08-01 13:00:00'.
I also parsed 2.5 months worth of consensuses to do the guardiness trick on the guard nodes, and then plotted the probability distribution of all possible middle nodes + the guardiness trick.
You can find a boxplot here (to see how outliers are distributed): https://people.torproject.org/~asn/guards4/middle_boxplot.png
You can read the output of the script here: https://people.torproject.org/~asn/guards4/results_sort_guardiness.txt (It's sorted based on the guardiness probability distribution. You can find one sorted based on the vanilla probability distribution here: https://people.torproject.org/~asn/guards4/results_sort_vanilla.txt)
Some notes: - You can see that young guards like 'monoversum' and 'yolocaust' got a decent weight upgrade because of their age putting them first in middle probability.
- If you scroll to the bottom you see some nodes with probability 0.0.
These nodes are plainly Exit nodes and they get nulled because of the bandwidth weight Wme being 0. OTOH, Guard+Exit nodes don't get nulled since Wmd=867; which is quite low and explains why no Guard+Exit nodes are on the top of the middle probability distribution.
This is related to what Zack asked.
My plan forward is to start working on the output file format (which means that I will have to take decisions on whether the script will be run repeatedly, or use some sort of state file) and also get more confidence that my script does not have any stupid bugs. I will publish my script RSN.
Cheers!