On Fri, Jan 31, 2020 at 07:24:48PM -0700, David Fifield wrote:
https://gitweb.torproject.org/user/dcf/snowflake.git/tree/server/server.go?h... The branch currently lacks client geoip lookup (ExtORPort USERADDR), because of the difficulty I have talked about before of providing an IP address for a virtual session that is not inherently tied to any single network connection or address. I have a plan for solving it, though; it requires a slight breaking of abstractions. In the server, after reading the ClientID, we can peek at the first 4 bytes of the first packet. These 4 bytes are the KCP conversation ID (https://github.com/xtaci/kcp-go/blob/v5.5.5/kcp.go#L120), a random number chosen by the client, serving roughly the same purpose in KCP as our ClientID. We store a temporary mapping from the conversation ID to the IP address of client making the WebSocket connection. kcp-go provides a GetConv function that we can call in handleStream, just as we're about to connect to the ORPort, to look up the client's IP address in the mapping. The possibility of doing this is one reason I decided to go with KCP for this implementation rather than QUIC as I did in the meek implementation: the quic-go package doesn't expose an accessor for the QUIC connection ID.
https://gitweb.torproject.org/user/dcf/snowflake.git/commit/?h=turbotunnel&a... This commit adds USERADDR support for turbotunnel sessions. I found a nicer way to do it than what I proposed above, that doesn't require peeking into the packet structure. Instead of using the KCP conversation ID as the common element linking an IP address and a client session, we can use the ClientID (the artificial 8-byte value that we tack on at the beginning of every WebSocket connection). The ServeHTTP function has access to the ClientID because it's what parses it out, and once you have a session you can recover the ClientID by calling the RemoteAddr method—this is an effect of kcp-go using the address returned from its ReadFrom calls as the remote address of the session, and the fact that we use the ClientID for the address in those ReadFrom calls.
To summarize: * ServeHTTP has an IP address and a ClientID but not a session. * acceptStreams has a session and a ClientID but not an IP address. * We bridge the gap using a data structure that maps a Client ID to an IP address. ServeHTTP stores an entry in the structure, and acceptStreams looks it up.
I designed a simple data structure, clientIDMap, to serve as the lookup table. In spirit it is a map[ClientID]string: you Set(clientID, addr) to store a mapping, and Get(clientID) to retrieve it. It differs from a plain map in that it expires old entries when storing new ones: it's a fixed-size circular buffer. I designed it to be proof against memory leaks. With a plain map, it would be possible for a client to get as far as sending a ClientID (storing an entry in the map), but not ultimately establish a session (leaving the entry in the map forever). But it means that the buffer has to be large enough not to expire entries before they are needed (how big depends on the rate of new session creation and delay between when a WebSocket connection starts and when a session is established). But even if an entry is expired before it is used, the worst thing that happens is that one session gets attributed to ?? rather than a certain country. I added a log message for when this happens and if it turns out to be a problem, we can design a more complicated, dynamically sized data structure.