We have been struggling with maintaining GetTor and BridgeDB because both projects have accumulated a considerable amount of technical debt. Besides, they implement similar tasks (generally speaking, they distribute resources to users), which means that we are maintaining redundant code.
Given these issues, we have been thinking about generalising and merging GetTor's and BridgeDB's architecture. That could mean 1) turning GetTor into a BridgeDB distributor or 2) building a new system. The rest of this email discusses the latter option. Below is a diagram that proposes a design. On the left side, we have scripts and tools that take as input resources (e.g., bridge descriptors, Tor Browser download links, or even HTTPS or Snowflake proxy information) and write them to our resource DB. On the right side, we have several distributors (e.g., our existing Email/HTTPS/Moat distributors, but possibly also a Salmon, GetTor, or wolpertinger, our "hand bridges to OONI" distributor). In the middle sits the Matchmaker, which receives requests from distributors and answers them with data from our resource DB.
┌───────┐ │User DB│ ┌───────────┐ └───┬───┘ │Bridge desc├──┐ ┌─────┴─────┐ │ parser │ │ ┌─┤Salmon dist│ └───────────┘ │ │ └───────────┘ ┌────────────┐ │┌───────────┐ ┌──────────┐ │ ┌─────────┐ │Tor Browser ├─┼┤Resource DB├─ SQL─┤Matchmaker├ TCP ┼─┤Moat dist│ │link scraper│ │└───────────┘ └──────────┘ │ └─────────┘ └────────────┘ │ │ ┌───────────┐ ┌────────────┐ │ └─┤GetTor dist│ │Snowflake or├─┘ └───────────┘ │https proxy?│ └────────────┘
Cecylia suggested that all (or some) of these components can live in separate processes. That has the following advantages:
* The system becomes easier to maintain and test. * We can build distributors in different languages. * A failing distributor doesn't affect the rest of the system. * A distributor doesn't have direct access to our database.
For now, let's assume that distributors talk to the Matchmaker over TCP and use a protocol like JSON, gob, or protobufs. When a distributor asks the Matchmaker for a resource, it provides the following information:
* The resource type (e.g., an obfs4 bridge) * A client-specific ID (e.g., its IP address) * The client's location (e.g., Russia)
The Matchmaker then responds with one or more resources, similar to how BridgeDB currently does it. If any of these resources go offline at some point, the Matchmaker should notify the distributor so it can react accordingly.
I would love to hear your thoughts on the above, even if it's far from comprehensive. Do you think that the above design sketch accommodates our present and foreseeable future needs?
Cheers, Philipp
anti-censorship-team@lists.torproject.org