Proposal: Check Maxmind GeoIP DB before distributing - tor-dev

30 Jun 2018


      Hi List,
Please have a look at this proposal.
Filename: Check-Maxmind-GeoIP-DB-before-distributing.txt
Title: Check Maxmind GeoIP-DB before distributing
Ticket(s): #26240
Author: Jaskaran Singh
Created: June 2018
Status: Open
0. Motivation and Overview
We're using Maxmind's (company registered in the US) GeoIP Database,
which is not just antithetical to the philosophy that one should not
totally rely on a service/software for all needs, but has some serious
security repercussions too.
Trusting Maxmind's GeoIP Database is dangerous, as it may lead to some
possible attacks on the Network. We propose that the Database be checked
for integrity before distributing to the users. The whole process of
checking for integrity can be assigned to the Directory Authorities (or
any trusted systems) who would be responsible for completing it using a
script.
We should also give a choice to the user whether she wants to use
Maxmind's DB or any other DB of her choice, or even to not use any
Geo-IP DB at all.
1. Threat Model
We assume an adversary that is capable of introducing false information
in the Maxmind GeoIP database, either by it's influence over the company
or otherwise. The adversary also has enough resources to perform Sybil
attack on the network.
2. Attacks on the Network
2.1 Sybil attack under the Radar
The Tor Network is constantly monitored for any suspicious spike in
nodes, as it may be an indication of an oncoming/undergoing sybil
attack. A powerful adversary can coerce Maxmind to map some specific IP
address blocks to different countries. This may lead to people/scripts
monitoring the network to not feel suspicious about this event, and
would result in the adversary staying under the radar.
2.2 False Location indication for a shady node
A large percentage of people don't want the exit of their circuits to be
located in certain countries where the communication is under
surveillance. The powerful adversary knows this as well. Users generally
add a line in their config that allows them to not form a circuit
through nodes located in those locations. To overcome this, the
adversary can coerce Maxmind to alter it's database to map some
particular IP's to locations which the user thinks are havens of free
speech.
3. Design of the Solution
We should check Maxmind database against it's own previous versions.
Additionally we should also simply stop using GeoIP database
intrinsically for every purpose but still allow users to plug in their
own databases through the interface we implement. Perhaps the latter can
be introduced as ./configure option for when the user is highly
distrustful of Maxmind and wants to use a service she trusts, or doesn't
wants to use at all. The two solutions are explained below.
3.1 Checking for integrity
Step 1: The Dir Authorities (or any trusted computers) fetch the latest
maxmind geoip-db along with its previous versions.
Step 2: Tor Nodes' location are checked against the previous versions
for any changes.
Step 3: All the Dir Authorities perform the above two steps
independently of each other. A count of the number of changes in node
locations is maintained. If the changes are in significant amount, they
are viewed with suspicion, since this can be the preparation of a sybil
attack by the adversary. In such a case, the new changes into the
database can be discarded. Though, even change in a single node's
location is concerning, but it is not easy attribute that change to
malice. Sometimes there are genuine reasons for a location to change.
Step 4. This database is then distributed to the users.
3.2 Doing away with GeoIP location altogether
GeoIP databases are occasionally un-realiable and can be done away with
safely. We can provide a ./configure option to the users that enables
them to plug in their own trusted service. If the user doesn't have
access to a database of her own choice, she can simply choose Maxmind,
or not use any database at all. It would remove our dependence from just
one database, and diversify our usage.
4. Licensing issues
Maxmind has a pretty liberal license when it comes to their database, as
summarized below
Maxmind - CC BY-SA 4.0
    * Copy and redistribute the material in any medium or format
    * remix, transform, and build upon the material
      for any purpose, even commercially
5. Dealing with false positives
Maxmind calculates geolocation of an IP addr using WHOIS records,
Reverse DNS etc. It claims to have precision rate of 99.5% on country
level. The other 0.5% is more likely to be those IP addresses for which
neither WHOIS record nor Reverse DNS are setup.
A very large percentage of Tor Nodes are run from datacenters, which
usually have all their records set up. It's highly unlikely for an IP
address belonging to a datacenter to be mapped to a wrong location.
Hence, false positives would be very few, and can be safely ignored
after a simple manual/scripted investigation.
-- 
Jaskaran Veer Singh (jvsg)
jvsg1303 at gmail dot com
PGP 2814 3FB7 A32D 429B 092E 27F0 8AA3 C532 9E1A 6AD8