-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
G'morning all!
Last weekend's hackfest inspired me to attempt to run some numbers on the EFF SSL Observatory data[1], in particular looking at two things: the commonality of the "Internet Widgits Pty" organization string (a default in OpenSSL's CSR generator) in self-signed certificate subjects, and, further to that, the most commonly seen self-signed certificates in the data.
Using the CSV data from the observatory, I found that out of 5,618,558 total certs in the all-certs file, only 8,891 contained subjects that matched the regex /Internet Widg[ie]ts Pty/ (a quick search observed both spellings; my OpenSSL used Widgits, FWIW. I further recorded the subjects of all certs where subject == issuer, and counted each, with the following results for the top 10:
' C=--, ST=SomeState, L=SomeCity, O=SomeOrganization, OU=SomeOrganizationalUnit, CN=localhost.localdomain/emailAddress=root@localhost.localdomain': 154867 ' C=US, ST=California, L=Sunnyvale, O=HTTPS Management Certificate for SonicWALL (self-signed), OU=HTTPS Management Certificate for SonicWALL (self-signed), CN=192.168.168.168': 111453 ' C=US, ST=Virginia, L=Herndon, O=Parallels, OU=Plesk, CN=plesk/emailAddress=info@plesk.com': 82689 ' CN=Fortinet, O=Fortinet Ltd.': 54975 ' C=US, ST=Virginia, L=Herndon, O=SWsoft, Inc., OU=Plesk, CN=plesk/emailAddress=info@plesk.com': 53184 ' C=USA, ST=California, L=Sunnyvale, O=HTTPS Management Certificate for SonicWALL (self-signed), OU=HTTPS Management Certificate for SonicWALL (self-signed), CN=192.168.168.168': 22987 ' CN=SpeedTouch 605, O=THOMSON, OU=DSL Internet Gateway Device': 16790 ' C=US, ST=Someprovince, L=Sometown, O=none, OU=none, CN=localhost/emailAddress=webaster@localhost': 16638 ' CN=SpeedTouch 5x6, O=THOMSON, OU=DSL Internet Gateway Device': 15286 ' C=US, ST=CA, L=Sunnyvale, O=SonicWALL, Inc., OU=SSL-VPN, CN=192.168.200.1': 11760
(Apologies for the lousy wrapping; initial spaces are present in the source data.) The numbers after the colons are, as you may guess, the total number of self-signed certificates with that subject.
It may be instructive to dig into the netblocks where these subjects occur and attempt to determine the context in which they occur. For the purposes of making Tor appear unblockable due to collateral damage, my strong suspicion is that these will not be all that helpful, as they are quite likely (and in some cases, clearly) used by CPE devices and the like for internal management, and thus won't really be likely to cause much difficulty for the censors if blocked.
My personal thinking is that the methodology in Jake's TLS normalization proposal[2] makes a lot of sense. Perhaps one modification, per the discussion at the hackfest, would be to stick to making the server's presented certificates self-signed, as that is probably more normal than having random unknown issuers, and less provably false than faking a known issuer like GoDaddy.
Even ignoring the question of whether or not a censor may be willing to sustain the collateral damage of blocking one of the common self-signed subjects above, by using a commonName that varies per bridge/relay and other certificate fields that also vary widely (and in fact may be randomized to some extent), we don't present any blockable static values in our certificate subject or issuer and can focus on other fun stuff in the certs. :)
Hopefully Jake hasn't already made this all a moot point while I was in the other room working with Andrew on Saturday and I just haven't heard about it yet. :) Further thoughts/discussions/flames welcome (but only if the flames come from Mike).
Regards, Tim
[1] https://www.eff.org/observatory [2] https://gitweb.torproject.org/tor.git/blob/HEAD:/doc/spec/proposals/ideas/xx...
- -- Tim Wilde, Senior Software Engineer, Team Cymru, Inc. twilde@cymru.com | +1-630-230-5433 | http://www.team-cymru.org/
Hi Tim,
This is cool! Thanks for reporting what you found.
Is there any chance you could show us the code you used to noodle through the CSV data? I created it on the off chance that someone would find it more appropriate for their needs than the MySQL DB, but we have no code for coping with it. All our examples use the MySQL database, since that's what Peter and Jesse originally intended.
On Feb 21, 2011, at 6:21 AM, Tim Wilde wrote:
Last weekend's hackfest inspired me to attempt to run some numbers on the EFF SSL Observatory data[1], in particular looking at two things: