Hi Tor developers,
I'm interested in participating in GSoC. I'm an undergrad majoring in computer science at University of Oklahoma, and I've been a major Tor enthusiast for years.
There are two possible projects which I'm considering; I'm looking for some feedback on which you think would be better for me to apply for.
One project is allowing hidden services to have human-readable names. I think Namecoin would be an excellent backend for this. I coded a proof-of-concept of using Namecoin to point human-readable .bit domains to .onion domains; that code is available at https://github.com/JeremyRand/Convergence . For example, using this, you can visit http://federalistpapers.bit/ to get to the Federalist Papers hidden service. The proof of concept only works on Firefox right now (not TorBrowser); I would definitely be interested in porting it to TorBrowser, improving its privacy, and making it work for applications other than web browsing. Namecoin also has the useful feature of allowing HTTPS fingerprints to be embedded in the blockchain, which eliminates the need to trust certificate authorities for clearnet HTTPS websites (I understand that malicious exit nodes messing with TLS is currently a significantly voiced concern for Tor). I have a strong understanding of how Namecoin's DNS works and have developed some projects using Namecoin (including a dynamic DNS client), so I think I'm a good fit for such a project if there's interest in the Tor community. I talked with Jacob Appelbaum about using Namecoin recently; he was concerned about a 51% attack. I think that could be mostly resolved via a checkpointing system; while doing so adds a small degree of centralization, Tor is already slightly centralized, and it's still less centralized than other alternative naming systems that have been proposed (e.g. having Tor Project maintain a list of names themselves). While I'm not particularly familiar (yet) with how checkpointing is done within Namecoin's block validation system, I do know how to at least verify whether the currently loaded blockchain matches a given checkpoint (which would at least alert users that an attack had taken place).
The other project is making a search engine for hidden services (listed as Project Idea F on the Tor website). I think YaCy could be used to accomplish this in a decentralized and censorship-free way. I would suggest making a separate YaCy network for hidden services, using a regexp whitelist to only index .onion URL's (YaCy has such a network but I think it's currently inactive). YaCy doesn't have whitelist support built in, but I think the blacklist feature should be usable for simulating such a feature with some effort. YaCy's SOLR schema supports searching based on outgoing link URL's, so I think I could make a standard YaCy client search for all clearnet sites which link to a .onion/.onion.to/.tor2web.org URL, and feed those URL's to a Tor YaCy client for indexing. I've been a YaCy enthusiast for a couple years, and I'm actually using YaCy in a grad-level CS project this semester (the course is on Artificial Neural Networks and Evolution), so while I haven't touched the YaCy source code, I think I'm a good match for this project.
Do either of these sound like good proposals? Is one significantly more likely to be approved than the other, so that I know which to submit?
Thanks, -Jeremy Rand
Jeremy Rand biolizard89@gmail.com writes:
Hi Tor developers,
I'm interested in participating in GSoC. I'm an undergrad majoring in computer science at University of Oklahoma, and I've been a major Tor enthusiast for years.
There are two possible projects which I'm considering; I'm looking for some feedback on which you think would be better for me to apply for.
One project is allowing hidden services to have human-readable names. I think Namecoin would be an excellent backend for this. I coded a proof-of-concept of using Namecoin to point human-readable .bit domains to .onion domains; that code is available at https://github.com/JeremyRand/Convergence . For example, using this, you can visit http://federalistpapers.bit/ to get to the Federalist Papers hidden service. The proof of concept only works on Firefox right now (not TorBrowser); I would definitely be interested in porting it to TorBrowser, improving its privacy, and making it work for applications other than web browsing. Namecoin also has the useful feature of allowing HTTPS fingerprints to be embedded in the blockchain, which eliminates the need to trust certificate authorities for clearnet HTTPS websites (I understand that malicious exit nodes messing with TLS is currently a significantly voiced concern for Tor). I have a strong understanding of how Namecoin's DNS works and have developed some projects using Namecoin (including a dynamic DNS client), so I think I'm a good fit for such a project if there's interest in the Tor community. I talked with Jacob Appelbaum about using Namecoin recently; he was concerned about a 51% attack. I think that could be mostly resolved via a checkpointing system; while doing so adds a small degree of centralization, Tor is already slightly centralized, and it's still less centralized than other alternative naming systems that have been proposed (e.g. having Tor Project maintain a list of names themselves). While I'm not particularly familiar (yet) with how checkpointing is done within Namecoin's block validation system, I do know how to at least verify whether the currently loaded blockchain matches a given checkpoint (which would at least alert users that an attack had taken place).
I'd like to see human-readable names in HSes, but I'm not very familiar with Namecoin. I don't want to discourage you from working on this, but I'm not sure if I would be a good mentor for this.
BTW, I remember watching a presentation about namecoin, and it seemed like there are still a few serious unresolved problems (domain squatting is easy, no revocation, lightweight clients are impossible). Also, namecoin are not anonymous, but people who get HS domain names care about anonymity.
The other project is making a search engine for hidden services (listed as Project Idea F on the Tor website). I think YaCy could be used to accomplish this in a decentralized and censorship-free way. I would suggest making a separate YaCy network for hidden services, using a regexp whitelist to only index .onion URL's (YaCy has such a network but I think it's currently inactive). YaCy doesn't have whitelist support built in, but I think the blacklist feature should be usable for simulating such a feature with some effort. YaCy's SOLR schema supports searching based on outgoing link URL's, so I think I could make a standard YaCy client search for all clearnet sites which link to a .onion/.onion.to/.tor2web.org URL, and feed those URL's to a Tor YaCy client for indexing. I've been a YaCy enthusiast for a couple years, and I'm actually using YaCy in a grad-level CS project this semester (the course is on Artificial Neural Networks and Evolution), so while I haven't touched the YaCy source code, I think I'm a good match for this project.
Yes, you seem like a good match for this project.
Familiriaty with YaCy will be very useful indeed.
On the crawler side, may I suggest you to also look into archive.org's Heritrix crawler? Someone told me that it's what the cool kids use these days for crawling the web but I haven't used it myself.
I think you would be a good candidate for this project. However, be warned that it's likely that more good candidates will apply for this project so it might be a tough competition.
Hi George, thanks for the reply.
On 03/02/2014 06:27 AM, George Kadianakis wrote:
I'd like to see human-readable names in HSes, but I'm not very familiar with Namecoin. I don't want to discourage you from working on this, but I'm not sure if I would be a good mentor for this.
Any idea who might be a good mentor for this idea?
BTW, I remember watching a presentation about namecoin, and it seemed like there are still a few serious unresolved problems (domain squatting is easy, no revocation, lightweight clients are impossible).
Domain squatting is known to be an issue, and there are proposals to adjust the name pricing structure of Namecoin to disincentivise squatting. While these proposals are not implemented at the moment, I think it's likely that they will be implemented in the future.
There is a workaround (recently implemented) for a specific use case of revocation: a Namecoin name can import data from a second Namecoin name, in such a way that one name can be held in a safe location while the other name would be easier to update (but overrideable by the first name). So if the easy-to-update name has its keys compromised, the safely-stored name can recover the situation. This doesn't solve the more generic revocation problem; I will inquire with the Namecoin developers about this. (I think it's possible to add full revocation support to Namecoin in the future.)
Lite clients do not exist right now, but are definitely possible to build. The UTXO lite client being implemented for Bitcoin should be mergeable to Namecoin in the future.
Also, namecoin are not anonymous, but people who get HS domain names care about anonymity.
You are correct that Namecoin addresses are linkable. I think it's likely that Zerocoin or CoinJoin will be implemented for Namecoin in the future, which would solve the issue. In the meantime, I think the best way to get mostly-anonymous namecoins would be to obtain bitcoins, run them through a mixer, and use the resulting anonymized bitcoins to purchase namecoins on an exchange. (Most exchanges don't ask for identification unless you're using government-issued currency.) I think some exchanges block Tor, so it might be necessary to use a proxy or VPN between Tor and the exchange.
Yes, you seem like a good match for this project.
Familiriaty with YaCy will be very useful indeed.
On the crawler side, may I suggest you to also look into archive.org's Heritrix crawler? Someone told me that it's what the cool kids use these days for crawling the web but I haven't used it myself.
Thanks for the tip, I will look into Heritrix.
I think you would be a good candidate for this project. However, be warned that it's likely that more good candidates will apply for this project so it might be a tough competition.
Is there a way that I could submit two proposals (one for each of the projects I listed), so that if there's tough competition for one project I can still be considered for the other? Or does GSoC only permit one proposal per student per organization?
Thanks, -Jeremy Rand
Is there a way that I could submit two proposals (one for each of the projects I listed), so that if there's tough competition for one project I can still be considered for the other? Or does GSoC only permit one proposal per student per organization?
Hi Jeremy. I'll leave the rest of the questions to George but as for this one, yes. It's perfectly fine to apply to multiple projects (or multiple orgs). Be wary though about spreading yourself too thin. Submitting a fistful of poor proposals wouldn't fare very well. ;)
On 03/02/2014 09:33 PM, Damian Johnson wrote:
Hi Jeremy. I'll leave the rest of the questions to George but as for this one, yes. It's perfectly fine to apply to multiple projects (or multiple orgs). Be wary though about spreading yourself too thin. Submitting a fistful of poor proposals wouldn't fare very well. ;)
Thanks Damian.
-Jeremy
Jeremy Rand biolizard89@gmail.com writes:
Hi George, thanks for the reply.
On 03/02/2014 06:27 AM, George Kadianakis wrote:
I'd like to see human-readable names in HSes, but I'm not very familiar with Namecoin. I don't want to discourage you from working on this, but I'm not sure if I would be a good mentor for this.
Any idea who might be a good mentor for this idea?
No idea. I don't know of any people experienced with Namecoin in Tor. Sorry.
BTW, I remember watching a presentation about namecoin, and it seemed like there are still a few serious unresolved problems (domain squatting is easy, no revocation, lightweight clients are impossible).
Domain squatting is known to be an issue, and there are proposals to adjust the name pricing structure of Namecoin to disincentivise squatting. While these proposals are not implemented at the moment, I think it's likely that they will be implemented in the future.
There is a workaround (recently implemented) for a specific use case of revocation: a Namecoin name can import data from a second Namecoin name, in such a way that one name can be held in a safe location while the other name would be easier to update (but overrideable by the first name). So if the easy-to-update name has its keys compromised, the safely-stored name can recover the situation. This doesn't solve the more generic revocation problem; I will inquire with the Namecoin developers about this. (I think it's possible to add full revocation support to Namecoin in the future.)
Lite clients do not exist right now, but are definitely possible to build. The UTXO lite client being implemented for Bitcoin should be mergeable to Namecoin in the future.
Also, namecoin are not anonymous, but people who get HS domain names care about anonymity.
You are correct that Namecoin addresses are linkable. I think it's likely that Zerocoin or CoinJoin will be implemented for Namecoin in the future, which would solve the issue. In the meantime, I think the best way to get mostly-anonymous namecoins would be to obtain bitcoins, run them through a mixer, and use the resulting anonymized bitcoins to purchase namecoins on an exchange. (Most exchanges don't ask for identification unless you're using government-issued currency.) I think some exchanges block Tor, so it might be necessary to use a proxy or VPN between Tor and the exchange.
Zerocoin/etc. seems like a bigger project than Namecoin. I think implementing Namecoin support now and then waiting for Zerocoin to be established and used is not going to be very efficient.
Yes, you seem like a good match for this project.
Familiriaty with YaCy will be very useful indeed.
On the crawler side, may I suggest you to also look into archive.org's Heritrix crawler? Someone told me that it's what the cool kids use these days for crawling the web but I haven't used it myself.
Thanks for the tip, I will look into Heritrix.
I think you would be a good candidate for this project. However, be warned that it's likely that more good candidates will apply for this project so it might be a tough competition.
Is there a way that I could submit two proposals (one for each of the projects I listed), so that if there's tough competition for one project I can still be considered for the other? Or does GSoC only permit one proposal per student per organization?
AFAIK, you can submit multiple proposals. Even multiple proposals through different FOSS projects. Like I suggested in my previous mail, I would even encourage you to submit multiple proposals since the HS search engine project has gotten plenty of student attention lately.
Cheers!
On 03/04/2014 11:31 AM, George Kadianakis wrote:
AFAIK, you can submit multiple proposals. Even multiple proposals through different FOSS projects. Like I suggested in my previous mail, I would even encourage you to submit multiple proposals since the HS search engine project has gotten plenty of student attention lately.
Cheers!
Sorry for delayed reply, school had me busy.
What is the preferred way to get feedback on a full proposal? Is there a way to submit a draft proposal on the GSoC website so that Tor devs can read it and send me feedback, but I can revise it before the deadline? Or should I just post a link in an e-mail to the Tor-Dev list? Also, does Tor prefer proposals in plain text, PDF, or some other format?
Thanks, -Jeremy Rand
Hi,
Attached is my draft GSoC proposal, Hidden Service Naming and TLS Cert Checking with Namecoin.
Feedback prior to the Friday GSoC deadline would be greatly appreciated.
Thanks, -Jeremy Rand