Hey,
I'm Krishna Shukla, I'm studying a bachelors of computer science at the University of Queensland. I guess the relevant subjects I've studied so far covers C and Unix programming, Computer Networks, Algorithms and Data Structures, and Programming in the large. (got a high distinction in all the above)
My most important question is if I could work on a project but not actually be apart of GSoC? - I am unfortunately ineligible as my brother works as a Security Engineer at Google Sydney. And if the above is okay would it also be okay to not have to strictly abide by their timeline as I don't actually have holidays during this time in Australia but I'd like to contribute in my free time nonetheless!
As for projects themselves I'm really interested in the relay crypto parallelism and the hidden service crypto parallelism. And I have a couple of questions regarding them.
For the relay crypto parallelism I wanted to know what is there left to be done? When I looked at the tickey #1749 someone called towelenee made a few patches that already made it multi threaded, were these changes just not accepted? Also wanted to know if specific knowledge about circuit cryptography was required? As I know of it, but I certainly cannot make my own fully homomorphic cryptosystem, is it more in the steps of the system has already been made, it just needs to be parallelised correctly?
It also states the code is written to expect immediate responses, I'm not sure what you mean by that, after all there is always a slight delay, and if it becomes multi threaded we can never know what is running what when, so is it more someone is waiting at the other end of a socket and needs it ASAP, or is it internally things want the answer quickly (in which case I don't know how to solve it other than uses mutexes which is probably not so okay)?
I am interested in the hidden service crypto parallelism in its own right, but I was also thinking weather it would be a feasible idea to combine the two projects and create a multi-threaded decryption library that could be linked to both the tor relay and the hidden services (could release it as a cryptosystem library, all the fully homorphic cryptosystem libraries I found used GPL licenses and thus not compatible with tors), or are their requirements too far apart?
Also I was wondering how the Ahmia automated blacklisting was planned to work? As in how would a list of child abuse sites be fetched? Honestly I don't actually know python, I've worked with it and Django before in a hackathon once, but I cannot claim any real knowledge in it, but at the same time I am passionate about the topic of child abuse, and I think if I can help reduce its demand in anyway by making it harder to find, I'd say it's some good added to the world.
Apologies about the long mail guys, Krishna Shukla
Hi Krishna, absolutely! We love having new volunteers be it through GSoC or not. Hell, most of us got our start outside the program. ;)
I'll leave the crypto parallelism questions to Nick, George, David, and others far more knowledgeable of the core tor codebase than me.
On Mon, Mar 27, 2017 at 6:17 PM, Krishna Shukla karatekrishna@hotmail.com wrote:
Hey,
I'm Krishna Shukla, I'm studying a bachelors of computer science at the University of Queensland. I guess the relevant subjects I've studied so far covers C and Unix programming, Computer Networks, Algorithms and Data Structures, and Programming in the large. (got a high distinction in all the above)
My most important question is if I could work on a project but not actually be apart of GSoC? - I am unfortunately ineligible as my brother works as a Security Engineer at Google Sydney. And if the above is okay would it also be okay to not have to strictly abide by their timeline as I don't actually have holidays during this time in Australia but I'd like to contribute in my free time nonetheless!
As for projects themselves I'm really interested in the relay crypto parallelism and the hidden service crypto parallelism. And I have a couple of questions regarding them.
For the relay crypto parallelism I wanted to know what is there left to be done? When I looked at the tickey #1749 someone called towelenee made a few patches that already made it multi threaded, were these changes just not accepted? Also wanted to know if specific knowledge about circuit cryptography was required? As I know of it, but I certainly cannot make my own fully homomorphic cryptosystem, is it more in the steps of the system has already been made, it just needs to be parallelised correctly?
It also states the code is written to expect immediate responses, I'm not sure what you mean by that, after all there is always a slight delay, and if it becomes multi threaded we can never know what is running what when, so is it more someone is waiting at the other end of a socket and needs it ASAP, or is it internally things want the answer quickly (in which case I don't know how to solve it other than uses mutexes which is probably not so okay)?
I am interested in the hidden service crypto parallelism in its own right, but I was also thinking weather it would be a feasible idea to combine the two projects and create a multi-threaded decryption library that could be linked to both the tor relay and the hidden services (could release it as a cryptosystem library, all the fully homorphic cryptosystem libraries I found used GPL licenses and thus not compatible with tors), or are their requirements too far apart?
Also I was wondering how the Ahmia automated blacklisting was planned to work? As in how would a list of child abuse sites be fetched? Honestly I don't actually know python, I've worked with it and Django before in a hackathon once, but I cannot claim any real knowledge in it, but at the same time I am passionate about the topic of child abuse, and I think if I can help reduce its demand in anyway by making it harder to find, I'd say it's some good added to the world.
Apologies about the long mail guys, Krishna Shukla
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Mon, Mar 27, 2017 at 9:17 PM, Krishna Shukla karatekrishna@hotmail.com wrote:
Hey,
I'm Krishna Shukla, I'm studying a bachelors of computer science at the University of Queensland. I guess the relevant subjects I've studied so far covers C and Unix programming, Computer Networks, Algorithms and Data Structures, and Programming in the large. (got a high distinction in all the above)
My most important question is if I could work on a project but not actually be apart of GSoC? - I am unfortunately ineligible as my brother works as a Security Engineer at Google Sydney. And if the above is okay would it also be okay to not have to strictly abide by their timeline as I don't actually have holidays during this time in Australia but I'd like to contribute in my free time nonetheless!
Sure; we are always happy to accept volunteers!
You might want to try something simpler than this for your first patch or two -- it's generally better to get practice with smaller things before you move to something big and complex. The documents in doc/HACKING inside the Tor git repository might be a good place to find good starting advice.
As for projects themselves I'm really interested in the relay crypto parallelism and the hidden service crypto parallelism. And I have a couple of questions regarding them.
For the relay crypto parallelism I wanted to know what is there left to be done? When I looked at the tickey #1749 someone called towelenee made a few patches that already made it multi threaded, were these changes just not accepted?
So, those patches just can't work as they're written now. To begin with, they launch a new thread for every hop in an outgoing circuit, and then they wait for every such thread to finish. They also have a pretty serious race condition in their handling of the payload they're supposed to be encrypting. And finally, they only handle client-side circuit crypto -- not relay-side crypto at all.
A better approach, and the approach we were hoping for, would be to parallelize crypto by circuit, not by hop: and to use long-lived worker threads, and not using one thread per circuit or (worse) one thread per cell per hop.
Also wanted to know if specific knowledge about circuit cryptography was required? As I know of it, but I certainly cannot make my own fully homomorphic cryptosystem, is it more in the steps of the system has already been made, it just needs to be parallelised correctly?
Right; all our crypto is implemented in Tor right now. I'm not sure why you're mentioning homomorphic cryptosystems; we don't need one of those here.
It also states the code is written to expect immediate responses, I'm not sure what you mean by that, after all there is always a slight delay, and if it becomes multi threaded we can never know what is running what when, so is it more someone is waiting at the other end of a socket and needs it ASAP, or is it internally things want the answer quickly (in which case I don't know how to solve it other than uses mutexes which is probably not so okay)?
Maybe have a look how we use the function relay_crypt() to see what we mean here? A more precise thing to say would have been that the calls to relay_crypt() are written to block until relay_crypt() is finished. But instead, if the work of relay_crypt() is to be done in another thread, then the functions that call it today need to queue the work to be done by another thread ... and then continue safely.
I am interested in the hidden service crypto parallelism in its own right, but I was also thinking weather it would be a feasible idea to combine the two projects and create a multi-threaded decryption library that could be linked to both the tor relay and the hidden services (could release it as a cryptosystem library, all the fully homorphic cryptosystem libraries I found used GPL licenses and thus not compatible with tors), or are their requirements too far apart?
So, the "hidden service" crypto in question here is a set of public key operations. But having a separate library for this probably isn't the right design, IMO. The idea is not to split _each operation_ across multiple CPU cores, but rather to handle _multiple operations_ by doing them on multiple cores. The code for each individual operation could remain single-threaded.
If you'd like to see an example of how we do this in Tor today for our server-side circuit extension handshakes, have a look at workqueue.c and onion.c in the Tor source code, to get a sense of how they work together. You'll notice that the crypto operations themselves are handled in regular single-threaded code (eg in onion_ntor.c), and that the parallelism happens on a higher level than a single crypto operation.
Also I was wondering how the Ahmia automated blacklisting was planned to work?
This isn't something I've been working on; maybe somebody else can answer this question. (Ahmia folks?)
best wishes,
On Thu, Mar 30, 2017 at 2:04 PM, Nick Mathewson nickm@alum.mit.edu wrote: [...]
Also wanted to know if specific knowledge about circuit cryptography was required? As I know of it, but I certainly cannot make my own fully homomorphic cryptosystem, is it more in the steps of the system has already been made, it just needs to be parallelised correctly?
Right; all our crypto is implemented in Tor right now. I'm not sure why you're mentioning homomorphic cryptosystems; we don't need one of those here.
Ohhhh! A colleague points out to me that the term "circuit" is also used in homomorphic encryption. I'm talking about Tor circuits here -- they are a construction using on the Tor network, and don't have anything to do with homomorphic encryption. For more information on them, see https://www.torproject.org/about/overview.html.en and tor-spec.txt