On 5/26/12 9:30 AM, Sebastian G. <bastik.tor> wrote:
Karsten Loesing, 22.05.2012 09:24:
Unless one objects or you disagree I'm going to upload the files I created and explain how and maybe I can say even why.
No objections at all. Open discussion is good.
I created a Blog, just because I wanted it some when in the past, but found it silly. That's the channel I planed to use. Maybe it's OK to put it on a Tor-List as well, but maybe it's considered as noise.
I wonder if the Tor wiki would be a better place to collect ideas for reversing the bridge descriptor sanitizing process. Feel free to grab a new page in doc/ and start describing what you did.
I did just that.
https://trac.torproject.org/projects/tor/wiki/doc/DataExtractionForCompariso...
Thanks for creating that page. Looks line a fine start, though you'll want to automate more things when looking at 2012 tarballs.
grep and friends are fine tools to process Tor descriptors. If you can, find a Unix/Linux-like environment for Windows (Cygwin?) and combine the powers of grep with sort, uniq, and maybe sed or awk. These tools are friggin' fast!
If you're comfortable with Java and want to do more fancy stuff with Tor descriptors, take a look at metrics-lib:
https://gitweb.torproject.org/metrics-lib.git
If you're a Python person, you'll like stem, even though it only implements parsing of a subset of Tor descriptors. More to come soon:
https://gitweb.torproject.org/stem.git
Best, Karsten