tor-dev November 2014

tor-dev@lists.torproject.org

68 participants
65 discussions

Pluggable transports meeting tomorrow (16:00UTC Wednesday 5th of November 2014)
by George Kadianakis 04 Nov '14

04 Nov '14

Hello! just wanted to remind you that the regular biweekly pluggable transports meeting is going to occur tomorrow at 16:00 UTC. Place is the #tor-dev IRC channel in the OFTC network. Thanks for your attention!

1 0

OONI hackfest summary
by Arturo Filastò 04 Nov '14

04 Nov '14

>From October 24th to 26th the OONI team gathered in Berlin for a hackfest. Around 20 people ended up showing up and although most of them were seasoned Oonitarians some fairly new people joined us that I hope will become part of the growing OONI community. The scope of the hackfest was that of data analytics and visualization with special focus on the Tor bridge reachability study we are currently doing. # Bridge reachability study The goal of this study [1] is that of answering some questions concerning the blocking of Tor bridges [2] and pluggable transport [3] enabled bridges in the countries of China, Iran, Russia and Ukraine (test vantage points). To establish a baseline to eliminate the cases in which the bridge is marked as blocked, while it is in fact just offline, we measure also from a vantage point located in the Netherlands. For every test vantage point we perform two types of measurements: * A Bridge reachability measurement [4][5] that attempts to build a tor circuit using the bridge in question * A TCP connect measurement [6][7] that simply does a TCP connect to the bridge IP and port We run both of the measurements to further debug the reason why the blocking is happening, may this be due to a TCP RST or direct IP blocking or tor malfunction. So far this study has been running for a little less than 1 month. # OONI data pipeline In order to produce the aggregate data needed to build visualizations we have built a data pipeline [8] This consists of a series of operations that are done to the raw reports in order to strip out sensitive information and place the collected data into a database. The nice thing is that the data pipeline we have designed is not specific to this study, but can and will be in the future expanded to export data needed to visualize also the other types of measurements done by OONI. The data pipeline is comprised of 3 steps (or states, depending on how you want to look at it). When the data is submitted to a OONI collector it is synchronized with the aggregator. This is a central machine responsible for running all the data processing tasks, storing the collected data in a database and hosting a public interface to the sanitised reports. Since all the steps are independent from one another it is not necessary that they run on the machine, but it may also be more distributed. Once the data is on the aggregator machine it is said to be in the RAW state. The sanitise task is then run on the RAW data to remove sensitive information and strip out some superfluous information. A RAW copy of every report is also stored in a private compressed archive for future reference. Once the data is sanitised it is said to tbe in SANITISED state. At this point a import task is run on the data to place it inside of a database. The SANITISED reports are then place in a directory that is publicly exposed to the internet to allow people to download also a copy of the YAML reports. At this point is is possible to run any export task that performs queries on the database and produces as output some documents to be used in the data visualizations (think JSON, CSV, etc.). # The OONI hackfest The first day of the hackfest was spent going over the scope of the project we would be working on in the following days as well as working in groups that were interested in tacking the design of one aspect of the problem. Sticky notes were plentiful and helped us have a clear vision of what lied ahead of us. By the end of the first day we had clear what were the set of tasks that were needed to achieve our goals and which teams would be responsible for doing what. The second day was almost entirely dedicated to hacking and everybody had a task to complete that was either completed by the end of the day or sooner. Some people even completed their initially assigned task before the end of the day and came back asking for more! By the end of the second day we had a real data set to hand over to the visualization team, to start producing some pretty graphs based on real data. We decided that the first visualization we wanted to do should be kept as simple as possible and be something that we could also use to debug the data we had collected. It should tell us which bridges were working when and it should present the information in a way that would highlight the country involved and the pluggable transport type. A prototype of it can be seen here: http://reports.ooni.nu/analytics/bridge_reachability/timeline/ The code for this visualization can be found here: https://github.com/Shidash/OONI-Bridge-Reachability-Timeline # Next steps * Write scripts for generating the bridge_db.json document based on the data that is given to us from the bridge db team https://trac.torproject.org/projects/tor/ticket/13570 * Align the dates in the visual timeline https://trac.torproject.org/projects/tor/ticket/13639 * Better tokenising for bridges so that bridges that have the same fingerprint, but different transport are grouped properly https://trac.torproject.org/projects/tor/ticket/13638 * Finish setting up the docker containers for the steps of the data pipeline https://trac.torproject.org/projects/tor/ticket/13568 * Setup disaster recovery procedure and backup: https://trac.torproject.org/projects/tor/ticket/13584 * Setup monitoring of the probes. https://trac.torproject.org/projects/tor/ticket/12549 * Add support for obfs4 https://trac.torproject.org/projects/tor/ticket/13597 * Set upper bound in comparison with the control in the bridge reachability timeline https://trac.torproject.org/projects/tor/ticket/13640 * Make sure that the control measurement is for the specific bridge measurement https://trac.torproject.org/projects/tor/ticket/13655 Questions and comments should be directed to the ooni-dev mailing list or to the #ooni channel on irc.oftc.net. Have fun! ~ Arturo [1] https://lists.torproject.org/pipermail/ooni-dev/2014-October/000184.html [2] https://www.torproject.org/docs/bridges [3] https://www.torproject.org/docs/pluggable-transports.html.en [4] https://gitweb.torproject.org/ooni/spec.git/blob/HEAD:/test-specs/ts-011-br… [5] https://gitweb.torproject.org/ooni-probe.git/blob/HEAD:/ooni/nettests/block… [6] https://gitweb.torproject.org/ooni/spec.git/blob/HEAD:/test-specs/ts-008-tc… [7] https://gitweb.torproject.org/ooni-probe.git/blob/HEAD:/ooni/nettests/block… [8] https://github.com/TheTorProject/ooni-pipeline/blob/master/Readme.md#ooni-p…

1 0

Re: [tor-dev] [HTTPS-Everywhere] "darkweb everywhere" extension
by yan 03 Nov '14

03 Nov '14

+tor-dev. tl;dr: Would be nice if there were an HTTP response header that allows HTTPS servers to indicate their .onion domain names so that HTTPS Everywhere can automatically redirect to the .onion version in the future if the user chooses a "use THS when available" preference. I imagine the header semantics and processing would be similar to HSTS. It would only be noted when sent over TLS and have the max-age and include-subdomains fields. -yan yan wrote: > Hi all, > > Some people have requested for the "Darkweb Everywhere" extension [1] to > be integrated into HTTPS Everywhere. This is an extension for Tor > Browser that redirects users to the Tor Hidden Service version of a > website when possible. > > I'm supportive of the idea; however, I'm worried that since .onion > domain names are usually unrelated to a site's regular domain name, a > malicious ruleset would be hard to detect. AFAIK Darkweb Everywhere only > defends against this by publishing a doc in their Github repo that cites > evidence for each ruleset [2]. > > What if, instead, we asked website owners to send an HTTP header that > indicates the Tor Hidden Service version of their website? Then HTTPS > Everywhere could cache the result (like HSTS) and redirect to the THS > version automatically in the future if the user opts-in. > > If this is something that EFF/Tor would be willing to advocate for, I > would be happy to draft a specification for the header syntax and > intended UA behavior. > > Thanks, > Yan > > > [1] https://github.com/chris-barry/darkweb-everywhere/ > [2] > https://github.com/chris-barry/darkweb-everywhere/blob/master/doc/EVIDENCE.… > _______________________________________________ > HTTPS-Everywhere mailing list > HTTPS-Everywhere(a)lists.eff.org > https://lists.eff.org/mailman/listinfo/https-everywhere >

3 2

Call for Papers: Switzerland - International Conference on Semantic Web Business and Innovation (SWBI2015)
by Conference Updates 03 Nov '14

03 Nov '14

The International Conference on Semantic Web Business and Innovation (SWBI2015) The University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis) October 7-9, 2015 http://sdiwc.net/conferences/swbi2015/ All registered papers will be included in SDIWC Digital Library ================================================================ The proposed conference on the above theme will be held at he University of Applied Sciences and Arts Western Switzerland (HES-SO Valais-Wallis) on October 7-9, 2015 which aims to enable researchers build connections between different digital applications. The conference welcomes papers on the following (but not limited to) research topics: *Semantic Web and Linked Data - Database, IR, NLP and AI technologies for the Semantic Web - Geospatial Semantic Web - Information Extraction from unstructured data - Information visualization of Semantic Web data and Linked Data - Internet of things - Languages, tools, and methodologies for representing and managing Semantic Web data - Linked open data - Management of Semantic Web data and Linked Data - Ontology engineering and ontology patterns for the Semantic Web - Ontology modularity, mapping, merging, and alignment - Ontology-based data access - Privacy and Security - Query and inference over data streams - Search, query, integration, and analysis on the Semantic Web - Semantic business process management - Semantic Sensor networks - Semantic technologies for mobile platforms - Semantic Web and Linked Data for Cloud Environments - Semantic Web, Ontologies - Social networks and processes on the Semantic Web - Supporting multi-linguality in the Semantic Web - User Interfaces and interacting with Semantic Web data and Linked Data *Business and Innovation in Semantic Web - Business Model Innovation - Business Models and E-Commerce - Business Technology Intelligence - Challenges for change in semantic services - Collaborative improvement and innovation - E-Business Applications and Software - E-commerce Technology Adoption - E-commerce, E-Business Strategies - E-tailing and Multi-Channel selling - Evolution of Business model for Semantic Web Applications - High-tech marketing - Implementation strategies for responsible innovation - Innovation for E-Business - Innovative methods and tools for products and services - Practices and Cases in E-Commerce - Production of Knowledge Economy - Technology and Business Transformation - Technology strategies - The Latest Trends in Linked Data - Web Advertising and Web Publishing - Web and Mobile Applications Researchers are encouraged to submit their work electronically. All papers will be fully refereed by a minimum of two specialized referees. Before final acceptance, all referees comments must be considered. Important Dates ============== The Submission is open Notification of Acceptance: 6 weeks from the submission date Camera Ready Submission: September 14, 2015 Registration Deadline: September 14, 2015 Conference Dates: October 7-9, 2015

1 0

On the visualization of OONI bridge reachability data
by George Kadianakis 02 Nov '14

02 Nov '14

== What is bridge reachability data? == By bridge reachability data I'm referring to information about which Tor bridges are censored in different parts of the world. The OONI project has been developing a test that allows probes in censored countries to test which bridges are blocked and which are not. The test simply takes as input a list of bridges and tests whether they work. It's also able to test obfuscated bridges with various pluggable transports (PTs). == Why do we care about this bridgability data? == A few different parties care about the results of the bridge reachability test [0]. Some examples: Tor developers and censorship researchers can study the bridge reachability data to learn which PTs are currently useful around the world, by seeing which pluggable transports get blocked and where. We can also learn which bridge distribution mechanisms are busted and which are not. Bridge operators, the press, funders and curious people, can learn which countries conduct censorship and how advanced technology they use. They can also learn how long it takes jurisdictions to block public bridges. And in general, they can get a better understanding of how well Tor is doing in censorship circumvention around the world. Finally, censored users and world travelers can use the data to learn which PTs are safe to use in a given jurisdiction. == Visualizing bridge reachability data == So let's look at the data. Currently, OONI bridge reachability reports look like this: https://ooni.torproject.org/reports/0.1/CN/bridge_reachability-2014-07-02T0… and you can retrieve them from this directory listing: https://ooni.torproject.org/reports/0.1/ That's nice, but I doubt that many people will be able to access (let alone understand) those reports. Hence, we need some kind of visualization (and better dir listing) to conveniently display the data to human beings. However, a simple x-to-y graph will not suffice: our ploblem is multidimensional. There are many use cases for the data and bridges have various characteristics (obfuscation method, distribution method, etc.) hence there are more than one useful ways to visualize this dataset. To give you an idea, I will show you two mockups of visualizations that I would find useful. Please don't pay attention to the data itself, I just made some things up while on a train. Here is one that shows which PTs are blocked in which countries: https://people.torproject.org/~asn/bridget_vis/countries_pts.jpg The list would only include countries that are blocking at least a bridge. Green is "works", red is "blocked". Also, you can imagine the same visualization, but instead of PT names for columns it has distribution methods ("BridgeDB HTTP distributor", "BridgeDB mail distributor", "Private bridge", etc.). And here is another one that shows how fast jurisdictions block the default TBB bridges: https://people.torproject.org/~asn/bridget_vis/tbb_blocked_timeline.jpg These visualizations could be helpful, but they are not the only ones. What other use cases do you imagine using this dataset for? What graphs or visualizations would you like to see? [0]: Here are some use cases: Tor developers / Researcers: *** Which pluggable transports are blocked and where? *** Do they do DPI? Or did they just block the TBB hardcoded bridges? *** Which jurisdictions are most aggressive and what blocking technology do they use? *** Do they block based on IP or on (IP && PORT)? Users: *** Which pluggable transport should I use in my jurisdiction? Bridge operators / Press / Funders / Curious people: *** Which jurisdictions conduct Tor censorship? (block pluggable transports/distribution methods) *** How quickly do jurisdictions block bridges? *** How many users/traffic (and which locations) did the blocked bridges serve? **** Can be found out through extrainfo descriptors. *** How well are Tor bridges doing in censorship circumvention?

4 6

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

tor-dev November 2014