Hi all,
As an action from today's call, here's a collection of most pertinent M-lab docs. If you're looking for something that doesn't seem to be here, or if you have questions, don't hesitate to ask Stephen and me.
Cheers, Meredith
*Documents: *
- Platform architecture dochttp://measurementlab.net/sites/default/files/HowtoContributetoM-LabServerInfrastructure.pdf. The intended audience is hosting partners (who donate space/power/connectivity/hardware). - Policies and procedures for approvalhttp://measurementlab.net/sites/default/files/SubmissionguidelinesforM-Labexperiments.pdf of new experiments. Note that I wrote this, and that it may not be specific enough, but I can answer any questions. Note that OONI has been approved as meeting these criteria per the Steering Committee process. - Server map http://measurementlab.net/mlab_sites. Get a look at the global M-Lab footprint. - List of existing M-Labhttp://measurementlab.net/measurement-lab-tools measurement tools. Code repos and documentation for individual tools should be linked from here. If they're not, let me know and I'll add this. - Access existing M-Lab data http://measurementlab.net/data, tarballs and BigQuery. - Blog post describing the process to visualize data using BigQueryhttp://dmadev.com/2012/11/19/visualizing-m-lab-data-with-bigquery/, a good introduction both to the tool and the data structure. If you'd like whitelisted access to M-Lab data on BigQuery, let me know and I'll make it happen. - Founding documenthttp://measurementlab.net/sites/default/files/mlab_intro_and_server_requirements.pdf, which will give you background on M-Lab's founding motivations. Many of the facts quoted here have changed (e.g. number of servers), but as history this is instructive. - Existing visualizations http://measurementlab.net/visualization, which are fun and may be instructive.
Also requested was a description of typical package deployment and experiment management on M-Lab.
Typical package management follows these basic steps:
1) build rpm package from source repos in http://github.com/m-lab-tools/ 2) copy and sign rpm package to M-Lab yum repository
Now per-machine, the slice (i.e. experiment VM) is instantiated:
3) upon slice vm creation on an M-lab server, the first-phase of initialization includes bootstrapping the yum configuration of the vm's filesystem (i.e. /etc/yum.conf and /etc/yum.slice.d/slice.repo list). These import the public signing key, and point to CentOS mirrors and the M-Lab slice package repository. 4) the second phase of slice initialization then tries to install the slice package: i.e. yum install mlab_ooni 5) on success, the service starts. on failure, stop. 6) M-Lab uses external monitoring to identify failed services and inform a directory service like mlab-ns of available servers.
Since the rpm packages are effectively public, they do not include any private information. Some experiments have conditional deployment. i.e. if site==nuq01, then run an additional service. As well, the initialize scripts for a package could generate per-machine key pairs.
The scenario I remember us talking about in Berlin involved M-Lab harvesting the public part of the generated slice keys, or the onion urls from the ooni slice, and publishing them in some way, through mlab-ns, or other service like the ooni bouncer.
Does that sound workable for oonib?
Best, Stephen
On 08/30/2013 04:50 PM, Meredith Whittaker wrote:
Hi all,
As an action from today's call, here's a collection of most pertinent M-lab docs. If you're looking for something that doesn't seem to be here, or if you have questions, don't hesitate to ask Stephen and me.
Cheers, Meredith
*Documents: *
- Platform architecture doc http://measurementlab.net/sites/default/files/HowtoContributetoM-LabServerInfrastructure.pdf. The intended audience is hosting partners (who donate space/power/connectivity/hardware).
- Policies and procedures for approval http://measurementlab.net/sites/default/files/SubmissionguidelinesforM-Labexperiments.pdf of new experiments. Note that I wrote this, and that it may not be specific enough, but I can answer any questions. Note that OONI has been approved as meeting these criteria per the Steering Committee process.
- Server map http://measurementlab.net/mlab_sites. Get a look at the global M-Lab footprint.
- List of existing M-Lab http://measurementlab.net/measurement-lab-tools measurement tools. Code repos and documentation for individual tools should be linked from here. If they're not, let me know and I'll add this.
- Access existing M-Lab data http://measurementlab.net/data, tarballs and BigQuery.
- Blog post describing the process to visualize data using BigQuery http://dmadev.com/2012/11/19/visualizing-m-lab-data-with-bigquery/, a good introduction both to the tool and the data structure. If you'd like whitelisted access to M-Lab data on BigQuery, let me know and I'll make it happen.
- Founding document http://measurementlab.net/sites/default/files/mlab_intro_and_server_requirements.pdf, which will give you background on M-Lab's founding motivations. Many of the facts quoted here have changed (e.g. number of servers), but as history this is instructive.
- Existing visualizations http://measurementlab.net/visualization, which are fun and may be instructive.
--
Meredith Whittaker Program Manager, Google Research Google NYC
Thank you very much for such a detailed answer.
This is exactly what we were looking for.
More replies inter lines:
On Sep 3, 2013, at 8:34 PM, Stephen Soltesz soltesz@opentechinstitute.org wrote:
Also requested was a description of typical package deployment and experiment management on M-Lab.
Typical package management follows these basic steps:
- build rpm package from source repos in http://github.com/m-lab-tools/
- copy and sign rpm package to M-Lab yum repository
Now per-machine, the slice (i.e. experiment VM) is instantiated:
- upon slice vm creation on an M-lab server, the first-phase of
initialization includes bootstrapping the yum configuration of the vm's filesystem (i.e. /etc/yum.conf and /etc/yum.slice.d/slice.repo list). These import the public signing key, and point to CentOS mirrors and the M-Lab slice package repository. 4) the second phase of slice initialization then tries to install the slice package: i.e. yum install mlab_ooni 5) on success, the service starts. on failure, stop. 6) M-Lab uses external monitoring to identify failed services and inform a directory service like mlab-ns of available servers.
Can we extend this to inform a different service that is not mlab-ns? We have concluded that it's probably ideal to not rely to heavily on m-lab for the discovery and of collectors and helpers.
We would ideally want to be able to do a HTTP POST request over Tor to a certain service to update the list of collectors and helpers that are currently running as is detailed in this ticket.
In light of this data we have devised this possible strategy for learning about new collector and test helpers: https://github.com/TheTorProject/ooni-probe/issues/183.
Would it be a problem to install tor, torsocks and curl on these machines and add a script to push to such a service?
Does this monitoring service also routinely check to monitor the health of the nodes?
Since the rpm packages are effectively public, they do not include any private information. Some experiments have conditional deployment. i.e. if site==nuq01, then run an additional service. As well, the initialize scripts for a package could generate per-machine key pairs.
The reason why this is the case is now clear to me. If we can add some extra commands to the monitoring scripts that would be sufficient for us.
Would it be possible to service the monitoring infrastructure with private key material?
Thanks again for your thorough reply.
~ Art.