Re: [ooni-dev] Some ideas on the visualization of OONI data

18 Oct 2014


      ...
To be honest, at the time i wrote the email i knew that mongoDB
provided sharding which should provide horizontal scaling but at the
time i didn't know how it worked in mongodb, because i hadn't time to
dig through the docs. Now, after learning a bit more about mongodb i
still don't know if i know :) but i agree with you, this is not
distributed.
ah interesting mongodb has built in sharding:
http://docs.mongodb.org/manual/core/sharding-introduction/
perhaps you are correct about mongo db in that it does seem like it would scale well.
however we have to carefully evaluate several more criteria before
choosing a data store. for instance operational costs should always be evaluated:
Is it a pain to setup? (sharded mongo db seems heavy weight!)
Is it a pain to add a new replica to a replica set?
How are additional shards added?
Does balancing the cluster after adding additional shards kill performance and take a long time? (most likely yes)
...
So, i think we should index the reports to provide a query API, this
still applies, but, should we build a distributed datastore that will
fit with every deployed collector? or a central respository that grabs
the reports of the collectors and index them? should we care at all?
Yes... "indexed" reports sound much easier to work with than just the reports...
however it is not yet clear that we really need the datastore to be distributed.
Highly or mostly highly availability might be a requirement for this project.
That is much easier to accomplish!
OK... so if we go with one of these CF (column family) data stores... then we must keep in mind
the types of queries we will need when creating the schema. Another possibility would be Redis.
It supports a set-theoretic query language... Also I've heard good things about CouchDB.
I think we should look at these different datastore possibilities and discuss potential schema
and query design for our project. I suspect a discussion of schema and query patterns will be more useful
than discussing operational properties especially if a centralized datastore is good enough.
cheers,
david

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [ooni-dev] Some ideas on the visualization of OONI data