Hello devs,
I'm seeking advice from people with experience in writing server-side Java applications.
Let me give you some background about this request: for the past five years, I have been developing server-side Java applications which all process large amounts of Tor directory data and provide their output via a web interface.
Examples:
- The metrics data processor (metrics-db) fetches Tor descriptors from the Tor directory authorities, the bridge authority, etc., performs some sanity-checks, and provides descriptors by type as tarballs. We're talking about roughly 7 GiB new bzip2-compressed data per month.
- The metrics website (metrics-web) uses the output from the metrics data processor, stuffs everything into a database, computes aggregates, and presents results in graphs and .csv files.
- The Onionoo service processes the same data from the metrics data processor, but provides statistics per Tor relay, not for the Tor network as a whole. The processing is done every two hours and may take 30 minutes to 1.5 hours, depending on how overloaded the server is.
- The ExoneraTor service, again, uses the same data and puts it in a database to answer whether a certain IP address has been a Tor relay at some point in the past.
That's what is done. And here's how it's done under the surface:
- There's one or more cronjobs, each of which starts an ant task to process data. Some of these tasks import data into the database, others store results in the file system.
- Each application uses a web application deployed in Tomcat to provide results to web users. Most things are written in servlets, some use JSPs.
My problem is that this approach is rather fragile and difficult to setup for new volunteers. I'm aware of that, and I'd like to improve it.
My question is: what Java frameworks should I be looking at for the applications described above? Bonus points if something is in Debian stable.
Note that "switch to $some_other_programming_language" is not a very useful answer to me, at least not for the larger applications. There's just too much existing code and not enough developer time to port it.
Thanks in advance!
All the best, Karsten