Hello devs,
I'm seeking advice from people with experience in writing server-side
Java applications.
Let me give you some background about this request: for the past five
years, I have been developing server-side Java applications which all
process large amounts of Tor directory data and provide their output via
a web interface.
Examples:
- The metrics data processor (metrics-db) fetches Tor descriptors from
the Tor directory authorities, the bridge authority, etc., performs some
sanity-checks, and provides descriptors by type as tarballs. We're
talking about roughly 7 GiB new bzip2-compressed data per month.
- The metrics website (metrics-web) uses the output from the metrics
data processor, stuffs everything into a database, computes aggregates,
and presents results in graphs and .csv files.
- The Onionoo service processes the same data from the metrics data
processor, but provides statistics per Tor relay, not for the Tor
network as a whole. The processing is done every two hours and may take
30 minutes to 1.5 hours, depending on how overloaded the server is.
- The ExoneraTor service, again, uses the same data and puts it in a
database to answer whether a certain IP address has been a Tor relay at
some point in the past.
That's what is done. And here's how it's done under the surface:
- There's one or more cronjobs, each of which starts an ant task to
process data. Some of these tasks import data into the database, others
store results in the file system.
- Each application uses a web application deployed in Tomcat to provide
results to web users. Most things are written in servlets, some use JSPs.
My problem is that this approach is rather fragile and difficult to
setup for new volunteers. I'm aware of that, and I'd like to improve it.
My question is: what Java frameworks should I be looking at for the
applications described above? Bonus points if something is in Debian
stable.
Note that "switch to $some_other_programming_language" is not a very
useful answer to me, at least not for the larger applications. There's
just too much existing code and not enough developer time to port it.
Thanks in advance!
All the best,
Karsten