Hi!
It feels so strange to say that this year around, but... happy new year everyone! Let's hope we can do better this time around. ;)
Here's your first sysadmin report for 2021, hopefully we'll keep you informed of our progress steadily in the coming year. Right now we're working on the roadmap and, even though we asked you for feedback in the user survey, it's still time to steer us in the good direction. We have a meeting coming up where we're likely to set that more in stone, so now is a good time if you forgot to respond to the survey...
Now onto the minutes.
Agenda:
- Roll call: who's there and emergencies - Dashboard review - Roadmap 2021 proposal - 2020 retrospective - Services survey - Goals for 2021 - Other discussions - Next meeting - Metrics of the month
# Roll call: who's there and emergencies
present: hiro, gaba, anarcat
[GitLab backups are broken][]: it might need more disk space than we need. just bump disk space in the short term, consider changing the backups system, in the long term.
[GitLab backups are broken]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40143
# Dashboard review
We [reviewed the dashboard][], too much stuff in January, but we'll review in February.
[reviewed the dashboard]: https://gitlab.torproject.org/tpo/tpa/team/-/boards
# Roadmap 2021 proposal
We discussed the [roadmap project][] anarcat worked on. We reviewed the 2020 retrospective, talked about the services survey, and discussed goals for 2021.
[roadmap project]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/roadmap/2021
## 2020 retrospective
We reviewed and discussed the [2020 roadmap evaluation][] that anarcat prepared:
[2020 roadmap evaluation]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/roadmap/2021#2020-roadmap...
* **what worked?** we did the "need to have" even through the apocalypse, staff reduction and all the craziness of 2020! success! * **what was a challenge?** * monthly tracking was not practical, and hard to do in Trac. things are a lot easier with GitLab's dashboard. * it was hard to work through the pandemic. * **what can we change?** * do quarterly-based planning * estimates were off because so many things happened that we did not expect. reserve time for the unexpected, reduce expectations. * ticket triage is rotated now.
## Services survey
We discussed the [survey results analysis][] briefly, and how it is used as a basis for the roadmap brainstorm. The two major services people use are GitLab and email, and those will be the focus of the roadmap for the coming year.
[survey results analysis]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/roadmap/2021#survey-resul...
## Goals for 2021
* email services stabilisation ("submission server", "my email end up in spam", CiviCRM bounce handling, etc) - consider [outsourcing email services][] * gitlab migration continues (Jenkins, gitolite) * simplify / improve puppet code base * stabilise services (e.g. gitlab, schleuder)
[outsourcing email services]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/submission#cost
Next steps for the roadmap:
* try to make estimates * add need to have, nice to have * anarcat will work on a draft based on the brainstorm * we meet again in one week to discuss it
# Other discussions
Postponed: metrics services to maintain until we hire new person
# Next meeting
Same time, next week.
# Metrics of the month
Fun fact: we crossed the 2TiB total available memory back in November 2020, almost double from the previous report (in July), even with the number of hosts in Puppet remained mostly constant (78 vs 72). This is due (among other things) to the new Cymru Ganeti cluster that added a whopping 1.2TiB of memory to our infrastructure!
* hosts in Puppet: 82, LDAP: 85, Prometheus exporters: 134 * number of Apache servers monitored: 27, hits per second: 198 * number of Nginx servers: 2, hits per second: 3, hit ratio: 0.86 * number of self-hosted nameservers: 6, mail servers: 12 * pending upgrades: 3, reboots: 0 * average load: 0.29, memory available: 2.00 TiB/2.61 TiB, running processes: 512 * bytes sent: 265.07 MB/s, received: 155.20 MB/s * [GitLab tickets][]: 113 tickets including... * open: 0 * icebox: 91 * backlog: 20 * next: 12 * doing: 10 * (closed: 2165)
[Gitlab tickets]: https://gitlab.torproject.org/tpo/tpa/team/-/boards
Now also available as the main Grafana dashboard. Head to https://grafana.torproject.org/, change the time period to 30 days, and wait a while for results to render.