Hi!
I forgot to add a fancy header like this like month, but I want to say "hi!" to everyone, and "welcome back to our monthly reports from the sysadmin team"! :)
Hopefully everyone can manage to stay safe in this crazier-than-usual holiday season!
**Agenda**
- Roll call: who's there and emergencies - Roadmap review - Triage rotation - Holiday planning - TPA survey review - Other discussions - New intern - Next meeting - Metrics of the month
# Roll call: who's there and emergencies
anarcat, hiro, gaba, no emergencies
The meeting took place on IRC because anarcat had too much noise.
# Roadmap review
Did a lot of cleanup in the dashboard:
https://gitlab.torproject.org/tpo/tpa/team/-/boards
In general, the following items were priotirized:
* [GitLab CI][] * finish setting up the Cymru network, especially the [VPN][] * [BTCpayserver][] * [tor browser build boxes][] * small tickets like the [git stuff][] and triage (see below)
[git stuff]: https://gitlab.torproject.org/tpo/tpa/team/-/boards?&label_name%5B%5D=Gi... [tor browser build boxes]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/34122 [BTCpayserver]: https://bugs.torproject.org/tpo/tpa/team/33750 [VPN]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40097 [GitLab CI]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40095
The following items were punted to the future:
* SVN retirement (to January) * password management (specs in January?) * Puppet role account and verifications
We briefly discussed Grafana authentication, because of a request to [create a new account on grafana2][]. anarcat said the current model of managing the htpasswd file in Puppet doesn't scale so well because we need to go through this process every time we need to grant access (or do a password reset) and identified 3 alternative authentication mechanisms:
[create a new account on grafana2]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40102
1. htpasswd managed in Puppet (status quo) 2. Grafana users (disabling the htpasswd, basically) 3. LDAP authentication
The current authentication model was picked because we wanted to automate user creation in Puppet, and because it's hard to create users in Grafana from Puppet. When a new Grafana server is setup, there's a small window during which an attacker could create an admin account, which we were trying to counter. But maybe those concerns are moot now.
We also discussed passord management but that will be worked on in January. We'll try to set a roadmap for 2021 in January, after the results of the survey have come in.
# Triage rotation
Hiro brought up the idea of rotating the triage work instead of having always the same person doing it. Right now, anarcat looks at the board at the beginning of every week and deals with tickets in the "Open" column. Often, he just takes the easy tickets, drops them in ~Next, and just does them, other times, they end up in ~Backlog or get closed or at least have some response of some sort.
We agreed to switch that responsability every two weeks
# Holiday planning
anarcat off from 14th to the 26th, hiro from 30th to jan 14th
# TPA survey review
anarcat is [working on a survey][] to get information from our users to plan the 2021 roadmap.
[working on a survey]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40061
People like the survey in general, but the "services" questions were just too long. It was suggested to remove services TPA has nothing to do with (like websites or metrics stuff like check.tpo). But anarcat pointed out that we need to know which of those services are important: for example right now we "just know" that check.tpo is important, but it would be nice to have hard data that confirms it.
Anarcat agreed to separate the table into teams so that it doesn't look that long and will submit the survey back for review again by the end of the week.
# Other discussions
## New intern
[MariaV][] just started as an Outreachy intern to work on Anonymous Ticket System. She may be joining the `#tpo-admin` channel and may join the gitlab/tooling meetings.
Welcome MariaV!
[MariaV]: https://mviolante.com/
# Next meeting
Quick check-in on December 29th, same time.
# Metrics of the month
* hosts in Puppet: 79, LDAP: 82, Prometheus exporters: 133 * number of apache servers monitored: 28, hits per second: 205 * number of nginx servers: 2, hits per second: 3, hit ratio: 0.86 * number of self-hosted nameservers: 6, mail servers: 12 * pending upgrades: 1, reboots: 0 * average load: 0.34, memory available: 1.80 TiB/2.39 TiB, running processes: 481 * bytes sent: 245.34 MB/s, received: 139.99 MB/s * [GitLab tickets][]: 129 issues including... * open: 0 * icebox: 92 * backlog: 20 * next: 9 * doing: 8 * (closed: 2130)
[Gitlab tickets]: https://gitlab.torproject.org/tpo/tpa/team/-/boards
The upgrade prediction graph has been retired since it keeps predicting the upgrades will be finished in the past, which no one seems to have noticed from the last report (including me).
Metrics also available as the main Grafana dashboard. Head to https://grafana.torproject.org/, change the time period to 30 days, and wait a while for results to render.
tor-project@lists.torproject.org