[heroes] Short status update
Hi Just if you wonder what happened during the last weeks.... 1) DNS Our FreeIPA instance is finally the one and only official instance for * opensuse.org * opensuse.de * opensuse.fr The connection to the MF-IT network and servers, as mentioned at https://progress.opensuse.org/projects/opensuse-admin-wiki/wiki/DNS is history. Only the DNS servers inside the heroes network are left and do what good DNS servers do: just work. Each server is answering ~15 requests per second, according to the statistics. ==> I just want to wait until next month, before I create an announcement that "openSUSE is now in Heroes hand" - might be that I need some help with the text and merge request... ;-) 2) Monitoring I started with https://monitor.opensuse.org/icingaweb2/ and connected it directly with the LDAP server. Means: people in the "monitoring-admins" group have full access, people in "monitoring-user" have normal rights. But please note that there are currently just 2 hosts monitored. The reason: the old icinga setup uses check_mk for autodiscovery and autogeneration of most checks. This is not possible any more with icinga2. I want to use Salt as replacement instead. Instead of "pnp4nagios", the new icinga2 uses Grafana: https://monitor.opensuse.org/grafana/d/YyV2BduWk/base-metrics?orgId=1&refresh=30s It's the same instance which is also providing some graphs for: * our elasticsearch cluster for the wikis * the Galera cluster * the PostgreSQL cluster => https://monitor.opensuse.org/grafana/ This instance is also connected to our LDAP. Data sources are currently InfluxDB (for icinga2), PostgreSQL and Prometheus (for Galera). Instead of just storing the logs of our hosts on a hard drive and waiting if someone wants to have a look at them (which honestly more or less never happens), I decided to move forward and installed Graylog here: https://graylog.opensuse.org/ As - for example - https://graylog.opensuse.org/dashboards/5e6ea77657c155111a8fbd37 shows, this makes it much easier to get an overview of "what's currently going on" on our machines. The filtering and search functionality is IMHO way easier than the alternative ELK stack (which is also hardware to maintain as package). I did not setup big dashboards, yet, in the hope that some other volunteer steps in :-) Ah: obvious to say that Graylog also uses our LDAP and normally all monitoring-admins are also Graylog admins. But - as with Grafana - if you want more than your original rights, just ping. 3) Support I hope I did not forget/overlooked too many requests for additional resources or machines. Open (incl. "feedback") progress tickets are down to 142[1]. 4) Mirrors I tried to reduce the amount of messages sent to the admin-auto mailing list from provo-mirror.opensuse.org, olaf (our scanner) and pontifex (aka download.o.o) over the last weeks. This included some (small) fixes in some packages (did I tell anyone that I don't know python?). Overall just minor stuff. But if you wondered, you might know whom to ask now. 5) CaaSP As nobody was actively maintaining the CaaSP cluster any longer (since ~ half a year), I asked in https://progress.opensuse.org/issues/54977 for someone who wants to step in - but got no feedback. As nobody was loggin in to the machines as well and their content (with one exception below) was already migrated to other machines, I shut down the machines at 1st of March. The only real issue since then is that we lost our docker containers with the gitlab runners. I'm sorry for that, but these runners were outdated (42.3, anyone?) anyway. I discussed with Ricardo (our Gitlab Guru) and we decided to setup two independent machines just to host the runners (via docker containers) in the future (gitlab-runner{1,2}.infra.opensuse.org). This is the next TODO on the list - I'm currently just waiting for a time slot from Ricardo to finalize the setup. --- There is probably a bit more - but my brain is currently blocking my memory. I just want to give you a short status summary of topics I still remember. Stay healthy! Lars -- [1]: https://monitor.opensuse.org/pnp4nagios/graph?host=redmine.infra.opensuse.org&srv=Heroes_tickets&start=1573631689&end=1584559999 -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
participants (1)
-
Lars Vogdt