-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi heroes, Since I typed much of the below during our meeting in Nuremberg, I'm also sending it here for the larger group. Attendees: Lars Vogdt Markus 'darix' Rueckert Martin Caj Karol Babioch Ricardo Klein Christian Boltz Per Jessen Oliver Kurz Olaf Reinert Adam Majer Theo Chatzimichos Michael Stroeder Bernhard M. Wiedemann Proposed Topics: salt widehat , download & co relation beween heroes and SUSE IT - where is the boundary how to better structure IRC meetings? How to better use CaaSP and/or AWS for flexible infrastructure? priorities +ownership + leadership write more documentation budget discussion by LVogdt+GeraldPfeiffer: wants plans in addition to ideas 3 budget proposals wanted for widehat/rsync.o.o Lars moved to Build Service Operators (Buildops) download.o.o - darix working on mirrorbrain v3 ; There will be a prototype in 1-6 months Lars proposed to review machines to find out if one has trouble list from something like: host -t axfr opensuse.org 192.168.47.101 101.o.o dead - was on cloudfoundry docs.o.o = activedoc - serving static files admin.o.o = progress amqp.o.o = publish messages from OBS - allows to react to events anna.o.o + elsa = proxy host api.o.o = OBS apparmor.o.o = redirect to gitlab page auth = login proxy for OBS beans = piwik analytics - enginfra bugfilesstage = bugzilla parts - enginfra bugs bugzilla carnia = unknown/down cdn.o.o = sponsored ci cn commons community conference ; events conncheck +-test = used by networkmanager to check for WLAN captive portal ; handled directly by haproxy connect = kill it! need something for membership management countdown counter discuss = discource for testing ; to replace vbulleting download downloadcontent drpmsync = unused ; former part of download infra duke = NFS server for CaaSP education = dead elections = Helios voting system ; needs update etherpad events + -test fate + features = openFATE dead/zombie files = hosting wiki files fontinfo = from pgajdos ; microsite foreman forums +stage = vbulletin ftp = download.o.o ... influx irc = chat kernel Areas of topics: download infrastructure (widehat, download, mirrorbrain) Mail (mailing lists, opensuse.org forwarding) machine management (salt) general infrastructure: DNS provided by freeIPA ; VPN : proxies different networks: one accessible by heroes = *.infra.o.o ; one net for OBS ; accessed through proxy forwarder Wiki - green ; hosted in Nuremberg ; cbolz cares for updates On Generic Infrastructure Theo gave introduction on plans with CaaSP/Kubernetes some services (wordpress, wiki, redmine) could be hosted + managed externally. Need to ensure it does not affect SUSE CC/FIPS certification s containers can have state, if needed (e.g. postfix spool dir) we prefer self-managed for our services katacontainers can be used for more isolation form working groups so that people can continue whenever they want darix introduction to download infra: OBS pushes to pontifex ; notifies repopusher about some dirs via touching files in a dir ; only works for /repositories users first hit downloadcontent.o.o , then widehat mirrorbrain database has too many write locks ; darix reworked the DB schema to be lock free - needs adaption in apache mod_mirrorbrain https://github.com/openSUSE/mirrorbrain/wiki/Roadmap has more detail s postgresql DB force-vacuum is troublesome atm because of table-locks during updates. scanners contact mirrors via rsync or ftp to find what files are there and add this info to the database. more on download infra: new scanner code with sha256 or blake (sha3-candidate) for filenames darix will test-deploy a copy for download.suse.de to reduce repo publish delays: want more repo-push via rsync ; still need 2 stage publishing ; 1st push rpms etc without --delete ; 2nd push repomd.xml ; avoid 404 for deleted files before scanner found about the deletion mbscan -a blocks until the slowest mirror finished ; needs to be decoupled /repositories has many files that get written more often than read /distribution and /update are easy AWS traffic was expensive at $13000/month could ask cdn77 or others if they are interested Hetzner sponsoring a widehat for 6 months as stop-gap (jdsn will contact) 30M files x 200 mirrors (103 for Leap 15.1 ; 88 for Leap 15.1 updates ; 58 for tumbleweed) squid-like caching proxies are troublesome because of cache-invalidation and debian package rebuild counters not working yet. AI: bmwiedemann+darix : file dpkg bug + work with its maintainer ; when that is solved they could relieve downloadcontent fallback-mirror ; can also help against the "only mirror is in Kasachstan"-problem where downloadcontent is dropped from served mirrorlist. ===== containers topic: We should evaluate / try to run as many services with kubernetes as VMs (via kubevirt, virtlet or firecracker-containerd) or containers (with KVM-level isolation using katacontainer (already in Factory), gvisor, firecracker, runq, qemu-microvm, or rancherVM) - use OBS to build openSUSE containers auto-published to registry.opensuse.org - listen to rabbitmq bus for publish events and then use kubernetes' rolling update. It will automatically roll back if the readiness-check fails (e.g. daemon does not start up) Goal: easier management of services (console access) ; get enforced network-isolation between services, so that hacking service1 does not give higher access towards service2 than from outside. This would be automatic with OpenStack security-groups and kubernetes. TODO: find some machines to run tests on. Maybe something with fibre-channel? Then do research. Try cilium for networking. darix is going to write a blog-post about pets vs cattle across HW,VM,Container space. Hint: OBS has obs-worker nodes as cattle that come up clean after reset. Even postfix is not stateless - it needs /var/cache/postfix ==== DNS breakout: goals: Independence, support DNSSEC + CAA records, maybe delegation for sub-teams to be able to update some entries without allowing them to break everything current status quo: freeIPA queried via LDAP/bind for infra.o.o and pushed to MF DNS for external opensuse.org we discussed if we want to run our own public master and most agreed. Reason: more control+independence ; downside: no Anycast-IP for low-latency ; load is expected to be low at 800 requests per second there should be a hidden master DNS server which software to run? most prefer PowerDNS ; 2nd was bind we might do GeoIP-based responses in a subzone, so US users can get faster replies from download.geo.opensuse.org that points to one of multiple places (needs mirrorbrain without 18TB of rpms - just with database) we do not need FreeIPA anymore for DNS ; except for infra.o.o for kerberos and other integrations there is pdnsutil zoneedit (in pdns package) to edit entries as zone-files maybe run NSD in addition on the front, in case PowerDNS has exploitable bugs there is a web-frontend and letsEncrypt integration "lexicon" AppArmor profiles for DNS server confinement exist or will be create d ======= - - GOAL: all core services to be in salt - - GOAL: automating our automation - - GOAL: better monitoring/reporting - - why not using more formulas? they were very promising, but the lack of package management and proper versioning makes them pain to use. Theo suggests to actually create a directory formulas and copy the code there, and stop depending on third party people. - - why we keep simple formulas that do eg only one package installation? eg git formula either because they could be improved, either to make your profile code simpler - - why we don't auto-deploy because of lack of reporting. The following solutions were proposed: - trigger the state.apply via the CI and look at the output on the gitlab CI output - trigger the state.apply via the CI or via a cronjob and send the output to the admin-auto@ ml - trigger the state.apply via the CI and send the output on monitoring (Martin and Ricardo will take a look at this) - - the progress will be tracked in the https://progress.opensuse.org/issues/59918 - - why the haproxy config is not in salt? because Christian didn't do it ... waiting for Theo to update salt to match the manually done changes in the keepalived config, making a highstate run breaking it again - - why the login proxies have incosistent network configuration? - - open MRs ==== Discussion + decision on how to communicate && collaborate: Use progress.o.o tickets to coordinate with our customers. Also use it for tracking work, so others can find the status, continue on the work. ML is good for notification of (breaking) changes ; can also CC heroes@ and admin@ Admin wiki is for long-term and general documentation ML is for documenting ongoing changes - should result in updated wiki pages link to tickets and/or ML-archive in git commit messages so that others can find "Why?" and context. https://en.opensuse.org/openSUSE:Heroes has general contact info IRC: When discussing things in IRC, if it is important for others, write it up for them - send to ML for monthly meetings: moderator to take note of who attends, ask them around attendees should come prepared - read the agenda, know what they want to tell others TODO: document this in admin wiki : Olav TODO: List set of communication channels, and how they are used. - IRC, mailing list, tickets, wiki. - not discussed: news.opensuse.org? If unsure about news-worthyness, ask. Also, others can suggest news-worthyness in response to mail/IRC chat. ==== Budget discussion own cluster machines in NUE and Provo - like current DMZ cluster ; hosts SUSE production machines and openSUSE. separated with VLANs. Uses shared storage. Disadvantage: no self-service for non-employees. hard: Storage - space needs unknown. mcaj will find out current use. 1st alternative: use a cloud ; need to discuss with SUSE to cover continuous costs ; https://www.ovh.de/public-cloud/prices/ 26EUR ; https://www.hetzner.de/cloud 3EUR ; https://azure.microsoft.com/en-us/pricing/calculator/ 6-104 USD ; https://calculator.s3.amazonaws.com/index.html 9-18USD (Theo's comment: take a look also at digitalocean, it is generally cheap and it even offers managed kubernetes) 2nd : buy hardware in 1+ locations ; make serial console available - similar to slimhat 3rd : rented server/colocation ; can also be at multiple locations ; console-access can be harder backup - - use restic ==== IAM / AE-DIR by mstroeder Centrailized IAM (heroes, members?, Tools, mail users?) Services: Linux-Login, VPN, DNS-Admin, gitlab, WebApps, SSO Infra-SSO: SAMLv2 and/or OpenIDConnect Audit-logs (History) 2FA Authorized keys (SSH) centrally managed! eventually temporary OpenSSH user certs Multiple user accounts per person Open: user name assignment GDPR considerations Deactivation / deletion policy Variant 1: FreeIPA - outdated OS and FreeIPA - no openSUSE packages, no OBS updates + existing setup -> works for now (VPN, gitlab, let's encrypt, ...) + decent WebUI and CLI tools + provides DNS (CAA RRs provisioned separately) - issues with sssd (restarts needed) - No audit logs Variant 2: AE-DIR Current status: https://progress.opensuse.org/issues/39872 PoC installation (since 08/2018, updated to Leap 15.?) ansible-based installation, site-directory see gitlab Services: See client-examples/ -> gitlab.com no WebSSO for now Audit-logs: OpenLDAP accesslog overlay 2FS: OATH-LDAP preferably with yubikeys Multi-account: aeUser -> aePerson Mail accounts / forwarding (membership-mgmt ; UI needed!) User name: default 4-rnd-letter GDPR: aeStatus(1) -> (2) deactivate -> archive Delegated administration aehostd: sssd replacement (salt state needed) ===== Mail MX <- inbound MX opensuse.org Mailing-lists incl. archiving (searchable) Spam Protection @opensuse.org Mail Forwarding @opensuse.org including SRS -> (Membership Mgmt.) Security MTA-STS (TLS-Enforcement), DMARC?, DANE, SPF ; for later wo rk Mail Relay (outbound) No member mailboxes for now - can recommend mailbox.org free hosting --- Own MX inbound (Project Independence) Defer Mailing list changes (integrate existing list server) postfix! dovecot (IMAP) for tool addresses? rspamd (smtpd) - optional reject immediately ; user white-listing honeypot@opensuse.org (linked in Wiki, Lists, etc.) to train spam-filter GDPR considerations start with 2 MX in NBG separate inbount / outbound MTSs -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQRk4KvQEtfG32NHprVJNgs7HfuhZAUCXdboVAAKCRBJNgs7Hfuh ZF4QAJwPXs66Hg3+FzPtsssxIOzUeiSZIwCg+VikpvHKCeIEKtjiNSeoq3alnfU= =H5Dw -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org