[heroes] openSUSE server migration
Hi @ll At the moment, most machines hosting the openSUSE infrastructure in Nuremberg are running in a co-location network together with other services from other teams/companies (like SUSE, openAttic, KDE, ...) With the new openSUSE Heroes team, SUSE-IT wants to split up the network into pieces to allow the heroes to get access to nearly everything they need for their work. It might be also used to clarify the responsibility for some services and make the openSUSE setup a bit more easy and clear. But this network split comes with some migration downtime. For most of the machines, a simple "shutdown", "put into the other network", "boot" should be enough (so something around max. 5-10 Minutes per machine). But some machines - especially those that are providing their service to others in this network - are a bit trickier... I started http://etherpad.opensuse.org/p/Server_migration to collect the affected machines together with some nodes and want to ask you: 1) Can someone from the SUSE-IT Heroes join the work? => All machines need a DNS change (new external IP) during the time, so we should minimize the refresh time, if not already done. => the haproxy setup on the old and new instances needs to be adapted during the migration. => we need to clarify the database usage and split up new DB servers, if needed 2) When will be the best time when we can start with the migration ? 3) What did I miss? At the moment, my plan number #1 is to do the following on this Thursday: * migrate freeIPA, chip, mickey and minnie to the new network * follow with other, easier to migrate machines like keyserver, icc, hackweek, ... as time permits * check the mysql and postgresql servers for their running databases and plan the split Regards, Lars -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
it might be enough to just do brctl addif/delif + patch the XML file. darix -- openSUSE - SUSE Linux is my linux openSUSE is good for you www.opensuse.org -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Am 3. Juli 2017 19:01:33 MESZ schrieb "Marcus Rückert" <mrueckert@suse.de>:
it might be enough to just do brctl addif/delif + patch the XML file.
Right. This is what I call the easy hosts. But some machines (like freeipa or the db-clusters) are a bit more complicated: as other machines depend on them. Especially, if the other machines should stay in the old network. Regards, Lars -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
On Tue, 2017-07-04 at 05:19 +0000, Lars Vogdt wrote:
Am 3. Juli 2017 19:01:33 MESZ schrieb "Marcus Rückert" <mrueckert@sus e.de>:
it might be enough to just do brctl addif/delif + patch the XML file.
Right. This is what I call the easy hosts.
But some machines (like freeipa or the db-clusters) are a bit more complicated: as other machines depend on them. Especially, if the other machines should stay in the old network.
For those special machines we could even remove "direct access" and move them into a special vlan only reachable via proxy. -- openSUSE - SUSE Linux is my linux openSUSE is good for you www.opensuse.org -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
On Mon, Jul 03, 2017 at 06:54:17PM +0200, Lars Vogdt wrote:
Hi @ll
Hello,
At the moment, most machines hosting the openSUSE infrastructure in Nuremberg are running in a co-location network together with other services from other teams/companies (like SUSE, openAttic, KDE, ...)
With the new openSUSE Heroes team, SUSE-IT wants to split up the network into pieces to allow the heroes to get access to nearly everything they need for their work. It might be also used to clarify the responsibility for some services and make the openSUSE setup a bit more easy and clear.
There are a bunch of opensuse machines that the heroes team has access, and there are others where the heroes don't have access (they are managed by SUSE-IT or buildops). According to the saltmaster, these machines are: - the atreju and cirrus hypervisors - all the OBS related machines - login.o.o - the mysql and postgresql database clusters (as there are databases for SUSE related services, like fate.suse.com or hackweek.suse.com) - smt-internal. Apart from SMT server, it is also backup server, and there is also something else running there that I don't recall atm. - the NTP servers - pontifex3 for mirrors So my suggestion would be to take the above case by case. Eg the db clusters shouldn't be migrated, but I would suggest to create a new cluster at the opensuse network, and migrate only the relevant dbs.
But this network split comes with some migration downtime. For most of the machines, a simple "shutdown", "put into the other network", "boot" should be enough (so something around max. 5-10 Minutes per machine). But some machines - especially those that are providing their service to others in this network - are a bit trickier...
I started http://etherpad.opensuse.org/p/Server_migration to collect the affected machines together with some nodes and want to ask you:
I added some more info there already
1) Can someone from the SUSE-IT Heroes join the work? => All machines need a DNS change (new external IP) during the time, so we should minimize the refresh time, if not already done. => the haproxy setup on the old and new instances needs to be adapted during the migration. => we need to clarify the database usage and split up new DB servers, if needed
Count me in
2) When will be the best time when we can start with the migration ?
3) What did I miss?
At the moment, my plan number #1 is to do the following on this Thursday: * migrate freeIPA, chip, mickey and minnie to the new network * follow with other, easier to migrate machines like keyserver, icc, hackweek, ... as time permits
Hackweek is a SUSE service, I don't see a reason to put it in the opensuse network.
* check the mysql and postgresql servers for their running databases and plan the split
Minnie (saltmaster) depends on mickey (gitlab), which in turn depends on postgresql. So, as mentioned before, I would suggest first to start by creating a new set of db clusters at the new network. -- Theo Chatzimichos <tampakrap@opensuse.org> <tchatzimichos@suse.com> System Administrator SUSE Operations and Services Team
Lars Vogdt wrote:
2) When will be the best time when we can start with the migration ?
Given that Leap 42.3 will be released July 26th the infrastructure needs to work in that week and the following four ones at least. So release relevant hosts shouldn't be touched before September please. cu Ludwig -- (o_ Ludwig Nussel //\ V_/_ http://www.suse.com/ SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Am Tue, 4 Jul 2017 13:59:18 +0200 schrieb Ludwig Nussel <lnussel@suse.de>:
Lars Vogdt wrote:
2) When will be the best time when we can start with the migration ?
Given that Leap 42.3 will be released July 26th the infrastructure needs to work in that week and the following four ones at least. So release relevant hosts shouldn't be touched before September please.
I'm sorry to bother, but what *are* "release relevant hosts"? Could you provide a list of services or DNS names for us? Regards, Lars -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Lars Vogdt wrote:
Am Tue, 4 Jul 2017 13:59:18 +0200 schrieb Ludwig Nussel <lnussel@suse.de>:
Lars Vogdt wrote:
2) When will be the best time when we can start with the migration ?
Given that Leap 42.3 will be released July 26th the infrastructure needs to work in that week and the following four ones at least. So release relevant hosts shouldn't be touched before September please.
I'm sorry to bother, but what *are* "release relevant hosts"?
Could you provide a list of services or DNS names for us?
Well, anything that would give a bad impression to people trying to get 42.3. You know better how the infrastructure is connected and which hosts you intend to touch. Off the bat I'd say things like obs, wiki, landing page, software.o.o, doc, countdown, download, beans, shop etc but also bugzilla shouldn't break. cu Ludwig -- (o_ Ludwig Nussel //\ V_/_ http://www.suse.com/ SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Am Tue, 4 Jul 2017 14:26:01 +0200 schrieb Ludwig Nussel <lnussel@suse.de>:
Could you provide a list of services or DNS names for us?
Well, anything that would give a bad impression to people trying to get 42.3. You know better how the infrastructure is connected and which hosts you intend to touch. Off the bat I'd say things like obs, wiki, landing page, software.o.o, doc, countdown, download, beans, shop etc but also bugzilla shouldn't break.
ok, that sounds to me like either we get everything done until July 26th - or we need to wait for the Christmas time (at least for me, as September and October are already full). => 3 weeks left to go (not that I want to put some pressure on anybody, this is just a fact) Regards, Lars -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Am 3. Juli 2017 18:54:17 MESZ schrieb Lars Vogdt <lrupp@suse.de>:
But this network split comes with some migration downtime. For most of the machines, a simple "shutdown", "put into the other network", "boot" should be enough (so something around max. 5-10 Minutes per machine). But some machines - especially those that are providing their service to others in this network - are a bit trickier...
JFY: we (Theo, Darix, myself - those are the ones who volunteered) will start with the work during our maintenance window this Thursday. I added a note on status.opensuse.org already. I'm currently unsure if we can update status.o.o during the time with all details, but I will sent a summary to this list on Thursday afternoon.
I started http://etherpad.opensuse.org/p/Server_migration to collect the affected machines together
This is our "working base" now. Please note that neither OBS, the wiki or anything else hosted in Provo should be affected by our work - so if you see some outages in this area, feel free to inform any of us directly, so we can investigate what happened. With kind regards, Lars -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Hi Lars, software.opensuse.org isn't available since 10 minutes. Do you work on the network? Best regards, Sarah Am 05.07.2017 um 08:15 schrieb Lars Vogdt:
Am 3. Juli 2017 18:54:17 MESZ schrieb Lars Vogdt <lrupp@suse.de>:
But this network split comes with some migration downtime. For most of the machines, a simple "shutdown", "put into the other network", "boot" should be enough (so something around max. 5-10 Minutes per machine). But some machines - especially those that are providing their service to others in this network - are a bit trickier...
JFY: we (Theo, Darix, myself - those are the ones who volunteered) will start with the work during our maintenance window this Thursday. I added a note on status.opensuse.org already.
I'm currently unsure if we can update status.o.o during the time with all details, but I will sent a summary to this list on Thursday afternoon.
I started http://etherpad.opensuse.org/p/Server_migration to collect the affected machines together
This is our "working base" now. Please note that neither OBS, the wiki or anything else hosted in Provo should be affected by our work - so if you see some outages in this area, feel free to inform any of us directly, so we can investigate what happened.
With kind regards, Lars
-- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Hi On Thu, 6 Jul 2017 10:51:53 +0200 Sarah Julia Kriesch wrote:
software.opensuse.org isn't available since 10 minutes. Do you work on the network?
Short answer: yes. Long story will follow at the next beer event ;-) Regards, Lars -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Hi Before I leave after ~20h of "server migration", here a short summary of what has been achieved so far: * we created a new VLAN and deployed it on all hypervisors incl. routing, jalla, jalla * created 2 new "proxy" machines, that run haproxy and some special firewall rules - acting as frontend for the services behind the internal (let's call it: private) network * created a new "login" machine, that acts as LDAP proxy (authentication backend) for the VMs * created a new openVPN server as central connection point into the private network (not finished yet) * migrated the running VMs via + brctl delif $old_bridge $vm_interface + brctl addif $new_bridge $vm_interface + s/$old_bridge/$new_bridge/g $vm_config => that part went smoothly ;-) * adapt the haproxy config on the old and new proxy servers to have the frontends pointing to the right backends in the right networks * adapt the firewall settings on the proxy servers * adapt the DNS to move together with the machines * after that, we sometimes had a back and forth with some machines, where the firewall rules where to restrictive (as we introduced two new IPs on the haproxy machines) or other stuff did not work in the first run * adapting the NATting rules, testing new stuff (ähe: packages) on our beloved openSUSE distribution and deeply debugging keepalived, apache2, ldap, perl scripts, freeipa and other stuff took another minute or two :-) * ... (anything I forgot) => 31 hosts are now in the new network, updated (sometimes a bit reconfigured ;-) and up and running Please note: those machines are currently NOT reachable from the outside any more (the xinetd redirection of the ssh port is disabled by intention). So the next important step is to finalize the setup of the openVPN server, so we can distribute the openVPN credentials/certificates. After that, I would say that we are ready to go wherever we want (especially in regard of Salt... ;-) Thanks a lot to everyone that helped with the migration and firefighting! There are always a lot of people involved, but this time I want to thank especially *darix* and *bugfinder* for their work and debugging skills. I'm sure that the current state would not have been reached without you guys! with kind regards, Lars -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Lars Vogdt wrote:
Please note: those machines are currently NOT reachable from the outside any more (the xinetd redirection of the ssh port is disabled by intention).
I guess this is why I cannot currently reach baloo? -- Per Jessen, Zürich (28.8°C) openSUSE mailing list admin -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Hi Per On Fri, 07 Jul 2017 14:04:45 +0200 Per Jessen wrote:
Please note: those machines are currently NOT reachable from the outside any more (the xinetd redirection of the ssh port is disabled by intention).
I guess this is why I cannot currently reach baloo?
Yes, sorry. Depending on the urgency, we have a couple of options: * using Theo as proxy and let him do your work ;-) * giving you access to a jump host * implementing the same setup as before I would say: let's wait how long the openVPN setup needs and make a decision after the weekend. --- JFYI (needs to be documented): * anna and elsa are our new (ha)proxy machines => my idea is to limit them to this task and do not run other services on them * daffy is our new login machine (for authentication of users against the Novell Auth Server) => no other services on this machine (which needs a 2nd one for HA) * gate.opensuse.org should become our "gate" into the privat network - and also the new gateway for private machines who need external access. That way we know that "private"/management traffic only goes via this machine - and the machines that handle "public"/official traffic are not involved in any way. @Per: that would for example mean that we might need to reconfigure baloo in the next days, too. ^^^ does that make sense to anyone? Regards, Lars -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
Lars Vogdt wrote:
Hi Per
On Fri, 07 Jul 2017 14:04:45 +0200 Per Jessen wrote:
Please note: those machines are currently NOT reachable from the outside any more (the xinetd redirection of the ssh port is disabled by intention).
I guess this is why I cannot currently reach baloo?
Yes, sorry.
Depending on the urgency, we have a couple of options: * using Theo as proxy and let him do your work ;-) * giving you access to a jump host * implementing the same setup as before
I would say: let's wait how long the openVPN setup needs and make a decision after the weekend.
No problem Lars, it's nothing urgent. -- Per Jessen, Zürich (29.4°C) openSUSE mailing list admin -- To unsubscribe, e-mail: heroes+unsubscribe@opensuse.org To contact the owner, e-mail: heroes+owner@opensuse.org
On Thu, Jul 06, 2017 at 10:02:08PM +0200, Lars Vogdt wrote:
Hi
...
Before I leave after ~20h of "server migration", here a short summary of what has been achieved so far:
Please note: those machines are currently NOT reachable from the outside any more (the xinetd redirection of the ssh port is disabled by intention).
So the next important step is to finalize the setup of the openVPN server, so we can distribute the openVPN credentials/certificates.
After that, I would say that we are ready to go wherever we want (especially in regard of Salt... ;-)
Thanks a lot to everyone that helped with the migration and firefighting! There are always a lot of people involved, but this time I want to thank especially *darix* and *bugfinder* for their work and debugging skills. I'm sure that the current state would not have been reached without you guys!
Thanks a lot to Lars, Rudi and Marcus for their amazing and fast work! The OpenVPN setup is an action item for me, and I am currently working on it with very high priority. So please be patient, and feel free to contact me to do any changes in your machines at the meantime. -- Theo Chatzimichos <tampakrap@opensuse.org> <tchatzimichos@suse.com> System Administrator SUSE Operations and Services Team
participants (7)
-
Lars Vogdt
-
Lars Vogdt
-
Ludwig Nussel
-
Marcus Rückert
-
Per Jessen
-
Sarah Julia Kriesch
-
Theo Chatzimichos