[opensuse-buildservice] (hopefully) short downtime for OBS

Hi, in order to do the final switch of the underlying storage for the backend server (an increase from 4.8T to 8T storage) there will be a (hopefully) short downtime for the OBS. At the moment, the server will still react, but the schedulers, dispatcher and publisher are off and a sync is running (the sync before had just completed, so the delta should already be quite small). If this run is done, OBS will go down in maintenance mode, repserver will be shutdown and the webservice will give out a downtime message and the old storate will be remounted readonly for one final and hopefully quick sync before we switch to the new filesystem. I will report back when everything is complete. -- with kind regards (mit freundlichem Grinsen), Ruediger Oertel (ro@novell.com,ro@suse.de,bugfinder@t-online.de) ---------------------------------------------------------------------- Linux T410Rudi 2.6.36-rc4-2-desktop #1 SMP PREEMPT 2010-09-16 20:58:38 +0200 x86_64 Key fingerprint = 17DC 6553 86A7 384B 53C5 CA5C 3CE4 F2E7 23F2 B417 SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org

On 10/10/10 13:15, Ruediger Oertel wrote:
Hi,
in order to do the final switch of the underlying storage for the backend server (an increase from 4.8T to 8T storage) there will be a (hopefully) short downtime for the OBS.
At the moment, the server will still react, but the schedulers, dispatcher and publisher are off and a sync is running (the sync before had just completed, so the delta should already be quite small).
If this run is done, OBS will go down in maintenance mode, repserver will be shutdown and the webservice will give out a downtime message and the old storate will be remounted readonly for one final and hopefully quick sync before we switch to the new filesystem.
I will report back when everything is complete.
Hi Reudiger This sounds interesting.... would you be able to describe the steps in a little more detail? Eg which services are shut down and in what order; which parts of the storage they touch? I'm sure that many of us operating our own instances would be keen to understand how you approach it. Thanks David -- "Don't worry, you'll be fine; I saw it work in a cartoon once..." -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org

On Sunday, October 10, 2010 03:05:16 pm David Greaves wrote:
On 10/10/10 13:15, Ruediger Oertel wrote:
Hi,
in order to do the final switch of the underlying storage for the backend server (an increase from 4.8T to 8T storage) there will be a (hopefully) short downtime for the OBS.
At the moment, the server will still react, but the schedulers, dispatcher and publisher are off and a sync is running (the sync before had just completed, so the delta should already be quite small). FYI: this step is completed, we're now at the real downtime and the final sync is running
If this run is done, OBS will go down in maintenance mode, repserver will be shutdown and the webservice will give out a downtime message and the old storate will be remounted readonly for one final and hopefully quick sync before we switch to the new filesystem.
I will report back when everything is complete.
Hi Reudiger
This sounds interesting.... would you be able to describe the steps in a little more detail? Eg which services are shut down and in what order; which parts of the storage they touch? I'm sure that many of us operating our own instances would be keen to understand how you approach it. well, in short: "/etc/init.d/obsscheduler shutdown" will write out the state of the schedulers to avoid a cold start phase after everything is finished. So the schedulers (one per target architecture) are stopped, they create most the write operations below /bs/build , and now you can stop the dispatcher (which sends out the build jobs to the build clients) this one has most of the write operations below /bs/jobs and /bs/workers. Then you can stop the publisher, which basically reads from /bs/build and writes to /bs/repos, and at this state you probably have no more active build jobs running and you can stop bs_warden and bs_signer and basically have only the bs_repserver left, so that osc operations can still work but the rest of the buildservice is mostly silent. That was the part that is done by now (some syncs from the old to the new device were running over the last weeks while I've been pushing this step ahead of me ...) and now the almost-last sync is done, the repserver is shut down and you do a readonly remount of the old filesystem and run the really-final-sync. Well, the rest is basically unmount the old fs, mount the new fs in place, possibly edit the /etc/fstab so this stays that way during a reboot, and then you either reboot the machine for a clean start or just start all the services again in reverse order from above.
Thanks
David
-- with kind regards (mit freundlichem Grinsen), Ruediger Oertel (ro@novell.com,ro@suse.de,bugfinder@t-online.de) ---------------------------------------------------------------------- Linux T410Rudi 2.6.36-rc4-2-desktop #1 SMP PREEMPT 2010-09-16 20:58:38 +0200 x86_64 Key fingerprint = 17DC 6553 86A7 384B 53C5 CA5C 3CE4 F2E7 23F2 B417 SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org

On Sunday, October 10, 2010 03:41:31 pm Ruediger Oertel wrote:
On Sunday, October 10, 2010 03:05:16 pm David Greaves wrote:
On 10/10/10 13:15, Ruediger Oertel wrote:
Hi,
in order to do the final switch of the underlying storage for the backend server (an increase from 4.8T to 8T storage) there will be a (hopefully) short downtime for the OBS.
At the moment, the server will still react, but the schedulers, dispatcher and publisher are off and a sync is running (the sync before had just completed, so the delta should already be quite small).
FYI: this step is completed, we're now at the real downtime and the final sync is running
If this run is done, OBS will go down in maintenance mode, repserver will be shutdown and the webservice will give out a downtime message and the old storate will be remounted readonly for one final and hopefully quick sync before we switch to the new filesystem.
I will report back when everything is complete. okay, the new filesystem is now mounted, the machine has been rebooted and all services are running again. The webservice is back to normal and osc as well as the webgui.
-- with kind regards (mit freundlichem Grinsen), Ruediger Oertel (ro@novell.com,ro@suse.de,bugfinder@t-online.de) ---------------------------------------------------------------------- Linux T410Rudi 2.6.36-rc4-2-desktop #1 SMP PREEMPT 2010-09-16 20:58:38 +0200 x86_64 Key fingerprint = 17DC 6553 86A7 384B 53C5 CA5C 3CE4 F2E7 23F2 B417 SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
participants (2)
-
David Greaves
-
Ruediger Oertel