Hej, On various occasions, I heard multiple requests about having a simple way to set up and maintain an openSUSE mirror. Since then I had some ideas about "projects" as they are defined at mirrors.opensuse.org, and the fact that each project may need a dedicated rsync process with customized individual check intervals and maybe some kind of notification from the server when new content is arrived, etc. E.g. Leap iso and repo files rarely change, so they might need to be checked less frequently than Leap updates, etc. But when a big sync starts for Leap quarterly updates (.iso files) - it shouldn't delay the sync of updates for long. With these ideas I implemented opensuse-rsync packages. Ideally users follow the following steps: * Choose a size of the mirror according to the list: https://github.com/andrii-suse/opensuse-rsync/blob/master/README.md#approxim... * Install the corresponding package, e.g. opensuse-rsync-typical (which will require approximately 1.2TB of disk space). * Enable the timers using the provided command. * Optionally add custom filters to a config. * Monitor and manage systemd services for each project. So, the idea is that such steps might be the default set of actions to set up a mirror (but not required). Further, it will be easier to gather feedback and implement improvements or add functionality. I would appreciate it if somebody had a chance to review this proposal and share ideas if it looks legit or why it is a bad idea. I will be glad to answer questions about the implementation or design decisions, etc. If no feedback is provided, I probably will add it as an experimental section to the wiki and wait for somebody to try it. Github: https://github.com/andrii-suse/opensuse-rsync OBS: https://build.opensuse.org/project/show/home:andriinikitin:opensuse-rsync Regards, Andrii Nikitin
Lowest latency would be achieved by push mirroring. Debian uses SSH keys restricted to only run the mirror command, documentation here: https://www.debian.org/mirror/push_server I personally would not install systemd services or timers on the server [previous server was FreeBSD, as well :-p ], but adding another line next to debian push mirroring setup is a no brainer. This approach needs no extra scripts to install, api calls to make, or cache files to keep in the local mirror, either. On Mon, 21 Oct 2024 at 09:30, Andrii Nikitin <andrii.nikitin@suse.com> wrote:
Hej,
On various occasions, I heard multiple requests about having a simple way to set up and maintain an openSUSE mirror. Since then I had some ideas about "projects" as they are defined at mirrors.opensuse.org, and the fact that each project may need a dedicated rsync process with customized individual check intervals and maybe some kind of notification from the server when new content is arrived, etc.
E.g. Leap iso and repo files rarely change, so they might need to be checked less frequently than Leap updates, etc. But when a big sync starts for Leap quarterly updates (.iso files) - it shouldn't delay the sync of updates for long.
With these ideas I implemented opensuse-rsync packages. Ideally users follow the following steps: * Choose a size of the mirror according to the list: https://github.com/andrii-suse/opensuse-rsync/blob/master/README.md#approxim... * Install the corresponding package, e.g. opensuse-rsync-typical (which will require approximately 1.2TB of disk space). * Enable the timers using the provided command. * Optionally add custom filters to a config. * Monitor and manage systemd services for each project.
So, the idea is that such steps might be the default set of actions to set up a mirror (but not required). Further, it will be easier to gather feedback and implement improvements or add functionality.
I would appreciate it if somebody had a chance to review this proposal and share ideas if it looks legit or why it is a bad idea. I will be glad to answer questions about the implementation or design decisions, etc.
If no feedback is provided, I probably will add it as an experimental section to the wiki and wait for somebody to try it.
Github: https://github.com/andrii-suse/opensuse-rsync OBS: https://build.opensuse.org/project/show/home:andriinikitin:opensuse-rsync
Regards, Andrii Nikitin -- Ftp mailing list Ftp@liste.linux.org.tr https://liste.linux.org.tr/cgi-bin/mailman/listinfo/ftp
LKD FTP wrote:
Debian uses SSH keys restricted to only run the mirror command,
My understanding is below, it is not scientifically based, just an impression. So correct me if I am wrong: You cannot compare OBS to Debian or literally any other distribution, just because of: - total number of projects involved: 100K+ vs 10(?) - amount of daily updated projects: 1K+ vs 0.1(?) - how many times per day each project can be updated: unlimited vs 0.01? - and total daily volume (still unlimited vs very limited).
I personally would not install systemd services or timers on the server
May I ask why? Do you generally dislike systemd or is there any other reason? My understanding is that you either use systemd for everything or custom reimplementations of susset of systemd for every service you want to run. Or am I missing something?
On Mon, 21 Oct 2024, 12:02 Andrii Nikitin, <andrii.nikitin@suse.com> wrote:
LKD FTP wrote:
I personally would not install systemd services or timers on the server
May I ask why? Do you generally dislike systemd or is there any other reason? My understanding is that you either use systemd for everything or custom reimplementations of susset of systemd for every service you want to run. Or am I missing something?
Happy to explain, but mainly it's due to simplicity. (Note: we're running on Debian and using systemd for serving stuff, the point was for installing custom per-distro units in systemd) Mirroring uses ftpsync for Debian and derivatives, all the other mirrored software is done by an in-house script written ~20 years ago with custom config files. Only required a handful of updates over years. It's triggered by local user's crontab. Moving or reinstalling the server is very simple. - Copy server configs and install server packages. - Copy entire home (includes ftpsync, custom mirroring, ssh autohrized_keys for push mirroring, a few other helpers) - Run one of the helpers to reinstall local crontab Most public mirroring servers are run by volunteers, probably not more than a few hours per month of time available. This simplicity helped our server run for decades, and moving it between servers and OS'es was not an issue at all. if there were many different ways to configure and run mirrors, either the number of mirrored distros would go down, or maintainers would burn out, or the mirroring would be problematic for the users.
LKD FTP wrote:
Moving or reinstalling the server is very simple. - Copy server configs and install server packages. - Copy entire home (includes ftpsync, custom mirroring, ssh autohrized_keys for push mirroring, a few other helpers) - Run one of the helpers to reinstall local crontab
I am not sure how it is easier than installing a package and start the provided timers. But again - the proposed steps aim more to simplify new mirrors setup, rather than immediately replacing current legacy.
if there were many different ways to configure and run mirrors, either the number of mirrored distros would go down, or maintainers would burn out, or the mirroring would be problematic for the users.
Exactly, and creating "unified way to manage mirrors" is tricky, especially for OBS and all the daily traffic it produces. Still it is good to start with at least something and 2024 is too late to resist systemd. The .deb package shouldn't be an issue as well - I will be glad to try implementing it if somebody would like to give it a shot. Regards, Andrii Nikitin
Dear Andrii, It's just an idea, but how about doing away with the ssh push sync solution altogether, due to recent vulnerabilities/backdoors in sshd, and also because not all mirror admins like to give this kind of even limited rights to other projects, even if they have IP limited access. Instead, there could be a much simpler and cleaner way to do it. You could create a simple nginx web server with subdirs or txt files with unique IDs if needed for all mirrors, so you could easily monitor - who and when retrieved the URL, in case, if there was an outage. The mirrors would query the txt file on the URL generated for them every minute with wget or curl, the value in the txt file could be 0 or 1, if zero then no update (no changes in the main repo) is needed, if 1 then an update would be triggered (the script you provide or rsync command you recommend etc.). If there is an update to do, then the value could remain 1 for about 3-5 minutes in case of a network problem that prevents the mirror from accessing the URL - the flock will not start the rsync process in multiple instances anyway. Essentially, the same result can be achieved as with push sync, but without installing ssh and other customized packages. By the way, there is no silver bullet in mirror operation, there will be problems that need to be solved by the admin of the mirror and no ready-made solution will help, if these problems cannot be solved by someone, there is nothing to talk about. Sometimes rsync process gets stuck, sometimes systemd does not restart the web server or the rsyncd service, but nothing what cannot be solved with a simple if else script. Other thought: If you want you can also introduce rsync over TLS as an option in addition to plain rsync, just if it does not place too much extra load on the main repo server. Thank you for your hard work and ideas to make things better! Have a very nice day! Cheers, Peter On 2024-10-21 10:21, Andrii Nikitin wrote:
Hej,
On various occasions, I heard multiple requests about having a simple way to set up and maintain an openSUSE mirror. Since then I had some ideas about "projects" as they are defined at mirrors.opensuse.org, and the fact that each project may need a dedicated rsync process with customized individual check intervals and maybe some kind of notification from the server when new content is arrived, etc.
E.g. Leap iso and repo files rarely change, so they might need to be checked less frequently than Leap updates, etc. But when a big sync starts for Leap quarterly updates (.iso files) - it shouldn't delay the sync of updates for long.
With these ideas I implemented opensuse-rsync packages. Ideally users follow the following steps: * Choose a size of the mirror according to the list: https://github.com/andrii-suse/opensuse-rsync/blob/master/README.md#approxim... * Install the corresponding package, e.g. opensuse-rsync-typical (which will require approximately 1.2TB of disk space). * Enable the timers using the provided command. * Optionally add custom filters to a config. * Monitor and manage systemd services for each project.
So, the idea is that such steps might be the default set of actions to set up a mirror (but not required). Further, it will be easier to gather feedback and implement improvements or add functionality.
I would appreciate it if somebody had a chance to review this proposal and share ideas if it looks legit or why it is a bad idea. I will be glad to answer questions about the implementation or design decisions, etc.
If no feedback is provided, I probably will add it as an experimental section to the wiki and wait for somebody to try it.
Github: https://github.com/andrii-suse/opensuse-rsync OBS: https://build.opensuse.org/project/show/home:andriinikitin:opensuse-rsync
Regards, Andrii Nikitin
Hej Peter, Quantum Mirror wrote:
It's just an idea, but how about doing away with the ssh push sync
Unfortunately I am not ready to discuss push sync yet. In my understanding: (considering amount of traffic and number of mirrors) - we will have to stick with mixture of push/pull mirrors. But current proposal is solely about pull.
The mirrors would query the txt file on the URL generated for them every
So current approach in the project is that sync scripts do query 'last_modified_time' of the 'project' and store it locally before the sync. The api look like:
curl http://download.opensuse.org/rest/project_last_modified?project=15.6+update 1729520040
The valid values for "project" parameter are defined in "name" column at https://download.opensuse.org/app/project So, instead of verifying thousands / millions of files for each project using rsync protocol - the scripts can use single HTTP request per project. This all logic is hidden in the scripts, but of course in can be used in other implementations.
By the way, there is no silver bullet in mirror operation, there will be problems that need to be solved by the admin of the mirror and no
Yeah, the goal is also to provide useful commands to provide a comprehensive overview of the sync status and help dealing with the most typical issues. I believe it is quite possible to have all more or less under control with the proposed tooling. Regards, Andrii Nikitin
participants (3)
-
Andrii Nikitin
-
LKD FTP
-
Quantum Mirror