[opensuse-buildservice] bs_srcserver seems stuck at a specific lastnotification number

Hi, I was investigating why my OBS server has a considerable amount of CPU load all the time. It's bs_srcserver that occupies one CPU almost entirely. Investigating, it looks like it is processing the /srv/obs/events/lastnotifications over and over again. obs:~ # grep lastnotifications /srv/obs/log/src_server.log|tail 2018-06-19 15:21:17: [2007] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-06-19 15:21:20: [2011] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-06-19 15:21:23: [2014] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-06-19 15:21:27: [2018] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-06-19 15:21:30: [2022] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-06-19 15:21:33: [2024] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-06-19 15:21:36: [2030] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-06-19 15:21:39: [2040] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-06-19 15:21:43: [2048] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-06-19 15:21:46: [2052] GET (127.0.0.1) /lastnotifications?start=297709&block=1 Actually, the oldest copy I have of this logfile is from 2018-05-01, and it has the same start= number: obs:~ # xzgrep lastnotifications /srv/obs/log/src_server.log-20180501.xz|head -5 2018-04-30 08:45:09: [30828] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-04-30 08:45:11: [30833] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-04-30 08:45:14: [30837] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-04-30 08:45:17: [30841] GET (127.0.0.1) /lastnotifications?start=297709&block=1 2018-04-30 08:45:19: [30843] GET (127.0.0.1) /lastnotifications?start=297709&block=1 obs:~ # l /srv/obs/events/ total 26832 drwxr-xr-x 9 obsrun obsrun 4096 Jun 4 12:37 ./ drwxr-xr-x 23 obsrun obsrun 4096 Sep 28 2017 ../ drwxr-xr-x 2 obsrun obsrun 4096 Jun 12 08:21 deltastore/ drwxr-xr-x 2 obsrun obsrun 4096 Jun 19 11:07 dispatch/ drwxr-xr-x 2 obsrun obsrun 905216 Jun 19 11:16 i586/ -rw-r--r-- 1 obsrun obsrun 520080 Jun 19 15:11 lastevents -rw-r--r-- 1 obsrun obsrun 25804967 Jun 19 14:22 lastnotifications drwxr-xr-x 2 obsrun obsrun 4096 Jun 19 14:22 publish/ drwxr-xr-x 2 obsrun obsrun 4096 Jun 19 15:11 repository/ drwxr-xr-x 2 obsrun obsrun 8192 Jun 19 14:21 signer/ drwxr-xr-x 2 obsrun obsrun 90112 Jun 19 14:22 x86_64/ note that I have no i586 scheduler anymore (had it before, but finally got rid of it maybe a year ago). The i586 events directory is full of files named after projects obs:~ # l /srv/obs/events/i586/|wc 1557 14006 141552 Most of these are old (older than 2017-08-24, maybe the date I got rid of the i586 scheduler), but two were updated today: vsa1001:~ # ls -l /srv/obs/events/i586/ -rt|tail -2 -rw-r--r-- 1 obsrun obsrun 122 Jun 19 10:46 relsync:Images:SaaS:BM::images::i586 -rw-r--r-- 1 obsrun obsrun 122 Jun 19 11:16 relsync:Images:SaaS:HV::images::i586 I have no real idea how this all plays together, and what I have to do to get this into a sane state again. Any hints? Best regards, and thanks in advance, Stefan -- Stefan Seyfried "For a successful technology, reality must take precedence over public relations, for nature cannot be fooled." -- Richard Feynman -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Hi all, JFTR: this is OBS-2.8.4 still, but I think the issue is present in 2.9 On 19.06.2018 17:31, Stefan Seyfried wrote:
Hi,
I was investigating why my OBS server has a considerable amount of CPU load all the time. It's bs_srcserver that occupies one CPU almost entirely. Investigating, it looks like it is processing the /srv/obs/events/lastnotifications over and over again.
I have no real idea how this all plays together, and what I have to do to get this into a sane state again.
(has nothing to do with schedulers, architectures etc...) I have put debugging code into src/api/app/models/update_notification_events.rb:update_events, in the ActiveRecord::StatementInvalid path. I, [2018-06-20T07:45:48.800294 #31698] INFO -- : [31698:20.91] #<ActiveRecord::StatementInvalid: Mysql2::Error: Incorrect string value: '\xC2\x9B:wq"...' for column 'payload' at row 1: INSERT INTO `events` (`eventtype`, `payload`, `created_at`, `updated_at`, `lock_version`) VALUES ('Event::Commit', '{\"project\":\"home:username:branches:Images:S4H\",\"package\":\"sles12_sp1_x86_64_s4h\",\"comment\":\"awq\",\"user\":\"username\",\"files\":\"Modified:\\n sles12_sp2_x86_64_s4h.kiwi\\n\\n\",\"rev\":\"22\"}', '2018-06-20 07:45:48', '2018-06-20 07:45:48', 0)> obs:/srv/obs/events # grep 297709 lastnotifications 297709|SRCSRV_COMMIT|1511433089|comment%3Da�%9B:wq|files%3DModified:%0A sles12_sp2_x86_64_s4h.kiwi%0A%0A|package%3Dsles12_sp1_x86_64_s4h|project%3Dhome:username:branches:Images:S4H|rev%3D22|user%3Dusername obs:/srv/obs/events # grep 297709 lastnotifications | hexdump -C 00000000 32 39 37 37 30 39 7c 53 52 43 53 52 56 5f 43 4f |297709|SRCSRV_CO| 00000010 4d 4d 49 54 7c 31 35 31 31 34 33 33 30 38 39 7c |MMIT|1511433089|| 00000020 63 6f 6d 6d 65 6e 74 25 33 44 61 c2 25 39 42 3a |comment%3Da.%9B:| 00000030 77 71 7c 66 69 6c 65 73 25 33 44 4d 6f 64 69 66 |wq|files%3DModif| 00000040 69 65 64 3a 25 30 41 20 20 73 6c 65 73 31 32 5f |ied:%0A sles12_| 00000050 73 70 32 5f 78 38 36 5f 36 34 5f 73 34 68 2e 6b |sp2_x86_64_s4h.k| 00000060 69 77 69 25 30 41 25 30 41 7c 70 61 63 6b 61 67 |iwi%0A%0A|packag| 00000070 65 25 33 44 73 6c 65 73 31 32 5f 73 70 31 5f 78 |e%3Dsles12_sp1_x| 00000080 38 36 5f 36 34 5f 73 34 68 7c 70 72 6f 6a 65 63 |86_64_s4h|projec| 00000090 74 25 33 44 68 6f 6d 65 3a 75 73 65 72 6e 61 6d |t%3Dhome:usernam| 000000a0 65 3a 62 72 61 6e 63 68 65 73 3a 49 6d 61 67 65 |e:branches:Image| 000000b0 73 3a 53 34 48 7c 72 65 76 25 33 44 32 32 7c 75 |s:S4H|rev%3D22|u| 000000c0 73 65 72 25 33 44 75 73 65 72 6e 61 6d 65 0a |ser%3Dusername.| 000000cf So a user managed to sneak in an invalid character 0xC2 into a comment and this breaks OBS. USERS! We should really forbid access for them! ;-) I removed the \xC2%9B combo, then the code took the sync lost path: if @last['sync'] == 'lost' # we're doomed, but we can't help - it's not supposed to happen BackendInfo.lastnotification_nr = Integer(@last['next']) return end and now everything seems back to normal. Maybe more input checking is needed to prevent this. -- Stefan Seyfried "For a successful technology, reality must take precedence over public relations, for nature cannot be fooled." -- Richard Feynman -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

On Wed, 20 Jun 2018 10:46:35 +0200 Stefan Seyfried <stefan.seyfried@googlemail.com> wrote:
Hi all,
JFTR: this is OBS-2.8.4 still, but I think the issue is present in 2.9
On 19.06.2018 17:31, Stefan Seyfried wrote:
Hi,
I was investigating why my OBS server has a considerable amount of CPU load all the time. It's bs_srcserver that occupies one CPU almost entirely. Investigating, it looks like it is processing the /srv/obs/events/lastnotifications over and over again.
I have no real idea how this all plays together, and what I have to do to get this into a sane state again.
(has nothing to do with schedulers, architectures etc...)
I seem to have similar problems with Packman's PMBS: pmbs:/srv/obs/events # egrep 3635610 lastnotifications 3635610|SRCSRV_COMMIT|1484696363|comment%3D- Update to version 17.0.1:%0A * deps/libff: Fix VP8/VP9/webm alpha support%0A * rtmp-services: Increase video bitrate limit for YouTube%0A * obs-outputs: fix build error on freebsd%0A * Update translations from Crowdin%0A * [CI] Use prebuilt deps so we can build on 10.9%0A * CI: Build more features into FFMPEG deps%0A * CI: Update browser plugin ref and build scripts.%0A * CI: Fix zip permission issue on CEF plist files%0A * obs-x264: ignore opencl param%0A * enc-amf: Update to 1.4.3.8%0A * CI: Ability to make packages on travis%0A * CI: actually call packagesbuild from the right place �%9F%98%91%0A * cmake: Remove unnecessary find_package calls%0A * libobs: Fix scale filtering bug when duplicating scenes%0A * win-capture: Don't use FindWindow for game capture keepalive%0A * CI: Install Packages and use the full version%0A * Revert "win-capture: Use FindWindowEx to traverse window list"%0A * obs-filters.c: Fix color correction filter OpenGL crash%0A * obs-filters.c: Fix color correction filter saturation%0A * Update translations from Crowdin%0A * UI: Fix bug with uncopied profile import/export files%0A * win-capture: Fix game capture size bug when rehooking%0A * libobs: Add func to enum active and inactive child tree%0A * libobs: Enumerate full tree when adding active child%0A * libobs: Add callback for enumerating all scene children%0A * obs-x264: Allow opencl through much longer alias%0A * libobs: Update to version 17.0.1|files%3DAdded:%0A obs-studio-17.0.1.tar.xz%0A%0ADeleted:%0A obs-studio-0.17.0.tar.xz%0A%0AModified:%0A _service%0A _servicedata%0A obs-studio.changes%0A obs-studio.spec%0A%0A|package%3Dobs-studio|project%3Dhome:<REDACTED>:branches:Multimedia|rev%3D2|user%3D<REDACTED> Notice the "�%9F%98%91", a hexdump excerpt is here: 00000270 69 6c 64 20 66 72 6f 6d 20 74 68 65 20 72 69 67 |ild from the rig| 00000280 68 74 20 70 6c 61 63 65 20 f0 25 39 46 25 39 38 |ht place .%9F%98| 00000290 25 39 31 25 30 41 20 20 2a 20 63 6d 61 6b 65 3a |%91%0A * cmake:|
So a user managed to sneak in an invalid character 0xC2 into a comment and this breaks OBS.
Looks like that in my case also. :
Maybe more input checking is needed to prevent this.
I second this! Thank you Stefan, for pointing this out! Greetings, Stefan -- Stefan Botter zu Hause Bremen

On 20.06.2018 13:11, Stefan Botter wrote:
I seem to have similar problems with Packman's PMBS:
Notice the "�%9F%98%91", a hexdump excerpt is here:
00000270 69 6c 64 20 66 72 6f 6d 20 74 68 65 20 72 69 67 |ild from the rig| 00000280 68 74 20 70 6c 61 63 65 20 f0 25 39 46 25 39 38 |ht place .%9F%98| 00000290 25 39 31 25 30 41 20 20 2a 20 63 6d 61 6b 65 3a |%91%0A * cmake:|
So a user managed to sneak in an invalid character 0xC2 into a comment and this breaks OBS.
Looks like that in my case also.
Yes. Note that just editing out the one invalid character was not enough, because bs_srcserver (who processes this file) then returned an empty object to obsapidelayed. I had to remove the invalid character and the following %-escaped one for everything to proceed (or better: it just jumped to the end of the queue, so just empying lastnotifications file would proabably have had the same effect ;-)
Maybe more input checking is needed to prevent this.
I second this!
Thank you Stefan, for pointing this out!
It was a hard day of debugging. If I ever meet the guy who decided to use ruby on rails for OBS, I will not buy him a beer ;-) -- Stefan Seyfried "For a successful technology, reality must take precedence over public relations, for nature cannot be fooled." -- Richard Feynman -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
participants (2)
-
Stefan Botter
-
Stefan Seyfried