[opensuse-factory] journald configuration defaults
Some time ago I had a problem with slow boot. Once that was fixed I started to have the opposite problem, that is long delays with high disk activity during shutdown, getting much worse with time. Again this turns out to be triggered by journald: once the log file had grown past 20MiB or so there would be a noticeable delay when flushing the log to disk and it became unbearable at 80MiB. The time is superlinear with log size: a verify on the current 7MiB journal file takes roughly 1.5s (17s for a loop of 16 verifies), the 114MiB for the old file are already over 30s. The solution (for me) was to set SystemMaxFileSize=8M in /etc/systemd/journald.conf so that the active journal file gets rotated much earlier than with the built-in defaults. I have a rather small separate /var, but that file would have grown to about 300MiB. considering that most people probably have probably much larger /var partitions and the superlinear scaling characteristics, I think it would be advisable to provide more sensible defaults with the distribution. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf rackAttack: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Sunday 30 March 2014 11.21:07 Achim Gratz wrote:
Some time ago I had a problem with slow boot. Once that was fixed I started to have the opposite problem, that is long delays with high disk activity during shutdown, getting much worse with time. Again this turns out to be triggered by journald: once the log file had grown past 20MiB or so there would be a noticeable delay when flushing the log to disk and it became unbearable at 80MiB. The time is superlinear with log size: a verify on the current 7MiB journal file takes roughly 1.5s (17s for a loop of 16 verifies), the 114MiB for the old file are already over 30s. The solution (for me) was to set
SystemMaxFileSize=8M
in /etc/systemd/journald.conf so that the active journal file gets rotated much earlier than with the built-in defaults. I have a rather small separate /var, but that file would have grown to about 300MiB. considering that most people probably have probably much larger /var partitions and the superlinear scaling characteristics, I think it would be advisable to provide more sensible defaults with the distribution.
Regards, Achim.
Take into account, that some people will use journald without the rsyslog emulation on (no /var/log/message text flushing) And the default doesn't look that bad systemd-journal[419]: Runtime journal is using 2.7M (max 796.6M, leaving 1.1G of free 7.7G, current limit 796.6M). du total of /var/log/journal = 1.8GB -- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch openSUSE Member & Board GPG KEY : D5C9B751C4653227 irc: tigerfoot ~~~~~ Don't take Life too serious. Nobody gets out alive anyway! ~~~~~ -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Bruno Friedmann <bruno <at> ioda-net.ch> writes:
Take into account, that some people will use journald without the rsyslog emulation on (no /var/log/message text flushing) And the default doesn't look that bad
systemd-journal[419]: Runtime journal is using 2.7M (max 796.6M, leaving 1.1G of free 7.7G, current limit 796.6M).
None of this is even remotely related to the problem at hand AFAICS.
du total of /var/log/journal = 1.8GB
How many files do you have in that directory and what's the result of time journalctl --verify (if you have lots of rotated backups, you can do the verify only on the "open" system and user journals, which seems to be -- at least runtime-wise -- what happens at shutdown). Regards, Achim. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Monday 31 March 2014 19.59:33 Achim Gratz wrote:
Bruno Friedmann <bruno <at> ioda-net.ch> writes:
Take into account, that some people will use journald without the rsyslog emulation on (no /var/log/message text flushing) And the default doesn't look that bad
systemd-journal[419]: Runtime journal is using 2.7M (max 796.6M, leaving 1.1G of free 7.7G, current limit 796.6M).
None of this is even remotely related to the problem at hand AFAICS.
du total of /var/log/journal = 1.8GB
How many files do you have in that directory and what's the result of
70 files since January 28th 2013.
time journalctl --verify
real 2m20.742s user 2m6.646s sys 0m12.043s
(if you have lots of rotated backups, you can do the verify only on the "open" system and user journals, which seems to be -- at least runtime-wise -- what happens at shutdown).
-- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch openSUSE Member & Board GPG KEY : D5C9B751C4653227 irc: tigerfoot ~~~Don't take Life too serious. Nobody gets out alive anyway!~~~ -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Bruno Friedmann writes:
How many files do you have in that directory and what's the result of 70 files since January 28th 2013.
So these files are 25MByte on average?
time journalctl --verify
real 2m20.742s user 2m6.646s sys 0m12.043s
You obviously have a much speedier system than mine (which doesn't really surprise me as my computer is 11 years old). Is this an SSD? Out of curiosity, how long does it take to just verify the active journal file? Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Waldorf MIDI Implementation & additional documentation: http://Synth.Stromeko.net/Downloads.html#WaldorfDocs -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
El 03/04/14 14:57, Achim Gratz escribió:
You obviously have a much speedier system than mine (which doesn't really surprise me as my computer is 11 years old). Is this an SSD? Out of curiosity, how long does it take to just verify the active journal file?
journalctl --verify --system 4.25s user 0.20s system 99% cpu 4.452 total -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Thursday 03 April 2014 19.57:58 Achim Gratz wrote:
Bruno Friedmann writes:
How many files do you have in that directory and what's the result of 70 files since January 28th 2013.
So these files are 25MByte on average? Nope from what I'm seeing actually ll -h total 2.9G -rw-r----- 1 root systemd-journal 66M Apr 4 13:20 system.journal -rw-r----- 1 root root 9.2M Jan 28 2013 system@0004d45c6f97979d-6d3c33838bc094eb.journal~ -rw-r----- 1 root root 20M Feb 5 2013 system@0004d4f68053e839-c00fd50d92979adb.journal~ -rw-r----- 1 root root 5.9M Feb 5 2013 system@0004d4f68e89dfb9-45d1f13b40b639ca.journal~ -rw-r----- 1 root root 8.3M Feb 5 2013 system@0004d500fec4b3d7-7e1c8dd748725f34.journal~ -rw-r----- 1 root root 8.1M Feb 6 2013 system@0004d50e672f2e02-893607a1cbea8e95.journal~ -rw-r----- 1 root root 60M Feb 18 2013 system@0004d5fbea3fe951-256d00234d3b9d7d.journal~ -rw-r----- 1 root adm 111M May 10 2013 system@0004dc57c0b51f58-3516742edcf9e0b5.journal~ -rw-r----- 1 root adm 40M May 21 2013 system@0004dd3ea6f84899-22c0f968b70fc06b.journal~ -rw-r----- 1 root adm 71M Jun 22 2013 system@0004dfc2749b1439-6b0c18865ffc3d64.journal~ -rw-r----- 1 root adm 24M Jul 5 2013 system@0004e0c0e3bce84f-e5cd923386f0d278.journal~ -rw-r----- 1 root adm 28M Sep 7 2013 system@0004e5c434473d5b-0f83b153a890c7fa.journal~ -rw-r----- 1 root adm 56M Oct 12 12:39 system@0004e888db01ff0a-01baf6f666669368.journal~ -rw-r----- 1 root adm 69M Nov 13 09:21 system@0004eb0aa5365af8-8be6bbcaff3e87cb.journal~ -rw-r----- 1 root adm 39M Dec 29 21:16 system@0004eeb2026988f7-17aa7d403fd374ef.journal~ -rw-r----- 1 root adm 45M Jan 8 09:42 system@0004ef717acf584b-59b9e5aa377b1c39.journal~ -rw-r----- 1 root systemd-journal 101M Feb 4 21:12 system@0004f19a41003956-e443095aa509e4f6.journal~ -rw-r----- 1 root systemd-journal 22M Feb 9 08:56 system@0004f1f48f5e22b8-2c527e0add7ff074.journal~ -rw-r----- 1 root systemd-journal 116M Mar 10 13:00 system@0004f43f590508f8-7e6de2b10ba821ee.journal~ -rw-r----- 1 root adm 128M Dec 7 18:38 system@6423493be90d40b49663aa64789b995c-0000000000000001-0004eb0aa4ef8d16.journal -rw-r----- 1 root adm 78M Dec 12 18:53 system@6423493be90d40b49663aa64789b995c-0000000000031fec-0004ecf53ad4d4a3.journal -rw-r----- 1 root adm 74M Dec 12 18:53 system@6423493be90d40b49663aa64789b995c-00000000000609ed-0004ed5a04714ebe.journal -rw-r----- 1 root adm 74M Dec 12 18:53 system@6423493be90d40b49663aa64789b995c-000000000008f32a-0004ed5a060f7a66.journal -rw-r----- 1 root adm 75M Dec 12 18:54 system@6423493be90d40b49663aa64789b995c-00000000000bd280-0004ed5a07a3f8b4.journal -rw-r----- 1 root adm 84M Jul 13 2013 system@82d22cae59d34f4faa2c4c466ad33d28-0000000000000001-0004e0c0e3753f1b.journal -rw-r----- 1 root adm 87M Jul 13 2013 system@82d22cae59d34f4faa2c4c466ad33d28-00000000000194c0-0004e1622c739444.journal -rw-r----- 1 root adm 86M Jul 13 2013 system@82d22cae59d34f4faa2c4c466ad33d28-000000000002ea70-0004e1628aa1a140.journal -rw-r----- 1 root adm 87M Jul 13 2013 system@82d22cae59d34f4faa2c4c466ad33d28-0000000000043ff9-0004e162e9b73f63.journal -rw-r----- 1 root adm 86M Jul 13 2013 system@82d22cae59d34f4faa2c4c466ad33d28-000000000005957b-0004e16347e94345.journal -rw-r----- 1 root adm 87M Jul 13 2013 system@82d22cae59d34f4faa2c4c466ad33d28-000000000006eb06-0004e163a6394fd5.journal -rw-r----- 1 root adm 98M Aug 31 2013 system@82d22cae59d34f4faa2c4c466ad33d28-000000000008409e-0004e1640e03a52f.journal -rw-r----- 1 root systemd-journal 97M Mar 13 20:22 system@a9f02b318c1f4d9ab2173da113c8a52c-0000000000000001-0004f43f58b582f1.journal -rw-r----- 1 root systemd-journal 109M Mar 13 21:34 system@a9f02b318c1f4d9ab2173da113c8a52c-000000000004e12d-0004f481df891bfa.journal -rw-r----- 1 root systemd-journal 108M Mar 13 23:11 system@a9f02b318c1f4d9ab2173da113c8a52c-00000000000635f9-0004f482e12cc623.journal -rw-r----- 1 root systemd-journal 107M Mar 14 01:05 system@a9f02b318c1f4d9ab2173da113c8a52c-0000000000078b2b-0004f4843b756fe8.journal -rw-r----- 1 root systemd-journal 107M Mar 14 03:45 system@a9f02b318c1f4d9ab2173da113c8a52c-000000000008e06b-0004f485d3a0be92.journal -rw-r----- 1 root systemd-journal 107M Mar 14 06:06 system@a9f02b318c1f4d9ab2173da113c8a52c-00000000000a35ab-0004f488104413c2.journal -rw-r----- 1 root systemd-journal 109M Mar 14 07:48 system@a9f02b318c1f4d9ab2173da113c8a52c-00000000000b8aed-0004f48a082c1b8d.journal -rw-r----- 1 root systemd-journal 104M Mar 27 12:13 system@a9f02b318c1f4d9ab2173da113c8a52c-00000000000ce011-0004f48b7517d07a.journal -rw-r-----+ 1 root adm 72K Apr 3 2013 user-1501.journal -rw-r-----+ 1 root systemd-journal 4.4M Apr 4 12:35 user-1502.journal -rw-r----- 1 root root 3.8M Jan 28 2013 user-1502@0004d45c7279d0c5-7d6b2f0878984657.journal~ -rw-r----- 1 root root 3.7M Feb 5 2013 user-1502@0004d500ffa45136-2a813f88b4c35eb0.journal~ -rw-r----- 1 root root 3.6M Feb 6 2013 user-1502@0004d50e68019f2c-792367421b954d3a.journal~ -rw-r----- 1 root root 3.8M Feb 18 2013 user-1502@0004d5fc00c3dd4b-9552d6e00f94c9e3.journal~ -rw-r-----+ 1 root adm 4.3M May 10 2013 user-1502@0004dc57c1291f04-c43f342a77d43ea8.journal~ -rw-r-----+ 1 root adm 3.7M May 21 2013 user-1502@0004dd3ea7df8ad1-666ffd6c48d2525b.journal~ -rw-r-----+ 1 root adm 3.9M Jun 22 2013 user-1502@0004dfc276f1acf5-fb61ee9435f0f5e0.journal~ -rw-r-----+ 1 root adm 4.2M Jul 5 2013 user-1502@0004e0c0e49c31eb-3d50e534af090374.journal~ -rw-r-----+ 1 root adm 3.6M Sep 7 2013 user-1502@0004e5c43512657f-889179bef82f4bca.journal~ -rw-r-----+ 1 root adm 3.7M Oct 12 12:40 user-1502@0004e888dbdd32eb-90bfad78126b55bc.journal~ -rw-r-----+ 1 root adm 4.3M Nov 13 09:34 user-1502@0004eb0ad6516f93-2ff823c5f53765c6.journal~ -rw-r-----+ 1 root adm 6.9M Dec 29 21:30 user-1502@0004eeb232aa6968-e7eb4d9a5a6445e8.journal~ -rw-r-----+ 1 root adm 6.1M Jan 8 09:43 user-1502@0004ef717b99e324-2307311a5c943deb.journal~ -rw-r-----+ 1 root systemd-journal 8.7M Feb 4 21:15 user-1502@0004f19a4edacc21-bd8b32cfa2466105.journal~ -rw-r-----+ 1 root systemd-journal 4.0M Feb 9 08:56 user-1502@0004f1f48f88a35a-7bf0835addc9b83a.journal~ -rw-r-----+ 1 root systemd-journal 58M Mar 13 20:22 user-1502@1790e2b1060b4746a2fa3a6fed23cca8-0000000000000538-0004f1f48f88b351.journal -rw-r-----+ 1 root systemd-journal 3.6M Mar 13 21:34 user-1502@1790e2b1060b4746a2fa3a6fed23cca8-00000000000500d5-0004f481fabc495c.journal -rw-r-----+ 1 root systemd-journal 3.6M Mar 13 23:11 user-1502@1790e2b1060b4746a2fa3a6fed23cca8-000000000006ecc3-0004f4833cb0b5ef.journal -rw-r-----+ 1 root systemd-journal 3.6M Mar 14 01:05 user-1502@1790e2b1060b4746a2fa3a6fed23cca8-0000000000083409-0004f484e9c8700f.journal -rw-r-----+ 1 root systemd-journal 3.6M Mar 14 03:45 user-1502@1790e2b1060b4746a2fa3a6fed23cca8-0000000000095d31-0004f48696f291b5.journal -rw-r-----+ 1 root systemd-journal 3.6M Mar 14 06:06 user-1502@1790e2b1060b4746a2fa3a6fed23cca8-00000000000a578a-0004f488441b2c01.journal -rw-r-----+ 1 root systemd-journal 3.6M Mar 14 07:48 user-1502@1790e2b1060b4746a2fa3a6fed23cca8-00000000000bbe17-0004f48a5c81e02c.journal -rw-r-----+ 1 root systemd-journal 5.3M Mar 27 12:13 user-1502@1790e2b1060b4746a2fa3a6fed23cca8-00000000000d2f5d-0004f48b9e755711.journal -rw-r-----+ 1 root adm 4.7M Dec 7 18:38 user-1502@6423493be90d40b49663aa64789b995c-0000000000001a3b-0004eb0ad65182da.journal -rw-r-----+ 1 root adm 3.7M Dec 12 18:53 user-1502@6423493be90d40b49663aa64789b995c-0000000000032005-0004ecf545be842c.journal -rw-r-----+ 1 root adm 3.7M Jul 13 2013 user-1502@82d22cae59d34f4faa2c4c466ad33d28-0000000000000650-0004e0c0e49c47c6.journal -rw-r-----+ 1 root adm 3.8M Aug 31 2013 user-1502@82d22cae59d34f4faa2c4c466ad33d28-000000000008db91-0004e167f68d62dd.journal -rw-r-----+ 1 root systemd-journal 3.8M Mar 7 11:29 user-26.journal
What is interesting is the changes we have in the group (from 12.3 to 13.1)
time journalctl --verify
real 2m20.742s user 2m6.646s sys 0m12.043s
You obviously have a much speedier system than mine (which doesn't really surprise me as my computer is 11 years old). Is this an SSD? Out of curiosity, how long does it take to just verify the active journal file?
It's already a old computer 3 years old i7-2nd generation with 16GB Ram and a 480GB Corsair SSD GT-Force3 (6gbs sata) 11 years old ? I would not like to pay the electricity bill :-)
Regards, Achim.
active journal take the same time almost cause everything is active. On this laptop, due to some crash there one or two place where the journal is not in good shape like having tons of @@@@ in old syslog message, due to crash test (volunteer or not) On productive server, I will apply the same rule as with syslog. Export every 3 months all the log somewhere else. -- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch openSUSE Member & Board GPG KEY : D5C9B751C4653227 irc: tigerfoot ~~~Don't take Life too serious. Nobody gets out alive anyway!~~~ -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Fri, Apr 4, 2014 at 8:38 AM, Bruno Friedmann <bruno@ioda-net.ch> wrote:
time journalctl --verify
real 2m20.742s user 2m6.646s sys 0m12.043s
You obviously have a much speedier system than mine (which doesn't really surprise me as my computer is 11 years old). Is this an SSD? Out of curiosity, how long does it take to just verify the active journal file?
It's already a old computer 3 years old i7-2nd generation with 16GB Ram and a 480GB Corsair SSD GT-Force3 (6gbs sata)
2 minutes on an SSD? That's completely unworkable for rotating media (they're *orders of magnitude* slower for random access). Can you test that on rotating media please? A great deal of us are still using spinning rust for our storage needs, testing on SSD and calling it representative is probably as wrong as you can get. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Friday 04 April 2014 09.31:38 Claudio Freire wrote:
On Fri, Apr 4, 2014 at 8:38 AM, Bruno Friedmann <bruno@ioda-net.ch> wrote:
time journalctl --verify
real 2m20.742s user 2m6.646s sys 0m12.043s
You obviously have a much speedier system than mine (which doesn't really surprise me as my computer is 11 years old). Is this an SSD? Out of curiosity, how long does it take to just verify the active journal file?
It's already a old computer 3 years old i7-2nd generation with 16GB Ram and a 480GB Corsair SSD GT-Force3 (6gbs sata)
2 minutes on an SSD?
That's completely unworkable for rotating media (they're *orders of magnitude* slower for random access). For 2.9GB of logs ?
Can you test that on rotating media please? A great deal of us are still using spinning rust for our storage needs, testing on SSD and calling it representative is probably as wrong as you can get.
Well if only I've still rotating rust for system. I've chosen journald exclusively cause I know some report of being perhaps slow on rust and all system here are now powered on ssd. (Tired of sector defects & co) Now the concerns raised about it are to take into account http://seife.kernalert.de/blog/2013/05/31/the-systemd-journal-is-a-broken-pi... http://www.happyassassin.net/2013/06/14/fedora-1920-logfile-explosions/ (chapter four) I also have decided to export them to external storage in plain text mode for archiving purpose. Rotating the journal is also a nice way to get perf back. Afterall we also rotate text log files and compress them. But when it work, honestly I became quickly addict to the color/bold , filtering etc :-) for example I've rotated my log this morning and now du -sh /var/log/journal/32aed2336d02ef306e8c3ca25103d7ee 3.7M /var/log/journal/32aed2336d02ef306e8c3ca25103d7ee time journalctl --verify --system PASS: /var/log/journal/32aed2336d02ef306e8c3ca25103d7ee/system.journal real 0m0.041s user 0m0.035s sys 0m0.006s Look like there's some good practices to be determined. -- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch openSUSE Member & Board GPG KEY : D5C9B751C4653227 irc: tigerfoot ~~~Don't take Life too serious. Nobody gets out alive anyway!~~~ -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Fri, Apr 4, 2014 at 2:12 PM, Bruno Friedmann <bruno@ioda-net.ch> wrote:
On Friday 04 April 2014 09.31:38 Claudio Freire wrote:
On Fri, Apr 4, 2014 at 8:38 AM, Bruno Friedmann <bruno@ioda-net.ch> wrote:
time journalctl --verify
real 2m20.742s user 2m6.646s sys 0m12.043s
You obviously have a much speedier system than mine (which doesn't really surprise me as my computer is 11 years old). Is this an SSD? Out of curiosity, how long does it take to just verify the active journal file?
It's already a old computer 3 years old i7-2nd generation with 16GB Ram and a 480GB Corsair SSD GT-Force3 (6gbs sata)
2 minutes on an SSD?
That's completely unworkable for rotating media (they're *orders of magnitude* slower for random access). For 2.9GB of logs ?
It depends on whether they're loaded sequentially or randomly, but judging from the OP, it's random, and random is on average 2MB/s on modern rotating rust (resulting in 24.7 minutes runtime for 2.9GB).
Can you test that on rotating media please? A great deal of us are still using spinning rust for our storage needs, testing on SSD and calling it representative is probably as wrong as you can get.
Well if only I've still rotating rust for system. I've chosen journald exclusively cause I know some report of being perhaps slow on rust and all system here are now powered on ssd. (Tired of sector defects & co)
Can you not copy into rotating rust and verify the copy? Flush your cache first with: sudo sync sudo sysctl -w vm.drop_caches=3 sudo sysctl -w vm.drop_caches=0
I also have decided to export them to external storage in plain text mode for archiving purpose. Rotating the journal is also a nice way to get perf back. Afterall we also rotate text log files and compress them. .. for example I've rotated my log this morning and now du -sh /var/log/journal/32aed2336d02ef306e8c3ca25103d7ee 3.7M /var/log/journal/32aed2336d02ef306e8c3ca25103d7ee
time journalctl --verify --system PASS: /var/log/journal/32aed2336d02ef306e8c3ca25103d7ee/system.journal
real 0m0.041s user 0m0.035s sys 0m0.006s
Look like there's some good practices to be determined.
Indeed -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Am 04.04.2014 19:42, schrieb Claudio Freire:
Can you not copy into rotating rust and verify the copy?
That's not a real test, since the files will be defragmented by the process. In a real setup, they are fragmented like hell, or at least they were before Werner backported the fix into systemd-208 on Feb.14 :-) -- Stefan Seyfried "For a successful technology, reality must take precedence over public relations, for nature cannot be fooled." -- Richard Feynman -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Am 04.04.2014 19:12, schrieb Bruno Friedmann:
Well if only I've still rotating rust for system. I've chosen journald exclusively
I really would not do that, I'd always add syslog-ng in addition. I cannot count the number of times the journald crapped itself and I had files that could not be read. Even better: when the broken file was present I could not read any journal anymore. Basically journalctl was going in circles: 2013-12-21 00:00:01 .... 2013-12-21 12:34:45 .... 2013-12-21 19:20:31 .... --- reboot --- 2013-12-21 00:00:01 .... etc. And just with trial and error (removing one file after another) I was finally able to delete 3 weeks of journal logs to at least be able to read the rest. This happened so often that I finally got tired of filing bugs :-) With syslog-ng, i sometimes after a crash get 4K of \000 in the file, but that's it. And I can just skip over it.
cause I know some report of being perhaps slow on rust and all system here are now powered on ssd. (Tired of sector defects & co)
Slowness is one problem, but being unreliable is really a PITA.
for example I've rotated my log this morning and now
How do you do that? How do you determine which of the files are no longer needed? -- Stefan Seyfried "For a successful technology, reality must take precedence over public relations, for nature cannot be fooled." -- Richard Feynman -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Friday 04 April 2014 23.25:09 Stefan Seyfried wrote:
Am 04.04.2014 19:12, schrieb Bruno Friedmann:
Well if only I've still rotating rust for system. I've chosen journald exclusively
I really would not do that, I'd always add syslog-ng in addition.
I cannot count the number of times the journald crapped itself and I had files that could not be read. Even better: when the broken file was present I could not read any journal anymore. Basically journalctl was going in circles:
2013-12-21 00:00:01 .... 2013-12-21 12:34:45 .... 2013-12-21 19:20:31 .... --- reboot --- 2013-12-21 00:00:01 ....
etc. And just with trial and error (removing one file after another) I was finally able to delete 3 weeks of journal logs to at least be able to read the rest.
This happened so often that I finally got tired of filing bugs :-)
With syslog-ng, i sometimes after a crash get 4K of \000 in the file, but that's it. And I can just skip over it.
cause I know some report of being perhaps slow on rust and all system here are now powered on ssd. (Tired of sector defects & co)
Slowness is one problem, but being unreliable is really a PITA.
for example I've rotated my log this morning and now
How do you do that? How do you determine which of the files are no longer needed?
While I'm still experiencing the impact of the different value in journalctl.conf on running system. And also due to some of the same kind of corruption (on old journal in 2013) (I've successfully reduce the lost by playing with the --since=DATE --until=DATE to bisect the faulty. Upstream have always promise that the journal has not problem, just the parser. So I keep them ;-) I'm using a poor dirty man hack rotate way. I've no formal code but my hack follow the idea of Find journal storage D=$(journalctl --header --system | grep "Machine ID" | awk '{print $3}') R=${D}_$(date --utc "+%F_%H%M%S") $B="/var/log/journal" systemctl stop systemd-journald.service mv ${B}/${D} ${B}/${R} systemctl start systemd-journald.service Journal is recreated and old one are still accessible with journalctl --directory=PATH could be exported to another host etc... Not that ideal as you can see, but I know that I don't have 100 lines of incoming logs when the rotate happen. So just a way to keep the journal quite fresh. -- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch openSUSE Member & Board GPG KEY : D5C9B751C4653227 irc: tigerfoot ~~~Don't take Life too serious. Nobody gets out alive anyway!~~~ -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Hi Bruno, Am 07.04.2014 09:26, schrieb Bruno Friedmann:
Upstream have always promise that the journal has not problem, just the parser. So I keep them ;-)
I keep the messages-$DATE.xz, I know that I can read them without relying on someone else :-) Journal is just for "fun" (well, strange values of "fun") for now and the foreseeable future.
I'm using a poor dirty man hack rotate way. I've no formal code but my hack follow the idea of
Find journal storage D=$(journalctl --header --system | grep "Machine ID" | awk '{print $3}') R=${D}_$(date --utc "+%F_%H%M%S") $B="/var/log/journal"
systemctl stop systemd-journald.service mv ${B}/${D} ${B}/${R} systemctl start systemd-journald.service
That's cool. I'll keep that for reference :-)
Journal is recreated and old one are still accessible with journalctl --directory=PATH could be exported to another host etc...
Not that ideal as you can see, but I know that I don't have 100 lines of incoming logs when the rotate happen.
IMO you should be able to just move the directory and then restart journald, since the FDs will stay open during move, so you'll only have a very short outage? -- Stefan Seyfried "For a successful technology, reality must take precedence over public relations, for nature cannot be fooled." -- Richard Feynman -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Tuesday 08 April 2014 22.07:21 Stefan Seyfried wrote:
Hi Bruno,
Am 07.04.2014 09:26, schrieb Bruno Friedmann:
Upstream have always promise that the journal has not problem, just the parser. So I keep them ;-)
I keep the messages-$DATE.xz, I know that I can read them without relying on someone else :-) Journal is just for "fun" (well, strange values of "fun") for now and the foreseeable future.
I'm using a poor dirty man hack rotate way. I've no formal code but my hack follow the idea of
Find journal storage D=$(journalctl --header --system | grep "Machine ID" | awk '{print $3}') R=${D}_$(date --utc "+%F_%H%M%S") $B="/var/log/journal"
systemctl stop systemd-journald.service mv ${B}/${D} ${B}/${R} systemctl start systemd-journald.service
That's cool. I'll keep that for reference :-)
Journal is recreated and old one are still accessible with journalctl --directory=PATH could be exported to another host etc...
Not that ideal as you can see, but I know that I don't have 100 lines of incoming logs when the rotate happen.
IMO you should be able to just move the directory and then restart journald, since the FDs will stay open during move, so you'll only have a very short outage? Hey that need tests too. :-)
I'm also investigating in the different export format journalctl could do For example using it to feed sshguard daemon LANG=C /usr/bin/journalctl $jnlunits -alfb --no-pager -p info -n1 -o cat | /usr/sbin/sshguard $cmdline -- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch openSUSE Member & Board GPG KEY : D5C9B751C4653227 irc: tigerfoot ~~~Don't take Life too serious. Nobody gets out alive anyway!~~~ -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Am 04.04.2014 14:31, schrieb Claudio Freire:
2 minutes on an SSD?
It's CPU bound, not IO bound (for me, 780MB /var/log/journal => 3 minutes to verify, just reading takes ~3 seconds with SSD)
That's completely unworkable for rotating media (they're *orders of magnitude* slower for random access).
Never ever put the persistant journal onto rotating rust. Just use syslog. Due to the genius design, the files fragment heavily and thrash the disk. Even plain "systemctl status foo.service" which looks into journal for logs will take 30+ seconds with journal on non-SSD.
Can you test that on rotating media please? A great deal of us are still using spinning rust for our storage needs, testing on SSD and calling it representative is probably as wrong as you can get.
Putting the journal on anything but SSD or RAM is as wrong as you can get :-) -- Stefan Seyfried "For a successful technology, reality must take precedence over public relations, for nature cannot be fooled." -- Richard Feynman -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Fri, Apr 4, 2014 at 6:16 PM, Stefan Seyfried <stefan.seyfried@googlemail.com> wrote:
That's completely unworkable for rotating media (they're *orders of magnitude* slower for random access).
Never ever put the persistant journal onto rotating rust. Just use syslog. Due to the genius design, the files fragment heavily and thrash the disk. Even plain "systemctl status foo.service" which looks into journal for logs will take 30+ seconds with journal on non-SSD.
Can you test that on rotating media please? A great deal of us are still using spinning rust for our storage needs, testing on SSD and calling it representative is probably as wrong as you can get.
Putting the journal on anything but SSD or RAM is as wrong as you can get :-)
Why not preallocate the space? That would get rid of fragmentation, and it should be a mostly-oneliner patch (call to posix_fallocate). For reading, posix_fadvise (or madvise) could read it whole (to avoid reading it in random fashion). Has this been tried? -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Am 04.04.2014 23:28, schrieb Claudio Freire:
Why not preallocate the space?
Because the genius did not think of this?
That would get rid of fragmentation, and it should be a mostly-oneliner patch (call to posix_fallocate).
For reading, posix_fadvise (or madvise) could read it whole (to avoid reading it in random fashion).
Has this been tried?
a676e66535e12458ea6d366a653f8dd60f982504 seems to do this: + /* Increase the file size a bit further than this, so that we + * we can create larger memory maps to cache */ + file_size = ((new_size+FILE_SIZE_INCREASE-1) / FILE_SIZE_INCREASE) * FILE_SIZE_INCREASE; + if (file_size > (uint64_t) f->last_stat.st_size) { + if (file_size > new_size) + ftruncate(f->fd, file_size); + + if (fstat(f->fd, &f->last_stat) < 0) + return -errno; + } FILE_SIZE_INCREASE is 8MB. And since this patch is ported back to opensuse, fragmentation is much better: Files with mtime 2014-02-13: ... 1420 extents found ... 786 extents found later files: ... 12 extents found system.journal: 20 extents found So it should be better nowadays. -- Stefan Seyfried "For a successful technology, reality must take precedence over public relations, for nature cannot be fooled." -- Richard Feynman -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Bruno Friedmann writes:
So these files are 25MByte on average? Nope from what I'm seeing actually […]
Your active journal seems to be around 100MiB then.
It's already a old computer 3 years old i7-2nd generation with 16GB Ram and a 480GB Corsair SSD GT-Force3 (6gbs sata)
SSD, I thought so. You'd certainly not like the performance if this was on spinning disk regardless of the CPU performance. Even on my old system and only a single thread, I/O wait time already is noticeable. It doesn't help that journald hopelessly fragments its journal files.
11 years old ? I would not like to pay the electricity bill :-)
For starters, it's not 24/7 and it actually isn't all that bad except when idle (and then only if you compare it to Haswell).
active journal take the same time almost cause everything is active.
The "closed" journals actually don't seem to be delaying the shutdown of the system as far as I am able to determine. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
participants (6)
-
Achim Gratz
-
Achim Gratz
-
Bruno Friedmann
-
Claudio Freire
-
Cristian Rodríguez
-
Stefan Seyfried