[Bug 911347] New: Journal default settings use quite lot of disk store
http://bugzilla.suse.com/show_bug.cgi?id=911347 Bug ID: 911347 Summary: Journal default settings use quite lot of disk store Classification: openSUSE Product: openSUSE Factory Version: 201412* Hardware: Other OS: Other Status: NEW Severity: Enhancement Priority: P5 - None Component: Basesystem Assignee: bnc-team-screening@forge.provo.novell.com Reporter: tchvatal@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- As we by default create 40gb partition for / we simply say journal can log 4gb of data in its storage before turncating. We should set default to something like 100megs in /etc/journalctl.conf: SystemMaxUse=100M And also to match up what rsyslog does we should probably set the 12th terminal to print the syslog ForwardToConsole=yes TTYPath=/dev/tty12 MaxLevelConsole=info -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 --- Comment #1 from Tomáš Chvátal <tchvatal@suse.com> --- Of course the config is /etc/systemd/journald.conf just my brain got ahead of me... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 Chenzi Cao <chcao@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |chcao@suse.com Assignee|bnc-team-screening@forge.pr |systemd-maintainers@suse.de |ovo.novell.com | -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |thomas.blume@suse.com Flags| |needinfo?(werner@suse.com) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 Dr. Werner Fink <werner@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Comment #2 is|1 |0 private| | --- Comment #3 from Dr. Werner Fink <werner@suse.com> --- (In reply to Thomas Blume from comment #2) AFAICR the default log console was /dev/tty10 also I'd like to know if we should use relative values which fits to 100Mb with a 40Gb root partition ... better we should check the size of /var which could be a seperate parition -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 --- Comment #5 from Thomas Blume <thomas.blume@suse.com> --- Thanks for the feedback. Using relative values for disk space is indeed more sensitive. The code checks for the filesystem size of the mount where the journal is stored. So, this covers a separate /var partition. Still, it is very hard to estimate an average size for /var. If it is 1G we would have only 10Mb for the journal if we set SystemMaxUse to 1%. I guess we'd need an absolute lower limit for small systems and an relative upper limit for big systems. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 --- Comment #7 from Thomas Blume <thomas.blume@suse.com> --- (In reply to Bruno Pesavento from comment #6)
(In reply to Thomas Blume from comment #5)
I guess we'd need an absolute lower limit for small systems and an relative upper limit for big systems.
Maybe there is a simpler solution. What really hurts slow disks is max file size. So something like "SystemMaxUse=400M" (or 1% of 40GB) and "SystemMaxFileSize=26M" (or three 8.4 MB log chunks as seen on my 64bit systems) is likely to make everybody happy. Anyway that sounds more reasonable than the current 4GB and 480MB, respectively, with the suggested 40 GB root/. If anybody really needs more, or less, he/she is not an "average" user IMHO and should know how to tweak system defaults. I'll try that on my slow test disk and report back anything of interest.
Thanks, would be good if you could test this. Btw. the code includes some general limits on the journal size: -->-- /* This is the minimum journal file size */ #define JOURNAL_FILE_SIZE_MIN (4ULL*1024ULL*1024ULL) /* 4 MiB */ /* These are the lower and upper bounds if we deduce the max_use value * from the file system size */ #define DEFAULT_MAX_USE_LOWER (1ULL*1024ULL*1024ULL) /* 1 MiB */ #define DEFAULT_MAX_USE_UPPER (4ULL*1024ULL*1024ULL*1024ULL) /* 4 GiB */ /* This is the upper bound if we deduce max_size from max_use */ #define DEFAULT_MAX_SIZE_UPPER (128ULL*1024ULL*1024ULL) /* 128 MiB */ --<-- but it seems they are too generous for old machines. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(bpesavento@infini | |to.it) --- Comment #9 from Thomas Blume <thomas.blume@suse.com> --- Thanks for the feedback. I'm still searching for alternatives to generally reducing the journal size, because we might need a full journal log for debugging. Seems that a similar issue was reported here: https://bugzilla.redhat.com/show_bug.cgi?id=1006386 can you please take a look at comment#95 and following in this bug and check wheter journal fragmentation has an influence on your machine? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tchvatal@suse.com Flags| |needinfo?(bpesavento@infini | |to.it), | |needinfo?(tchvatal@suse.com | |) --- Comment #12 from Thomas Blume <thomas.blume@suse.com> --- (In reply to Bruno Pesavento from comment #11)
I still think that limiting SystemMaxFileSize= well under DEFAULT_MAX_SIZE_UPPER (I would set it to 32 MiB on my test system) prevents most collateral damage on slow systems and has no adverse effects, to my understanding. Then the defaults will take care of small systems, trying to leave some free space on /var anyway.
Hm, https://bugzilla.redhat.com/show_bug.cgi?id=1006386 comment#84 indicates that a small MaxFileSize will hurt the journal efficiency. However, it doesn't seem that we will loose logs therewith. And 32Mb is 3 times more than suggested in the RH bug. Maybe we won't have too bad negative effects therewith. Does SystemMaxFileSize=32M also help if you have SystemMaxUse unset?
Fragmentation is almost nonexistent in my "productive" laptop so far (Tumbleweed, SSD, EXT4, 30GB root/). So, waiting for the designers to solve the root problem, I'm not complaining: the test disk is going to be wiped by the next RC anyway. But even halving the times I'm witnessing, I think that many laptops more than 5 years old are going to hit troubles with current defaults.
Sure, we should continue pursue this, independently of the SystemMaxFileSize settings. Can you confirm that you see the fragmentation and the long journal flushes only on btrfs? If so, I should probably open a separate btrfs bug.
Feel free to ask for more testing if needed.
Thanks, would be good if you could also test the other fix, e.g. edit /etc/sysctl.d/99-sysctl.conf and put: net.unix.max_dgram_qlen=1000 there. Does this has any further influence on the journal flush time? Tomas, could you also test all the above? I really would need testing on more hardware here. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 --- Comment #16 from Thomas Blume <thomas.blume@suse.com> --- (In reply to Bruno Pesavento from comment #15)
Created attachment 619596 [details] Flushing times with SystemMaxFileSize=34M
Here are the results with SystemMaxFileSize=34M and no explicit limit to SystemMaxUse. Flush time grows to 25s, then cycles back to 57ms. No side effect visible to the user. Adding net.unix.max_dgram_qlen=1000 has no effect, apparently.
Hm, still 25s, not really good. I'm currently looking at the upstream commits to improve journal flush performance on btrfs (see comment#14). Let's check wheter they can give some better results. I will let you know when testbuilds are available. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(bpesavento@infini | |to.it) --- Comment #18 from Thomas Blume <thomas.blume@suse.com> --- (In reply to Bruno Pesavento from comment #17)
(In reply to Thomas Blume from comment #16) Agree, this is just a workaround to keep old systems rolling, waiting for the designers to solve the root cause. Journal logs are highly sparse, a 33.6MB file compresses to just 4MB in tar.gz, as a rough measure of info content...
I would be completely fine with the workaround, if we only could limit it to systems that really need it. But I don't know how we should distinguish old from recent machines. Setting SystemMaxFileSize unconditionally gives me some headaches. Let's keep it as a last resort for now and first try some other things. A systemd build including the btrfs patches is now available at: https://build.opensuse.org/package/binaries/home:tsaupe:branches:Base:System... can you give it a try? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 --- Comment #20 from Thomas Blume <thomas.blume@suse.com> --- (In reply to Bruno Pesavento from comment #17)
Hm, still 25s, not really good. I'm currently looking at the upstream commits to improve journal flush performance on btrfs (see comment#14). Agree, this is just a workaround to keep old systems rolling, waiting for
(In reply to Thomas Blume from comment #16) the designers to solve the root cause. Journal logs are highly sparse, a 33.6MB file compresses to just 4MB in tar.gz, as a rough measure of info content...
Ok, still investigating more options. A first and quick one is using the autodefrag mount option for /var/log. For details, see: http://www.spinics.net/lists/linux-btrfs/msg41015.html. Can you please check wheter mount /var/log with autodefrag has an impact? Secondly, I've found another commit that could have an influence: http://cgit.freedesktop.org/systemd/systemd/commit/src/journal/journal-file.... I will provide a new testpackages, but first please try the mount option. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(bpesavento@infini | |to.it) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(bpesavento@infini | |to.it) --- Comment #22 from Thomas Blume <thomas.blume@suse.com> --- (In reply to Bruno Pesavento from comment #21)
Created attachment 621355 [details] Flushing times with /var/log mounted with autodefrag option
Mounting /var/log autodefrag is definitely a better workaround: flushing time is still less than 4 s with a 41.9 MB system journal. Now removed the 34MB limit, to be able to test upcoming patches with larger files.
Ok, thanks for the feedback. We might still set an upper limit for the journal files, because the manpage states: -->-- autodefrag Disable/enable auto defragmentation. Auto defragmentation detects small random writes into files and queue them up for the defrag process. Works best for small files; Not well suited for large database workloads. --<-- But this rather refers to files of gigabyte size. Can you test with SystemMaxFileSize up to 1Gb and check wheter you see any negative impact? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 --- Comment #24 from Thomas Blume <thomas.blume@suse.com> --- (In reply to Bruno Pesavento from comment #23)
I'll let the file grow unconstrained (unless I hit a wall...), but think the default max is 128MiB (see comment #7).
Indeed, please forget my comment and test without a limit. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 Tomáš Chvátal <tchvatal@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(tchvatal@suse.com | |) | --- Comment #26 from Tomáš Chvátal <tchvatal@suse.com> --- For now at least on Factory it behaves quite decently. My journal folder is ~100 megs and it is quite fast to act when running journalctl commands. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 http://bugzilla.suse.com/show_bug.cgi?id=911347#c27 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |gnyers@suse.com --- Comment #27 from Thomas Blume <thomas.blume@suse.com> --- *** Bug 838475 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=911347 http://bugzilla.suse.com/show_bug.cgi?id=911347#c30 Thomas Blume <thomas.blume@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED Flags|needinfo?(werner@suse.com), | |needinfo?(bpesavento@infini | |to.it) | --- Comment #30 from Thomas Blume <thomas.blume@suse.com> --- closing -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com