[Bug 657093] New: git uncovers possible hardware or kernel disk cache vm corruption
https://bugzilla.novell.com/show_bug.cgi?id=657093 https://bugzilla.novell.com/show_bug.cgi?id=657093#c0 Summary: git uncovers possible hardware or kernel disk cache vm corruption Classification: openSUSE Product: openSUSE 11.1 Version: Final Platform: x86-64 OS/Version: openSUSE 11.1 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: d.a.van.delft@gmail.com QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.1.11) Gecko/20100714 SUSE/3.5.11-0.1.1 Firefox/3.5.11 Error occured during git rebasing (git version 1.7.2.1): port@tilia:/home/port/devel 1039 # ../rebase_tmp_diversen No local changes to save Switched to a new branch 'rebasing' Switched to branch 'exit_handling' Current branch exit_handling is up to date. Switched to branch 'Link-utf8' Current branch Link-utf8 is up to date. error: inflate: data stream error (incorrect data check) fatal: object 3dba60b2e86c69451c357f2c5e94a9c1e871bc8e is corrupted which object is this: git log --raw --all --full-history and search for 3dba60b (first 7 characters of SHA1 of corrupt object) which pointed to :100755 100755 3dba60b... f5e4bac... M siteroot/.site_local_profile which had disappeared from the filesystem! Subsequent login gave error message, but this is a coincidence: site_local_profile gets sourced during login. A compare of the object with a backup: port@tilia:/home/port/devel 1008 # cmp /backups/tilia/Versioned/Versions/home.20101202091706/port/devel/.git/objects/3d/ba60b2e86c69451c357f2c5e94a9c1e871bc8e git/objects/3d/ba60b2e86c69451c357f2c5e94a9c1e871bc8e /backups/tilia/Versioned/Versions/home.20101202091706/port/devel/.git/objects/3d/ba60b2e86c69451c357f2c5e94a9c1e871bc8e git/objects/3d/ba60b2e86c69451c357f2c5e94a9c1e871bc8e verschillen: byte 375, regel 2 Okay, they differ (and, not shown here, other available backup's in the same position as well) Try a git fsck: port@tilia:/home/port/devel 1027 # git fsck error: inflate: data stream error (incorrect data check) fatal: object 3dba60b2e86c69451c357f2c5e94a9c1e871bc8e is corrupted Hmm, triggered by a Google search on: "error: inflate: data stream error (incorrect data check) fatal: object is corrupted", came up with http://lists-archives.org/git/709953-git-fsck-uncovers-hardware-kernel-probl... So, let's flush the kernel memory caches: port@tilia:/home/port/devel 1028 # echo 3 | sudo tee /proc/sys/vm/drop_caches root's password: 3 port@tilia:/home/port/devel 1029 # git fsck <snip dangling commits and blobs, no error> So, okay now!!! Let's try the compare again: port@tilia:/home/port/devel 1032 # cmp /backups/tilia/Versioned/Versions/home.20101202091706/port/devel/.git/objects/3d/ba60b2e86c69451c357f2c5e94a9c1e871bc8e git/objects/3d/ba60b2e86c69451c357f2c5e94a9c1e871bc8e port@tilia:/home/port/devel 1033 # So, the git repo is now all okay, BUT THIS IS SCARY !!!!!!!!!!!!!!!! First cmp returned a difference, but after a cache flush this is gone: cache was corrupt. What's the cause: - Hardware corrupt something (memory, disk, disk2memory) or - Kernel bug Linux tilia 2.6.27.54-0.1-default #1 SMP 2010-10-19 18:40:07 +0200 x86_64 x86_64 x86_64 GNU/Linux OS: Linux 2.6.27.54-0.1-default x86_64 Systeem: openSUSE 11.1 (x86_64) KDE: 4.4.4 (KDE 4.4.4) "release 3" Processor (CPU): Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz Snelheid: 1,600.00 MHz Kernen: 8 (running at stock speed, NOT overclocked) Totaal aan geheugen (RAM): 11.7 GB Beschikbaar geheugen: 6.3 GB (+ 3.3 GB caches) Beschikbaar wisselgeheugen: 16.0 GB Seeking suggestions how to diagnose. There are no relevant error messages in the system logs. Reproducible: Didn't try Steps to Reproduce: 1. 2. 3. Actual Results: Comparing two files (one original, the other a backup) with cmp first noted a difference, but after a flush of the kernel vm cache (forcing a reread of disk) they are identical. Scary is that this "problem" may have occured more often than this one time I have now noticed. Git could find the corruption because it names its object by the SHA1 hash of the contents, and they didn't match. As first measure I'll run memtest86. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=657093 https://bugzilla.novell.com/show_bug.cgi?id=657093#c Xinli Niu <xlniu@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED CC| |xlniu@novell.com AssignedTo|bnc-team-screening@forge.pr |kernel-maintainers@forge.pr |ovo.novell.com |ovo.novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=657093 https://bugzilla.novell.com/show_bug.cgi?id=657093#c1 --- Comment #1 from Danny van Delft <d.a.van.delft@gmail.com> 2010-12-03 08:13:27 UTC --- I let memtest86 run for a couple of hours, no error reported. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=657093 https://bugzilla.novell.com/show_bug.cgi?id=657093#c2 Jeff Mahoney <jeffm@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |WONTFIX --- Comment #2 from Jeff Mahoney <jeffm@novell.com> 2011-02-07 17:41:50 UTC --- OpenSUSE 11.1 is out of maintenance. If you are able to reproduce this issue with openSUSE 11.3 or openSUSE Factory (preferred), please re-open. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com