[Bug 716321] New: Major crash / major data loss - urgent help please!
https://bugzilla.novell.com/show_bug.cgi?id=716321 https://bugzilla.novell.com/show_bug.cgi?id=716321#c0 Summary: Major crash / major data loss - urgent help please! Classification: openSUSE Product: openSUSE 11.3 Version: Final Platform: x86-64 OS/Version: openSUSE 11.3 Status: NEW Severity: Critical Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: dav1dblunk3tt@hotmail.com QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-GB; rv:1.9.2.18) Gecko/20110613 SUSE/3.6.18-0.2.1 Firefox/3.6.18 This morning my suse 11.3 box locked up (firefox locked, the WM and mouse worked ok but couldn't start any processes, I switch to text console but got "INIT cannot execute mingetty" on login and "INIT cannot execute shutdown" on alt-cntl-del). I performed a hard reset. On reboot the disks were checked with "filesystems have not been checked for >60 days" and reboot appeared to proceed as normal. However, the filesystems (ext4 / and ext3 /home) have both been wound back 64 days leading to major data loss. I am at a loss to diagnose since /var/log/messages ends on 4/7/11 and restarts today on 7/9/11. Every single file since shutdown in July and the 2nd reboot today is missing. I have backups for critical data files but recovering the filesystems is a much more attractive idea if possible. I realise this may not be the most appropriate place for this bug report but I have to start somewhere... Urgent help on fixing the filesystems and diagnosing the problem appreciated! In the meantime I'll quarantine this machine and start on another. Reproducible: Didn't try Steps to Reproduce: Please see details Actual Results: Major data loss Expected Results: No data loss /var log messages and dmesg are free of any useful information: Jul 4 14:07:15 lunesta sshd[3275]: Received signal 15; terminating. Jul 4 14:07:15 lunesta ntpd[3707]: ntpd exiting on signal 15 Jul 4 14:07:15 lunesta rpcbind: rpcbind terminating on signal. Restart with "rpcbind -w" Jul 4 14:07:15 lunesta avahi-daemon[3446]: Got SIGTERM, quitting. Jul 4 14:07:15 lunesta avahi-daemon[3446]: Leaving mDNS multicast group on interface eth1.IPv4 with address 192.168.2.2. Jul 4 14:07:15 lunesta kernel: Kernel logging (proc) stopped. Jul 4 14:07:15 lunesta rsyslogd: [origin software="rsyslogd" swVersion="5.4.0" x-pid="2506" x-info="http://www.rsyslog.com"] exiting on signal 15. Sep 7 08:25:10 lunesta kernel: imklog 5.4.0, log source = /proc/kmsg started. Sep 7 08:25:10 lunesta rsyslogd: [origin software="rsyslogd" swVersion="5.4.0" x-pid="1552" x-info="http://www.rsyslog.com"] start Sep 7 08:25:10 lunesta kernel: [ 621.456946] type=1505 audit(1315380309.297:2): operation="profile_load" pid=1486 name=/bin/ping Sep 7 08:25:10 lunesta kernel: [ 621.500804] type=1505 audit(1315380309.341:3): operation="profile_load" pid=1487 name=/sbin/klogd Sep 7 08:25:10 lunesta kernel: [ 621.582038] type=1505 audit(1315380309.423:4): operation="profile_load" pid=1488 name=/sbin/syslog-ng Sep 7 08:25:10 lunesta kernel: [ 621.664381] type=1505 audit(1315380309.505:5): operation="profile_load" pid=1489 name=/sbin/syslogd Se -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=716321
https://bugzilla.novell.com/show_bug.cgi?id=716321#c1
Jean-Daniel Dodin
https://bugzilla.novell.com/show_bug.cgi?id=716321
https://bugzilla.novell.com/show_bug.cgi?id=716321#c2
--- Comment #2 from SA SA
https://bugzilla.novell.com/show_bug.cgi?id=716321
https://bugzilla.novell.com/show_bug.cgi?id=716321#c3
--- Comment #3 from Jean-Daniel Dodin
https://bugzilla.novell.com/show_bug.cgi?id=716321
https://bugzilla.novell.com/show_bug.cgi?id=716321#c4
--- Comment #4 from SA SA
https://bugzilla.novell.com/show_bug.cgi?id=716321
https://bugzilla.novell.com/show_bug.cgi?id=716321#c
zj jia
https://bugzilla.novell.com/show_bug.cgi?id=716321
https://bugzilla.novell.com/show_bug.cgi?id=716321#c5
Petr Uzel
I've already file a report on suse forums.
I think there is two bugs here, the first is what ever brought the system down but unless this happens many times again I do not think there is enough information available to diagnose this.
You're right. Unless it happens again and you can provide more information, I doubt we could do anything about it. You can at least try memcheck and check out SMART data.
The second is fsck rewinding time by 64 days. Obviously this may only occur for the exceptional circumstances of the first bug but it is not good behaviour.
I really doubt that fsck did this 'undoing' of last 64 days. (In reply to comment #4)
The date the data has reverted to (4/7/11 / 64 days ago) is beginning to look suspiciously like the date I installed suse 11.3 on this disk. In particular it looks like the last log entry is the last shutdown after I finished the install and copied my home across from the the old disk.
Blind guess: do you have multiple HDDs (or partitions)? Perhaps wrong (on the old disk) partition is mounted? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=716321
https://bugzilla.novell.com/show_bug.cgi?id=716321#c6
--- Comment #6 from SA SA
https://bugzilla.novell.com/show_bug.cgi?id=716321
https://bugzilla.novell.com/show_bug.cgi?id=716321#c7
SA SA
https://bugzilla.novell.com/show_bug.cgi?id=716321
https://bugzilla.novell.com/show_bug.cgi?id=716321#c8
--- Comment #8 from Petr Uzel
So thank you for your patience and good ideas and apologies for presenting this as a bug and not spotting the obvious hardware failure.
No problem, I'm glad you figured it out. Good luck with getting your data from the dead disk. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com