I have a problem on a new server that I'm bedding in. I'm trying to copy
data onto it from an older machine and the new machine is going into a
strange 'locked-in' state. I'm looking for suggestions on the best way
to investigate the problem.
On an old server, there's some data that I want to copy to the new
server. The way I'm doing that is to nfs export the data on the old
server, nfs mount it on the new server and then use cp to copy it to its
new home, something like this:
cp -uax /nfs/old-host/data /new-directory
I actually have two of these running, for different subsets of the data.
BTW, I'm executing all these commands via ssh from my desktop machine.
That works fine for a couple of hours and then when I try to execute
another command (ls) in another ssh shell it tells me that the new
server has closed the connection. The two cp commands are still
apparently running but when I try typing ^Z, those ssh sessions are
terminated as well.
If I go to the new server's screen, it is blank and I haven't found any
key sequence that makes anything appear on it (specifically, CTRL-ALT-F1
doesn't, for example).
So it's beginning to sound like it's crashed, yes? But it hasn't. The
disk activity lights for the drives where /new-directory lives are still
flashing and running top on the old server shows me that a couple of
nfsd processes are busy and that system is spending a lot of time
accessing its disks (there's not much else running) so it seems like the
cp processes are still running.
At this point I don't see any alternative to rebooting but I thought I'd
ask first to see if anybody had any other ideas on ways to gather
These symptoms have occurred once before. When I rebooted that time, the
system came up without one of the data disks - the system said the
interface wasn't responding. But when I unplugged the disk and plugged
it back in and rebooted, it came back. There was actually a lot else
happened at that time, so I can't be sure it was as simple as it sounds.
I'll see if the same happens this time. I wasn't able to spot anything
useful in the logs and smart said the disk was OK, AFAICT.
The new data directory is in an ext4 filesystem in an LVM volume on an
mdadm RAID10 all on openSUSE 11.2 so another possibility is some kind of
problem in one of those systems. The drive is in a SATA hot swap cage
connected to a port on the motherboard; it's a WD RE4 1.5 TB.
All ideas gratefully received!
To unsubscribe, e-mail: opensuse+unsubscribe(a)opensuse.org
For additional commands, e-mail: opensuse+help(a)opensuse.org