Mailinglist Archive: opensuse (2912 mails)

< Previous Next >
Re: [SLE] Cannot upgrade to 9.2
  • From: "Carl E. Hartung" <suselinux@xxxxxxxxxxxxx>
  • Date: Thu, 17 Feb 2005 12:52:57 -0500
  • Message-id: <4214D9F9.9060508@xxxxxxxxxxxxx>
Hash: SHA1

Marc Chamberlin wrote:
> I'm sorry, your joke went over my head...

Sorry, I forgot to throw a :-D in there.

> As far as progress goes, no joy yet. I have disconnected/removed the
> SCSI card, modem card and one of the two CD drives from my computer in
> an attempt to isolate and remove some of the variables.

I was going to recommend a time-consuming but time-tested version of
this process, but figured you'd probably already thought about it. It
takes a lot of time and patience, but it almost always yields favorable
results -- even if it means working at it for a couple of days and you
end up changing out an uncooperative piece of hardware. (I already
mentioned that this is a time-consuming process, right? Also, it's a
really good idea to keep a fresh pot of coffee going and to have all
your hardware documentation available.) Finally, Marc, please don't
interpret this as /instructions/ on how to proceed, but go through it
and pick out what seems reasonable given all the variables you're juggling:

a.) See if Google reveals any field evidence of similar troubles /or/
successes involving a current kernel on this motherboard and/or SCSI
controller. Who knows? Maybe 100% of the installations on this
hardware combination fail unless you turn the box on it's side? (you get
my drift) It'd be nice to know this kind of thing early on, wouldn't it?

- --> Here's an excerpt from my first Google 'hit' with the search string
"Adaptec 29160 Ultra160 SuSE Linux 9.2"):
With the 2.6.8 kernel from SUSE I only get:
In /proc I have:
# cat /proc/scsi/aic7xxx/2 # I have 2 3ware cards in the same system
that are scsi0 and scsi1
Adaptec AIC7xxx driver version: 6.2.36
Adaptec 29160 Ultra160 SCSI adapter
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
Allocated SCBs: 4, SG List Length: 128
Serial EEPROM:
0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a
0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a 0xc33a
0x08f4 0x7c5c 0x2807 0x0010 0x0300 0xffff 0xffff 0xffff
0xffff 0xffff 0xffff 0xffff 0xffff 0xffff 0x0250 0xe64e

There are probably other relevant experiences documented out there that
you can mine for valuable information and ideas.

b.) If a. yields nothing that moves me forward after a reasonable effort
and I'm stuck with an unexplained endless boot/crash cycle, I get
surgical: First, is the latest BIOS version intalled in the
motherboard? If not and I'm experiencing problems like this, I upgrade
it and move on.

c.) Are all the options in my CMOS setup really configured the way they
need to be? Are there any comments in the release notes or installation
manual(s) related to the items controlled here? When that all checks
out, I go to phase II, which involves gutting the system:

d.) I create the boot and module floppies and boot with those. No
CD/DVD drives, no SCSI controller or drives, no fancy graphics card, no
modem, no nada installed -- just the mainboard, CPU, RAM, floppy drive
and a keyboard. When I boot it the first time in in this "bare bones"
configuration, I run memtest86. Normally, you want to perform a long
test, but this is an existing machine that has already been "burned in"
through daily use. If the memory passes muster, I reboot and spend some
time studying system messages and the results of "Hardware Information"
(hwinfo.) I am most interested in understanding the "starting point"
resource allocations so I can track them through the rest of the procedure.

e) I begin adding hardware back one piece at a time, rebooting each time
and updating my notes until I experience a "hiccup." I keep problem
pieces off to the side and continue with the rest of my hardware until
I've got everything installed that works without me intervening
manually. I'd probably save a special graphics card for next-to-last
and the SCSI controller/drives combination for the very last items to
install. The time consuming part of this comes from rebooting using the
floppies and taking notes each time. It's a PITA, but sometimes this
'fine-tooth combing' of the system uncovers overlooked configuration
anomalies -- not necessarily the main problem you're looking for, but
factors that could be inadvertently contributing to it.

f) At this point, you end up with your problem hardware isolated from a
stable, verified configuration, bootable and testable base system. It
is then a matter of tackling the problem hardware pieces, one at a time,
and fully resolving any configuration issues with them (or installing
alternatives) prior to continuing the process.

> At one point I saw something confusing, it looked as if the scripts are
> trying to set up a RAID set of drives. I don't understand why that was
> happening as I do not have a RAID system.

Messages like that is not uncommon. It's just the system 'waking up'
and probing to discover the parts of it's environment that haven't been
explicitly defined. Take a look at these excerpts (chopped up &
commented) from my own startup log:

<4>Uncovering SIS18 that hid as a SIS503 (compatible=0)
- --> a second probe was required to correctly ID the chip
<6>audit: initializing netlink socket (disabled)
- --> process starts at each system boot for no purpose. It is disabled.
<6>SIS5513: not 100%% native mode: will probe irqs later
- --> must be handled differently (i.e. it's been patched)
<7>Probing IDE interface ide2...
<7>ide2: Wait for ready failed before probe !
<7>Probing IDE interface ide3...
<7>ide3: Wait for ready failed before probe !
<7>Probing IDE interface ide4...
<7>ide4: Wait for ready failed before probe !
<7>Probing IDE interface ide5...
<7>ide5: Wait for ready failed before probe !
- --> the system only has ide0 and ide1 for up to four devices, but the
process marches along probing higher anyway
<7>PM: Reading pmdisk image.
<7>swsusp: Resume From Partition: /dev/hdc1
<7><3>swsusp: Invalid partition type.
<7>pmdisk: Error -22 resuming
<7>PM: Resume from disk failed.
- --> I'm glad it finally figured out it wasn't resuming from suspended
<6>md: Autodetecting RAID arrays.
<6>md: autorun ...
<6>md: ... autorun DONE.

This last message might be related to the one you spotted. I don't
think it's conclusive enough at this point to start focusing on it.

> I do have a lot of partitions on my disk drives, mostly Fat32
> and NFS for Windows (my computer is a dual, actual three Win98,
> Win2000 and Linux) boot system and I wonder if having a lot of
> partitions might be confusing the install scripts?

You /did/ mean "NTFS," didn't you? And it's possible you've exceeded a
limitation of some kind that requires manual intervention. Since you've
brought it up, can you post details about your drives and partitioning

> Bottom line, my computer is still crashing as always and I am both
> contemplating a next step and diagnosing/researching.....

As Sid and Basil (and many others on previous threads) have stated,
installing 9.2 is -- at the very least -- much more challenging than
earlier SuSE versions. That's been my experience, as well. But don't
give up hope yet, whatever you do. It's just a matter of persistence.
When you're setup is finally humming along, you'll have a great
installation woes story to tell.

- - Carl

P.S. - Thanks to all who responded on and off the list regarding my
"quotes" character problem. It does appear to be fixed, finally, and
getting accurate feedback was a great part of the process. So, thanks

- --
C. E. Hartung Business Development & Support Services carlh@xxxxxxxxxxxxx
Dover Foxcroft, Maine, USA Public Key #0x68396713
Reg. Linux User #350527
Version: GnuPG v1.2.5 (GNU/Linux)


< Previous Next >
Follow Ups