I just noticed a 1.02 BIOS go up in the Tyan directory. You might want to wait a bit before trying it out, since it seems to have re-broken DMA to the 3ware card and eaten my / again... -mcq
On Wed, 4 Feb 2004 12:41:03 -0800 mcq1@viz.cacr.caltech.edu wrote:
I just noticed a 1.02 BIOS go up in the Tyan directory. You might want to wait a bit before trying it out, since it seems to have re-broken DMA to the 3ware card and eaten my / again...
Was this with the latest update kernel ? And do you have more than 3GB of RAM? -Andi
I just noticed a 1.02 BIOS go up in the Tyan directory. You might want to wait a bit before trying it out, since it seems to have re-broken DMA to the 3ware card and eaten my / again...
Was this with the latest update kernel ? And do you have more than 3GB of RAM?
This was both after boot and during POST (corrupt card ID messages and system hang), so it's definately a hardware thing. Tyan was quite helpful and pointed me at this: https://www.3ware.com/kbadmin/attachments/TM900-0045-00%20Rev%20A_P.pdf which hints that all these weirdo problems might involve sketchy signalling that's just getting aggravated by some weird timing somewhere or something. Nice, eh? I hope 3ware replaces these 36 cards I just bought. I kindof got the feeling from Tyan that they consider this to be 3ware's problem and that the issue is closed. But, I'm not using risers, and my system is rock solid (up, thrashing disk for over two weeks with no problems) in the 0.01b BIOS and dies about 20% of the time during POST with 1.02, and blows panic or dynamic linking chunks very quickly after boot in that other 80%. And presumably a PCI parity error would be detected; I see no messages mentioning any such thing -- I just see silent data corruption across the bus just like the iommu problem (except that it happens at POST now too). And the 3ware iommu bug also hit Qlogic cards, yes? I can't believe those have signalling problems considering they plug into every weirdo PCI-carrying not-a-PC jumbo datacenter rack monster on the planet without a problem. But recall that the iommu bug (as far as I know) was never explained -- it was merely noted that the flush optimization triggered it, so the optimization was backed out. So it might have been the same tim'rous signalling beastie aggravated in both cases...er, I guess. The upshot at the moment seems to be that if Tyan hears more complaints from more people for more cards than just the 3ware, perhaps we'll here more. Until then, it feels like they're going to treat the card (rightly, perhaps) as not worth engineering thought. Bummer for us folks who bought a zillion and now have to play BIOS highwire trying to get them to work. -mcq
On Wed, 4 Feb 2004 19:10:56 -0800 mcq1@viz.cacr.caltech.edu wrote:
I just noticed a 1.02 BIOS go up in the Tyan directory. You might want to wait a bit before trying it out, since it seems to have re-broken DMA to the 3ware card and eaten my / again...
Was this with the latest update kernel ? And do you have more than 3GB of RAM?
This was both after boot and during POST (corrupt card ID messages and system hang), so it's definately a hardware thing. Tyan was quite helpful and pointed me at this:
Ok thanks. Please always mention such things when you talk about corruptions.
And the 3ware iommu bug also hit Qlogic cards, yes? I can't believe those
Yes it did. But I finally gave in and disabled the optimization to avoid corrupting people's disks. I still believe it's a bug somewhere in hardware and not the software though that is probably just hidden by the more frequent flushes.
have signalling problems considering they plug into every weirdo PCI-carrying not-a-PC jumbo datacenter rack monster on the planet without a problem.
I don't consider it that unlikely that two independent card have the same issues. Maybe they used the same macro cell to interface to PCI bought from some IP vendor or whatever. Or maybe it's a bug in the chipset. Or a combination of both.
But recall that the iommu bug (as far as I know) was never explained -- it was merely noted that the flush optimization triggered it, so the optimization was backed out. So it might have been the same tim'rous signalling beastie aggravated in both cases...er, I guess.
Yep.
The upshot at the moment seems to be that if Tyan hears more complaints from more people for more cards than just the 3ware, perhaps we'll here more.
Thanks for the update. Unfortunately we hear the complaints too and people blame such things on Linux :-/ But it's good that it at least breaks at POST now too, this makes it clearer where the problems are. -Andi
On Wed, 4 Feb 2004 mcq1@viz.cacr.caltech.edu wrote:
I kindof got the feeling from Tyan that they consider this to be 3ware's problem and that the issue is closed.
I have no input on whether 3ware or Tyan is to blame this time (both my Opterons have Rioworks mainboards), but what you're saying there is typical of Tyan's reaction whenever there is a compatibility problem with other vendors' hard- ware. Various ATI graphics adaptors, various D-link NICs, Tyan usually responds very quickly, blaming the other vendor. I'm not saying they're wrong, only that they draw that particular conclusion much too soon, usually. Bjørn -- Bjørn Tore Sund Phone: (+47) 555-84894 Stupidity is like a System administrator Fax: (+47) 555-89672 fractal; universal and Math. Department Mobile: (+47) 918 68075 infinitely repetitive. University of Bergen VIP: 81724 teknisk@mi.uib.no Email: bjornts@mi.uib.no http://www.mi.uib.no/
participants (3)
-
Andi Kleen
-
Bjorn Tore Sund
-
mcq1@viz.cacr.caltech.edu