what has become of the latest kernel hang/freeze bug?
hi there, the latest suse security announcement for the dhcp packages doesnt list any pending problems with the kernel packages. i was wondering if i have missed anything about ths discussion regarding the freezes and hangs many people experienced when this kernel updates was released for the floating point exception problem. how did people work around the freeze when booting this new kernel, or does suse have any plans on fixing/rereleasing another kernel that fixes this bug? thanks already for any hints. cheers andy
Andreas Bittner wrote:
hi there,
the latest suse security announcement for the dhcp packages doesnt list any pending problems with the kernel packages.
The announcement for the updated kernel packages (version 2.4.1-226 for 9.0, version 2.6.5-7.75 for 9.1) was made on June 16. -- Phil Brutsche phil@optimumdata.com
On Wednesday 23 June 2004 08:52, Phil Brutsche wrote:
Phil Brutsche wrote:
version 2.4.1-226 for
Oops, should have read "2.4.21-226"
Whats it matter what it should have read? Your post was completely off target to the original question. The original question dealt with the DAMAGE done by the updated kernel packages. -- _____________________________________ John Andersen
And for many of us the "fix" was worse than the original problem. The unpatched kernel would at least BOOT. JimW ----- Original Message ----- From: "Phil Brutsche" <phil@optimumdata.com> To: <suse-security@suse.com> Sent: Wednesday, June 23, 2004 11:47 AM Subject: Re: [suse-security] what has become of the latest kernel hang/freeze bug?
Andreas Bittner wrote:
hi there,
the latest suse security announcement for the dhcp packages doesnt list any pending problems with the kernel packages.
The announcement for the updated kernel packages (version 2.4.1-226 for 9.0, version 2.6.5-7.75 for 9.1) was made on June 16.
--
Phil Brutsche phil@optimumdata.com
-- Check the headers for your unsubscription address For additional commands, e-mail: suse-security-help@suse.com Security-related bug reports go to security@suse.de, not here
And for many of us the "fix" was worse than the original problem. The unpatched kernel would at least BOOT.
this is what i was asking, if suse will release new kernel images/patches to fix the boot problem? i have also some boxes that experience the boot freeze with the latest kernel releases from suse for 9.1 any hints?
Am Mittwoch, 23. Juni 2004 19:14 schrieb Jim Westbrook:
And for many of us the "fix" was worse than the original problem. The unpatched kernel would at least BOOT.
But this machine is save now :-) (waiting too for a new kernel) If I have to guess, I would say, DMA is involved, because it doesn't work with a machine here, where 1 drive has DMA problems and I disabled it therefore, while this machine has the same chipset (815i) and processor than a working machine. Al
On Wednesday 23 June 2004 10:05, Al Bogner wrote:
Am Mittwoch, 23. Juni 2004 19:14 schrieb Jim Westbrook:
And for many of us the "fix" was worse than the original problem. The unpatched kernel would at least BOOT.
But this machine is save now :-) (waiting too for a new kernel)
If I have to guess, I would say, DMA is involved, because it doesn't work with a machine here, where 1 drive has DMA problems and I disabled it therefore, while this machine has the same chipset (815i) and processor than a working machine.
Al
Many others reported this happened with older drives in the 4 to 10 gig size. That size was about the time the 80wire came into common use - (what is that ata66 or something?). How could this patch, which was only to fix the infinit loop problem (which one had to specifically compile a program to actually demonstrate) manage to break disk access that has been working for years... -- _____________________________________ John Andersen
On Wednesday 23 June 2004 20:16, John Andersen wrote:
Many others reported this happened with older drives in the 4 to 10 gig size. That size was about the time the 80wire came into common use - (what is that ata66 or something?).
I don't think it is directly related to the harddrives themselves. I have a couple of 30GB drives (from the same batch) in different machines. On some machines, the kernel update freezes the machine, while on others, it works just fine. With identical harddrives and wiring. I'd say it has something to do with the chipset on the motherboard.
How could this patch, which was only to fix the infinit loop problem (which one had to specifically compile a program to actually demonstrate) manage to break disk access that has been working for years...
Because it uses a different/newer version of the kernel? Note that (contrary to the common way of backporting bugfixes) the kernel update upgraded the kernel from 2.6.4 to 2.6.5. I don't know the difference between those two, but usually upgrading to a newer version means that you'll get new bugs too. Best regards, Arjen
I don't think it is directly related to the harddrives themselves. I have a couple of 30GB drives (from the same batch) in different machines. On some machines, the kernel update freezes the machine, while on others, it works just fine. With identical harddrives and wiring. I'd say it has something to do with the chipset on the motherboard.
I have four machines running 9.1 here. Three (slightly older ones) work fine with the new kernel (one Asus Centrino Notebook, one older P4 and a Duron box). With the fourth one (Athlon64, ASUS K8V) I experience the problem. My first thought was, that it could have something to do with SATA, as the three running ones don't have it. On the other hand, if You have problems with older machines... But I agree that it probably has to do with the chipset though. Regards, Christian
Am Donnerstag, 24. Juni 2004 11:33 schrieb Christian Richter:
But I agree that it probably has to do with the chipset though.
Hmm, I suppose that only the drive is important, where /boot and / locates. Soyo SY-7IS2 (815 Chipset) with 10GB IBM-DTTA-351010 hangs MSI MS-6337 (815 Chipset) with 40GB QUANTUM FIREBALLP AS40.0 works Asustek TUSL2-C (815 Chipset) Hitachi 160GB HDS722516VLAT80 works Asus P3B-F (440 BX chipset) with 8GB IBM-DTTA-350840 works Elitegroup P6S5AT (SIS 635T chipset) with 8GB Seagate ST38420A works Epox EP-8K9A9+ (KT400A chipset) with 40GB Maxtor 5T040H4 works and the rest of my machines are waiting for a new kernel :-) I don't know wich machines use 40wire-cables, but I am sure some do. Al
On Thursday 24 June 2004 13:44, Al Bogner wrote:
I suppose that only the drive is important, where /boot and / locates.
Don't think so. Just two of the most similar machines (with identical setup for the filesystems) but very different behaviour: ChainTech 5VGM0 (i430VX) with Maxtor 33073H3 revision YAH814Y0: fails ChainTech 6BTM (i440BX) with Maxtor 33073H3 revision YAH814Y0: OK Drivers used are identical (according to hwinfo) so it looks to me that something else goes wrong. Best regards, Arjen
On Thursday 24 June 2004 03:44, Al Bogner wrote:
Am Donnerstag, 24. Juni 2004 11:33 schrieb Christian Richter:
But I agree that it probably has to do with the chipset though.
Hmm,
I suppose that only the drive is important, where /boot and / locates.
Soyo SY-7IS2 (815 Chipset) with 10GB IBM-DTTA-351010 hangs MSI MS-6337 (815 Chipset) with 40GB QUANTUM FIREBALLP AS40.0 works Asustek TUSL2-C (815 Chipset) Hitachi 160GB HDS722516VLAT80 works Asus P3B-F (440 BX chipset) with 8GB IBM-DTTA-350840 works Elitegroup P6S5AT (SIS 635T chipset) with 8GB Seagate ST38420A works Epox EP-8K9A9+ (KT400A chipset) with 40GB Maxtor 5T040H4 works and the rest of my machines are waiting for a new kernel :-)
I don't know wich machines use 40wire-cables, but I am sure some do.
Al
Of those you listed only the 10gig, 8 gig and the other 8gig would likely use a 40pin, but the IBM may be of a vintage that can go with either 40 or 80 wire cable. -- _____________________________________ John Andersen
Am Mittwoch, 23. Juni 2004 20:16 schrieb John Andersen:
On Wednesday 23 June 2004 10:05, Al Bogner wrote:
Am Mittwoch, 23. Juni 2004 19:14 schrieb Jim Westbrook:
And for many of us the "fix" was worse than the original problem. The unpatched kernel would at least BOOT.
But this machine is save now :-) (waiting too for a new kernel)
If I have to guess, I would say, DMA is involved, because it doesn't work with a machine here, where 1 drive has DMA problems and I disabled it therefore, while this machine has the same chipset (815i) and processor than a working machine.
Al
Many others reported this happened with older drives in the 4 to 10 gig size.
Here it is a 10GB IBM-DTTA-351010. If anyone is interested to see my boot.msg with working 2.6.4-54.5-default and hwinfo, have a look at http://lists.suse.com/archive/suse-linux/2004-Jun/2602.html But I compiled a plain Vanilla 2.6.7-kernel today on another machine with a 2GB Maxtor 82100D4 and it worked without problems, but I don't want to try a 2.6.5-7.75-default from SuSE on this machine. :-) Al
In article <200406232005.13210.suse-linux@ml04q2.pinguin.uni.cc>, Al Bogner <suse-linux@ml04q2.pinguin.uni.cc> wrote:
Am Mittwoch, 23. Juni 2004 19:14 schrieb Jim Westbrook:
And for many of us the "fix" was worse than the original problem. The unpatched kernel would at least BOOT.
But this machine is save now :-) (waiting too for a new kernel)
If I have to guess, I would say, DMA is involved, because it doesn't work with a machine here, where 1 drive has DMA problems and I disabled it therefore, while this machine has the same chipset (815i) and processor than a working machine.
I too have had a DMA problem with nfsd/filesystem - but so far not since the pached 2.6.5-7.75 for 9.1 I do, however, have two quite bad problems - (1) Segmentation fault in line 162 of /usr/share/YaST2/clients/sw_single.ycp which says Pkg::SourceStartCache(true) which I presume is a call to some perl package. This prevents software update! This is new since the kernel update (obtained on 22nd June on-line!). (2) Still (right from first install of 9.1 over 8.2) there are intemittent total hang-ups which seem to be associated with the Touchpad Synaptic driver - only a power switch off has any effect (keyboard driver also not reacting). The kernel seems to 'freeze' as I cannot get a response to pinging the machine any longer when this has occurred - so it isn't really a security problem - I think(!?!). Keith Hopper -- City Desk Waikato University [PGP key available if desired]
The announcement for the updated kernel packages (version 2.4.1-226 for 9.0, version 2.6.5-7.75 for 9.1) was made on June 16.
and what is this information supposed to mean? i know when the latest kernel fixes from suse have been released, but my question was, what about the trouble they caused, and if the bad fix will be fixed somewhen soon? thanks.
Andreas Bittner wrote:
i know when the latest kernel fixes from suse have been released, but my question was, what about the trouble they caused, and if the bad fix will be fixed somewhen soon?
I misunderstood your question, then. I wasn't aware these updated kernels had problems - I have 3 9.0 and 1 9.1 systems that don't have any problems running these kernels. -- Phil Brutsche phil@optimumdata.com
I misunderstood your question, then. I wasn't aware these updated kernels had problems - I have 3 9.0 and 1 9.1 systems that don't have any problems running these kernels.
I think we've found it. If you feel like, please try out the kernels of the day (kotd), to be found at http://ftp.suse.com/pub/projects/kernel/kotd/i386/. There is a kernel update pending to fix the hangs that you might see under some circumstances. And there are some more things in the queue for this update... Thanks, Roman.
I misunderstood your question, then. I wasn't aware these updated kernels had problems - I have 3 9.0 and 1 9.1 systems that don't have any problems running these kernels.
I don't have any problems on one box, because Yast fails to install the patch (maybe depending on the fact, that our T1 line broke during the update). Maybe good or bad or vice versa?
I think we've found it. If you feel like, please try out the kernels of the day (kotd), to be found at http://ftp.suse.com/pub/projects/kernel/kotd/i386/.
There is a kernel update pending to fix the hangs that you might see under some circumstances. And there are some more things in the queue for this update...
Ah this time we have no announcement and officially we are testing everything we get from the list or what? I don't want to be a beta-tester at all or I switch to debian. <A little bit angry about the updatepolicy of SuSE with buggy updates the last time> Philippe P.S.: Please fix the grub setup-problem with scsi and ataraid after installation I cannot boot correctly (I know how to set this up correctly, but it nerves me).
Philippe Vogel wrote:
Ah this time we have no announcement and officially we are testing everything we get from the list or what?
As Roman wrote, test it "if you feel like" it. If you don't, wait for the next official release.
I don't want to be a beta-tester at all or I switch to debian.
Chuckle. Who do you think cares about this kind of "threat"? :-) -- Mit freundlichen Grüßen / Yours sincerely Dipl. Inform. Ralph Seichter HORUS-IT Ahornweg 10 D-57635 Oberirsen Tel +49 2686 987880 Fax +49 2686 987889 http://horus-it.de/
On Thu, 24 Jun 2004, Philippe Vogel wrote:
I misunderstood your question, then. I wasn't aware these updated kernels had problems - I have 3 9.0 and 1 9.1 systems that don't have any problems running these kernels.
I don't have any problems on one box, because Yast fails to install the patch (maybe depending on the fact, that our T1 line broke during the update). Maybe good or bad or vice versa?
Depends on which bugs you like and which you don't. I never hesitate to get a bugfix or security update in, including on the kernel. Since the download and installation happen sequentially, if the download breaks the rpm should be broken enough that YOU doesn't try to use it. Something must have broken during the actual installation of the patch. I'd seriously consider downloadin the full kernel-rpm and update with rpm -Uhv to make sure you have a clean kernel install, not a partial installation of the one or the other. Choose the kernel you like.
I think we've found it. If you feel like, please try out the kernels of the day (kotd), to be found at http://ftp.suse.com/pub/projects/kernel/kotd/i386/.
There is a kernel update pending to fix the hangs that you might see under some circumstances. And there are some more things in the queue for this update...
Ah this time we have no announcement and officially we are testing everything we get from the list or what?
There is a broken kernel update, the kernel before that has a security bug. Roman has given you the choice of which of three evils is the lesser one for you and you're complaining because there's no longer just two evil options?
I don't want to be a beta-tester at all or I switch to debian.
And debian is bug-free, the people behind being perfect people who never make mistakes? Sheesh, get real.
<A little bit angry about the updatepolicy of SuSE with buggy updates the last time>
It's not the first mistake they've done, it won't be the last. As soon as they get feedback that they've made a mistake they do a quick fix and both test it themselves and ask for volunteers to do so. They're professional and serious and doing their best - and they're doing no worse than any competitors.
P.S.: Please fix the grub setup-problem with scsi and ataraid after installation I cannot boot correctly (I know how to set this up correctly, but it nerves me).
I've done quite a few scsi installations with 9.1 by now, without any problems. What's wrong? Bjørn -- Bjørn Tore Sund Phone: (+47) 555-84894 Stupidity is like a System administrator Fax: (+47) 555-89672 fractal; universal and Math. Department Mobile: (+47) 918 68075 infinitely repetitive. University of Bergen VIP: 81724 Support: system@mi.uib.no Contact: teknisk@mi.uib.no Direct: bjornts@mi.uib.no
Does this only affect the 2.6 kernels? Did anyone have any problems with 2.4 updates? Selcuk Roman Drahtmueller wrote:
I misunderstood your question, then. I wasn't aware these updated kernels had problems - I have 3 9.0 and 1 9.1 systems that don't have any problems running these kernels.
I think we've found it. If you feel like, please try out the kernels of the day (kotd), to be found at http://ftp.suse.com/pub/projects/kernel/kotd/i386/.
There is a kernel update pending to fix the hangs that you might see under some circumstances. And there are some more things in the queue for this update...
Thanks, Roman.
On Thursday 24 June 2004 10:20, Selcuk Ozturk wrote:
Does this only affect the 2.6 kernels? Did anyone have any problems with 2.4 updates?
Selcuk
Roman Drahtmueller wrote:
I think we've found it. If you feel like, please try out the kernels of the day (kotd), to be found at http://ftp.suse.com/pub/projects/kernel/kotd/i386/.
2.6 only AFAIK. -- _____________________________________ John Andersen
Does this only affect the 2.6 kernels? Did anyone have any problems with 2.4 updates?
We have problems with the 2.4 update. We have 2 8.2 systems with problems with the latest 8.2 kernel k_deflt-2.4.20-113.i586.rpm. Since the kernel update, freeswan1.99 with nat-transversal enabled wont't work anymore. A downgrade to the previous kernel k_deflt-2.4.20-111.i586.rpm solved the problem. I filled out a bug report at the suse homepage some days ago, but no feedback. Here a cut of the logfile: Jun 17 07:38:27 x0070 pluto[845]: ERROR: "dhcp2"[1] 217.187.55.237:4500 #10: pfkey write() of SADB_ADD message 25 for Add ESP SA esp.4b3171fc@62.xxx.xxx.xx failed. Errno 22: Invalid argument Jun 17 07:38:27 x0070 pluto[845]: | 02 03 00 03 14 00 00 00 19 00 00 00 4d 03 00 00 Jun 17 07:38:27 x0070 pluto[845]: | 02 00 01 00 4b 31 71 fc 40 01 02 03 00 00 00 00 Jun 17 07:38:27 x0070 pluto[845]: | 03 00 05 00 00 00 00 00 02 00 00 00 d9 bb 37 ed Jun 17 07:38:27 x0070 pluto[845]: | 00 00 00 00 00 00 00 00 03 00 06 00 00 00 00 00 Jun 17 07:38:27 x0070 pluto[845]: | 02 00 00 43 3e 99 db 28 00 00 00 00 00 00 00 00 Jun 17 07:38:27 x0070 pluto[845]: | 03 00 08 00 80 00 00 00 a3 3f 08 1a 08 c9 47 7c Jun 17 07:38:27 x0070 pluto[845]: | 47 e7 7c 84 f4 a0 99 15 04 00 09 00 c0 00 00 00 Jun 17 07:38:27 x0070 pluto[845]: | 5c d4 de ae e0 27 25 04 d9 10 48 aa ae 79 9d a7 Jun 17 07:38:27 x0070 pluto[845]: | 88 84 62 55 7a 62 b4 15 01 00 1a 00 02 00 00 00 Jun 17 07:38:27 x0070 pluto[845]: | 01 00 1b 00 94 11 00 00 01 00 1c 00 94 11 00 00 Thanks Alexander
Selcuk
Hi! On Mon, 28 Jun 2004, Alexander Maier wrote:
We have problems with the 2.4 update. We have 2 8.2 systems with problems with the latest 8.2 kernel k_deflt-2.4.20-113.i586.rpm. Since the kernel update, freeswan1.99 with nat-transversal enabled wont't work anymore. A downgrade to the previous kernel k_deflt-2.4.20-111.i586.rpm solved the problem. I filled out a bug report at the suse homepage some days ago, but no feedback.
Here a cut of the logfile: Jun 17 07:38:27 x0070 pluto[845]: ERROR: "dhcp2"[1] 217.187.55.237:4500 #10: pfkey write() of SADB_ADD message 25 for Add ESP SA esp.4b3171fc@62.xxx.xxx.xx failed. Errno 22: Invalid argument
I think this message means that the ipsec.o kernel module and Pluto (the userspace daemon which is part of the freeswan RPM) are "out of sync". I got exactly this message (same SuSE, same kernel) after I upgraded the freeswan RPM from http://www.suse.de/~garloff/linux/FreeSWAN/ without upgrading the ipsec.o module; after installing the matching km_freeswan and recompiling it everything went smoothly. Note: I had to upgrade because NAT traversal does *not* work correctly with plain k_deflt-2.4.20-111! In particular, when connecting to a Win2000/WinXP client behind a NAT router, freeswan still uses protocol 50 for encrypted traffic, while it *should* use UDP port 4500 instead; in some situations it might still work, but I don't count on it... (Depends on the router, I think - if it does NAT on protocol 50, it will work, otherwise it won't.) I didn't update the box in question yet, so I can only theoretize: perhaps SuSE finally updated ipsec.o, but forgot to update the userspace tools? In that case, the freeswan RPM on http://www.suse.de/~garloff/linux/FreeSWAN/ might work... Martin
On Thursday 24 June 2004 08:11, Roman Drahtmueller wrote:
I misunderstood your question, then. I wasn't aware these updated kernels had problems - I have 3 9.0 and 1 9.1 systems that don't have any problems running these kernels.
I think we've found it.
And you think it was.....?? -- _____________________________________ John Andersen
participants (15)
-
Al Bogner
-
Alexander Maier
-
Andreas Bittner
-
Arjen de Korte
-
Bjorn Tore Sund
-
Christian Richter
-
Jim Westbrook
-
John Andersen
-
Keith Hopper
-
Martin Köhling
-
Phil Brutsche
-
Philippe Vogel
-
Ralph Seichter
-
Roman Drahtmueller
-
Selcuk Ozturk