[Bug 815962] New: grub2 network boot fails -> couldn't send network packet

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c0 Summary: grub2 network boot fails -> couldn't send network packet Classification: openSUSE Product: openSUSE Factory Version: 13.1 Milestone 0 Platform: Other OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Bootloader AssignedTo: jsrain@suse.com ReportedBy: rsalevsky@suse.com QAContact: jsrain@suse.com Found By: --- Blocker: --- Created an attachment (id=535830) --> (http://bugzilla.novell.com/attachment.cgi?id=535830) debug messages User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.63 Safari/537.31 Grub2 fails to load files from network with the error "couldn't send network packet" but when I activate debug=all then a message that he could not open tftp appears. For more debug information pls look at the photo in the attachment. It looks like this happens randomly at different stages when trying to download the kernel or initrd. In very rare cases the download even succeeds. Reproducible: Always Steps to Reproduce: 1. setup grub2 network boot 2. setup an config 3. loading kernel and inited from network Actual Results: Loading the kernel fails with "error: couldn't send network packet" Expected Results: Loading kernel and initrd and install the system. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c Jiri Srain <jsrain@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|jsrain@suse.com |mchang@suse.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c1 --- Comment #1 from Michael Chang <mchang@suse.com> 2013-04-19 08:33:04 UTC --- My test of grub2's pxe boot on Intel's tunnel mountain (ipv4) is quite stable, no problem so far. An interesting finding is the log also says "Opening tftp ... failed" although the booting is successful. That said the only meaningful message in the debug log is "error, couldn't send network packet". -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c2 --- Comment #2 from Michael Chang <mchang@suse.com> 2013-04-19 09:26:35 UTC --- I check with latest upstream efinet.c and we are one patch behind and that patch looks quite promising to fix the problem. :D I'm going to post that patch here and build a test package for you. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c3 --- Comment #3 from Michael Chang <mchang@suse.com> 2013-04-19 09:29:05 UTC --- Created an attachment (id=536051) --> (http://bugzilla.novell.com/attachment.cgi?id=536051) Resend-a-packet-if-we-got-the-wrong-buffer-in-status.patch -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c4 --- Comment #4 from Michael Chang <mchang@suse.com> 2013-04-19 10:09:43 UTC --- The build service is too slow and my package is not building yet. In case I have to leave office sooner or later, the project is here for your reference. obs://home:michael-chang:bnc815962/grub2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c5 --- Comment #5 from Rick Salevsky <rsalevsky@suse.com> 2013-04-19 10:46:11 UTC --- Thank you, it looks better the kernel is downloading fine but now I becomes an timeout while reading the initrd. This is the full error message: "error: timeout reading `SLES11-SP2-latest/x86_64/loader/initrd`" -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c6 --- Comment #6 from Michael Chang <mchang@suse.com> 2013-04-22 07:31:40 UTC --- Hi Rick, The painful 'timeout reading' error recurred after supposed antidote grub2-fix-tftp-endianness.patch was taken .. too bad(hurt) to know this. :( could you please help to 1. Test the most up-to-date grub2 (bzr) trunk package to see if the problem remains? home:arvidjaar:grub2-next 2. Doing packet capture on the tftp server would be helpful, you can use wireshark $ tshark -i <your_network_interface> Btw, is this issue specific to certain hardware you have tested? My tunnel mountain is happy with our sle11 grub2. thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c7 --- Comment #7 from Rick Salevsky <rsalevsky@suse.com> 2013-04-22 09:20:52 UTC --- Hey Michael, no problem :) Btw, I have testet the home:arvidjaar:grub2-next package and it works very good for me. It looks like that this package fixes another little bug, that the client makes many file requests to the server. I hope it helps... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c9 --- Comment #9 from Michael Chang <mchang@suse.com> 2013-04-23 03:44:28 UTC --- (In reply to comment #7)
Btw, I have testet the home:arvidjaar:grub2-next package and it works very good for me. It looks like that this package fixes another little bug, that the client makes many file requests to the server.
Nice. At least we don't have to solve the problem from scratch. :) We have to find that patch out and apply it, we can't use that (relatively) not stable latest trunk of grub2. It's even not public on openSUSE.
I hope it helps...
Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c10 --- Comment #10 from Michael Chang <mchang@suse.com> 2013-04-23 10:16:10 UTC --- (In reply to comment #5)
Thank you, it looks better the kernel is downloading fine but now I becomes an timeout while reading the initrd.
This is the full error message: "error: timeout reading `SLES11-SP2-latest/x86_64/loader/initrd`"
Can you please help to provide tftp capture of the error ? Both the NG and the OK case .. ? And your dhcp config ? I'm trying to find out the upstream commit which could fix the problem, but it end up with this one which is not very likely .. (others are either applied or not in the path ..) revno: 4596 author: Paulo Flabiano Smorigo <pfsmorigo@br.ibm.com> committer: Vladimir 'phcoder' Serbinenko <phcoder@gmail.com> branch nick: grub timestamp: Wed 2012-11-28 14:14:20 +0100 message: * grub-core/net/bootp.c (parse_dhcp_vendor): Fix double increment. diff: === modified file 'grub-core/net/bootp.c' --- grub-core/net/bootp.c 2012-06-21 20:20:57 +0000 +++ grub-core/net/bootp.c 2012-11-28 13:14:20 +0000 @@ -122,7 +122,7 @@ ptr += 4; } } - break; + continue; case GRUB_NET_BOOTP_HOSTNAME: set_env_limn_ro (name, "hostname", (char *) ptr, taglength); break; -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c11 --- Comment #11 from Rick Salevsky <rsalevsky@suse.com> 2013-04-23 16:38:54 UTC --- Here is the section from the log: Apr 23 18:24:58 music in.tftpd[19339]: RRQ from ::ffff:10.121.8.152 filename uefi.cfg/grub.efi Apr 23 18:24:58 music in.tftpd[19347]: RRQ from ::ffff:10.121.8.152 filename uefi.cfg/x86_64-efi/command.lst Apr 23 18:24:58 music in.tftpd[19348]: RRQ from ::ffff:10.121.8.152 filename uefi.cfg/x86_64-efi/fs.lst Apr 23 18:24:58 music in.tftpd[19351]: RRQ from ::ffff:10.121.8.152 filename uefi.cfg/x86_64-efi/crypto.lst Apr 23 18:24:58 music in.tftpd[19353]: RRQ from ::ffff:10.121.8.152 filename uefi.cfg/x86_64-efi/terminal.lst Apr 23 18:24:58 music in.tftpd[19355]: RRQ from ::ffff:10.121.8.152 filename uefi.cfg/grub.cfg Apr 23 18:24:58 music in.tftpd[19356]: RRQ from ::ffff:10.121.8.152 filename uefi.cfg/grub.cfg Apr 23 18:24:58 music in.tftpd[19359]: RRQ from ::ffff:10.121.8.152 filename uefi.cfg/grub.cfg Apr 23 18:25:00 music in.tftpd[19383]: RRQ from ::ffff:10.121.8.152 filename SLES11-SP2-latest/x86_64/DVD1/boot/x86_64/loader/linux Apr 23 18:25:02 music in.tftpd[19389]: RRQ from ::ffff:10.121.8.152 filename SLES11-SP2-latest/x86_64/DVD1/boot/x86_64/loader/linux Apr 23 18:25:03 music in.tftpd[19395]: RRQ from ::ffff:10.121.8.152 filename SLES11-SP2-latest/x86_64/DVD1/boot/x86_64/loader/initrd And the dhcp config is a little bit complex, I think it's better if you look at it self. If you need help so feel free and contact me in the irc. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c12 --- Comment #12 from Michael Chang <mchang@suse.com> 2013-04-24 06:05:08 UTC --- (In reply to comment #10)
(In reply to comment #5)
I'm trying to find out the upstream commit which could fix the problem, but it end up with this one which is not very likely.
FWIW. I added that bootp fix double increment patch to the same project and you can test it now. obs://home:michael-chang:bnc815962/grub2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c13 --- Comment #13 from Michael Chang <mchang@suse.com> 2013-04-24 06:13:51 UTC --- Created an attachment (id=536604) --> (http://bugzilla.novell.com/attachment.cgi?id=536604) mega diff of grub-core/net/* Just to record here, this is current mega diff of grub-core/net/* of home:michael-chang:bnc815962/ grub2 with the grub2-next (home:arvidjaar:grub2-next). The rest looks all innocent .. :( -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c14 --- Comment #14 from Thomas Renninger <trenn@suse.com> 2013-04-26 09:33:38 UTC --- Can't we just ramp up factory grub2 to latest mainline code base and be done with it? No need to do that immediately if there still is SP3 work, but we should/will do that anyway. Fixing everything for 12.3 looks like out of scope. If someone wants to do grub2 network booting they could easily use latest factory code base. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c15 --- Comment #15 from Michael Chang <mchang@suse.com> 2013-04-26 10:50:00 UTC --- (In reply to comment #14)
Can't we just ramp up factory grub2 to latest mainline code base and be done with it?
Yes. I think we will. Practically almost everything is done except for extensive testing.
No need to do that immediately if there still is SP3 work, but we should/will do that anyway.
Yeah, that's why we are working to find out the fixing patch from upstream to cherrypick.
Fixing everything for 12.3 looks like out of scope. If someone wants to do grub2 network booting they could easily use latest factory code base.
Agree. :) Thanks for your advice. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c16 --- Comment #16 from Thomas Renninger <trenn@suse.com> 2013-06-03 03:41:18 UTC --- Ping. Rick run into another efi network boot issue which should be fixed mainline. *_net_* variables should be defined in latest grub2 so that MAC/IP/etc can be accessed. Would be great if we could get an upgrade to latest mainline code state in factory. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c17 --- Comment #17 from Michael Chang <mchang@suse.com> 2013-06-04 21:05:04 UTC --- Thomas, Thanks to the update. I will check what's missing. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c18 --- Comment #18 from Michael Chang <mchang@suse.com> 2013-06-25 05:00:37 UTC --- (In reply to comment #17)
Thanks to the update. I will check what's missing.
That is upstream rev 4978. ------------------------------------------------------------ revno: 4978 committer: Vladimir 'phcoder' Serbinenko <phcoder@gmail.com> branch nick: grub timestamp: Tue 2013-05-07 12:05:36 +0200 message: New variables 'net_default_*' to determine MAC/IP of default interface. diff: === modified file 'ChangeLog' --- ChangeLog 2013-05-07 09:47:30 +0000 +++ ChangeLog 2013-05-07 10:05:36 +0000 @@ -1,5 +1,9 @@ 2013-05-07 Vladimir Serbinenko <phcoder@gmail.com> + New variables 'net_default_*' to determine MAC/IP of default interface. + +2013-05-07 Vladimir Serbinenko <phcoder@gmail.com> + * tests/gettext_strings_test.in: A test to check for strings not marked for translation. === modified file 'grub-core/net/bootp.c' --- grub-core/net/bootp.c 2013-01-20 13:24:47 +0000 +++ grub-core/net/bootp.c 2013-05-07 10:05:36 +0000 @@ -211,6 +211,9 @@ grub_print_error (); } + if (is_def) + grub_env_set ("net_default_interface", name); + if (device && !*device && bp->server_ip) { *device = grub_xasprintf ("tftp,%d.%d.%d.%d", === modified file 'grub-core/net/net.c' --- grub-core/net/net.c 2013-01-21 01:33:46 +0000 +++ grub-core/net/net.c 2013-05-07 10:05:36 +0000 @@ -1,6 +1,6 @@ /* * GRUB -- GRand Unified Bootloader - * Copyright (C) 2010,2011 Free Software Foundation, Inc. + * Copyright (C) 2010,2011,2012,2013 Free Software Foundation, Inc. * * GRUB is free software: you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -813,6 +813,69 @@ return grub_net_default_server ? : ""; } +static const char * +defip_get_env (struct grub_env_var *var __attribute__ ((unused)), + const char *val __attribute__ ((unused))) +{ + const char *intf = grub_env_get ("net_default_interface"); + const char *ret = NULL; + if (intf) + { + char *buf = grub_xasprintf ("net_%s_ip", intf); + if (buf) + ret = grub_env_get (buf); + grub_free (buf); + } + return ret; +} + +static char * +defip_set_env (struct grub_env_var *var __attribute__ ((unused)), + const char *val) +{ + const char *intf = grub_env_get ("net_default_interface"); + if (intf) + { + char *buf = grub_xasprintf ("net_%s_ip", intf); + if (buf) + grub_env_set (buf, val); + grub_free (buf); + } + return NULL; +} + + +static const char * +defmac_get_env (struct grub_env_var *var __attribute__ ((unused)), + const char *val __attribute__ ((unused))) +{ + const char *intf = grub_env_get ("net_default_interface"); + const char *ret = NULL; + if (intf) + { + char *buf = grub_xasprintf ("net_%s_mac", intf); + if (buf) + ret = grub_env_get (buf); + grub_free (buf); + } + return ret; +} + +static char * +defmac_set_env (struct grub_env_var *var __attribute__ ((unused)), + const char *val) +{ + const char *intf = grub_env_get ("net_default_interface"); + if (intf) + { + char *buf = grub_xasprintf ("net_%s_mac", intf); + if (buf) + grub_env_set (buf, val); + grub_free (buf); + } + return NULL; +} + static void grub_net_network_level_interface_register (struct grub_net_network_level_interface *inter) @@ -1560,6 +1623,10 @@ defserver_set_env); grub_register_variable_hook ("pxe_default_server", defserver_get_env, defserver_set_env); + grub_register_variable_hook ("net_default_ip", defip_get_env, + defip_set_env); + grub_register_variable_hook ("net_default_mac", defmac_get_env, + defmac_set_env); cmd_addaddr = grub_register_command ("net_add_addr", grub_cmd_addaddr, /* TRANSLATORS: HWADDRESS stands for -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c19 --- Comment #19 from Michael Chang <mchang@suse.com> 2013-06-25 05:06:37 UTC --- Hi Rick, The devel repo of grub2 factoy Base:System/grub2 has been updated with upstream trunk and should fix these two issues ("can't send network packet" and "missing pxe_default_(ip|mac) environment variable.") Please test if you can. Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c20 Thomas Renninger <trenn@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED --- Comment #20 from Thomas Renninger <trenn@suse.com> 2013-06-25 06:14:24 UTC ---
The devel repo of grub2 factoy Base:System/grub2 has been updated with upstream trunk Great, thanks a lot! Rick is in school the whole week, so testing this will still take the one or other day.
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c Thomas Renninger <trenn@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |rsalevsky@suse.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.

https://bugzilla.novell.com/show_bug.cgi?id=815962 https://bugzilla.novell.com/show_bug.cgi?id=815962#c21 Rick Salevsky <rsalevsky@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED InfoProvider|rsalevsky@suse.com | --- Comment #21 from Rick Salevsky <rsalevsky@suse.com> 2013-07-04 09:09:31 UTC --- Hi Michael, I have tested the environment variables and the network bug. All things lock good for me, I think the bug can mark as resolved. Thank you and bye, Rick -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com