[Bug 462769] New: dns resolution problems - possible getaddrinfo() bugs
https://bugzilla.novell.com/show_bug.cgi?id=462769 Summary: dns resolution problems - possible getaddrinfo() bugs Product: openSUSE 11.1 Version: Final Platform: x86-64 OS/Version: openSUSE 11.1 Status: NEW Severity: Major Priority: P5 - None Component: Other AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: aharrison@gmail.com QAContact: qa@suse.de Found By: Community User I discussed my problem originally in thread: http://lists.opensuse.org/opensuse/2008-12/msg00879.html I thought my problem might be related to bug id 441947, but was advised to create a new bug report since disabling ipv6 did not solve my problem. In a nutshell, everything worked fine with 10.2. I performed a fresh install of 11.1rc1 (my zypper updates are current as of today), now lots and lots of dns resolution problems. I am using the same name servers I used previously. I even ended up even restoring a copy of my resolv.conf from my 10.2 box, still no joy. (I use this same exact resolv.conf file on 75+ servers here at my location.) Firefox, konq, ssh, telnet, zypper, whois, etc, all fail most of the time when trying to resolve names, always requiring lots of retries before successfully being able to resolve a name. Yet the dig command *never* fails, even when I try to dig the address in question in the middle of my retries. In particular, the whois command fails complaining about the exact system call in the title of bug 441947: getaddrinfo(whois.crsnic.net): Name or service not known. I've rebooted several times to make sure the system was fresh after making my changes. I've combed over the logs. For troubleshooting, I changed my resolv.conf to point to a single name server instead of two. My dns servers handle the load of tens of thousands of customers, so if there were a problem with them, believe me, my doing a fresh linux install on my workstation would not be how I first find out about dns trouble. I have disabled ipv6. I also commented out the ipv6 related entries in /etc/hosts just to make sure it wasn't causing a problem. I have disabled all firewall selinux and apparmor services. I use neither dhcp nor NetworkManager. I have the problem with or without nscd running. I have pared down my nsswitch.conf so now it just contains: # grep '^[^#]' /etc/nsswitch.conf passwd: files ldap group: files ldap hosts: files dns networks: files services: files protocols: files rpc: files ethers: files netmasks: files netgroup: files publickey: files bootparams: files automount: files nis aliases: files My host.conf file is stock: # grep '^[^#]' /etc/host.conf order hosts, bind multi on Here's some info on my network configuration: # grep '^[^#]' /etc/sysconfig/network/config DEFAULT_BROADCAST="+" GLOBAL_POST_UP_EXEC="yes" GLOBAL_PRE_DOWN_EXEC="yes" CHECK_DUPLICATE_IP="no" DEBUG="no" USE_SYSLOG="yes" CONNECTION_SHOW_WHEN_IFSTATUS="no" CONNECTION_CHECK_BEFORE_IFDOWN="no" CONNECTION_CLOSE_BEFORE_IFDOWN="no" CONNECTION_UMOUNT_NFS_BEFORE_IFDOWN="no" CONNECTION_SEND_KILL_SIGNAL="no" MANDATORY_DEVICES="" WAIT_FOR_INTERFACES="20" FIREWALL="no" LINKLOCAL_INTERFACES="eth*[0-9]|tr*[0-9]|wlan[0-9]|ath[0-9]" IFPLUGD_OPTIONS="-f -I -b" NETWORKMANAGER="no" NM_ONLINE_TIMEOUT="0" NETCONFIG_MODULES_ORDER="dns-resolver dns-bind dns-dnsmasq nis ntp-runtime" NETCONFIG_DNS_FORWARDER="resolver" NETCONFIG_DNS_STATIC_SEARCHLIST="example.net foo.example.net" NETCONFIG_DNS_STATIC_SERVERS="10.10.10.181 10.10.10.240" NETCONFIG_NTP_POLICY="" NETCONFIG_NTP_STATIC_SERVERS="" NETCONFIG_NIS_POLICY="" NETCONFIG_NIS_SETDOMAINNAME="yes" NETCONFIG_NIS_STATIC_DOMAIN="" NETCONFIG_NIS_STATIC_SERVERS="" NETCONFIG_DNS_POLICY="" In desperation, I did add the repository I found in another bug report http://download.opensuse.org/repositories/home:/mtomaschewski:/Factory/openS... to zypper and upgraded sysconfig, but that did nothing. Please let me know what other information I can provide. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User aharrison@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c1 --- Comment #1 from Andy Harrison <aharrison@gmail.com> 2008-12-30 09:01:57 MST --- Further lending credit that this may be a bug related to ipv6, I went through my /etc/ssh/ssh_config and ~/.ssh/config files and made sure that the AddressFamily keywords all had an argument of "inet" instead of "any" and now the ssh command is successful 100% of the time when resolving names. Other commands such as whois continue to fail. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User luca.gugelmann@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c2 Luca Gugelmann <luca.gugelmann@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |luca.gugelmann@gmail.com --- Comment #2 from Luca Gugelmann <luca.gugelmann@gmail.com> 2008-12-30 14:50:38 MST --- I have the same problem and it seems indeed to be getaddrinfo() related. Specifically DNS resolution fails when ai_family is set to AF_UNSPEC in the second argument to getaddrinfo. AF_INET works as intended, so my understanding is that AF_UNSPEC should at least return the IPV4 address instead of failing. Attached is a small test program, which I hope shows the problem. The output on my side:
./dnstest novell.com AF_INET: 130.57.5.70 AF_INET6: getaddrinfo: Name or service not known AF_UNSPEC: getaddrinfo: Name or service not known
compare with localhost (which does not go through a dns server):
./dnstest localhost AF_INET: 127.0.0.1 127.0.0.1 AF_INET6: ::1 AF_UNSPEC: 127.0.0.1 ::1
I've been through much of the same troubleshooting as above, no success. ipv6 is disabled on my system (to at least have most of the gui internet apps work). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User luca.gugelmann@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c3 --- Comment #3 from Luca Gugelmann <luca.gugelmann@gmail.com> 2008-12-30 14:52:56 MST --- Created an attachment (id=262821) --> (https://bugzilla.novell.com/attachment.cgi?id=262821) The test program discussed in the comment above. gcc dnstest.c -o dnstest -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User luca.gugelmann@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c4 --- Comment #4 from Luca Gugelmann <luca.gugelmann@gmail.com> 2008-12-30 15:15:15 MST --- Further testing showed that once every few dozen queries the AF_UNSPEC case returns a correct answer. I tried to reproduce it and looking at the wireshark logs stumbled upon the following behavior: - on an AF_INET query a request for the A record goes out and the correct answer comes in. Everything ok. - on an AF_INET6 query a request for the AAAA record goes out and "not implemented" is the router's answer (as expected). (This is repeated 4 times.) - on an AF_UNSPEC query a request for the A record goes out, then a request for the AAAA record goes out, then the answer for the AAAA query comes in (not implemented) and finally the answer to the A query. Note that my router answers the queries in reverse order. In this case getaddrinfo fails. Once in a while the order in which the answers come in is correct (I'm on a wireless network, so I assume sometimes the first packet is delayed). When the order of the answers is consistent with the order of the queries (that is answer to A first, AAAA later) getaddrinfo returns the correct ip. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User luca.gugelmann@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c5 --- Comment #5 from Luca Gugelmann <luca.gugelmann@gmail.com> 2008-12-31 08:57:48 MST --- I'm no longer at my parent's house and the problem disappeared. Switching to a different router apparently fixes the problem, without requiring any configuration changes. This definitely points towards getaddrinfo choking on the answers by some broken(?) dns servers. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User aharrison@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c6 --- Comment #6 from Andy Harrison <aharrison@gmail.com> 2009-01-08 13:59:18 MST --- I tried the dnstest program attached by Luca and got the same failures. I tried installing the factory repository at http://download.opensuse.org/repositories/Base:/build/standard/ and seeing if those updates would help (in case they included glibc updates), but no joy. So, as a work-around, I installed a recursion-only instance of named locally and pointed my resolv.conf to 127.0.0.1. Works well enough. If I can assist with further troubleshooting of the actual problem, I'd be happy to assist. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c7 Petr Baudis <pbaudis@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |pbaudis@novell.com Status|NEW |NEEDINFO Info Provider| |aharrison@gmail.com --- Comment #7 from Petr Baudis <pbaudis@novell.com> 2009-01-08 19:51:51 MST --- The problem here is that getaddrinfo() still tries to resolve IPv6 AAAAs if IPv6 is disabled on your system - does ./dnstest localhost show AF_INET6 results if IPv6 is turned off? Can you paste your ip addr show output? lsmod | grep ipv6? Either IPv6 disabling is not working properly or there is bug in getaddrinfo() IPv6 auto-detection. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User pbaudis@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c8 --- Comment #8 from Petr Baudis <pbaudis@novell.com> 2009-01-08 20:20:45 MST --- Oh, I have just noticed - your getaddrinfo() call in ./dnstest has no AI_ADDRCONFIG in the ai_flags field - could you set it there instead of zero and try again? To clarify, we will skip AAAA queries only if AI_ADDRCONFIG flag is used and no IPv6 interfaces are available. Not all applications use AI_ADDRCONFIG, but what is confusing is that your firefox still does not work with IPv6 disabled since it definitely should use AI_ADDRCONFIG. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User aharrison@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c9 --- Comment #9 from Andy Harrison <aharrison@gmail.com> 2009-01-09 08:49:27 MST --- Apologies, I shouldn't have included firefox in this bug. I was too liberal in my cutting and pasting of previous communications. I'm not sure what I did to get firefox working correctly and even though it was showing these symptoms immediately after initial o/s installation, firefox was one of the first apps to start working smoothly for me when I started troubleshooting the problem. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User aharrison@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c10 --- Comment #10 from Andy Harrison <aharrison@gmail.com> 2009-01-09 08:56:52 MST --- I have ipv6 disabled. Here's the proof: # lsmod | grep -i ipv6 # # ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 brd 127.255.255.255 scope host lo inet 127.0.0.2/8 brd 127.255.255.255 scope host secondary lo 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:03:ba:f0:ce:50 brd ff:ff:ff:ff:ff:ff inet 192.168.3.104/20 brd 192.168.15.255 scope global eth0 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 link/ether 00:50:04:d2:73:7d brd ff:ff:ff:ff:ff:ff 4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 link/ether 00:50:04:62:0a:00 brd ff:ff:ff:ff:ff:ff 5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:03:ba:f0:ce:51 brd ff:ff:ff:ff:ff:ff inet 172.24.1.55/23 brd 172.24.1.255 scope global eth3 As for the dnstest program, I'm barely a read-only c programmer, so hopefully I did this correctly. I changed hints.ai_flags to... hints.ai_flags |= AI_ADDRCONFIG; ..and recompiled. AF_UNSPEC results are successful 100% of the time now. I grabbed the glibc src rpm you attached to bug 441947 and I'm compiling it now. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User luca.gugelmann@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c11 Luca Gugelmann <luca.gugelmann@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|aharrison@gmail.com | --- Comment #11 from Luca Gugelmann <luca.gugelmann@gmail.com> 2009-01-09 10:27:52 MST --- Setting AI_ADDRCONFIG produces correct results with AF_UNSPEC queries here too. Further I tested the glibc from bug 441947 (of which this bug can now probably considered a duplicate) and the problem has disappeared regardless whether AI_ADDRCONF is set or not. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=462769 User aharrison@gmail.com added comment https://bugzilla.novell.com/show_bug.cgi?id=462769#c12 Andy Harrison <aharrison@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE --- Comment #12 from Andy Harrison <aharrison@gmail.com> 2009-01-09 11:51:36 MST --- Confirmed, glibc-2.9-5 from bug 441947 fixed the problem for me. *** This bug has been marked as a duplicate of bug 441947 *** https://bugzilla.novell.com/show_bug.cgi?id=441947 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com