[Bug 1068470] New: Tumbleweed openssl 1.0.2m: osc co command hangs in libcrypto with https://api.opensuse.org
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470 Bug ID: 1068470 Summary: Tumbleweed openssl 1.0.2m: osc co command hangs in libcrypto with https://api.opensuse.org Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Security Assignee: security-team@suse.de Reporter: sebastian.parschauer@suse.com QA Contact: qa-bugs@suse.de Found By: L3 Blocker: --- I'm trying to develop my latest Base:System/parted changes in a QEMU/KVM Tumbleweed VM with Leap 42.3 on bare metal. On Leap 42.3 it is no problem to use the command:
osc co Base:System/parted
But on Tumbleweed this command hangs reliably at a random file. As a workaround I always have to use:
iosc co openSUSE.org:Base:System/parted
instead. This is annoying. Networking in the VM is fine. I've already tried all variants of network driver virtualization. I've attached gdb to the hanging process and was able to locate the hang in libcrypto (openssl packages). Top of stack:
(gdb) bt #0 0x00007f17490b4df1 in __libc_read (fd=3, buf=buf@entry=0x55f2039abc73, nbytes=nbytes@entry=5) at ../sysdeps/unix/sysv/linux/read.c:26 #1 0x00007f1745d262de in read (__nbytes=5, __buf=0x55f2039abc73, __fd=<optimized out>) at /usr/include/bits/unistd.h:44 #2 sock_read (b=0x55f203951e10, out=0x55f2039abc73 "I\027\177", outl=5) at bss_sock.c:140 #3 0x00007f1745d24273 in BIO_read (b=0x55f203951e10, out=out@entry=0x55f2039abc73, outl=outl@entry=5) at bio_lib.c:210 #4 0x00007f174609703a in ssl3_read_n (s=0x55f2038b2530, n=5, max=<optimized out>, extend=<optimized out>) at s3_pkt.c:258 #5 0x00007f174609882d in ssl3_get_record (s=0x55f2038b2530) at s3_pkt.c:342 #6 ssl3_read_bytes (s=0x55f2038b2530, type=22, buf=0x55f2039a6710 "\001", len=4, peek=0) at s3_pkt.c:1234 #7 0x00007f1746099df9 in ssl3_get_message (s=0x55f2038b2530, st1=<optimized out>, stn=4385, mt=-1, max=20000, ok=0x7ffda749b074) at s3_both.c:371 #8 0x00007f174608d673 in ssl3_get_server_hello (s=s@entry=0x55f2038b2530) at s3_clnt.c:917 #9 0x00007f1746092641 in ssl3_connect (s=0x55f2038b2530) at s3_clnt.c:313 #10 0x00007f173d7c7502 in ssl_connect (ssl=ssl@entry=0x55f2038b2530, timeout=-1) at SWIG/_m2crypto_wrap.c:8232 #11 0x00007f173d7c761b in _wrap_ssl_connect (self=<optimized out>, args=<optimized out>) at SWIG/_m2crypto_wrap.c:19578 #12 0x00007f17493d1402 in PyEval_EvalFrameEx () from /usr/lib64/libpython2.7.so.1.0 #13 0x00007f17493d04e4 in PyEval_EvalCodeEx () from /usr/lib64/libpython2.7.so.1.0
$ rpm -q libopenssl1_0_0 libopenssl1_0_0-1.0.2m-1.1.x86_64 On Leap 42.3 there is version 1.0.2j-10.1. Please fix this! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c1
Sebastian Parschauer
https://download.opensuse.org/repositories/security:/tls/openSUSE_Tumbleweed... and upgrading to openssl 1.1.0g from there.
-- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c3
--- Comment #3 from Sebastian Parschauer
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c5
--- Comment #5 from Sebastian Parschauer
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c6
Sebastian Parschauer
python -m trace --trace /usr/bin/osc up hangs in Connection.py(337): return m2.ssl_read(self.ssl, size, self._timeout)
for a couple of seconds and then enters an endless loop in the same package. Full path:
/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py
With Python tracing enabled the checkout succeeds. @Jan: How to debug your package further? Did you do any changes which could explain this behavior? TIA -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c7
Jan Matejek
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c8
--- Comment #8 from Sebastian Parschauer
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c9
--- Comment #9 from Sebastian Parschauer
bash -c "while [ 1 ]; do python -m trace --trace /usr/bin/osc co Base:System/parted; sleep 1; rm -Rf Base:System; done"
--- modulename: Connection, funcname: connect Connection.py(298): self.socket.connect(addr) --- modulename: socket, funcname: meth socket.py(228): return getattr(self._sock,name)(*args) Connection.py(299): self.addr = addr Connection.py(300): self.setup_ssl() --- modulename: Connection, funcname: setup_ssl Connection.py(209): self.sockbio = m2.bio_new_socket(self.socket.fileno(), 0) --- modulename: socket, funcname: meth socket.py(228): return getattr(self._sock,name)(*args) Connection.py(211): m2.ssl_set_bio(self.ssl, self.sockbio, self.sockbio) Connection.py(213): self.sslbio = m2.bio_new(m2.bio_f_ssl()) Connection.py(215): m2.bio_set_ssl(self.sslbio, self.ssl, m2.bio_noclose) Connection.py(301): self.set_connect_state() --- modulename: Connection, funcname: set_connect_state Connection.py(284): m2.ssl_set_connect_state(self.ssl) Connection.py(302): ret = self.connect_ssl() --- modulename: Connection, funcname: connect_ssl Connection.py(288): return m2.ssl_connect(self.ssl, self._timeout) <-- hangs here for several minutes -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c10
--- Comment #10 from Jan Matejek
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c11
--- Comment #11 from Sebastian Parschauer
So, are you sure that the connection to api.opensuse.org in particular is not faulty?
Yes, as wget for the same files with same user name + pw works when osc co is hanging and as it works if timing is different. Of cause it also works with different OS.
wget --user sparschauer --ask-password https://api.opensuse.org/source/Base:System/parted/0005-libparted-Remove-fda...
I've set "http_debug = 1" in .oscrc here. -- GET https://api.opensuse.org/source/Base:System/parted/0004-fdasd.c-Safeguard-ag... send: 'GET /source/Base:System/parted/0004-fdasd.c-Safeguard-against-geometry-misprobing.patch?rev=135 HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: api.opensuse.org\r\nConnection: close\r\nUser-agent: osc/0.161.1\r\n\r\n' reply: 'HTTP/1.1 200 OK\r\n' header: Date: Wed, 22 Nov 2017 15:22:21 GMT header: Server: Apache/2.4.23 (Linux/SUSE) header: Strict-Transport-Security: max-age=31536000 header: Cache-Control: no-transform header: X-XSS-Protection: 1; mode=block header: X-Opensuse-Runtimes: {"view":null,"db":6.0924879999999995,"backend":0,"xml":0} header: X-Request-Id: 7aeeeab1-b6e2-4d65-94c4-16f2a3dc985b header: X-Opensuse-APIVersion: 2.8.51.git20171106T173515.61f8121be header: X-Runtime: 0.027189 header: X-Frame-Options: SAMEORIGIN header: X-Content-Type-Options: nosniff header: X-Powered-By: Phusion Passenger 5.0.30 header: Content-Type: application/octet-stream header: Content-Length: 1568 header: Cache-Control: no-cache header: Connection: close A Base:System/parted/0004-fdasd.c-Safeguard-against-geometry-misprobing.patch -- GET https://api.opensuse.org/source/Base:System/parted/0005-libparted-Remove-fda... In this example it can get everything until patch 0004 but patch 0005 can't be loaded. It hangs here for minutes. It is random at which file it hangs. Adding "debug = 1" to .oscrc also doesn't show more information. This is why I've done Python tracing.
You could use wireshark to analyze the TLS handshake. If the hang is actually happening in ssl_connect, that should be in the part that is not yet encrypted and you could see what is going on.
Will try this next, thanks. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c12
--- Comment #12 from Sebastian Parschauer
-2017-11-22 16:52:06-- https://api.opensuse.org/source/Base:System/parted?rev=latest Resolving api.opensuse.org (api.opensuse.org)... 195.135.221.133, 2001:67c:2178:8::12 Connecting to api.opensuse.org (api.opensuse.org)|195.135.221.133|:443... connected. HTTP request sent, awaiting response... 401 Unauthorized Authentication selected: Basic realm="Use your novell account" Reusing existing connection to api.opensuse.org:443. HTTP request sent, awaiting response... 200 OK Length: 8161 (8.0K) [text/xml] Saving to: ‘parted?rev=latest’
parted?rev=latest 100%[===========================================>] 7.97K --.-KB/s in 0.001s
2017-11-22 16:52:07 (12.6 MB/s) - ‘parted?rev=latest’ saved [8161/8161]
A difference here is that wget always gets "401 Unauthorized" first. The difference when using api.suse.de is that it doesn't offer an IPv6 address. It also uses less secure encryption (TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, 128 bit keys, TLS 1.2). The rest is identical. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c13
--- Comment #13 from Sebastian Parschauer
header: Strict-Transport-Security: max-age=31536000
The max-age value seems to be extremely high to me. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c14
Sebastian Parschauer
sudo tcpdump -s 0 -i enp0s25 -w osc-co.pcap
on my Lenovo Thinkpad X250 laptop running Tumbleweed from USB. I've captured the dump on the 1 GbE Intel I218-LM interface (e1000e driver). I've run in parallel:
$ osc co Base:System/parted A Base:System A Base:System/parted ^Cinterrupted!
I've killed the dump first and then osc after 2 min no progress. In the dump there are TCP out of order and TCP retransmissions visible. @Jan: Does this help you? Is this a kernel 4.13.12-1-default bug? TIA -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c15
--- Comment #15 from Sebastian Parschauer
Application Data, Application Data, Encrypted Alert in the same TLSv1.2 packet. The client doesn't sent an ACK for this. Then api.opensuse.org complains about "TCP ACKed unseen segment".
-- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c16
--- Comment #16 from Sebastian Parschauer
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c17
--- Comment #17 from Sebastian Parschauer
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c18
--- Comment #18 from Sebastian Parschauer
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c20
Sebastian Parschauer
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c21
Jan Matejek
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c22
--- Comment #22 from Sebastian Parschauer
#self.set_info_callback() # debug
When uncommenting it, I see the following osc co output:
A Base:System/parted/lib-fs-resize-prevent-crash-resizing-FAT16.patch LOOP: SSL connect: before SSL initialization LOOP: SSL connect: SSLv3/TLS write client hello LOOP: SSL connect: SSLv3/TLS write client hello LOOP: SSL connect: SSLv3/TLS read server hello LOOP: SSL connect: SSLv3/TLS read change cipher spec LOOP: SSL connect: SSLv3/TLS read finished LOOP: SSL connect: SSLv3/TLS write change cipher spec LOOP: SSL connect: SSLv3/TLS write finished INFO: SSL connect: SSL negotiation finished successfully ALERT: write: warning: close notify A Base:System/parted/libparted-Add-support-for-NVDIMM-devices.patch LOOP: SSL connect: before SSL initialization LOOP: SSL connect: SSLv3/TLS write client hello
It always hangs at the first client hello. It seems to be a bug to me that there are two client hellos. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c23
Sebastian Parschauer
A Base:System/parted/libparted-Add-support-for-NVDIMM-devices.patch LOOP: SSL connect: before/connect initialization LOOP: SSL connect: SSLv3 write client hello A LOOP: SSL connect: SSLv3 read server hello A LOOP: SSL connect: SSLv3 read finished A LOOP: SSL connect: SSLv3 write change cipher spec A LOOP: SSL connect: SSLv3 write finished A LOOP: SSL connect: SSLv3 flush data INFO: SSL connect: SSL negotiation finished successfully ALERT: write: warning: close notify A Base:System/parted/libparted-Add-support-for-NVMe-devices.patch
+CC Adrian Schröter @Adrian: Can you help me to debug this further, please? TIA I've got a gdb call trace, Python tracing, a tcpdump, and the info_callback output. Looks like it is a bug in the openssl package that the client hello is sent twice on Tumbleweed and ACKs are racing. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c24
--- Comment #24 from Sebastian Parschauer
LOOP: SSL connect: SSLv3/TLS write client hello LOOP: SSL connect: SSLv3/TLS write client hello
is correct. The first one is printed before sending the client hello and the second one is printed after sending it. So the bug is located in the write_state_machine() of libcrypto in ssl/statem/statem.c. It returns before sending the client hello and waits for the server hello in the read_state_machine() too early this way. I'll try to catch it with breakpoints next. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c25
Sebastian Parschauer
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
Adrian Schröter
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
Sebastian Parschauer
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c31
Sebastian Parschauer
headers["Connection"] = "close"
M2Crypto closes the socket too early while closing the connection. I've finally found a way how to work around this bug. I've edited:
/usr/lib64/python2.7/site-packages/M2Crypto/SSL/Connection.py
+from time import sleep Connection::__del__: if self.ssl_close_flag == m2.bio_noclose and \ getattr(self, 'ssl', None): self.m2_ssl_free(self.ssl) + sleep(0.1) self.socket.close() 50ms was too short but 100ms seems to work fine for me. But one RST per downloaded file is still in the tcpdump. No difference if putting the sleep in front of the self.m2_ssl_free() call. @Jan: Is there a more elegant way than using a sleep to detect that the socket should not be closed yet or does the kernel have to detect that? TIA -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c33
Sebastian Parschauer
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
Tomáš Chvátal
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c61
--- Comment #61 from Matej Cepl
@Sebastian: is there any chance to make bug 1113787 public?
??? It has nothing to do with this bug. It is a similar bug, true, but in C and in libcurl. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c62
--- Comment #62 from Marcus Hüwe
(In reply to Marcus Hüwe from comment #60)
@Sebastian: is there any chance to make bug 1113787 public?
??? It has nothing to do with this bug.
Ok, if it is unrelated, I don't care:) I was just curious because #1113787 was added to the "See also" list of this bug... (but I have no permission to view that bug). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c63
--- Comment #63 from Matej Cepl
Ok, if it is unrelated, I don't care:) I was just curious because #1113787 was added to the "See also" list of this bug... (but I have no permission to view that bug).
Well, strictly speaking, it is the same bug, libcurl apparently doesn’t close properly connection, but the problem is as different as the design and implementation of libcurl is different from the design and implementation of M2Crypto. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470
http://bugzilla.opensuse.org/show_bug.cgi?id=1068470#c70
--- Comment #70 from Matej Cepl
Moving fix packages to personal OBS account "TuxCheater".
This bug won't be fixed and I have to keep a custom patch/package. So closing now.
I hoped (and still I would like) that somebody finishes https://gitlab.com/m2crypto/m2crypto/merge_requests/188 (resp. now rebased as !219), but alas it didn't happen, and I haven't found enough stamina to finish it myself. I am sorry. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com