nvme nvme0: frozen state error detected, reset controller
![](https://seccdn.libravatar.org/avatar/fcc52f3ac56a336f1d501c42a49baadd.jpg?s=120&d=mm&r=g)
Hi, I have this annoying issue with any of the recent kernels that come with Leap 15.1 and 15.2 that my SSD becomes read-only after a while. I can reproduce the behavior almost 100% of the time pushing the same heavy I/O work load on it. I have one "golden" kernel under which issue _never_ shows up (uptime easily over 45 days and pushing the same work load multiple times a day). This "golden" kernel is: vmlinuz-4.12.14-lp151.28.10-default. Any kernel I have tried since the above during Leap 15.1 updates, and now the latest Leap 15.2 (vmlinuz-5.3.18-lp152.50-default) have the above described issue with the SSD failure and system lockup. The last log that goes to the screen is: pcieport 000:00:1d.4: DPC: unmasked uncorrectable error detected nvme nvme0: frozen state error detected, reset controller I would just stay with the "golden" kernel if it was not for some other issues (HDMI problem) that I have with /it/. The latest 5.3 kernel definitely has the HDMI issue fixed, and I'd love to move on, but cannot due to the SSD issue. The drive is an Intel model number: HBRPEKNX0202AH. It's been years since I last built my own custom kernels, and I was really hoping to not have to do that again. Please let me know if there is any additional information that would be useful to address this issue. Best, -Gerhard
![](https://seccdn.libravatar.org/avatar/5fea8b8a942b267f895e5f713e465cc3.jpg?s=120&d=mm&r=g)
On Wed 02 Dec 2020 03:50:18 PM CST, Gerhard Theurich wrote:
Hi,
I have this annoying issue with any of the recent kernels that come with Leap 15.1 and 15.2 that my SSD becomes read-only after a while. I can reproduce the behavior almost 100% of the time pushing the same heavy I/O work load on it. I have one "golden" kernel under which issue _never_ shows up (uptime easily over 45 days and pushing the same work load multiple times a day). This "golden" kernel is: vmlinuz-4.12.14-lp151.28.10-default.
Any kernel I have tried since the above during Leap 15.1 updates, and now the latest Leap 15.2 (vmlinuz-5.3.18-lp152.50-default) have the above described issue with the SSD failure and system lockup. The last log that goes to the screen is:
pcieport 000:00:1d.4: DPC: unmasked uncorrectable error detected nvme nvme0: frozen state error detected, reset controller
I would just stay with the "golden" kernel if it was not for some other issues (HDMI problem) that I have with /it/. The latest 5.3 kernel definitely has the HDMI issue fixed, and I'd love to move on, but cannot due to the SSD issue.
The drive is an Intel model number: HBRPEKNX0202AH.
It's been years since I last built my own custom kernels, and I was really hoping to not have to do that again. Please let me know if there is any additional information that would be useful to address this issue.
Hi On my device I set nvme_core.default_ps_max_latency_us=0 (Sandisk Corp WD Black 2018 / PC SN520 NVMe SSD), it could be the same sort of issue as part of the error was frozen state error detected, reset controller. See last comments: https://bugzilla.kernel.org/show_bug.cgi?id=195039 -- Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890) Tumbleweed 20201201 | GNOME Shell 3.38.1 | 5.9.11-1-default Intel DQ77MK MB | Xeon E3-1245 V2 X8 @ 3.40 GHz | Intel/Nvidia up 13:17, 2 users, load average: 0.45, 0.81, 0.59
![](https://seccdn.libravatar.org/avatar/efd16ed89fbe5cb3dbeeea2fb30a68e3.jpg?s=120&d=mm&r=g)
Hi Malcolm, Thank you for the advice! I rebooted with the nvme_core.default_ps_max_latency_us=0 parameter, and have run the problematic work-load several times. Not a single freeze yet. So this does seem to have addressed the issue. Very happy! On the other hand I think I spoke too early on the HDMI issue. I will post another question to the list about that issue in a minute. Best, -Gerhard On 12/2/20 5:38 PM, Malcolm wrote:
On Wed 02 Dec 2020 03:50:18 PM CST, Gerhard Theurich wrote:
Hi,
I have this annoying issue with any of the recent kernels that come with Leap 15.1 and 15.2 that my SSD becomes read-only after a while. I can reproduce the behavior almost 100% of the time pushing the same heavy I/O work load on it. I have one "golden" kernel under which issue _never_ shows up (uptime easily over 45 days and pushing the same work load multiple times a day). This "golden" kernel is: vmlinuz-4.12.14-lp151.28.10-default.
Any kernel I have tried since the above during Leap 15.1 updates, and now the latest Leap 15.2 (vmlinuz-5.3.18-lp152.50-default) have the above described issue with the SSD failure and system lockup. The last log that goes to the screen is:
pcieport 000:00:1d.4: DPC: unmasked uncorrectable error detected nvme nvme0: frozen state error detected, reset controller
I would just stay with the "golden" kernel if it was not for some other issues (HDMI problem) that I have with /it/. The latest 5.3 kernel definitely has the HDMI issue fixed, and I'd love to move on, but cannot due to the SSD issue.
The drive is an Intel model number: HBRPEKNX0202AH.
It's been years since I last built my own custom kernels, and I was really hoping to not have to do that again. Please let me know if there is any additional information that would be useful to address this issue.
Hi On my device I set nvme_core.default_ps_max_latency_us=0 (Sandisk Corp WD Black 2018 / PC SN520 NVMe SSD), it could be the same sort of issue as part of the error was frozen state error detected, reset controller.
See last comments: https://bugzilla.kernel.org/show_bug.cgi?id=195039
![](https://seccdn.libravatar.org/avatar/fcc52f3ac56a336f1d501c42a49baadd.jpg?s=120&d=mm&r=g)
Hi Malcolm, Thank you for the advice! I rebooted with the nvme_core.default_ps_max_latency_us=0 parameter, and have run the problematic work-load several times. Not a single freeze yet. So this does seem to have addressed the issue. Very happy! On the other hand I think I spoke too early on the HDMI issue. I will post another question to the list about that issue in a minute. Best, -Gerhard On 12/2/20 5:38 PM, Malcolm wrote:
On Wed 02 Dec 2020 03:50:18 PM CST, Gerhard Theurich wrote:
Hi,
I have this annoying issue with any of the recent kernels that come with Leap 15.1 and 15.2 that my SSD becomes read-only after a while. I can reproduce the behavior almost 100% of the time pushing the same heavy I/O work load on it. I have one "golden" kernel under which issue _never_ shows up (uptime easily over 45 days and pushing the same work load multiple times a day). This "golden" kernel is: vmlinuz-4.12.14-lp151.28.10-default.
Any kernel I have tried since the above during Leap 15.1 updates, and now the latest Leap 15.2 (vmlinuz-5.3.18-lp152.50-default) have the above described issue with the SSD failure and system lockup. The last log that goes to the screen is:
pcieport 000:00:1d.4: DPC: unmasked uncorrectable error detected nvme nvme0: frozen state error detected, reset controller
I would just stay with the "golden" kernel if it was not for some other issues (HDMI problem) that I have with /it/. The latest 5.3 kernel definitely has the HDMI issue fixed, and I'd love to move on, but cannot due to the SSD issue.
The drive is an Intel model number: HBRPEKNX0202AH.
It's been years since I last built my own custom kernels, and I was really hoping to not have to do that again. Please let me know if there is any additional information that would be useful to address this issue.
Hi On my device I set nvme_core.default_ps_max_latency_us=0 (Sandisk Corp WD Black 2018 / PC SN520 NVMe SSD), it could be the same sort of issue as part of the error was frozen state error detected, reset controller.
See last comments: https://bugzilla.kernel.org/show_bug.cgi?id=195039
participants (3)
-
Gerhard Theurich
-
Gerhard Theurich
-
Malcolm