VirtualBox problems with kernel 5.13
I am writing this message as openSUSE's maintainer of VirtualBox. Nearly every update of the Linux kernel to a new 5.X version breaks VirtualBox. Why do you not know of this fact? It is because I undertake a program of testing 5.X almost immediately after the release of 5.X-1, even before the release of 5.X-rc1. On occasion, the fix is too complicated for someone with a medium knowledge of the kernel to prepare a fix, and we have to wait for Oracle. This may be the case here. In the case of 5.13, the strange thing is that my home-built kernel does not show the problem, thus I was not aware that this calamity was pending. I am looking at the differences between my configuration and that of openSUSE to see if that explains the difference in behavior. Of course, it is possible that openSUSE makes some patch to their system that causes the breakage. If that is the case, then neither I nor Oracle will be able to fix it!! Until this problem is resolved, you have two options: 1. Stay with a 5.12 kernel. 2. Switch to running your VM's using KVM or QEMU. Beware of Vagrant containers. They have some pitfalls, and you will get no support from me! Larry
On 11. 07. 21, 21:50, Larry Finger wrote:
In the case of 5.13, the strange thing is that my home-built kernel does not show the problem, thus I was not aware that this calamity was pending. I am looking at the differences between my configuration and that of openSUSE to see if that explains the difference in behavior.
Hi, have you considered kernel-obs-qa to check vbox against Kernel:HEAD? I don't know if the error you are writing about is a build or runtime one. But the former can be caught by building vbox against Kernel:HEAD (I assume you do so already.) The latter by doing an insmod in kernel-obs-qa against Kernel:HEAD.
Of course, it is possible that openSUSE makes some patch to their system that causes the breakage. If that is the case, then neither I nor Oracle will be able to fix it!!
We can fix this. However I believe we diverge from upstream in no significant way. thanks, -- js suse labs
On 7/12/21 1:59 AM, Jiri Slaby wrote:
On 11. 07. 21, 21:50, Larry Finger wrote:
In the case of 5.13, the strange thing is that my home-built kernel does not show the problem, thus I was not aware that this calamity was pending. I am looking at the differences between my configuration and that of openSUSE to see if that explains the difference in behavior.
Hi, have you considered kernel-obs-qa to check vbox against Kernel:HEAD? I don't know if the error you are writing about is a build or runtime one. But the former can be caught by building vbox against Kernel:HEAD (I assume you do so already.) The latter by doing an insmod in kernel-obs-qa against Kernel:HEAD.
Of course, it is possible that openSUSE makes some patch to their system that causes the breakage. If that is the case, then neither I nor Oracle will be able to fix it!!
We can fix this. However I believe we diverge from upstream in no significant way.
You are correct. The problem is that VirtualBox cannot handle the 5-level cache that got turned on in openSUSE Tumbleweed's 5.13 kernel. At least I know where to look. Larry
On 7/12/21 1:59 AM, Jiri Slaby wrote:
On 11. 07. 21, 21:50, Larry Finger wrote:
In the case of 5.13, the strange thing is that my home-built kernel does not show the problem, thus I was not aware that this calamity was pending. I am looking at the differences between my configuration and that of openSUSE to see if that explains the difference in behavior.
Hi, have you considered kernel-obs-qa to check vbox against Kernel:HEAD? I don't know if the error you are writing about is a build or runtime one. But the former can be caught by building vbox against Kernel:HEAD (I assume you do so already.) The latter by doing an insmod in kernel-obs-qa against Kernel:HEAD.
Of course, it is possible that openSUSE makes some patch to their system that causes the breakage. If that is the case, then neither I nor Oracle will be able to fix it!!
We can fix this. However I believe we diverge from upstream in no significant way.
It is not clear how long this will take to fix. Would it be possible to have a kernel with 5-level page tables turned off? That way VB users would be able to run kernel 5.13 with VB. Thanks, Larry
On 13/7/21 7:53 am, Larry Finger wrote:
On 7/12/21 1:59 AM, Jiri Slaby wrote:
On 11. 07. 21, 21:50, Larry Finger wrote:
In the case of 5.13, the strange thing is that my home-built kernel does not show the problem, thus I was not aware that this calamity was pending. I am looking at the differences between my configuration and that of openSUSE to see if that explains the difference in behavior.
Hi, have you considered kernel-obs-qa to check vbox against Kernel:HEAD? I don't know if the error you are writing about is a build or runtime one. But the former can be caught by building vbox against Kernel:HEAD (I assume you do so already.) The latter by doing an insmod in kernel-obs-qa against Kernel:HEAD.
Of course, it is possible that openSUSE makes some patch to their system that causes the breakage. If that is the case, then neither I nor Oracle will be able to fix it!!
We can fix this. However I believe we diverge from upstream in no significant way.
It is not clear how long this will take to fix. Would it be possible to have a kernel with 5-level page tables turned off? That way VB users would be able to run kernel 5.13 with VB.
Thanks,
Larry
G'day Larry, It looks like this is a problem with the new stank offset randomisation built into 5.13. It is not enabled by default on a vanilla kernel, but OpenSUSE seems to typically enable security features. This is probably what makes it elusive. It really relies on a particular kernel option. You can disable the offending extra security by kernel command line option ("yast2 bootloader" is your friend) The fix is in this bug report: https://www.virtualbox.org/ticket/20452 I have confirmed it works (for me). I can now run everything fine in virtualbox after adding "randomize_kstack_offset=off" to the kernel command line. This also increases performance _very_ slightly as the security feature takes a very small amount of performance away (Phoronix benchmark showed it was there, but really not noticeable). If you are running a server, probably best to leave it enabled... but running VirtualBox on a server install of Tumbleweed does not sound like it would be common. I doubt this is worth fixing using packaging or a launch script; I can't seem to find a way of turning this off using /sys or any other easy way. Maybe an RPM notification could help, but does anyone actually read those? I would assume the next VirtualBox would be fine even with this enabled, but who knows. As for migrating to KVM, there are simply too many issues with proprietary licensing tied to the (virtual) hardware to do that easily. I started with VirtualBox due to the graphics performance about 3 laptops ago, and now I am just stuck there. I am sure there are many users in the same boat. -- Ben
On 7/14/21 11:07 AM, S. B. wrote:
Ben Holmes wrote:
I can now run everything fine in virtualbox after adding "randomize_kstack_offset=off" to the kernel command line.
Thanks very much! I also confirm that this workaround works.
That workaround has been discussed in boo#1188105. I am trying to fix the cause of breakage, which should be easy to fix, once I track my way through the spaghetti code that makes up VirtualBox. Larry
participants (4)
-
Ben Holmes
-
Jiri Slaby
-
Larry Finger
-
S. B.