[opensuse-kernel] Moblin kernel merged to FACTORY
Hi all, I just now got the Moblin (2.6.29) kernel merged into the FACTORY kernel, so it should start showing up in the next few builds. It really wasn't that many changes, the real work is in the configurations. So, here's what I plan on doing, and it would be great to get some feedback. For Moblin, we used a PAE kernel as "kernel-default". For FACTORY, I can't do that, and as we moved away from the -legacy to -default naming scheme, I'll change the Moblin images to use kernel-pae. In the kernel-pae config, I'd like to start changing stuff to reflect the fastboot things we did for Moblin. In the end, we were booting the kernel in less than a second on a tiny netbook, and I see no reason why we can't do the same for FACTORY and all future releases. To achieve this, I'll start to change the i386/pae and x86-64/default configurations to build a whole raft of drivers into the kernel, which speeds up booting a _lot_ due to the async probing that it allows the kernel to do. I'll also disable a few things that PAE systems should never need (like ISA), and a few other things that are in the Moblin kernel config. In the end, this means that you can boot without an initrd at all, but we need to move our init script changes over to FACTORY as well to take full advantage of this. There's also some mkinitrd magic I need to figure out so that we don't accidentally create initrd when we don't need them (which is a bug right now.) Any objections to any of this? Hopefully this will help with both the Moblin releases, which should be in the near future, as well as openSUSE 11.2, which will probably happen afterward. thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Fri, Jun 19, 2009 at 03:00:46PM -0700, Greg KH wrote:
Hi all,
I just now got the Moblin (2.6.29) kernel merged into the FACTORY kernel, so it should start showing up in the next few builds.
Ah, sorry for the vagueness, this means that I am now using the 2.6.30 kernel for the Moblin builds. I forward ported the 2.6.29 Moblin bits to FACTORY, which is 2.6.30. Hope that clears up any confusion. thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Greg KH wrote:
Hi all,
I just now got the Moblin (2.6.29) kernel merged into the FACTORY kernel, so it should start showing up in the next few builds.
Cool. The majority of it looks like small fixes and adding the /dev stuff.
It really wasn't that many changes, the real work is in the configurations.
So, here's what I plan on doing, and it would be great to get some feedback.
For Moblin, we used a PAE kernel as "kernel-default". For FACTORY, I can't do that, and as we moved away from the -legacy to -default naming scheme, I'll change the Moblin images to use kernel-pae.
In the kernel-pae config, I'd like to start changing stuff to reflect the fastboot things we did for Moblin. In the end, we were booting the kernel in less than a second on a tiny netbook, and I see no reason why we can't do the same for FACTORY and all future releases.
To achieve this, I'll start to change the i386/pae and x86-64/default configurations to build a whole raft of drivers into the kernel, which speeds up booting a _lot_ due to the async probing that it allows the kernel to do.
I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
I'll also disable a few things that PAE systems should never need (like ISA), and a few other things that are in the Moblin kernel config.
The PAE config already has CONFIG_ISA=n.
In the end, this means that you can boot without an initrd at all, but we need to move our init script changes over to FACTORY as well to take full advantage of this. There's also some mkinitrd magic I need to figure out so that we don't accidentally create initrd when we don't need them (which is a bug right now.)
Did you figure out a way to discover when a module is built into the kernel instead of just unavailable?
Any objections to any of this? Hopefully this will help with both the Moblin releases, which should be in the near future, as well as openSUSE 11.2, which will probably happen afterward.
Outside of my usual objections, no. This looks like a good win. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAko9T3kACgkQLPWxlyuTD7JgOwCfW9izZrKxMttmtbAxzznU0rTp FBIAn0Dn3YOwZrfG42uw/TqQ1PYNJHwK =dOAS -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Saturday 20 June 2009 12:00:46 am Greg KH wrote: ...
To achieve this, I'll start to change the i386/pae and x86-64/default configurations to build a whole raft of drivers into the kernel, which speeds up booting a _lot_ due to the async probing that it allows the kernel to do. What if built-in drivers break on specific HW? Normal /etc/modprobe.conf blacklisting won't work. It would be great to have the linuxrc interpreted boot param brokenmodules= (to at least make sure you can install if elementary stuff breaks) taken into account by the kernel. No idea whether/how this could work out.
Thomas -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hi all, Jeff Mahoney wrote:
Greg KH wrote:
Hi all,
I just now got the Moblin (2.6.29) kernel merged into the FACTORY kernel, so it should start showing up in the next few builds.
Cool. The majority of it looks like small fixes and adding the /dev stuff.
It really wasn't that many changes, the real work is in the configurations.
So, here's what I plan on doing, and it would be great to get some feedback.
For Moblin, we used a PAE kernel as "kernel-default". For FACTORY, I can't do that, and as we moved away from the -legacy to -default naming scheme, I'll change the Moblin images to use kernel-pae.
In the kernel-pae config, I'd like to start changing stuff to reflect the fastboot things we did for Moblin. In the end, we were booting the kernel in less than a second on a tiny netbook, and I see no reason why we can't do the same for FACTORY and all future releases.
To achieve this, I'll start to change the i386/pae and x86-64/default configurations to build a whole raft of drivers into the kernel, which speeds up booting a _lot_ due to the async probing that it allows the kernel to do.
I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
Why don't we do something about it? I've already spent some thoughts about it, and come up with two possibilities: - Link in modules during initrd run. Shouldn't be too hard, after all that's what the kernel does nowadays during building anyway. So just some linker magic and you're done. Drawback is that you'd need an uncompressed kernel to start with, so I'm not sure it's the right way to go - Implement something like the 'kexec-cache' from Max OS-X. OS-X has a 'kexec-cache', which allow to preload some kernel modules during boot. Implementing a similar thing on Linux we could just stuff the preloaded modules into a blob and load this as an additional initrd image. Then we could just call the ->init calls and everything would be dandy. Or that's the hope. Seeing the amount of trouble we've been running with built in modules I'd rather avoid this exercise again. Building in infrastructure modules is okay in general, and also driver modules which are not expected to change a lot (like loopback interface or stuff like that). But everything else is bound to cause trouble. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
At Mon, 22 Jun 2009 09:12:14 +0200, Hannes Reinecke wrote:
Hi all,
Jeff Mahoney wrote:
Greg KH wrote:
Hi all,
I just now got the Moblin (2.6.29) kernel merged into the FACTORY kernel, so it should start showing up in the next few builds.
Cool. The majority of it looks like small fixes and adding the /dev stuff.
It really wasn't that many changes, the real work is in the configurations.
So, here's what I plan on doing, and it would be great to get some feedback.
For Moblin, we used a PAE kernel as "kernel-default". For FACTORY, I can't do that, and as we moved away from the -legacy to -default naming scheme, I'll change the Moblin images to use kernel-pae.
In the kernel-pae config, I'd like to start changing stuff to reflect the fastboot things we did for Moblin. In the end, we were booting the kernel in less than a second on a tiny netbook, and I see no reason why we can't do the same for FACTORY and all future releases.
To achieve this, I'll start to change the i386/pae and x86-64/default configurations to build a whole raft of drivers into the kernel, which speeds up booting a _lot_ due to the async probing that it allows the kernel to do.
I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
Why don't we do something about it?
I've already spent some thoughts about it, and come up with two possibilities:
- Link in modules during initrd run. Shouldn't be too hard, after all that's what the kernel does nowadays during building anyway. So just some linker magic and you're done. Drawback is that you'd need an uncompressed kernel to start with, so I'm not sure it's the right way to go
I thought we still have /boot/vmlinux-$VERSION.gz in each kernel package. I guess this will be kept in future, too, because it's needed for many debug tools. Takashi -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Takashi Iwai wrote:
At Mon, 22 Jun 2009 09:12:14 +0200, Hannes Reinecke wrote:
Hi all,
Jeff Mahoney wrote:
Greg KH wrote:
Hi all, I just now got the Moblin (2.6.29) kernel merged into the FACTORY kernel, so it should start showing up in the next few builds. Cool. The majority of it looks like small fixes and adding the /dev stuff.
[ ... ]
To achieve this, I'll start to change the i386/pae and x86-64/default configurations to build a whole raft of drivers into the kernel, which speeds up booting a _lot_ due to the async probing that it allows the kernel to do. I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
Why don't we do something about it?
I've already spent some thoughts about it, and come up with two possibilities:
- Link in modules during initrd run. Shouldn't be too hard, after all that's what the kernel does nowadays during building anyway. So just some linker magic and you're done. Drawback is that you'd need an uncompressed kernel to start with, so I'm not sure it's the right way to go
I thought we still have /boot/vmlinux-$VERSION.gz in each kernel package. I guess this will be kept in future, too, because it's needed for many debug tools.
Yes, but when going down that route we would either - boot from an uncompressed kernel -> longer booting time or - keep the bzImage header around somewhere an do the compressing ourselves. Neither of these approaches is very appealing. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hannes Reinecke wrote:
Jeff Mahoney wrote:
I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
Why don't we do something about it?
I've already spent some thoughts about it, and come up with two possibilities:
- Link in modules during initrd run. Shouldn't be too hard, after all that's what the kernel does nowadays during building anyway. So just some linker magic and you're done. Drawback is that you'd need an uncompressed kernel to start with, so I'm not sure it's the right way to go
This is something we discussed briefly a few months ago and the consensus was that there just wasn't enough information in the installation to properly link and assemble the new image. The idea just sort of petered out. I was thinking, though, that with the addition of a few more files, we might be able to make it work. The helpers in .../tools/, setup.bin, and a bit of scripting might be enough, but I haven't looked into it deeply enough to back that up with solid data.
- Implement something like the 'kexec-cache' from Max OS-X. OS-X has a 'kexec-cache', which allow to preload some kernel modules during boot. Implementing a similar thing on Linux we could just stuff the preloaded modules into a blob and load this as an additional initrd image. Then we could just call the ->init calls and everything would be dandy. Or that's the hope.
Wouldn't this also require a build environment? If not, doesn't it run into the same problem that we have now with serially loading the modules? - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAko/mVIACgkQLPWxlyuTD7K0qQCgj6V9ry7ZgIHNuanefpwqD6uR Vv0AoKSPOMjfmT95ikkCIA6D79W2BvvL =NMkp -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Mon, Jun 22, 2009 at 09:12:14AM +0200, Hannes Reinecke wrote:
I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
Why don't we do something about it?
I've already spent some thoughts about it, and come up with two possibilities:
- Link in modules during initrd run. Shouldn't be too hard, after all that's what the kernel does nowadays during building anyway. So just some linker magic and you're done. Drawback is that you'd need an uncompressed kernel to start with, so I'm not sure it's the right way to go - Implement something like the 'kexec-cache' from Max OS-X. OS-X has a 'kexec-cache', which allow to preload some kernel modules during boot. Implementing a similar thing on Linux we could just stuff the preloaded modules into a blob and load this as an additional initrd image. Then we could just call the ->init calls and everything would be dandy. Or that's the hope.
Big problem is that you need the .c files because you can have different code paths built in the file depending on if you are built to be a module or built into the kernel due to #ifdefs :( thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Sun, Jun 21, 2009 at 09:08:22PM +0200, Thomas Renninger wrote:
On Saturday 20 June 2009 12:00:46 am Greg KH wrote: ...
To achieve this, I'll start to change the i386/pae and x86-64/default configurations to build a whole raft of drivers into the kernel, which speeds up booting a _lot_ due to the async probing that it allows the kernel to do. What if built-in drivers break on specific HW?
Then we fix the problem :)
Normal /etc/modprobe.conf blacklisting won't work.
I agree, but if you look at the modules we are building in, they are all so far "common" modules that I do not think have ever been blacklisted. thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Le lundi 22 juin 2009, Greg KH a écrit :
On Sun, Jun 21, 2009 at 09:08:22PM +0200, Thomas Renninger wrote:
On Saturday 20 June 2009 12:00:46 am Greg KH wrote: ...
To achieve this, I'll start to change the i386/pae and x86-64/default configurations to build a whole raft of drivers into the kernel, which speeds up booting a _lot_ due to the async probing that it allows the kernel to do. What if built-in drivers break on specific HW?
Then we fix the problem :)
Normal /etc/modprobe.conf blacklisting won't work.
I agree, but if you look at the modules we are building in, they are all so far "common" modules that I do not think have ever been blacklisted.
You'd be surprised. Please don't underestimate the problem Thomas is pointing you at, it's very real. It doesn't mean we don't want to build these drivers in, but this means that if we do, we need a way to disable them. If you decide to ignore this problem today, L3 and R&D will remind you about it on a weekly basis for the next 7 years ;) -- Jean Delvare Suse L3 -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Mon, Jun 22, 2009 at 06:55:51PM +0200, Jean Delvare wrote:
Le lundi 22 juin 2009, Greg KH a écrit :
On Sun, Jun 21, 2009 at 09:08:22PM +0200, Thomas Renninger wrote:
On Saturday 20 June 2009 12:00:46 am Greg KH wrote: ...
To achieve this, I'll start to change the i386/pae and x86-64/default configurations to build a whole raft of drivers into the kernel, which speeds up booting a _lot_ due to the async probing that it allows the kernel to do. What if built-in drivers break on specific HW?
Then we fix the problem :)
Normal /etc/modprobe.conf blacklisting won't work.
I agree, but if you look at the modules we are building in, they are all so far "common" modules that I do not think have ever been blacklisted.
You'd be surprised. Please don't underestimate the problem Thomas is pointing you at, it's very real. It doesn't mean we don't want to build these drivers in, but this means that if we do, we need a way to disable them.
Fair enough, I'll work on that.
If you decide to ignore this problem today, L3 and R&D will remind you about it on a weekly basis for the next 7 years ;)
Heh. But note, that this is not being done (yet) for a product that we provide L3 support for :) thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Greg KH wrote:
On Mon, Jun 22, 2009 at 09:12:14AM +0200, Hannes Reinecke wrote:
I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
Why don't we do something about it?
I've already spent some thoughts about it, and come up with two possibilities:
- Link in modules during initrd run. Shouldn't be too hard, after all that's what the kernel does nowadays during building anyway. So just some linker magic and you're done. Drawback is that you'd need an uncompressed kernel to start with, so I'm not sure it's the right way to go - Implement something like the 'kexec-cache' from Max OS-X. OS-X has a 'kexec-cache', which allow to preload some kernel modules during boot. Implementing a similar thing on Linux we could just stuff the preloaded modules into a blob and load this as an additional initrd image. Then we could just call the ->init calls and everything would be dandy. Or that's the hope.
Big problem is that you need the .c files because you can have different code paths built in the file depending on if you are built to be a module or built into the kernel due to #ifdefs :(
I know they exist, but what are the valid use cases for doing that and do we need to worry about a lot of them? It seems like the cases can be broken down into a few categories: * print something * change a description string * optimize away things that aren't required when statically linked A lot of the stupid things are in ISA drivers. I do see one case in usbcore, but even that seems like it should always allow usbcore.nousb and enable nousb for ifndef MODULE. I do see your point that making assumptions like this could be fragile. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAko/wXgACgkQLPWxlyuTD7It1wCgqLKodCHueQD+xNrCrQT+MG/v o5wAoJNnWkV0aWqQG4eTjVnpD8CEaQOa =y8kH -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Mon, Jun 22, 2009 at 01:38:01PM -0400, Jeff Mahoney wrote:
Greg KH wrote:
On Mon, Jun 22, 2009 at 09:12:14AM +0200, Hannes Reinecke wrote:
I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
Why don't we do something about it?
I've already spent some thoughts about it, and come up with two possibilities:
- Link in modules during initrd run. Shouldn't be too hard, after all that's what the kernel does nowadays during building anyway. So just some linker magic and you're done. Drawback is that you'd need an uncompressed kernel to start with, so I'm not sure it's the right way to go - Implement something like the 'kexec-cache' from Max OS-X. OS-X has a 'kexec-cache', which allow to preload some kernel modules during boot. Implementing a similar thing on Linux we could just stuff the preloaded modules into a blob and load this as an additional initrd image. Then we could just call the ->init calls and everything would be dandy. Or that's the hope.
Big problem is that you need the .c files because you can have different code paths built in the file depending on if you are built to be a module or built into the kernel due to #ifdefs :(
I know they exist, but what are the valid use cases for doing that and do we need to worry about a lot of them? It seems like the cases can be broken down into a few categories:
* print something * change a description string * optimize away things that aren't required when statically linked
Also: - initialize something at a different run level Now that should be fixed up properly by doing the correct macro, but I have seen it enough that it is common.
A lot of the stupid things are in ISA drivers.
Agreed, and we aren't building ISA drivers for "real" systems anymore, thankfully :)
I do see one case in usbcore, but even that seems like it should always allow usbcore.nousb and enable nousb for ifndef MODULE.
I do see your point that making assumptions like this could be fragile.
Yeah, it's the odd-cases that I worry about here. thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Sat, Jun 20, 2009 at 05:07:05PM -0400, Jeff Mahoney wrote:
Greg KH wrote:
Hi all,
I just now got the Moblin (2.6.29) kernel merged into the FACTORY kernel, so it should start showing up in the next few builds.
Cool. The majority of it looks like small fixes and adding the /dev stuff.
Yes. There is also some wierd init call ordering that I'm not quite sure why it's needed, but it speeds boot up, so I'm not complaining.
It really wasn't that many changes, the real work is in the configurations.
So, here's what I plan on doing, and it would be great to get some feedback.
For Moblin, we used a PAE kernel as "kernel-default". For FACTORY, I can't do that, and as we moved away from the -legacy to -default naming scheme, I'll change the Moblin images to use kernel-pae.
In the kernel-pae config, I'd like to start changing stuff to reflect the fastboot things we did for Moblin. In the end, we were booting the kernel in less than a second on a tiny netbook, and I see no reason why we can't do the same for FACTORY and all future releases.
To achieve this, I'll start to change the i386/pae and x86-64/default configurations to build a whole raft of drivers into the kernel, which speeds up booting a _lot_ due to the async probing that it allows the kernel to do.
I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
I'll also disable a few things that PAE systems should never need (like ISA), and a few other things that are in the Moblin kernel config.
The PAE config already has CONFIG_ISA=n.
Ah, you're right, no wonder my diff didn't show it :)
In the end, this means that you can boot without an initrd at all, but we need to move our init script changes over to FACTORY as well to take full advantage of this. There's also some mkinitrd magic I need to figure out so that we don't accidentally create initrd when we don't need them (which is a bug right now.)
Did you figure out a way to discover when a module is built into the kernel instead of just unavailable?
No. thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Greg KH wrote:
On Mon, Jun 22, 2009 at 01:38:01PM -0400, Jeff Mahoney wrote:
Greg KH wrote:
On Mon, Jun 22, 2009 at 09:12:14AM +0200, Hannes Reinecke wrote:
I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
Why don't we do something about it?
I've already spent some thoughts about it, and come up with two possibilities:
- Link in modules during initrd run. Shouldn't be too hard, after all that's what the kernel does nowadays during building anyway. So just some linker magic and you're done. Drawback is that you'd need an uncompressed kernel to start with, so I'm not sure it's the right way to go - Implement something like the 'kexec-cache' from Max OS-X. OS-X has a 'kexec-cache', which allow to preload some kernel modules during boot. Implementing a similar thing on Linux we could just stuff the preloaded modules into a blob and load this as an additional initrd image. Then we could just call the ->init calls and everything would be dandy. Or that's the hope. Big problem is that you need the .c files because you can have different code paths built in the file depending on if you are built to be a module or built into the kernel due to #ifdefs :( I know they exist, but what are the valid use cases for doing that and do we need to worry about a lot of them? It seems like the cases can be broken down into a few categories:
* print something * change a description string * optimize away things that aren't required when statically linked
Also: - initialize something at a different run level
But that's really just to address dependencies, right? I don't intend to load the modules at the same runlevel where they would have run if normally compiled statically. If we load the linked module after the usual static parts have initialized, then we'll still observe the dependencies.
Now that should be fixed up properly by doing the correct macro, but I have seen it enough that it is common.
A lot of the stupid things are in ISA drivers.
Agreed, and we aren't building ISA drivers for "real" systems anymore, thankfully :)
I do see one case in usbcore, but even that seems like it should always allow usbcore.nousb and enable nousb for ifndef MODULE.
I do see your point that making assumptions like this could be fragile.
Yeah, it's the odd-cases that I worry about here.
Or perhaps a different solution would be to whitelist modules which are known to be safe. Given the number of modules we want to typically link in, this shouldn't be a long list. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAko/yjEACgkQLPWxlyuTD7KCsACeO3Lfr+zhj8dByRZ+E0LJpeKL ROUAoIErXMCBntEysg16w0knmOl/KIkr =Y0Yw -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Mon, Jun 22, 2009 at 02:15:13PM -0400, Jeff Mahoney wrote:
Greg KH wrote:
On Mon, Jun 22, 2009 at 01:38:01PM -0400, Jeff Mahoney wrote:
Greg KH wrote:
On Mon, Jun 22, 2009 at 09:12:14AM +0200, Hannes Reinecke wrote:
I'm still not a fan of this, but in the absence of the ability to link in modules at install time, I guess the gains outweigh the drawbacks.
Why don't we do something about it?
I've already spent some thoughts about it, and come up with two possibilities:
- Link in modules during initrd run. Shouldn't be too hard, after all that's what the kernel does nowadays during building anyway. So just some linker magic and you're done. Drawback is that you'd need an uncompressed kernel to start with, so I'm not sure it's the right way to go - Implement something like the 'kexec-cache' from Max OS-X. OS-X has a 'kexec-cache', which allow to preload some kernel modules during boot. Implementing a similar thing on Linux we could just stuff the preloaded modules into a blob and load this as an additional initrd image. Then we could just call the ->init calls and everything would be dandy. Or that's the hope. Big problem is that you need the .c files because you can have different code paths built in the file depending on if you are built to be a module or built into the kernel due to #ifdefs :( I know they exist, but what are the valid use cases for doing that and do we need to worry about a lot of them? It seems like the cases can be broken down into a few categories:
* print something * change a description string * optimize away things that aren't required when statically linked
Also: - initialize something at a different run level
But that's really just to address dependencies, right? I don't intend to load the modules at the same runlevel where they would have run if normally compiled statically. If we load the linked module after the usual static parts have initialized, then we'll still observe the dependencies.
No, it's to tell the kernel exactly when to initialize the code at what part during the init level processing, and link order matters. Actually, in thinking about it some more, I don't think this is going to work properly for the "fastboot" stuff that we really need. Here's why: - When drivers are build into the kernel, they are initialized in the order in which the Makefile places them, and we build them in pretty early. This allows the drivers to start up, and do some async stuff while the rest of the kernel initializes. - If you somehow "link" the modules into the built kernel, you will have to set up a mechanism to call the module_init() calls. The only safe way to do that is at the end of the init cycle. So any async processing that could have happened, will not, as these drivers will be the last things in the boot process now, instead of very early like they used to be. Now if you could figure out how to insert them into the link order in the boot process in the same sequence as if they were built in, that would be very nice, but I don't see how that would be possible. So, by adding the modules on to the kernel image, all we would save would be the module load time, which while not insignificant, is not sufficient for the boot times we are needing to achieve here. thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hi, On Mon, 22 Jun 2009, Greg KH wrote:
But that's really just to address dependencies, right? I don't intend to load the modules at the same runlevel where they would have run if normally compiled statically. If we load the linked module after the usual static parts have initialized, then we'll still observe the dependencies.
No, it's to tell the kernel exactly when to initialize the code at what part during the init level processing, and link order matters.
"exactly when to initialize the code" == "addresses dependencies", isn't it?
Actually, in thinking about it some more, I don't think this is going to work properly for the "fastboot" stuff that we really need. Here's why:
- When drivers are build into the kernel, they are initialized in the order in which the Makefile places them, and we build them in pretty early. This allows the drivers to start up, and do some async stuff while the rest of the kernel initializes.
Excuse me for not being up-to-date wrt. the kernel anymore, but isn't this done via the .init sections?
- If you somehow "link" the modules into the built kernel, you will have to set up a mechanism to call the module_init() calls. The only safe way to do that is at the end of the init cycle. So any async processing that could have happened, will not, as these drivers will be the last things in the boot process now, instead of very early like they used to be.
... Because if it is, then linking modules into the built kernel after the fact isn't going to change this principle. You still have a .initcall section (well, two of them, one for the built kernel, one for the module lump) which the kernel proper would iterate over very early (after determining existence of the second initcall table). Ciao, Michael. -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Michael Matz wrote:
Hi,
On Mon, 22 Jun 2009, Greg KH wrote:
[ .. ]
Actually, in thinking about it some more, I don't think this is going to work properly for the "fastboot" stuff that we really need. Here's why:
- When drivers are build into the kernel, they are initialized in the order in which the Makefile places them, and we build them in pretty early. This allows the drivers to start up, and do some async stuff while the rest of the kernel initializes.
Excuse me for not being up-to-date wrt. the kernel anymore, but isn't this done via the .init sections?
- If you somehow "link" the modules into the built kernel, you will have to set up a mechanism to call the module_init() calls. The only safe way to do that is at the end of the init cycle. So any async processing that could have happened, will not, as these drivers will be the last things in the boot process now, instead of very early like they used to be.
... Because if it is, then linking modules into the built kernel after the fact isn't going to change this principle. You still have a .initcall section (well, two of them, one for the built kernel, one for the module lump) which the kernel proper would iterate over very early (after determining existence of the second initcall table).
Which is exactly my thoughts. The only valid argument currently against this is the #ifdef MODULE case. One would have to look at the individual cases, but I suspect the most of these are leftovers and should be cleaned up anyway. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hannes Reinecke wrote:
Michael Matz wrote: [ .. ]
fact isn't going to change this principle. You still have a .initcall section (well, two of them, one for the built kernel, one for the module lump) which the kernel proper would iterate over very early (after determining existence of the second initcall table).
Which is exactly my thoughts.
The only valid argument currently against this is the #ifdef MODULE case. One would have to look at the individual cases, but I suspect the most of these are leftovers and should be cleaned up anyway.
As suspected. A quick glance at drivers/scsi revealed things like: drivers/scsi/gdth.c: #ifndef MODULE __setup("gdth=", option_setup); #endif drivers/scsi/gvp11.c: int gvp11_release(struct Scsi_Host *instance) { #ifdef MODULE DMA(instance)->CNTR = 0; release_mem_region(ZTWO_PADDR(instance->base), 256); free_irq(IRQ_AMIGA_PORTS, instance); wd33c93_release(); #endif return 1; } drivers/scsi/ibmmca.c: #if defined(MODULE) static char *boot_options = NULL; module_param(boot_options, charp, 0); module_param_array(io_port, int, NULL, 0); module_param_array(scsi_id, int, NULL, 0); MODULE_LICENSE("GPL"); #endif and my all-time favourite: drivers/scsi/BusLogic.c: #ifdef MODULE static struct pci_device_id BusLogic_pci_tbl[] __devinitdata = { { PCI_VENDOR_ID_BUSLOGIC, PCI_DEVICE_ID_BUSLOGIC_MULTIMASTER, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, { PCI_VENDOR_ID_BUSLOGIC, PCI_DEVICE_ID_BUSLOGIC_MULTIMASTER_NC, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, { PCI_VENDOR_ID_BUSLOGIC, PCI_DEVICE_ID_BUSLOGIC_FLASHPOINT, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, { } }; #endif MODULE_DEVICE_TABLE(pci, BusLogic_pci_tbl); So it's about time to have that cleaned up anyway. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Tue, Jun 23, 2009 at 01:03:48PM +0200, Michael Matz wrote:
Hi,
On Mon, 22 Jun 2009, Greg KH wrote:
But that's really just to address dependencies, right? I don't intend to load the modules at the same runlevel where they would have run if normally compiled statically. If we load the linked module after the usual static parts have initialized, then we'll still observe the dependencies.
No, it's to tell the kernel exactly when to initialize the code at what part during the init level processing, and link order matters.
"exactly when to initialize the code" == "addresses dependencies", isn't it?
No, see below for details.
Actually, in thinking about it some more, I don't think this is going to work properly for the "fastboot" stuff that we really need. Here's why:
- When drivers are build into the kernel, they are initialized in the order in which the Makefile places them, and we build them in pretty early. This allows the drivers to start up, and do some async stuff while the rest of the kernel initializes.
Excuse me for not being up-to-date wrt. the kernel anymore, but isn't this done via the .init sections?
Yes it is, but order within the .init sections matter.
- If you somehow "link" the modules into the built kernel, you will have to set up a mechanism to call the module_init() calls. The only safe way to do that is at the end of the init cycle. So any async processing that could have happened, will not, as these drivers will be the last things in the boot process now, instead of very early like they used to be.
... Because if it is, then linking modules into the built kernel after the fact isn't going to change this principle. You still have a .initcall section (well, two of them, one for the built kernel, one for the module lump) which the kernel proper would iterate over very early (after determining existence of the second initcall table).
We really have 8 different levels of init calls in the kernel these days: #define pure_initcall(fn) __define_initcall("0",fn,0) #define core_initcall(fn) __define_initcall("1",fn,1) #define core_initcall_sync(fn) __define_initcall("1s",fn,1s) #define postcore_initcall(fn) __define_initcall("2",fn,2) #define postcore_initcall_sync(fn) __define_initcall("2s",fn,2s) #define arch_initcall(fn) __define_initcall("3",fn,3) #define arch_initcall_sync(fn) __define_initcall("3s",fn,3s) #define subsys_initcall(fn) __define_initcall("4",fn,4) #define subsys_initcall_sync(fn) __define_initcall("4s",fn,4s) #define fs_initcall(fn) __define_initcall("5",fn,5) #define fs_initcall_sync(fn) __define_initcall("5s",fn,5s) #define rootfs_initcall(fn) __define_initcall("rootfs",fn,rootfs) #define device_initcall(fn) __define_initcall("6",fn,6) #define device_initcall_sync(fn) __define_initcall("6s",fn,6s) #define late_initcall(fn) __define_initcall("7",fn,7) #define late_initcall_sync(fn) __define_initcall("7s",fn,7s) If you build any code as a module, any of these different levels all change to be the "generic" module_init() call, which runs after all of these 8 levels runs. So you can't work backwards and figure out what level of init call the module really wanted to be run at if you only have a .o file. And then, within the different init call levels, we call the functions in the order in which they are linked into the kernel, which is driven by the Makefile. If you look at some of the recent changes that were made for "fastboot", we reoder the Makefiles to allow some things to run in parallel before others do (like ata drivers very early, before other drivers in the same run level to take advantage of the slowness of those initialization sequences). So if you just take the module init sections, and run them some time after all of the above sections run, then you don't get the same speedups that we need. If you look at the startup boot graphs, this is seen quite well, with the ATA drives taking a long time to startup, all the while the rest of the kernel is running along, initalizing other things. If you move the ata drivers to the end of the init sequence, then the whole kernel waits for that hardware to startup, wasting almost a full second. Remember, we are talking about a whole boot time of the kernel to be less than a second right now, so optimizations like this are essencial to get there. hope this helps explain things a bit better, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Greg KH wrote:
If you build any code as a module, any of these different levels all change to be the "generic" module_init() call, which runs after all of these 8 levels runs. So you can't work backwards and figure out what level of init call the module really wanted to be run at if you only have a .o file.
And then, within the different init call levels, we call the functions in the order in which they are linked into the kernel, which is driven by the Makefile. If you look at some of the recent changes that were made for "fastboot", we reoder the Makefiles to allow some things to run in parallel before others do (like ata drivers very early, before other drivers in the same run level to take advantage of the slowness of those initialization sequences).
So if you just take the module init sections, and run them some time after all of the above sections run, then you don't get the same speedups that we need.
Not having that information in the module is a file size optimization, not a real requirement. We could easily link in the initcall*.init sections into the modules and then use them for proper ordering when we relink the kernel. Since the initcall sections are consolidated into one during the original link, we would have to add a trampoline initcall to the end of each section to call out to the linked in ones. Since they'd be at the end of the runlevel, the ordering would be preserved. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpBAA0ACgkQLPWxlyuTD7IX9QCeI8sBFJSmaL7nJz5sPzTrZ0dA CCMAoJ8imBywEqW6t/Sil9k1qFEFtbdA =ME8+ -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Tue, Jun 23, 2009 at 12:17:17PM -0400, Jeff Mahoney wrote:
Greg KH wrote:
If you build any code as a module, any of these different levels all change to be the "generic" module_init() call, which runs after all of these 8 levels runs. So you can't work backwards and figure out what level of init call the module really wanted to be run at if you only have a .o file.
And then, within the different init call levels, we call the functions in the order in which they are linked into the kernel, which is driven by the Makefile. If you look at some of the recent changes that were made for "fastboot", we reoder the Makefiles to allow some things to run in parallel before others do (like ata drivers very early, before other drivers in the same run level to take advantage of the slowness of those initialization sequences).
So if you just take the module init sections, and run them some time after all of the above sections run, then you don't get the same speedups that we need.
Not having that information in the module is a file size optimization, not a real requirement. We could easily link in the initcall*.init sections into the modules and then use them for proper ordering when we relink the kernel. Since the initcall sections are consolidated into one during the original link, we would have to add a trampoline initcall to the end of each section to call out to the linked in ones. Since they'd be at the end of the runlevel, the ordering would be preserved.
No, the main issue is the order _within_ the runlevel. Almost every driver is in the "main" device_initcall level. But the ordering within that level matters, and that is only definied by the Makefile order, no symbol resolution or anything else specific we can determine later. thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Tue, Jun 23, 2009 at 12:30:41PM -0400, Jeff Mahoney wrote:
Greg KH wrote:
On Tue, Jun 23, 2009 at 12:17:17PM -0400, Jeff Mahoney wrote:
If you build any code as a module, any of these different levels all change to be the "generic" module_init() call, which runs after all of these 8 levels runs. So you can't work backwards and figure out what level of init call the module really wanted to be run at if you only have a .o file.
And then, within the different init call levels, we call the functions in the order in which they are linked into the kernel, which is driven by the Makefile. If you look at some of the recent changes that were made for "fastboot", we reoder the Makefiles to allow some things to run in parallel before others do (like ata drivers very early, before other drivers in the same run level to take advantage of the slowness of those initialization sequences).
So if you just take the module init sections, and run them some time after all of the above sections run, then you don't get the same speedups that we need. Not having that information in the module is a file size optimization, not a real requirement. We could easily link in the initcall*.init
Greg KH wrote: sections into the modules and then use them for proper ordering when we relink the kernel. Since the initcall sections are consolidated into one during the original link, we would have to add a trampoline initcall to the end of each section to call out to the linked in ones. Since they'd be at the end of the runlevel, the ordering would be preserved.
No, the main issue is the order _within_ the runlevel. Almost every driver is in the "main" device_initcall level. But the ordering within that level matters, and that is only definied by the Makefile order, no symbol resolution or anything else specific we can determine later.
I get that, but if it's safe to run it as a module later why would it matter if it's run at the end of a runlevel instead of the middle of it?
It is "safe", it just isn't "fast", which is one of the main goals here. thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Greg KH wrote:
On Tue, Jun 23, 2009 at 12:17:17PM -0400, Jeff Mahoney wrote:
If you build any code as a module, any of these different levels all change to be the "generic" module_init() call, which runs after all of these 8 levels runs. So you can't work backwards and figure out what level of init call the module really wanted to be run at if you only have a .o file.
And then, within the different init call levels, we call the functions in the order in which they are linked into the kernel, which is driven by the Makefile. If you look at some of the recent changes that were made for "fastboot", we reoder the Makefiles to allow some things to run in parallel before others do (like ata drivers very early, before other drivers in the same run level to take advantage of the slowness of those initialization sequences).
So if you just take the module init sections, and run them some time after all of the above sections run, then you don't get the same speedups that we need. Not having that information in the module is a file size optimization, not a real requirement. We could easily link in the initcall*.init
Greg KH wrote: sections into the modules and then use them for proper ordering when we relink the kernel. Since the initcall sections are consolidated into one during the original link, we would have to add a trampoline initcall to the end of each section to call out to the linked in ones. Since they'd be at the end of the runlevel, the ordering would be preserved.
No, the main issue is the order _within_ the runlevel. Almost every driver is in the "main" device_initcall level. But the ordering within that level matters, and that is only definied by the Makefile order, no symbol resolution or anything else specific we can determine later.
I get that, but if it's safe to run it as a module later why would it matter if it's run at the end of a runlevel instead of the middle of it? - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpBAzEACgkQLPWxlyuTD7KZXQCfQc/t5UZ4nuIoeFQ4uhh677xG Dw8AoJu+ruxqNHuh+uBoJGGQ5evm85GK =XPyi -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Greg KH wrote:
On Tue, Jun 23, 2009 at 12:30:41PM -0400, Jeff Mahoney wrote:
Greg KH wrote:
If you build any code as a module, any of these different levels all change to be the "generic" module_init() call, which runs after all of these 8 levels runs. So you can't work backwards and figure out what level of init call the module really wanted to be run at if you only have a .o file.
And then, within the different init call levels, we call the functions in the order in which they are linked into the kernel, which is driven by the Makefile. If you look at some of the recent changes that were made for "fastboot", we reoder the Makefiles to allow some things to run in parallel before others do (like ata drivers very early, before other drivers in the same run level to take advantage of the slowness of those initialization sequences).
So if you just take the module init sections, and run them some time after all of the above sections run, then you don't get the same speedups that we need. Not having that information in the module is a file size optimization, not a real requirement. We could easily link in the initcall*.init
Greg KH wrote: sections into the modules and then use them for proper ordering when we relink the kernel. Since the initcall sections are consolidated into one during the original link, we would have to add a trampoline initcall to the end of each section to call out to the linked in ones. Since they'd be at the end of the runlevel, the ordering would be preserved. No, the main issue is the order _within_ the runlevel. Almost every driver is in the "main" device_initcall level. But the ordering within
On Tue, Jun 23, 2009 at 12:17:17PM -0400, Jeff Mahoney wrote: that level matters, and that is only definied by the Makefile order, no symbol resolution or anything else specific we can determine later. I get that, but if it's safe to run it as a module later why would it matter if it's run at the end of a runlevel instead of the middle of it?
It is "safe", it just isn't "fast", which is one of the main goals here.
Oh ok, I see what you're getting at, but I think the order in the runlevel really only matters if it's slow vs fast. Granularity can't be that important. Runlevels are essentially *free* so we can do things like slow_device_init() if we want. Depending on link order works, but we end up with this mess now where a monolithic beast of a kernel is the "best" option. I was pretty happy leaving the 1.2 kernel behind. :) - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpBEGgACgkQLPWxlyuTD7II7QCgiuUfxn1rAc10R57zj8CwqwbB o0EAnRaF+iS478ONn5g7dzelDK6YhIOz =9Hdr -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hello, Greg KH wrote:
No, the main issue is the order _within_ the runlevel. Almost every driver is in the "main" device_initcall level. But the ordering within that level matters, and that is only definied by the Makefile order, no symbol resolution or anything else specific we can determine later.
Recent kernels produce modules.order which lists the makefile order of modules and is currently used by modprobe to determine which module to load when multiple modules match a device alias. It can take some work but I don't think it'll be too hard to make the fast loaded modules to observe link order. It would be nice to ship with a generic kernel and have some userland scripts which can link the dynamic image on the first boot (or when the list of loaded modules changes) and use it for subsequent boots while leaving the generic kernel as the 'safe' boot option. Thanks. -- tejun -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tejun Heo wrote:
Hello,
Greg KH wrote:
No, the main issue is the order _within_ the runlevel. Almost every driver is in the "main" device_initcall level. But the ordering within that level matters, and that is only definied by the Makefile order, no symbol resolution or anything else specific we can determine later.
Recent kernels produce modules.order which lists the makefile order of modules and is currently used by modprobe to determine which module to load when multiple modules match a device alias. It can take some work but I don't think it'll be too hard to make the fast loaded modules to observe link order.
Now this would be really interesting. I was concerned about having multiple .init sections and started doing some work in that area, but I do like the idea of generating the link order via that mechanism instead of just guessing using modprobe. Have you played around with this at all yet? I realized this morning that my linking attempts were actually done on the compressed image, not with the real vmlinux. The differences are huge: checkout arch/x86/kernel/vmlinux.lds - I'm not sure how to reliably generate a good replacement since it folds sections, etc.
It would be nice to ship with a generic kernel and have some userland scripts which can link the dynamic image on the first boot (or when the list of loaded modules changes) and use it for subsequent boots while leaving the generic kernel as the 'safe' boot option.
Agreed. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpBen4ACgkQLPWxlyuTD7IGaACgl97Crtg8Pu62HEYv+Ao4NzJl zEEAnRXSiLvMRDx9IAp7MKTUGa+Hv99o =oHgQ -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hello, Jeff Mahoney wrote:
Tejun Heo wrote:
Recent kernels produce modules.order which lists the makefile order of modules and is currently used by modprobe to determine which module to load when multiple modules match a device alias. It can take some work but I don't think it'll be too hard to make the fast loaded modules to observe link order.
Now this would be really interesting. I was concerned about having multiple .init sections and started doing some work in that area, but I do like the idea of generating the link order via that mechanism instead of just guessing using modprobe.
Have you played around with this at all yet? I realized this morning that my linking attempts were actually done on the compressed image, not with the real vmlinux. The differences are huge: checkout arch/x86/kernel/vmlinux.lds - I'm not sure how to reliably generate a good replacement since it folds sections, etc.
Nope, not yet. I think there can be two different approaches here. 1. Try to create a `static' image from vmlinux and target modules. I doubt this would work very well. There are subtle module specific things which are setup by the module load code and I don't think this will be too easy. 2. Create an very early initrd thingie which gets loaded by the boot loader and gets unpacked and loaded early during boot. This should be easier but I'm curious how much difference it would make compared to normal initrd if we can make it go faster Maybe it would be better to think about how to make initrd go faster. If the dynamic relocation time is problem, we can reserve an area in the FIXMAP area and pre-relocate modules and make them a bundle and suck it up into the kernel at once and run initialization for them as usual. It'll be slower than monolithic kernel but hopefullly not by too much. Thanks. -- tejun -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tejun Heo wrote:
Hello,
Jeff Mahoney wrote:
Tejun Heo wrote:
Recent kernels produce modules.order which lists the makefile order of modules and is currently used by modprobe to determine which module to load when multiple modules match a device alias. It can take some work but I don't think it'll be too hard to make the fast loaded modules to observe link order. Now this would be really interesting. I was concerned about having multiple .init sections and started doing some work in that area, but I do like the idea of generating the link order via that mechanism instead of just guessing using modprobe.
Have you played around with this at all yet? I realized this morning that my linking attempts were actually done on the compressed image, not with the real vmlinux. The differences are huge: checkout arch/x86/kernel/vmlinux.lds - I'm not sure how to reliably generate a good replacement since it folds sections, etc.
Nope, not yet. I think there can be two different approaches here.
1. Try to create a `static' image from vmlinux and target modules. I doubt this would work very well. There are subtle module specific things which are setup by the module load code and I don't think this will be too easy.
Like what? AFAIK the only difference between a built-in.o and a <module>.ko is that the module has the <module>.mod.o added in which only contains a few sections. The conflicting sections can be renamed or dropped with objcopy before actually linking the static image. I had planned on ignoring the fact that it was a module and adding the static .initcall.init section to each module just like what would happen if it were compiled as a static object. That way, we can just link it in and the initcalls will be in the right section. Since it becomes statically linked, there's no need for the module loading code to be invoked. That's where the fun in ordering comes in, especially because there aren't separate initcall sections in vmlinux.
2. Create an very early initrd thingie which gets loaded by the boot loader and gets unpacked and loaded early during boot. This should be easier but I'm curious how much difference it would make compared to normal initrd if we can make it go faster
Maybe it would be better to think about how to make initrd go faster. If the dynamic relocation time is problem, we can reserve an area in the FIXMAP area and pre-relocate modules and make them a bundle and suck it up into the kernel at once and run initialization for them as usual. It'll be slower than monolithic kernel but hopefullly not by too much.
This runs into the same problem Greg keeps underscoring. The issue is that we want to initialize some things earlier in the boot cycle. In reality, there aren't many of them right now where this will happen - but it's possible to add more runlevels to allow finer granularity when defining the module init. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpBibUACgkQLPWxlyuTD7J7DACgimWd62thkYv43tGVJ95i09fQ 9xUAn37OX4NPbQ1jbJiBMuQmkpWohpUG =mT/+ -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hello, Jeff Mahoney wrote:
1. Try to create a `static' image from vmlinux and target modules. I doubt this would work very well. There are subtle module specific things which are setup by the module load code and I don't think this will be too easy.
Like what? AFAIK the only difference between a built-in.o and a <module>.ko is that the module has the <module>.mod.o added in which only contains a few sections. The conflicting sections can be renamed or dropped with objcopy before actually linking the static image.
I was primarily thinking about percpu areas - how percpu areas are setup and accessed differently on some archs (s390 specifically) and the vmlinux defined symbols which need to be changed if the percpu sections are merged and so on. My worries could be bogus but things are subtle. I think it would be better if we can somehow share the usual module loading code path while skipping time consuming stages.
2. Create an very early initrd thingie which gets loaded by the boot loader and gets unpacked and loaded early during boot. This should be easier but I'm curious how much difference it would make compared to normal initrd if we can make it go faster
Maybe it would be better to think about how to make initrd go faster. If the dynamic relocation time is problem, we can reserve an area in the FIXMAP area and pre-relocate modules and make them a bundle and suck it up into the kernel at once and run initialization for them as usual. It'll be slower than monolithic kernel but hopefullly not by too much.
This runs into the same problem Greg keeps underscoring. The issue is that we want to initialize some things earlier in the boot cycle. In reality, there aren't many of them right now where this will happen - but it's possible to add more runlevels to allow finer granularity when defining the module init.
Yeap, we can surely try to load and initialize earlier than the current initrd. My primary concern is having two separate mechanisms for linking. Module related code including the linker script keeps going through subtle changes, so it's a tad bit scary. Hey, but I have to admit I get scared easily. :-) Thanks. -- tejun -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hi, On Tue, 23 Jun 2009, Greg KH wrote:
No, it's to tell the kernel exactly when to initialize the code at what part during the init level processing, and link order matters.
"exactly when to initialize the code" == "addresses dependencies", isn't it?
No, see below for details.
I stay by the above claim after having read your mail. You essentially want to start some thing early, before other things, hence there's a dependency from those other things to the first thing. Not out of correctness needs but out of speed needs. But reasons for dependencies don't matter for making them dependencies :) In any case, it's just idle terminology and a minor point.
Excuse me for not being up-to-date wrt. the kernel anymore, but isn't this done via the .init sections?
Yes it is, but order within the .init sections matter.
Yes, understood. This is determined by the link order. There is no reason that the hypothetical lump-modules-together-and-attach-to-vmlinux program cannot also observe some ordering given from the outside. In fact part of this program will be calls to the normal linker to join together the individual .o files of modules (possibly after stripping away some uninteresting sections), and that again establishes a certain order in the newly created initcall section. That's the point where you turn knobs to start some things earlier than other things, much like you right now had changed the order in the Makefile.
fact isn't going to change this principle. You still have a .initcall section (well, two of them, one for the built kernel, one for the module lump) which the kernel proper would iterate over very early (after determining existence of the second initcall table).
We really have 8 different levels of init calls in the kernel these days: #define pure_initcall(fn) __define_initcall("0",fn,0) #define core_initcall(fn) __define_initcall("1",fn,1) #define core_initcall_sync(fn) __define_initcall("1s",fn,1s) #define postcore_initcall(fn) __define_initcall("2",fn,2) #define postcore_initcall_sync(fn) __define_initcall("2s",fn,2s) #define arch_initcall(fn) __define_initcall("3",fn,3) #define arch_initcall_sync(fn) __define_initcall("3s",fn,3s) #define subsys_initcall(fn) __define_initcall("4",fn,4) #define subsys_initcall_sync(fn) __define_initcall("4s",fn,4s) #define fs_initcall(fn) __define_initcall("5",fn,5) #define fs_initcall_sync(fn) __define_initcall("5s",fn,5s) #define rootfs_initcall(fn) __define_initcall("rootfs",fn,rootfs) #define device_initcall(fn) __define_initcall("6",fn,6) #define device_initcall_sync(fn) __define_initcall("6s",fn,6s) #define late_initcall(fn) __define_initcall("7",fn,7) #define late_initcall_sync(fn) __define_initcall("7s",fn,7s)
This doesn't change the picture, as long as this information is preserved in the module .o files ...
If you build any code as a module, any of these different levels all change to be the "generic" module_init() call, which runs after all of these 8 levels runs. So you can't work backwards and figure out what level of init call the module really wanted to be run at if you only have a .o file.
... maybe you can't currently, but we are talking about ways to improve the sitation. One of the necessary things would be to _not_ throw away this information. You would then end up with multiple .initcall sections also in modules. That's perfectly fine if the module loader simply iterates over all of them in order. The linker script can make sure that all these .initcall sections are lying next to each other (i.e. just remain separate in the ELF section list), so that the current code doesn't even need to be changed, if I'm reading it right.
And then, within the different init call levels, we call the functions in the order in which they are linked into the kernel, which is driven by the Makefile.
Yes, understood. This will still work, except for one detail: you wouldn't have just one chunk of such .initcall sections (the one in vmlinux proper), but two. So before iterating the next level you have to iterate the current level twice: once the list in vmlinux, once the list in module-lump. Then go to next level, do the same. It is true then that all things in module-lump will run after vmlinux things of the same level L. If that isn't wanted you need to introduce a new level L-0.5 for the module-lump things, at voila, they are run before the vmlinux things at level L. Inside one level again the link order specifies the order, the link order of vmlinux (as specified in the Makefile) for vmlinux things, the link order when creating module-lump for those things.
Remember, we are talking about a whole boot time of the kernel to be less than a second right now, so optimizations like this are essencial to get there.
Yes, understood. I don't see why that wouldn't be possible with the lumping together anymore, under the condition that modules will contain the original initcall sections in the future. I think now is the first time it actually could pay back that modules are relocatable objects instead of DSOs. The latter would be considerably more difficult to join together into one bucket, with relocatable objects it's trivial. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Jeff Mahoney napsal(a):
Did you figure out a way to discover when a module is built into the kernel instead of just unavailable?
I wrote a patch for kbuild to generate a modules.builtin file and install it below /lib/modules/, the plan is to make modprobe look into this file instead of complaining about missing modules. Unfortunately the kbuild maintainer hasn't responded yet, I guess I'll add it to our tree nevertheless as we really need it now. Michal -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Wednesday, 24 June 2009 14:05:24 Michal Marek wrote:
Unfortunately the kbuild maintainer hasn't responded yet, I guess I'll add it to our tree nevertheless as we really need it now.
Sam is often quite busy, but he is very receptive to reasonable ideas and patches. I don't see adding this before it has been officially gone upstream as a big risk. Thanks, Andreas -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Michael Matz wrote:
Hi,
On Tue, 23 Jun 2009, Greg KH wrote:
Excuse me for not being up-to-date wrt. the kernel anymore, but isn't this done via the .init sections? Yes it is, but order within the .init sections matter.
Yes, understood. This is determined by the link order. There is no reason that the hypothetical lump-modules-together-and-attach-to-vmlinux program cannot also observe some ordering given from the outside. In fact part of this program will be calls to the normal linker to join together the individual .o files of modules (possibly after stripping away some uninteresting sections), and that again establishes a certain order in the newly created initcall section. That's the point where you turn knobs to start some things earlier than other things, much like you right now had changed the order in the Makefile.
This is an interesting point. I've, so far, just been using the linker directly and accepting that the new objects would probably just end up getting appended in the order provided on the command line. I suppose it would be possible to interleave them.
If you build any code as a module, any of these different levels all change to be the "generic" module_init() call, which runs after all of these 8 levels runs. So you can't work backwards and figure out what level of init call the module really wanted to be run at if you only have a .o file.
... maybe you can't currently, but we are talking about ways to improve the sitation. One of the necessary things would be to _not_ throw away this information. You would then end up with multiple .initcall sections also in modules. That's perfectly fine if the module loader simply iterates over all of them in order. The linker script can make sure that all these .initcall sections are lying next to each other (i.e. just remain separate in the ELF section list), so that the current code doesn't even need to be changed, if I'm reading it right.
The linker script does this now, but it also consolidates them into one .initcall section. AFAIK the only down side to not consolidating it anymore is using extra ELF header space. I've only looked at a few arches, but it looks like we don't use the section headers for freeing the initmem.
And then, within the different init call levels, we call the functions in the order in which they are linked into the kernel, which is driven by the Makefile.
Yes, understood. This will still work, except for one detail: you wouldn't have just one chunk of such .initcall sections (the one in vmlinux proper), but two. So before iterating the next level you have to iterate the current level twice: once the list in vmlinux, once the list in module-lump. Then go to next level, do the same. It is true then that all things in module-lump will run after vmlinux things of the same level L. If that isn't wanted you need to introduce a new level L-0.5 for the module-lump things, at voila, they are run before the vmlinux things at level L.
How much trouble would we run into if we kept the original initcall sections in vmlinux? Would the initcall sections in the module properly merge into them? Or are we already too late and the relocation records are lost?
Inside one level again the link order specifies the order, the link order of vmlinux (as specified in the Makefile) for vmlinux things, the link order when creating module-lump for those things.
Remember, we are talking about a whole boot time of the kernel to be less than a second right now, so optimizations like this are essencial to get there.
Yes, understood. I don't see why that wouldn't be possible with the lumping together anymore, under the condition that modules will contain the original initcall sections in the future.
And this bit is trivial. I already have a patch doing that on my development node. The big problem I've been running into is wrangling the complex linker script into something we can use for the secondary linking. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpCT8IACgkQLPWxlyuTD7JnxgCfZWDmH3f8n5RSGXPbjT2LrteU vyYAnA8cV80TiBp7rMfMxmsG2gO+rxBS =xI6f -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tejun Heo wrote:
Hello,
Jeff Mahoney wrote:
1. Try to create a `static' image from vmlinux and target modules. I doubt this would work very well. There are subtle module specific things which are setup by the module load code and I don't think this will be too easy. Like what? AFAIK the only difference between a built-in.o and a <module>.ko is that the module has the <module>.mod.o added in which only contains a few sections. The conflicting sections can be renamed or dropped with objcopy before actually linking the static image.
I was primarily thinking about percpu areas - how percpu areas are setup and accessed differently on some archs (s390 specifically) and the vmlinux defined symbols which need to be changed if the percpu sections are merged and so on. My worries could be bogus but things are subtle. I think it would be better if we can somehow share the usual module loading code path while skipping time consuming stages.
If this were a problem, wouldn't we be running into subtle bugs already? The .o files already have the .data.percpu sections regardless of whether it's a module or not. The only special case seems to be that the module code has to parse the percpu section itself and the static kernel uses global variables to delimit the sections. Regardless, I don't think we're really targeting s/390 with this. Really, we're only interested in the x86-based arches.
2. Create an very early initrd thingie which gets loaded by the boot loader and gets unpacked and loaded early during boot. This should be easier but I'm curious how much difference it would make compared to normal initrd if we can make it go faster Maybe it would be better to think about how to make initrd go faster. If the dynamic relocation time is problem, we can reserve an area in the FIXMAP area and pre-relocate modules and make them a bundle and suck it up into the kernel at once and run initialization for them as usual. It'll be slower than monolithic kernel but hopefullly not by too much. This runs into the same problem Greg keeps underscoring. The issue is that we want to initialize some things earlier in the boot cycle. In reality, there aren't many of them right now where this will happen - but it's possible to add more runlevels to allow finer granularity when defining the module init.
Yeap, we can surely try to load and initialize earlier than the current initrd. My primary concern is having two separate mechanisms for linking. Module related code including the linker script keeps going through subtle changes, so it's a tad bit scary. Hey, but I have to admit I get scared easily. :-)
Indeed. This is something I'm concerned about as well. I haven't yet gotten that part working. I do have a new linker script that doesn't consolidate the initcall sections so we can reuse them and insert new calls. I need to do some more research on the details of relocation. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpCbscACgkQLPWxlyuTD7JftwCfRsyG8OD9Fa1fTL0Y/Mzk+A3a sf0Ani8GHThdRe4wUhJKzsjpZdAi4fqt =hz/M -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hello, Jeff Mahoney wrote:
I was primarily thinking about percpu areas - how percpu areas are setup and accessed differently on some archs (s390 specifically) and the vmlinux defined symbols which need to be changed if the percpu sections are merged and so on. My worries could be bogus but things are subtle. I think it would be better if we can somehow share the usual module loading code path while skipping time consuming stages.
If this were a problem, wouldn't we be running into subtle bugs already? The .o files already have the .data.percpu sections regardless of whether it's a module or not. The only special case seems to be that the module code has to parse the percpu section itself and the static kernel uses global variables to delimit the sections.
Regardless, I don't think we're really targeting s/390 with this. Really, we're only interested in the x86-based arches.
Hmmm... yeah, now that I think more about it, it should be okay for x86 and for other archs too if the symbols are adjusted properly and all. Thanks. -- tejun -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hi, On Wed, 24 Jun 2009, Jeff Mahoney wrote:
iterates over all of them in order. The linker script can make sure that all these .initcall sections are lying next to each other (i.e. just remain separate in the ELF section list), so that the current code doesn't even need to be changed, if I'm reading it right.
The linker script does this now, but it also consolidates them into one .initcall section. AFAIK the only down side to not consolidating it anymore is using extra ELF header space.
Exactly.
I've only looked at a few arches, but it looks like we don't use the section headers for freeing the initmem.
Right, AFAICS (and I have looked at even less arches :) ) the initcall walker is driven just by the _initcall_start and _initcall_end symbols, not by section headers.
all things in module-lump will run after vmlinux things of the same level L. If that isn't wanted you need to introduce a new level L-0.5 for the module-lump things, at voila, they are run before the vmlinux things at level L.
How much trouble would we run into if we kept the original initcall sections in vmlinux?
Into none I would think ...
Would the initcall sections in the module properly merge into them? Or are we already too late and the relocation records are lost?
... because this is not the case, the modules are .o files, not fully linked, so all relocations are still there.
Yes, understood. I don't see why that wouldn't be possible with the lumping together anymore, under the condition that modules will contain the original initcall sections in the future.
And this bit is trivial. I already have a patch doing that on my development node.
The big problem I've been running into is wrangling the complex linker script into something we can use for the secondary linking.
So, what's the nature of the problem? Just wrapping your head around the many sections and ALIGN directives, or something principle as in "result doesn't work and I don't know why"? Theoretically (as the sections of .ko have all the right attributes already) all that should be needed is "ld -r -o lump.ko input1.ko input2.ko ...". This currently fails because of multiply defined symbols on at least __mod_pci_device_table, init_module and cleanup_module. That must be solved in module build time or by renaming symbols. Probably making them static is the right way, a pointer to those functions is already embedded in the this_module section. Another problem will be the .gnu.linkonce.this_module sections. ld -r will simply select one of the input ones, and not merge them. That requires probably renaming this section (I have no idea why it is linkonce anyway, kernel modules are .o files, not fully linked, so there is only ever one copy of each section, with or without linkonce marking). It seems that currently __mod_pci_device_table is not referred to via some pointer in some module descriptor, but rather explicitely by symbol name in the loader. That's not going to work as when you lump multiple modules together you would have multiple of those. I would embed a pointer to that table into the module descriptor. For loading of the lump it's probably useful if you mark the start and end of all collected module descriptors (the this_module sections) in the usual way, by this_module_start/_end symbols (or strip them away, no idea what would be better as I don't know what vital information is contained therein). Ciao, Michael. -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Michael Matz wrote:
Hi,
On Wed, 24 Jun 2009, Jeff Mahoney wrote:
iterates over all of them in order. The linker script can make sure that all these .initcall sections are lying next to each other (i.e. just remain separate in the ELF section list), so that the current code doesn't even need to be changed, if I'm reading it right. The linker script does this now, but it also consolidates them into one .initcall section. AFAIK the only down side to not consolidating it anymore is using extra ELF header space.
Exactly.
I've only looked at a few arches, but it looks like we don't use the section headers for freeing the initmem.
Right, AFAICS (and I have looked at even less arches :) ) the initcall walker is driven just by the _initcall_start and _initcall_end symbols, not by section headers.
Yep.
all things in module-lump will run after vmlinux things of the same level L. If that isn't wanted you need to introduce a new level L-0.5 for the module-lump things, at voila, they are run before the vmlinux things at level L. How much trouble would we run into if we kept the original initcall sections in vmlinux?
Into none I would think ...
I tested it by changing the INITCALLS macro in include/asm-generic/vmlinux.lds.h to issue individual sections. I had to add initcalls.{start,end} sections to contain the pointers describing the boundaries of the initcalls too.
Would the initcall sections in the module properly merge into them? Or are we already too late and the relocation records are lost?
... because this is not the case, the modules are .o files, not fully linked, so all relocations are still there.
In the modules, yes. I was wondering about the static image.
Yes, understood. I don't see why that wouldn't be possible with the lumping together anymore, under the condition that modules will contain the original initcall sections in the future. And this bit is trivial. I already have a patch doing that on my development node.
The big problem I've been running into is wrangling the complex linker script into something we can use for the secondary linking.
So, what's the nature of the problem? Just wrapping your head around the many sections and ALIGN directives, or something principle as in "result doesn't work and I don't know why"? Theoretically (as the sections
Initially I was just going to reuse the original linker scripts, but since there is consolidation of sections, it ends up not working as well as expected. I have a script that parses the existing section headers and generates a script with the proper alignment (I hope). Whether it actually works or not, I can't tell yet, since I run into this: ld: vmlinux(.text.head+0xc020000c): reloc against `boot_gdt_descr': error 2 I'm not sure what's causing that yet.
of .ko have all the right attributes already) all that should be needed is "ld -r -o lump.ko input1.ko input2.ko ...". This currently fails because of multiply defined symbols on at least __mod_pci_device_table, init_module and cleanup_module. That must be solved in module build time or by renaming symbols. Probably making them static is the right way, a pointer to those functions is already embedded in the this_module section.
Another problem will be the .gnu.linkonce.this_module sections. ld -r will simply select one of the input ones, and not merge them. That requires probably renaming this section (I have no idea why it is linkonce anyway, kernel modules are .o files, not fully linked, so there is only ever one copy of each section, with or without linkonce marking).
I've been planning on unconditionally generating the .initcall*.init section for every module (static or not) and keeping the .gnu.linkonce.this_module section only until we link with the static image.
It seems that currently __mod_pci_device_table is not referred to via some pointer in some module descriptor, but rather explicitely by symbol name in the loader. That's not going to work as when you lump multiple modules together you would have multiple of those. I would embed a pointer to that table into the module descriptor.
We don't need that either. The device table is pointed at by the driver struct, which is registered in the initcall. The __mod_pci_device_table, I think, is only used to generate the aliases for automatic loading. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpDifkACgkQLPWxlyuTD7Im/wCeIYGNI0i69XkacPY6HnTXkkxo yH8An27qyeJDPZRarl16/DAnMvbu1HE5 =zCZY -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hi, On Thu, 25 Jun 2009, Jeff Mahoney wrote:
I tested it by changing the INITCALLS macro in include/asm-generic/vmlinux.lds.h to issue individual sections. I had to add initcalls.{start,end} sections to contain the pointers describing the boundaries of the initcalls too.
Yes.
Would the initcall sections in the module properly merge into them? Or are we already too late and the relocation records are lost?
... because this is not the case, the modules are .o files, not fully linked, so all relocations are still there.
In the modules, yes. I was wondering about the static image.
Um, wait. You really want to link the module lump together with the static image? That's not going to work easily (or at all) as the static image is fully linked and doesn't contain all necessary relocations anymore to move around its sections (which happens if you insert stuff into the existing ones). So, instead of amending the existing .text section in vmlinux (to contain the merged .text's from the modules) you would have to put this blob into an extra .text2 (for instance) behind all other current sections of vmlinux. You'd have to do that for all sections, essentially being back at attaching the blob simply to the end of vmlinux (with the change that you only have one set of ELF section headers, instead of two, but that's minor). In my mind that module lump would not be linked with the static image, but would rather be sort of a mega-module that is loaded implicitely. It would "somehow" be tacked at the end of vmlinux as blob and then loaded with a only mildly extended module loader (basically everything as now, just that you have multiple this_module sections, and with the difference that the initcall order is split and changed)
I have a script that parses the existing section headers and generates a script with the proper alignment (I hope). Whether it actually works or not, I can't tell yet, since I run into this: ld: vmlinux(.text.head+0xc020000c): reloc against `boot_gdt_descr': error 2
Maybe that's a result of vmlinux already fully linked. Hmm... wasn't it once the case that during building the kernel initially only a merged .o file for everything was generated and that one then again got fully linked? vmlinux.o or something... you could use that.
I'm not sure what's causing that yet.
Seems to be an internal linker error. What's output of 'readelf -rsW vmlinux | grep boot_gdt_descr' (i.e. what type of reloc and symbol is this)? Ciao, Michael. -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Michael Matz wrote:
Hi,
On Thu, 25 Jun 2009, Jeff Mahoney wrote:
I tested it by changing the INITCALLS macro in include/asm-generic/vmlinux.lds.h to issue individual sections. I had to add initcalls.{start,end} sections to contain the pointers describing the boundaries of the initcalls too.
Yes.
Would the initcall sections in the module properly merge into them? Or are we already too late and the relocation records are lost? ... because this is not the case, the modules are .o files, not fully linked, so all relocations are still there. In the modules, yes. I was wondering about the static image.
Um, wait. You really want to link the module lump together with the static image? That's not going to work easily (or at all) as the static image is fully linked and doesn't contain all necessary relocations anymore to move around its sections (which happens if you insert stuff into the existing ones). So, instead of amending the existing .text section in vmlinux (to contain the merged .text's from the modules) you would have to put this blob into an extra .text2 (for instance) behind all other current sections of vmlinux. You'd have to do that for all sections, essentially being back at attaching the blob simply to the end of vmlinux (with the change that you only have one set of ELF section headers, instead of two, but that's minor).
In my mind that module lump would not be linked with the static image, but would rather be sort of a mega-module that is loaded implicitely. It would "somehow" be tacked at the end of vmlinux as blob and then loaded with a only mildly extended module loader (basically everything as now, just that you have multiple this_module sections, and with the difference that the initcall order is split and changed)
Yeah, I'd like to be able to relink the static image. It is built with - --emit-relocs, so I was hoping that would be enough. This isn't exactly my area of expertise. :) Is there a way to keep this information in the final vmlinux? I realize it will waste space, but it can be stripped out again when the vmlinuz is made, and the vmlinux image is gzipped when we ship it. My concern with making it a module lump is that we lose many of the advantages we gain by doing this. Yes, we could make it work so that each section is executed in the proper place, but we end up having to do all the back end work for modules anyway, which isn't free. The solution I'm targeting isn't a "make module loading faster" solution, it's a "make it faster than a static image" solution. If that's possible, it's win-win. We get the advantages of a static image and the advantages of a modular kernel.
I have a script that parses the existing section headers and generates a script with the proper alignment (I hope). Whether it actually works or not, I can't tell yet, since I run into this: ld: vmlinux(.text.head+0xc020000c): reloc against `boot_gdt_descr': error 2
Maybe that's a result of vmlinux already fully linked. Hmm... wasn't it once the case that during building the kernel initially only a merged .o file for everything was generated and that one then again got fully linked? vmlinux.o or something... you could use that.
I'm not sure what's causing that yet.
Seems to be an internal linker error. What's output of 'readelf -rsW vmlinux | grep boot_gdt_descr' (i.e. what type of reloc and symbol is this)?
c020000c 00a7fb01 R_386_32 c082e116 boot_gdt_descr 1317: c0572676 0 NOTYPE LOCAL DEFAULT 8 boot_gdt_descr 43003: c082e116 0 NOTYPE GLOBAL DEFAULT 38 boot_gdt_descr - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpDmg4ACgkQLPWxlyuTD7LHeQCeLjwBm82xGIYPF/bqzGRrW1Bm PY0Anj6CE+ibH9veoaAVGS9WqKGfbmuq =qXtx -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hi, On Thu, 25 Jun 2009, Jeff Mahoney wrote:
In my mind that module lump would not be linked with the static image, but would rather be sort of a mega-module that is loaded implicitely. It would "somehow" be tacked at the end of vmlinux as blob and then loaded with a only mildly extended module loader (basically everything as now, just that you have multiple this_module sections, and with the difference that the initcall order is split and changed)
Yeah, I'd like to be able to relink the static image. It is built with --emit-relocs, so I was hoping that would be enough.
It should be, but nobody is using that option in the hope to be able to link a fully linked executable again, hence there might be interesting bugs lurking. Basically what the linker needs to do is to undo the relocations first (should be possible for most relocs), do the section merging (the linking step) and the apply the relocs again. I wouldn't have expected it to work :-)
This isn't exactly my area of expertise. :) Is there a way to keep this information in the final vmlinux?
The above option is the only way.
I realize it will waste space, but it can be stripped out again when the vmlinuz is made, and the vmlinux image is gzipped when we ship it.
My concern with making it a module lump is that we lose many of the advantages we gain by doing this.
I never bought this argument. I can see how the initcall ordering is an issue, I can see how first having to load an initrd in order to load further modules is an issue. I can't for the life of me figure out why relocation processing of the module loader should in any way be an issue. It's fast if implemented correctly, and I assume the kernel does implement it correctly. All the fancy user-space stuff regarding pre-linking in order to speed up pure app loading with hot caches helps mostly the case where there are many shared libraries to look up symbols in. But here we lookup only in vmlinux and the hypothetical lump.
Yes, we could make it work so that each section is executed in the proper place, but we end up having to do all the back end work for modules anyway, which isn't free.
The solution I'm targeting isn't a "make module loading faster" solution,
I thought the goal of the whole excercise is to have the interesting modules available already long before initrd and in addition that the initorder be optimized by interleaving it with the init of vmlinux proper. The idea to do that was to somehow merge these interesting modules and vmlinux. There is more than one way to implement this merge, at least your current approach, and the module-lump-tacked-to-vmlinux approach.
it's a "make it faster than a static image" solution. If that's possible, it's win-win. We get the advantages of a static image and the advantages of a modular kernel.
Yes. I see my approach still as static image. In any case if you get your approach working, all the better.
Seems to be an internal linker error. What's output of 'readelf -rsW vmlinux | grep boot_gdt_descr' (i.e. what type of reloc and symbol is this)?
c020000c 00a7fb01 R_386_32 c082e116 boot_gdt_descr 1317: c0572676 0 NOTYPE LOCAL DEFAULT 8 boot_gdt_descr 43003: c082e116 0 NOTYPE GLOBAL DEFAULT 38 boot_gdt_descr
That's the problem. Two definitions of the same symbol, once local once global and one relocation which then can't decide anymore which one to choose. Do this: % sed -ie 's/boot_gdt_descr/myown_boot_gdt_descr/g' \ arch/x86/kernel/trampoline_32.S That renames the local version, which can't be referenced from other places anyway. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Michael Matz wrote:
Hi,
On Thu, 25 Jun 2009, Jeff Mahoney wrote: The above option is the only way.
Hrm. Well using vmlinux.o means shipping another large file and needing the full linker script.
Yes. I see my approach still as static image. In any case if you get your approach working, all the better.
Ok, your solution seems doable. I'll probably end up taking that route if I can't get mine working properly.
Seems to be an internal linker error. What's output of 'readelf -rsW vmlinux | grep boot_gdt_descr' (i.e. what type of reloc and symbol is this)? c020000c 00a7fb01 R_386_32 c082e116 boot_gdt_descr 1317: c0572676 0 NOTYPE LOCAL DEFAULT 8 boot_gdt_descr 43003: c082e116 0 NOTYPE GLOBAL DEFAULT 38 boot_gdt_descr
That's the problem. Two definitions of the same symbol, once local once global and one relocation which then can't decide anymore which one to choose. Do this:
% sed -ie 's/boot_gdt_descr/myown_boot_gdt_descr/g' \ arch/x86/kernel/trampoline_32.S
That renames the local version, which can't be referenced from other places anyway.
Unfortunately it didn't fix the problem for me. Now I get: readelf -rsW vmlinux|grep boot_gdt_desc c020000c 00a7fb01 R_386_32 c082e116 boot_gdt_descr 1317: c0572676 0 NOTYPE LOCAL DEFAULT 8 t_boot_gdt_descr 43003: c082e116 0 NOTYPE GLOBAL DEFAULT 38 boot_gdt_descr .. but still get the ld: vmlinux(.text.head+0xc020000c): reloc against `boot_gdt_descr': error 2 - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpD1O8ACgkQLPWxlyuTD7IOnACfUVp7kUU4OgkCVvBwCGFIK2P8 7ZkAn2EpKGcCl97bvfxaBgoE8vjT6cVz =HW9d -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Thu, Jun 25, 2009 at 10:30:17AM -0400, Jeff Mahoney wrote:
It seems that currently __mod_pci_device_table is not referred to via some pointer in some module descriptor, but rather explicitely by symbol name in the loader. That's not going to work as when you lump multiple modules together you would have multiple of those. I would embed a pointer to that table into the module descriptor.
We don't need that either. The device table is pointed at by the driver struct, which is registered in the initcall. The __mod_pci_device_table, I think, is only used to generate the aliases for automatic loading.
Yes, you should be able to throw those away. You will get __mod_BUS_device_table entries all over the place, where BUS is any one of a variety of different types (pci, usb, pnp, etc.) good luck, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hi Michal, Le mercredi 24 juin 2009, Michal Marek a écrit :
Jeff Mahoney napsal(a):
Did you figure out a way to discover when a module is built into the kernel instead of just unavailable?
I wrote a patch for kbuild to generate a modules.builtin file and install it below /lib/modules/, the plan is to make modprobe look into this file instead of complaining about missing modules. Unfortunately the kbuild maintainer hasn't responded yet, I guess I'll add it to our tree nevertheless as we really need it now.
Sounds like a really good idea for a variety of purposes, thanks for working on this! -- Jean Delvare Suse L3 -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jeff Mahoney wrote:
Michael Matz wrote:
Hi,
On Thu, 25 Jun 2009, Jeff Mahoney wrote: The above option is the only way.
Hrm. Well using vmlinux.o means shipping another large file and needing the full linker script.
Yes. I see my approach still as static image. In any case if you get your approach working, all the better.
Ok, your solution seems doable. I'll probably end up taking that route if I can't get mine working properly.
Seems to be an internal linker error. What's output of 'readelf -rsW vmlinux | grep boot_gdt_descr' (i.e. what type of reloc and symbol is this)? c020000c 00a7fb01 R_386_32 c082e116 boot_gdt_descr 1317: c0572676 0 NOTYPE LOCAL DEFAULT 8 boot_gdt_descr 43003: c082e116 0 NOTYPE GLOBAL DEFAULT 38 boot_gdt_descr That's the problem. Two definitions of the same symbol, once local once global and one relocation which then can't decide anymore which one to choose. Do this:
% sed -ie 's/boot_gdt_descr/myown_boot_gdt_descr/g' \ arch/x86/kernel/trampoline_32.S
That renames the local version, which can't be referenced from other places anyway.
Unfortunately it didn't fix the problem for me.
Now I get: readelf -rsW vmlinux|grep boot_gdt_desc c020000c 00a7fb01 R_386_32 c082e116 boot_gdt_descr 1317: c0572676 0 NOTYPE LOCAL DEFAULT 8 t_boot_gdt_descr 43003: c082e116 0 NOTYPE GLOBAL DEFAULT 38 boot_gdt_descr
.. but still get the
ld: vmlinux(.text.head+0xc020000c): reloc against `boot_gdt_descr': error 2
[Sorry for the double post to CC-list members, I originally sent from @suse.de] I've abandoned the idea of linking against the static vmlinux. Michael is right, the information we need just isn't there. The good news is that on x86, I think we can keep vmlinux relocatable and then just link it statically to boot it as vmlinuz. I've altered the build process to leave a vmlinux.reloc around before doing the final link. That's enough to leave the information we need to link the modules later. One thing we haven't discussed yet is the supportability of this. If we start relinking the kernel on installed systems, -debuginfo kernel packages are useless. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpUrJMACgkQLPWxlyuTD7LQ/ACdF64ATl5GCuIlucjs0YFDCpUg dw4An0XzBnZNfHgYqavGVQZqndhc99x2 =Jozu -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Wed, Jul 08, 2009 at 10:26:27AM -0400, Jeff Mahoney wrote:
Jeff Mahoney wrote:
Michael Matz wrote:
Hi,
On Thu, 25 Jun 2009, Jeff Mahoney wrote: The above option is the only way.
Hrm. Well using vmlinux.o means shipping another large file and needing the full linker script.
Yes. I see my approach still as static image. In any case if you get your approach working, all the better.
Ok, your solution seems doable. I'll probably end up taking that route if I can't get mine working properly.
Seems to be an internal linker error. What's output of 'readelf -rsW vmlinux | grep boot_gdt_descr' (i.e. what type of reloc and symbol is this)? c020000c 00a7fb01 R_386_32 c082e116 boot_gdt_descr 1317: c0572676 0 NOTYPE LOCAL DEFAULT 8 boot_gdt_descr 43003: c082e116 0 NOTYPE GLOBAL DEFAULT 38 boot_gdt_descr That's the problem. Two definitions of the same symbol, once local once global and one relocation which then can't decide anymore which one to choose. Do this:
% sed -ie 's/boot_gdt_descr/myown_boot_gdt_descr/g' \ arch/x86/kernel/trampoline_32.S
That renames the local version, which can't be referenced from other places anyway.
Unfortunately it didn't fix the problem for me.
Now I get: readelf -rsW vmlinux|grep boot_gdt_desc c020000c 00a7fb01 R_386_32 c082e116 boot_gdt_descr 1317: c0572676 0 NOTYPE LOCAL DEFAULT 8 t_boot_gdt_descr 43003: c082e116 0 NOTYPE GLOBAL DEFAULT 38 boot_gdt_descr
.. but still get the
ld: vmlinux(.text.head+0xc020000c): reloc against `boot_gdt_descr': error 2
[Sorry for the double post to CC-list members, I originally sent from @suse.de]
I've abandoned the idea of linking against the static vmlinux. Michael is right, the information we need just isn't there. The good news is that on x86, I think we can keep vmlinux relocatable and then just link it statically to boot it as vmlinuz.
I've altered the build process to leave a vmlinux.reloc around before doing the final link. That's enough to leave the information we need to link the modules later.
One thing we haven't discussed yet is the supportability of this. If we start relinking the kernel on installed systems, -debuginfo kernel packages are useless.
Would that imply that tools like kdb and systemtap would not work with this kind of implementation as they need that debug package information? thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Greg KH wrote:
On Wed, Jul 08, 2009 at 10:26:27AM -0400, Jeff Mahoney wrote:
Jeff Mahoney wrote:
Hi, On Thu, 25 Jun 2009, Jeff Mahoney wrote: The above option is the only way. Hrm. Well using vmlinux.o means shipping another large file and needing
Michael Matz wrote: the full linker script.
Yes. I see my approach still as static image. In any case if you get your approach working, all the better. Ok, your solution seems doable. I'll probably end up taking that route if I can't get mine working properly.
Seems to be an internal linker error. What's output of 'readelf -rsW vmlinux | grep boot_gdt_descr' (i.e. what type of reloc and symbol is this)? c020000c 00a7fb01 R_386_32 c082e116 boot_gdt_descr 1317: c0572676 0 NOTYPE LOCAL DEFAULT 8 boot_gdt_descr 43003: c082e116 0 NOTYPE GLOBAL DEFAULT 38 boot_gdt_descr That's the problem. Two definitions of the same symbol, once local once global and one relocation which then can't decide anymore which one to choose. Do this: % sed -ie 's/boot_gdt_descr/myown_boot_gdt_descr/g' \ arch/x86/kernel/trampoline_32.S That renames the local version, which can't be referenced from other places anyway. Unfortunately it didn't fix the problem for me.
Now I get: readelf -rsW vmlinux|grep boot_gdt_desc c020000c 00a7fb01 R_386_32 c082e116 boot_gdt_descr 1317: c0572676 0 NOTYPE LOCAL DEFAULT 8 t_boot_gdt_descr 43003: c082e116 0 NOTYPE GLOBAL DEFAULT 38 boot_gdt_descr
.. but still get the
ld: vmlinux(.text.head+0xc020000c): reloc against `boot_gdt_descr': error 2 [Sorry for the double post to CC-list members, I originally sent from @suse.de]
I've abandoned the idea of linking against the static vmlinux. Michael is right, the information we need just isn't there. The good news is that on x86, I think we can keep vmlinux relocatable and then just link it statically to boot it as vmlinuz.
I've altered the build process to leave a vmlinux.reloc around before doing the final link. That's enough to leave the information we need to link the modules later.
One thing we haven't discussed yet is the supportability of this. If we start relinking the kernel on installed systems, -debuginfo kernel packages are useless.
Would that imply that tools like kdb and systemtap would not work with this kind of implementation as they need that debug package information?
I need to do some research on what would be involved to regenerate the debuginfo as well. Michael - Is it possible to pull the debuginfo back in during the link for re-export after the link? Is there a better way to do this? - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpUtrcACgkQLPWxlyuTD7KUVQCfbyGFNsZi5zfpDaywrJc4sx3q W9oAn1rbtSYCmyAPkfSEgrEhsMvuv6EQ =1FwS -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Hi, On Wed, 8 Jul 2009, Jeff Mahoney wrote:
I've abandoned the idea of linking against the static vmlinux. Michael is right, the information we need just isn't there. The good news is that on x86, I think we can keep vmlinux relocatable and then just link it statically to boot it as vmlinuz.
I guess this means an additional file in /boot/ ? Or perhaps in /lib/modules/$VERSION/ ?
I need to do some research on what would be involved to regenerate the debuginfo as well.
Hmm. Debuginfo refers to entities by address/offset. So if addresses change due to relinking (e.g. by enlarging .text with the code from modules) the debuginfo becomes incorrect. We have no way to correct this anymore, except if vmlinux.reloc would contain unrelocated debuginfo (a hypothetical DWARF rewriter handling this doesn't exist and would be very complicated to create). It's not easy to separate unrelocated debug info from unrelocated .o files, they contain relocations against file sections which depend on those sections being in the same file (e.g. against .text). So there would have to be two vmlinux.reloc, one without debug sections and one with debug-info _and_ the code/data. In relinking you would chose the latter if installed and separate debuginfo afterwards (or not, doesn't matter). This all seems quite unappealing :-)
Michael - Is it possible to pull the debuginfo back in during the link for re-export after the link? Is there a better way to do this?
Right now I can't think of any. The situation would be different if the relinking above would _not_ change the layout of the pre-existing vmlinux sections, but it always will if starting from unrelocated files. Of course, with just a little bit of code in the kernel (I believe :) ) it would be possible to load a meta-module tacked somewhere into the vmlinux file, and all the relinking problems vanish. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Michael Matz wrote:
Hi,
On Wed, 8 Jul 2009, Jeff Mahoney wrote:
I've abandoned the idea of linking against the static vmlinux. Michael is right, the information we need just isn't there. The good news is that on x86, I think we can keep vmlinux relocatable and then just link it statically to boot it as vmlinuz.
I guess this means an additional file in /boot/ ? Or perhaps in /lib/modules/$VERSION/ ?
Yes, but we already had another file anyway, so that's not much of a difference.
I need to do some research on what would be involved to regenerate the debuginfo as well.
Hmm. Debuginfo refers to entities by address/offset. So if addresses change due to relinking (e.g. by enlarging .text with the code from modules) the debuginfo becomes incorrect. We have no way to correct this anymore, except if vmlinux.reloc would contain unrelocated debuginfo (a hypothetical DWARF rewriter handling this doesn't exist and would be very complicated to create).
It's not easy to separate unrelocated debug info from unrelocated .o files, they contain relocations against file sections which depend on those sections being in the same file (e.g. against .text). So there would have to be two vmlinux.reloc, one without debug sections and one with debug-info _and_ the code/data.
In relinking you would chose the latter if installed and separate debuginfo afterwards (or not, doesn't matter).
This all seems quite unappealing :-)
Indeed. Well, it was worth a try. I was hoping do as much of the processing during installation rather than during the boot.
Michael - Is it possible to pull the debuginfo back in during the link for re-export after the link? Is there a better way to do this?
Right now I can't think of any. The situation would be different if the relinking above would _not_ change the layout of the pre-existing vmlinux sections, but it always will if starting from unrelocated files.
Of course, with just a little bit of code in the kernel (I believe :) ) it would be possible to load a meta-module tacked somewhere into the vmlinux file, and all the relinking problems vanish.
I guess this is where the next step leads. You told me so. ;) - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkpU0hYACgkQLPWxlyuTD7Jc6wCfWRvc4bXy39UOS5UL9wEaSdLw OOcAn0uAOtGO2KIZZsBo4en51iS475xc =QR0D -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
participants (10)
-
Andreas Gruenbacher
-
Greg KH
-
Hannes Reinecke
-
Jean Delvare
-
Jeff Mahoney
-
Michael Matz
-
Michal Marek
-
Takashi Iwai
-
Tejun Heo
-
Thomas Renninger