https://bugzilla.novell.com/show_bug.cgi?id=642078
https://bugzilla.novell.com/show_bug.cgi?id=642078#c1
--- Comment #1 from Brian Gilbert
it seems there are two bdrvs that reference and write to the same qcow2 file, but the code in block-qcow2.c doesn't properly account for that....
[KEVIN] Right, having the same image file opened twice is asking for trouble. The real problem is how Xen handles PV on HVM. You start the VM using the IDE emulation and at some point during the boot process the OS starts to refer to the same disk using the PV drivers. So you basically have to have it opened twice unless you manage to have one BlockDriverState that is used for both virtual devices. I'm not sure though, why qemu opens the same file twice. I think some patches went in that made it use qemu instead of an external tapdisk backend to actually access the image. I don't have that code around (I'm working KVM nowadays), but you could look up if the code that receives the file open request from blktapctrl requires this blktap0 drive. If so, it should probably be rewritten to associate hda with the blktap interface. That doesn't consider hot-unplug yet, though... To sum it up, the idea of accessing the same image file through two different devices is just insane for anything but raw images. [BRIAN]
So do you have any thoughts on how to best handle this?
Options 1 - Remove your patch, I took out just the #ifndef'd block that adds the new bdrv to drives_table and that seems to get past my problem, but I assume it will bring back the problem you were solving, i.e. make it more likely that qemu gets confused.
[KEVIN] Before the patch, the file was actually opened twice, but the blktap0 drive wasn't properly registered with qemu. The result of this is that the savevm handler didn't see the seconds instance of the drive when iterating over all drives to snapshot. Now I hate myself for the bad commit message, so I can't tell what actually was broken, but not registering properly is probably a bad idea in more than one place. And even without that patch, you still open the image file twice, which is not helpful in any case. [BRIAN]
2 - In file savevm.c, save_disk_snapshots() I added code that checks if the current bs->filename was already saved in this call, and if so skip it, that seems to work too. I worry that there are probably some other places where we need to apply similar logic.
3 - Start with a .qcow2 file that already has at least one snapshot, presumably that forces refcounts >= 2, which seems to prevent the caching that causes this problem. No code change is good, but snapshotting twice seems wasteful and still potentially dangerous.
4 - Either replace the hda drives_table entry with the blktap0 entry, or somehow explicitly associate/link the 2 entries and then update code that uses drives_table to recognize and deal with this special case. Not tested, just an idea, but seems overly complicated.
5 - Have the blktap0 bdrv use the same BlockDriverState as the hda bdrv? Not tested, just an idea. Seems easy enough but could have problems....
[KEVIN] Yes, that's the only one that I think really solves the problem. To clarify this, what I meant here is not having two DriveInfos referring to the same BlockDriverState but rather eliminate the blktap0 DriveInfo and access the hda one where currently the blktap0 one is used. [BRIAN]
6 - Update block-qcow2 to never cache, or to validate caches somehow before using them. Seems like a big task.
[KEVIN] And is certainly going to hurt performance to the point where it's not usable any more (though I think Xen uses an outdated copy of qcow2 code anyway, so it's slow already without doing such things) [BRIAN]
BTW - I am open to using the vhd disk format instead (or something else if it does snapshots), and have started some tests, but I haven't been able to complete an xpsp3 install into a new vhd on OpenSuSE 11.3 yet...
[KEVIN] Unless they have changed their model of doing PV on HVM, I think it's going to have the same problems. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.