From Dom0 there is no way to guess the number and, or the complete
Hello community,
here is the log from the commit of package xen.1954 for openSUSE:12.3:Update checked in at 2013-09-04 14:12:49
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:12.3:Update/xen.1954 (Old)
and /work/SRC/openSUSE:12.3:Update/.xen.1954.new (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "xen.1954"
Changes:
--------
New Changes file:
--- /dev/null 2013-07-23 23:44:04.804033756 +0200
+++ /work/SRC/openSUSE:12.3:Update/.xen.1954.new/xen.changes 2013-09-04 14:12:57.000000000 +0200
@@ -0,0 +1,7313 @@
+-------------------------------------------------------------------
+Thu Aug 8 08:26:07 MDT 2013 - carnold@suse.com
+
+- bnc#824676 - L3: Failed to setup devices for vm instance when
+ start multiple vms simultaneously
+ blktap-close-fifos.patch
+
+-------------------------------------------------------------------
+Wed Aug 7 09:11:21 MDT 2013 - carnold@suse.com
+
+- bnc#817799 - sles9sp4 guest fails to start after upgrading to
+ sles11 sp3
+ 27230-Revert-hvmloader-always-include-HPET-table.patch
+- Upstream patches from Jan
+ 27197-AMD-intremap-Prevent-use-of-per-device-vector-maps.patch
+ 27204-libxl-suppress-device-assignment-to-HVM-guest-without-IOMMU.patch
+ 27212-x86-don-t-pass-negative-time-to-gtime_to_gtsc-try-2.patch
+ 27213-iommu-amd-Fix-logic-for-clearing-the-IOMMU-interrupt-bits.patch
+ 27214-iommu-amd-Workaround-for-erratum-787.patch
+ 27221-x86-mm-Ensure-useful-progress-in-alloc_l2_table.patch
+ 27231-adjust-x86-EFI-build.patch
+ 27248-x86-cpuidle-Change-logging-for-unknown-APIC-IDs.patch
+ 27263-x86-time-Update-wallclock-in-shared-info-when-altering-domain-time-offset.patch
+ 27321-x86-refine-FPU-selector-handling-code-for-XSAVEOPT.patch
+
+-------------------------------------------------------------------
+Wed Jun 27 10:32:43 MDT 2013 - carnold@suse.com
+
+- bnc#826882 - VUL-0: xen: CVE-2013-1432: XSA-58: Page reference
+ counting error due to XSA-45/CVE-2013-1918 fixes
+ 27181-x86-fix-page-refcount-handling-in-page-table-pin-error-path.patch
+
+-------------------------------------------------------------------
+Wed Jun 26 12:22:43 MDT 2013 - jfehlig@suse.com
+
+- Add upstream patch to fix devid assignment in libxl
+ 27184-libxl-devid-fix.patch
+
+-------------------------------------------------------------------
+Thu Jun 6 09:46:41 MDT 2013 - carnold@suse.com
+
+- bnc#823608 - VUL-0: xen: XSA-57: libxl allows guest write access to
+ sensitive console related xenstore keys
+ 27178-libxl-Restrict-permissions-on-PV-console-device-xenstore-nodes.patch
+- bnc#823011 - VUL-0: xen: XSA-55: Multiple vulnerabilities in
+ libelf PV kernel handling
+ 27141-libelf-abolish-libelf-relocate.c.patch
+ 27142-libxc-introduce-xc_dom_seg_to_ptr_pages.patch
+ 27143-libxc-Fix-range-checking-in-xc_dom_pfn_to_ptr-etc.patch
+ 27144-libelf-add-struct-elf_binary-parameter-to-elf_load_i.patch
+ 27145-libelf-abolish-elf_sval-and-elf_access_signed.patch
+ 27146-libelf-move-include-of-asm-guest_access.h-to-top-of-.patch
+ 27147-libelf-xc_dom_load_elf_symtab-Do-not-use-syms-uninit.patch
+ 27148-libelf-introduce-macros-for-memory-access-and-pointe.patch
+ 27149-tools-xcutils-readnotes-adjust-print_l1_mfn_valid_no.patch
+ 27150-libelf-check-nul-terminated-strings-properly.patch
+ 27151-libelf-check-all-pointer-accesses.patch
+ 27152-libelf-Check-pointer-references-in-elf_is_elfbinary.patch
+ 27153-libelf-Make-all-callers-call-elf_check_broken.patch
+ 27154-libelf-use-C99-bool-for-booleans.patch
+ 27155-libelf-use-only-unsigned-integers.patch
+ 27156-libelf-check-loops-for-running-away.patch
+ 27157-libelf-abolish-obsolete-macros.patch
+ 27158-libxc-Add-range-checking-to-xc_dom_binloader.patch
+ 27159-libxc-check-failure-of-xc_dom_-_to_ptr-xc_map_foreig.patch
+ 27160-libxc-check-return-values-from-malloc.patch
+ 27161-libxc-range-checks-in-xc_dom_p2m_host-and-_guest.patch
+ 27162-libxc-check-blob-size-before-proceeding-in-xc_dom_ch.patch
+ 27163-libxc-Better-range-check-in-xc_dom_alloc_segment.patch
+- bnc#808269 - Fully Virtualized Windows VM install is failed on
+ Ivy Bridge platforms with Xen kernel
+ 26878-VMX-Detect-posted-interrupt-capability.patch
+ 26879-VMX-Turn-on-posted-interrupt-bit-in-vmcs.patch
+ 26880-VMX-Add-posted-interrupt-supporting.patch
+ 26881-x86-HVM-Call-vlapic_set_irq-to-delivery-virtual-interrupt.patch
+ 26882-VMX-Use-posted-interrupt-to-deliver-virtual-interrupt.patch
+ x86-VMX-Viridian-vINTR.patch
+- Upstream patches from Jan
+ 27093-libxc-limit-cpu-values-when-setting-vcpu-affinity.patch (Replaces CVE-2013-2072-xsa56.patch)
+ 27112-x86-fix-ordering-of-operations-in-destroy_irq.patch
+ 27117-x86-MCE-disable-if-MCE-banks-are-not-present.patch
+ 27118-x86-xsave-fix-information-leak-on-AMD-CPUs.patch (Replaces CVE-2013-2076-xsa52.patch)
+ 27119-x86-xsave-recover-from-faults-on-XRSTOR.patch (Replaces CVE-2013-2077-xsa53.patch)
+ 27120-x86-xsave-properly-check-guest-input-to-XSETBV.patch (Replaces CVE-2013-2078-xsa54.patch)
+ 27121-x86-preserve-FPU-selectors-for-32-bit-guest-code.patch
+ 27122-x86-fix-XCR0-handling.patch
+ 27123-x86-vtsc-update-vcpu_time-in-hvm_set_guest_time.patch
+- Dropped 27083-AMD-iommu-SR56x0-Erratum-64-Reset-all-head-tail-pointers.patch
+
+-------------------------------------------------------------------
+Fri May 31 09:40:59 MDT 2013 - carnold@suse.com
+
+- bnc#801663 - performance of mirror lvm unsuitable for production
+ block-dmmd
+
+-------------------------------------------------------------------
+Fri May 24 08:49:11 MDT 2013 - carnold@suse.com
+
+- bnc#817904 - [SLES11SP3 BCS Bug] Crashkernel fails to boot after
+ panic on XEN kernel SP3 Beta 4 and RC1
+ 27085-x86-fix-boot-time-APIC-mode-detection.patch
+- Upstream AMD Erratum patch from Jan
+ 27083-AMD-iommu-SR56x0-Erratum-64-Reset-all-head-tail-pointers.patch
+
+-------------------------------------------------------------------
+Tue May 21 14:30:50 MDT 2013 - carnold@suse.com
+
+- bnc#813675 - - VUL-0: xen: CVE-2013-1919: XSA-46: Several access
+ permission issues with IRQs for unprivileged guests
+ 27079-fix-XSA-46-regression-with-xend-xm.patch
+
+-------------------------------------------------------------------
+Fri May 17 14:44:25 MDT 2013 - carnold@suse.com
+
+- bnc#820917 - VUL-1: CVE-2013-2076: xen: Information leak on
+ XSAVE/XRSTOR capable AMD CPUs (XSA-52)
+ CVE-2013-2076-xsa52.patch
+- bnc#820919 - VUL-0: CVE-2013-2077: xen: Hypervisor crash due to
+ missing exception recovery on XRSTOR (XSA-53)
+ CVE-2013-2077-xsa53.patch
+- bnc#820920 - VUL-1: CVE-2013-2078: xen: Hypervisor crash due to
+ missing exception recovery on XSETBV (XSA-54)
+ CVE-2013-2078-xsa54.patch
+- bnc#808085 - aacraid driver panics mapping INT A when booting
+ kernel-xen
+ 26953-x86-allow-Dom0-read-only-access-to-IO-APICs.patch
+- bnc#817210 - openSUSE 12.3 Domain 0 doesn't boot with i915
+ graphics controller under Xen with VT-d enabled
+ 27071-x86-IO-APIC-fix-guest-RTE-write-corner-cases.patch
+- Upstream patches from Jan
+ 26860-x86-allow-VCPUOP_register_vcpu_info-to-work-again-on-PVHVM-guests.patch
+ 26910-ns16550-delay-resume-until-dom0-ACPI-has-a-chance-to-run.patch
+ 26946-x86-make-vcpu_destroy_pagetables-preemptible.patch (Replaces CVE-2013-1918-xsa45*.patch)
+ 26947-x86-make-new_guest_cr3-preemptible.patch (Replaces CVE-2013-1918-xsa45*.patch)
+ 26948-x86-make-MMUEXT_NEW_USER_BASEPTR-preemptible.patch (Replaces CVE-2013-1918-xsa45*.patch)
+ 26949-x86-make-vcpu_reset-preemptible.patch (Replaces CVE-2013-1918-xsa45*.patch)
+ 26950-x86-make-arch_set_info_guest-preemptible.patch (Replaces CVE-2013-1918-xsa45*.patch)
+ 26951-x86-make-page-table-unpinning-preemptible.patch (Replaces CVE-2013-1918-xsa45*.patch)
+ 26952-x86-make-page-table-handling-error-paths-preemptible.patch (Replaces CVE-2013-1918-xsa45*.patch)
+ 26956-x86-mm-preemptible-cleanup.patch (Replaces CVE-2013-1918-xsa45*.patch)
+ 26958-VT-d-don-t-permit-SVT_NO_VERIFY-entries-for-known-device-types.patch (Replaces CVE-2013-1952-xsa49.patch)
+ 26966-x86-handle-paged-gfn-in-wrmsr_hypervisor_regs.patch
+ 27072-x86-shadow-fix-off-by-one-in-MMIO-permission-check.patch
+ 27074-x86-hvm-avoid-p2m-lookups-for-vlapic-accesses.patch
+
+-------------------------------------------------------------------
+Tue May 14 08:32:55 MDT 2013 - carnold@suse.com
+
+- bnc#819416 - VUL-0: xen: CVE-2013-2072: XSA-56: Buffer overflow
+ in xencontrol Python bindings affecting xend
+ CVE-2013-2072-xsa56.patch
+
+-------------------------------------------------------------------
+Tue May 7 11:46:29 MDT 2013 - carnold@suse.com
+
+- bnc#818183 - VUL-0: xen: CVE-2013-2007: XSA-51: qga set umask
+ 0077 when daemonizing
+ CVE-2013-2007-xsa51-1.patch
+ CVE-2013-2007-xsa51-2.patch
+
+-------------------------------------------------------------------
+Mon May 6 15:52:03 CEST 2013 - ohering@suse.de
+
+- add lndir to BuildRequires
+
+-------------------------------------------------------------------
+Mon May 6 11:45:03 CEST 2013 - ohering@suse.de
+
+- remove xen.migrate.tools_notify_restore_to_hangup_during_migration_--abort_if_busy.patch
+ It changed migration protocol and upstream wants a different solution
+
+-------------------------------------------------------------------
+Sun May 5 16:20:30 CEST 2013 - ohering@suse.de
+
+- bnc#802221 - fix xenpaging
+ readd xenpaging.qemu.flush-cache.patch
+
+-------------------------------------------------------------------
+Thu May 2 09:11:33 MDT 2013 - carnold@suse.com
+
+- bnc#808269 - Fully Virtualized Windows VM install is failed on
+ Ivy Bridge platforms with Xen kernel
+ 26754-hvm-Improve-APIC-INIT-SIPI-emulation.patch
+
+-------------------------------------------------------------------
+Tue Apr 30 09:15:26 MDT 2013 - carnold@suse.com
+
+- Upstream patches from Jan
+ 26891-x86-S3-Fix-cpu-pool-scheduling-after-suspend-resume.patch
+ 26930-x86-EFI-fix-runtime-call-status-for-compat-mode-Dom0.patch
+- Additional fix for bnc#816159
+ CVE-2013-1918-xsa45-followup.patch
+
+-------------------------------------------------------------------
+Mon Apr 29 15:40:35 MDT 2013 - cyliu@suse.com
+
+- bnc#817068 - Xen guest with >1 sr-iov vf won't start
++++ 7116 more lines (skipped)
++++ between /dev/null
++++ and /work/SRC/openSUSE:12.3:Update/.xen.1954.new/xen.changes
New:
----
25861-x86-early-fixmap.patch
25862-sercon-non-com.patch
25863-sercon-ehci-dbgp.patch
25864-sercon-unused.patch
25866-sercon-ns16550-pci-irq.patch
25867-sercon-ns16550-parse.patch
25874-x86-EFI-chain-cfg.patch
25909-xenpm-consistent.patch
25920-x86-APICV-enable.patch
25921-x86-APICV-delivery.patch
25922-x86-APICV-x2APIC.patch
25952-x86-MMIO-remap-permissions.patch
25957-x86-TSC-adjust-HVM.patch
25958-x86-TSC-adjust-sr.patch
25959-x86-TSC-adjust-expose.patch
25975-x86-IvyBridge.patch
26077-stubdom_fix_compile_errors_in_grub.patch
26078-hotplug-Linux_remove_hotplug_support_rely_on_udev_instead.patch
26079-hotplug-Linux_close_lockfd_after_lock_attempt.patch
26081-stubdom_fix_rpmlint_warning_spurious-executable-perm.patch
26082-blktap2-libvhd_fix_rpmlint_warning_spurious-executable-perm.patch
26083-blktap_fix_rpmlint_warning_spurious-executable-perm.patch
26084-hotplug_install_hotplugpath.sh_as_data_file.patch
26085-stubdom_install_stubdompath.sh_as_data_file.patch
26086-hotplug-Linux_correct_sysconfig_tag_in_xendomains.patch
26087-hotplug-Linux_install_sysconfig_files_as_data_files.patch
26114-pygrub-list-entries.patch
26129-ACPI-BGRT-invalidate.patch
26133-IOMMU-defer-BM-disable.patch
26189-xenstore-chmod.patch
26262-x86-EFI-secure-shim.patch
26324-IOMMU-assign-params.patch
26325-IOMMU-add-remove-params.patch
26326-VT-d-context-map-params.patch
26327-AMD-IOMMU-flush-params.patch
26328-IOMMU-pdev-type.patch
26329-IOMMU-phantom-dev.patch
26330-VT-d-phantom-MSI.patch
26331-IOMMU-phantom-dev-quirk.patch
26341-hvm-firmware-passthrough.patch
26342-hvm-firmware-passthrough.patch
26343-hvm-firmware-passthrough.patch
26344-hvm-firmware-passthrough.patch
26369-libxl-devid.patch
26370-libxc-x86-initial-mapping-fit.patch
26372-tools-paths.patch
26404-x86-forward-both-NMI-kinds.patch
26418-x86-trampoline-consider-multiboot.patch
26532-AMD-IOMMU-phantom-MSI.patch
26547-tools-xc_fix_logic_error_in_stdiostream_progress.patch
26548-tools-xc_handle_tty_output_differently_in_stdiostream_progress.patch
26549-tools-xc_turn_XCFLAGS__into_shifts.patch
26550-tools-xc_restore_logging_in_xc_save.patch
26551-tools-xc_log_pid_in_xc_save-xc_restore_output.patch
26554-hvm-firmware-passthrough.patch
26555-hvm-firmware-passthrough.patch
26556-hvm-firmware-passthrough.patch
26576-x86-APICV-migration.patch
26577-x86-APICV-x2APIC.patch
26675-tools-xentoollog_update_tty_detection_in_stdiostream_progress.patch
26754-hvm-Improve-APIC-INIT-SIPI-emulation.patch
26860-x86-allow-VCPUOP_register_vcpu_info-to-work-again-on-PVHVM-guests.patch
26878-VMX-Detect-posted-interrupt-capability.patch
26879-VMX-Turn-on-posted-interrupt-bit-in-vmcs.patch
26880-VMX-Add-posted-interrupt-supporting.patch
26881-x86-HVM-Call-vlapic_set_irq-to-delivery-virtual-interrupt.patch
26882-VMX-Use-posted-interrupt-to-deliver-virtual-interrupt.patch
26891-x86-S3-Fix-cpu-pool-scheduling-after-suspend-resume.patch
26902-x86-EFI-pass-boot-services-variable-info-to-runtime-code.patch
26910-ns16550-delay-resume-until-dom0-ACPI-has-a-chance-to-run.patch
26930-x86-EFI-fix-runtime-call-status-for-compat-mode-Dom0.patch
26946-x86-make-vcpu_destroy_pagetables-preemptible.patch
26947-x86-make-new_guest_cr3-preemptible.patch
26948-x86-make-MMUEXT_NEW_USER_BASEPTR-preemptible.patch
26949-x86-make-vcpu_reset-preemptible.patch
26950-x86-make-arch_set_info_guest-preemptible.patch
26951-x86-make-page-table-unpinning-preemptible.patch
26952-x86-make-page-table-handling-error-paths-preemptible.patch
26953-x86-allow-Dom0-read-only-access-to-IO-APICs.patch
26956-x86-mm-preemptible-cleanup.patch
26958-VT-d-don-t-permit-SVT_NO_VERIFY-entries-for-known-device-types.patch
26966-x86-handle-paged-gfn-in-wrmsr_hypervisor_regs.patch
27071-x86-IO-APIC-fix-guest-RTE-write-corner-cases.patch
27072-x86-shadow-fix-off-by-one-in-MMIO-permission-check.patch
27074-x86-hvm-avoid-p2m-lookups-for-vlapic-accesses.patch
27079-fix-XSA-46-regression-with-xend-xm.patch
27085-x86-fix-boot-time-APIC-mode-detection.patch
27093-libxc-limit-cpu-values-when-setting-vcpu-affinity.patch
27112-x86-fix-ordering-of-operations-in-destroy_irq.patch
27117-x86-MCE-disable-if-MCE-banks-are-not-present.patch
27118-x86-xsave-fix-information-leak-on-AMD-CPUs.patch
27119-x86-xsave-recover-from-faults-on-XRSTOR.patch
27120-x86-xsave-properly-check-guest-input-to-XSETBV.patch
27121-x86-preserve-FPU-selectors-for-32-bit-guest-code.patch
27122-x86-fix-XCR0-handling.patch
27123-x86-vtsc-update-vcpu_time-in-hvm_set_guest_time.patch
27125-Revert-irq-Add-extra-debugging.patch
27127-x86-HVM-fix-x2APIC-APIC_ID-read-emulation.patch
27128-x86-HVM-fix-initialization-of-wallclock-time-for-PVHVM-on-migration.patch
27141-libelf-abolish-libelf-relocate.c.patch
27142-libxc-introduce-xc_dom_seg_to_ptr_pages.patch
27143-libxc-Fix-range-checking-in-xc_dom_pfn_to_ptr-etc.patch
27144-libelf-add-struct-elf_binary-parameter-to-elf_load_i.patch
27145-libelf-abolish-elf_sval-and-elf_access_signed.patch
27146-libelf-move-include-of-asm-guest_access.h-to-top-of-.patch
27147-libelf-xc_dom_load_elf_symtab-Do-not-use-syms-uninit.patch
27148-libelf-introduce-macros-for-memory-access-and-pointe.patch
27149-tools-xcutils-readnotes-adjust-print_l1_mfn_valid_no.patch
27150-libelf-check-nul-terminated-strings-properly.patch
27151-libelf-check-all-pointer-accesses.patch
27152-libelf-Check-pointer-references-in-elf_is_elfbinary.patch
27153-libelf-Make-all-callers-call-elf_check_broken.patch
27154-libelf-use-C99-bool-for-booleans.patch
27155-libelf-use-only-unsigned-integers.patch
27156-libelf-check-loops-for-running-away.patch
27157-libelf-abolish-obsolete-macros.patch
27158-libxc-Add-range-checking-to-xc_dom_binloader.patch
27159-libxc-check-failure-of-xc_dom_-_to_ptr-xc_map_foreig.patch
27160-libxc-check-return-values-from-malloc.patch
27161-libxc-range-checks-in-xc_dom_p2m_host-and-_guest.patch
27162-libxc-check-blob-size-before-proceeding-in-xc_dom_ch.patch
27163-libxc-Better-range-check-in-xc_dom_alloc_segment.patch
27168-x86-hvm-fix-HVMOP_inject_trap-return-value-on-success.patch
27178-libxl-Restrict-permissions-on-PV-console-device-xenstore-nodes.patch
27179-VMX-Viridian-suppress-MSR-based-APIC-suggestion-when-having-APIC-V.patch
27181-x86-fix-page-refcount-handling-in-page-table-pin-error-path.patch
27184-libxl-devid-fix.patch
27197-AMD-intremap-Prevent-use-of-per-device-vector-maps.patch
27204-libxl-suppress-device-assignment-to-HVM-guest-without-IOMMU.patch
27212-x86-don-t-pass-negative-time-to-gtime_to_gtsc-try-2.patch
27213-iommu-amd-Fix-logic-for-clearing-the-IOMMU-interrupt-bits.patch
27214-iommu-amd-Workaround-for-erratum-787.patch
27221-x86-mm-Ensure-useful-progress-in-alloc_l2_table.patch
27230-Revert-hvmloader-always-include-HPET-table.patch
27231-adjust-x86-EFI-build.patch
27248-x86-cpuidle-Change-logging-for-unknown-APIC-IDs.patch
27263-x86-time-Update-wallclock-in-shared-info-when-altering-domain-time-offset.patch
27321-x86-refine-FPU-selector-handling-code-for-XSAVEOPT.patch
32on64-extra-mem.patch
CVE-2013-1922-xsa48.patch
CVE-2013-2007-xsa51-1.patch
CVE-2013-2007-xsa51-2.patch
README.SuSE
VNC-Support-for-ExtendedKeyEvent-client-message.patch
altgr_2.patch
baselibs.conf
bdrv_default_rwflag.patch
bdrv_open2_fix_flags.patch
bdrv_open2_flags_2.patch
blktap-close-fifos.patch
blktap-disable-debug-printf.patch
blktap-pv-cdrom.patch
blktap.patch
blktapctrl-default-to-ioemu.patch
block-dmmd
block-iscsi
block-nbd
block-npiv
block-npiv-common.sh
block-npiv-vport
boot.local.xenU
boot.xen
bridge-bonding.diff
bridge-opensuse.patch
bridge-record-creation.patch
bridge-vlan.diff
build-tapdisk-ioemu.patch
capslock_enable.patch
cdrom-removable.patch
change-vnc-passwd.patch
change_home_server.patch
check_device_status.patch
checkpoint-rename.patch
del_usb_xend_entry.patch
disable_emulated_device.diff
domUloader.py
domu-usb-controller.patch
etc_pam.d_xen-api
hibernate.patch
hv_extid_compatibility.patch
init.pciback
init.xen_loop
init.xend
init.xendomains
ioemu-7615-qcow2-fix-alloc_cluster_link_l2.patch
ioemu-bdrv-open-CACHE_WB.patch
ioemu-blktap-barriers.patch
ioemu-blktap-fv-init.patch
ioemu-blktap-image-format.patch
ioemu-blktap-zero-size.patch
ioemu-debuginfo.patch
ioemu-disable-emulated-ide-if-pv.patch
ioemu-disable-scsi.patch
ioemu-vnc-resize.patch
ioemu-watchdog-ib700-timer.patch
ioemu-watchdog-linkage.patch
ioemu-watchdog-support.patch
ipxe-enable-nics.patch
ipxe.tar.bz2
kernel-boot-hvm.patch
kmp_filelist
libxen_permissive.patch
log-guest-console.patch
logrotate.conf
magic_ioport_compat.patch
minios-fixups.patch
multi-xvdp.patch
network-nat-open-SuSEfirewall2-FORWARD.patch
pvdrv-import-shared-info.patch
pvdrv_emulation_control.patch
pygrub-netware-xnloader.patch
qemu-dm-segfault.patch
qemu-security-etch1.diff
qemu-xen-dir-remote.tar.bz2
qemu-xen-traditional-dir-remote.tar.bz2
seabios-dir-remote.tar.bz2
serial-split.patch
stdvga-cache.patch
stubdom.tar.bz2
supported_module.diff
suspend_evtchn_lock.patch
sysconfig.pciback
tapdisk-ioemu-logfile.patch
tapdisk-ioemu-shutdown-fix.patch
tmp-initscript-modprobe.patch
tmp_build.patch
tools-watchdog-support.patch
udev-rules.patch
usb-list.patch
vif-bridge-no-iptables.patch
vif-bridge-tap-fix.patch
vif-route-ifup.patch
x86-VMX-Viridian-vINTR.patch
x86-cpufreq-report.patch
x86-dom-print.patch
x86-extra-trap-info.patch
x86-ioapic-ack-default.patch
xen-4.2.2-testing-src.tar.bz2
xen-api-auth.patch
xen-changeset.diff
xen-cpupool-xl-config-format.patch
xen-destdir.diff
xen-disable-qemu-monitor.diff
xen-domUloader.diff
xen-fixme-doc.diff
xen-glibc217.patch
xen-hvm-default-bridge.diff
xen-hvm-default-pae.diff
xen-ioemu-hvm-pv-support.diff
xen-managed-pci-device.patch
xen-max-free-mem.diff
xen-migration-bridge-check.patch
xen-minimum-restart-time.patch
xen-no-dummy-nfs-ip.diff
xen-paths.diff
xen-qemu-iscsi-fix.patch
xen-updown.sh
xen-utils-0.1.tar.bz2
xen-xm-top-needs-root.diff
xen-xmexample-vti.diff
xen-xmexample.diff
xen.changes
xen.migrate.tools-libxc_print_stats_if_migration_is_aborted.patch
xen.migrate.tools-xc_document_printf_calls_in_xc_restore.patch
xen.migrate.tools-xc_print_messages_from_xc_save_with_xc_report.patch
xen.migrate.tools-xc_rework_xc_save.cswitch_qemu_logdirty.patch
xen.migrate.tools-xend_move_assert_to_exception_block.patch
xen.migrate.tools_add_xm_migrate_--log_progress_option.patch
xen.migrate.tools_set_migration_constraints_from_cmdline.patch
xen.migrate.tools_set_number_of_dirty_pages_during_migration.patch
xen.sles11sp1.fate311487.xen_platform_pci.dmistring.patch
xen.spec
xen_pvdrivers.conf
xenalyze.hg.tar.bz2
xenapi-console-protocol.patch
xenapiusers
xenconsole-no-multiple-connections.patch
xend-config-enable-dump-comment.patch
xend-config.diff
xend-console-port-restore.patch
xend-core-dump-loc.diff
xend-cpuid.patch
xend-cpuinfo-model-name.patch
xend-devid-or-name.patch
xend-disable-internal-logrotate.patch
xend-domain-lock-sfex.patch
xend-domain-lock.patch
xend-hvm-firmware-passthrough.patch
xend-migration-domname-fix.patch
xend-relocation-server.fw
xend-relocation.sh
xend-sysconfig.patch
xend-vcpu-affinity-fix.patch
xenpaging.autostart.patch
xenpaging.doc.patch
xenpaging.qemu.flush-cache.patch
xm-create-maxmem.patch
xm-create-xflag.patch
xm-save-check-file.patch
xmclone.sh
xmexample.disks
xmexample.domUloader
xnloader.py
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Other differences:
------------------
++++++ xen.spec ++++++
++++ 1560 lines (skipped)
++++++ 25861-x86-early-fixmap.patch ++++++
# HG changeset patch
# User Jan Beulich
# Date 1347371120 -7200
# Node ID 51c2d7c83cbc2a0357ce112a463f91d354dcdba9
# Parent e4cb8411161043c726f699252cc761e77853e820
x86: allow early use of fixmaps
As a prerequisite for adding an EHCI debug port based console
implementation, set up the page tables needed for (a sub-portion of)
the fixmaps together with other boot time page table construction.
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
Index: xen-4.2.2-testing/xen/arch/x86/boot/head.S
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/boot/head.S
+++ xen-4.2.2-testing/xen/arch/x86/boot/head.S
@@ -3,6 +3,7 @@
#include
#include
#include
+#include
#include
#include
@@ -136,6 +137,9 @@ __start:
add $8,%edx
add $(1<
#include
#include
+#define __ASSEMBLY__ /* avoid pulling in ACPI stuff (conflicts with EFI) */
+#include
+#undef __ASSEMBLY__
#include
#include
#include
@@ -1123,14 +1126,19 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SY
slot &= L2_PAGETABLE_ENTRIES - 1;
l2_bootmap[slot] = l2e_from_paddr(addr, __PAGE_HYPERVISOR|_PAGE_PSE);
}
+ /* Initialise L2 fixmap page directory entry. */
+ l2_fixmap[l2_table_offset(FIXADDR_TOP - 1)] =
+ l2e_from_paddr((UINTN)l1_fixmap, __PAGE_HYPERVISOR);
/* Initialise L3 identity-map page directory entries. */
for ( i = 0; i < ARRAY_SIZE(l2_identmap) / L2_PAGETABLE_ENTRIES; ++i )
l3_identmap[i] = l3e_from_paddr((UINTN)(l2_identmap +
i * L2_PAGETABLE_ENTRIES),
__PAGE_HYPERVISOR);
- /* Initialise L3 xen-map page directory entry. */
+ /* Initialise L3 xen-map and fixmap page directory entries. */
l3_xenmap[l3_table_offset(XEN_VIRT_START)] =
l3e_from_paddr((UINTN)l2_xenmap, __PAGE_HYPERVISOR);
+ l3_xenmap[l3_table_offset(FIXADDR_TOP - 1)] =
+ l3e_from_paddr((UINTN)l2_fixmap, __PAGE_HYPERVISOR);
/* Initialise L3 boot-map page directory entries. */
l3_bootmap[l3_table_offset(xen_phys_start)] =
l3e_from_paddr((UINTN)l2_bootmap, __PAGE_HYPERVISOR);
Index: xen-4.2.2-testing/xen/arch/x86/mm.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/mm.c
+++ xen-4.2.2-testing/xen/arch/x86/mm.c
@@ -131,6 +131,10 @@
l1_pgentry_t __attribute__ ((__section__ (".bss.page_aligned")))
l1_identmap[L1_PAGETABLE_ENTRIES];
+/* Mapping of the fixmap space needed early. */
+l1_pgentry_t __attribute__ ((__section__ (".bss.page_aligned")))
+ l1_fixmap[L1_PAGETABLE_ENTRIES];
+
#define MEM_LOG(_f, _a...) gdprintk(XENLOG_WARNING , _f "\n" , ## _a)
/*
Index: xen-4.2.2-testing/xen/arch/x86/x86_64/mm.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/x86_64/mm.c
+++ xen-4.2.2-testing/xen/arch/x86/x86_64/mm.c
@@ -65,6 +65,10 @@ l3_pgentry_t __attribute__ ((__section__
l2_pgentry_t __attribute__ ((__section__ (".bss.page_aligned")))
l2_xenmap[L2_PAGETABLE_ENTRIES];
+/* Enough page directories to map the early fixmap space. */
+l2_pgentry_t __attribute__ ((__section__ (".bss.page_aligned")))
+ l2_fixmap[L2_PAGETABLE_ENTRIES];
+
/* Enough page directories to map into the bottom 1GB. */
l3_pgentry_t __attribute__ ((__section__ (".bss.page_aligned")))
l3_bootmap[L3_PAGETABLE_ENTRIES];
Index: xen-4.2.2-testing/xen/include/asm-x86/config.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/asm-x86/config.h
+++ xen-4.2.2-testing/xen/include/asm-x86/config.h
@@ -317,7 +317,7 @@ extern unsigned char boot_edid_info[128]
#define MACHPHYS_MBYTES 16 /* 1 MB needed per 1 GB memory */
#define FRAMETABLE_MBYTES (MACHPHYS_MBYTES * 6)
-#define IOREMAP_VIRT_END 0UL
+#define IOREMAP_VIRT_END _AC(0,UL)
#define IOREMAP_VIRT_START (IOREMAP_VIRT_END - (IOREMAP_MBYTES<<20))
#define DIRECTMAP_VIRT_END IOREMAP_VIRT_START
#define DIRECTMAP_VIRT_START (DIRECTMAP_VIRT_END - (DIRECTMAP_MBYTES<<20))
Index: xen-4.2.2-testing/xen/include/asm-x86/fixmap.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/asm-x86/fixmap.h
+++ xen-4.2.2-testing/xen/include/asm-x86/fixmap.h
@@ -13,12 +13,17 @@
#define _ASM_FIXMAP_H
#include
+#include
+
+#define FIXADDR_TOP (IOREMAP_VIRT_END - PAGE_SIZE)
+
+#ifndef __ASSEMBLY__
+
#include
#include
#include
#include
#include
-#include
#include
#include
#include
@@ -68,7 +73,6 @@ enum fixed_addresses {
__end_of_fixed_addresses
};
-#define FIXADDR_TOP (IOREMAP_VIRT_END - PAGE_SIZE)
#define FIXADDR_SIZE (__end_of_fixed_addresses << PAGE_SHIFT)
#define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE)
@@ -92,4 +96,6 @@ static inline unsigned long virt_to_fix(
return __virt_to_fix(vaddr);
}
+#endif /* __ASSEMBLY__ */
+
#endif
Index: xen-4.2.2-testing/xen/include/asm-x86/page.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/asm-x86/page.h
+++ xen-4.2.2-testing/xen/include/asm-x86/page.h
@@ -1,6 +1,8 @@
#ifndef __X86_PAGE_H__
#define __X86_PAGE_H__
+#include
+
/*
* It is important that the masks are signed quantities. This ensures that
* the compiler sign-extends a 32-bit mask to 64 bits if that is required.
@@ -306,13 +308,15 @@ extern l2_pgentry_t idle_pg_table_l2[
extern l2_pgentry_t *compat_idle_pg_table_l2;
extern unsigned int m2p_compat_vstart;
extern l2_pgentry_t l2_xenmap[L2_PAGETABLE_ENTRIES],
+ l2_fixmap[L2_PAGETABLE_ENTRIES],
l2_bootmap[L2_PAGETABLE_ENTRIES];
extern l3_pgentry_t l3_xenmap[L3_PAGETABLE_ENTRIES],
l3_identmap[L3_PAGETABLE_ENTRIES],
l3_bootmap[L3_PAGETABLE_ENTRIES];
#endif
extern l2_pgentry_t l2_identmap[4*L2_PAGETABLE_ENTRIES];
-extern l1_pgentry_t l1_identmap[L1_PAGETABLE_ENTRIES];
+extern l1_pgentry_t l1_identmap[L1_PAGETABLE_ENTRIES],
+ l1_fixmap[L1_PAGETABLE_ENTRIES];
void paging_init(void);
void setup_idle_pagetable(void);
#endif /* !defined(__ASSEMBLY__) */
Index: xen-4.2.2-testing/xen/include/xen/const.h
===================================================================
--- /dev/null
+++ xen-4.2.2-testing/xen/include/xen/const.h
@@ -0,0 +1,24 @@
+/* const.h: Macros for dealing with constants. */
+
+#ifndef __XEN_CONST_H__
+#define __XEN_CONST_H__
+
+/* Some constant macros are used in both assembler and
+ * C code. Therefore we cannot annotate them always with
+ * 'UL' and other type specifiers unilaterally. We
+ * use the following macros to deal with this.
+ *
+ * Similarly, _AT() will cast an expression with a type in C, but
+ * leave it unchanged in asm.
+ */
+
+#ifdef __ASSEMBLY__
+#define _AC(X,Y) X
+#define _AT(T,X) X
+#else
+#define __AC(X,Y) (X##Y)
+#define _AC(X,Y) __AC(X,Y)
+#define _AT(T,X) ((T)(X))
+#endif
+
+#endif /* __XEN_CONST_H__ */
++++++ 25862-sercon-non-com.patch ++++++
# HG changeset patch
# User Jan Beulich
# Date 1347371236 -7200
# Node ID 776a23fa0e938e4cf3307fc2e3b3f1a9488a5927
# Parent 51c2d7c83cbc2a0357ce112a463f91d354dcdba9
console: prepare for non-COMn port support
Widen SERHND_IDX (and use it where needed), introduce a flush low level
driver method, and remove unnecessary peeking of the common code at the
(driver specific) serial port identification string in the "console="
command line option value.
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -1017,7 +1017,7 @@ void __init smp_intr_init(void)
* Also ensure serial interrupts are high priority. We do not
* want them to be blocked by unacknowledged guest-bound interrupts.
*/
- for ( seridx = 0; seridx < 2; seridx++ )
+ for ( seridx = 0; seridx <= SERHND_IDX; seridx++ )
{
if ( (irq = serial_irq(seridx)) < 0 )
continue;
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -539,6 +539,7 @@ void printk(const char *fmt, ...)
void __init console_init_preirq(void)
{
char *p;
+ int sh;
serial_init_preirq();
@@ -551,8 +552,9 @@ void __init console_init_preirq(void)
vga_init();
else if ( !strncmp(p, "none", 4) )
continue;
- else if ( strncmp(p, "com", 3) ||
- (sercon_handle = serial_parse_handle(p)) == -1 )
+ else if ( (sh = serial_parse_handle(p)) >= 0 )
+ sercon_handle = sh;
+ else
{
char *q = strchr(p, ',');
if ( q != NULL )
--- a/xen/drivers/char/serial.c
+++ b/xen/drivers/char/serial.c
@@ -22,9 +22,11 @@ size_param("serial_tx_buffer", serial_tx
#define mask_serial_rxbuf_idx(_i) ((_i)&(serial_rxbufsz-1))
#define mask_serial_txbuf_idx(_i) ((_i)&(serial_txbufsz-1))
-static struct serial_port com[2] = {
- { .rx_lock = SPIN_LOCK_UNLOCKED, .tx_lock = SPIN_LOCK_UNLOCKED },
- { .rx_lock = SPIN_LOCK_UNLOCKED, .tx_lock = SPIN_LOCK_UNLOCKED }
+static struct serial_port com[SERHND_IDX + 1] = {
+ [0 ... SERHND_IDX] = {
+ .rx_lock = SPIN_LOCK_UNLOCKED,
+ .tx_lock = SPIN_LOCK_UNLOCKED
+ }
};
void serial_rx_interrupt(struct serial_port *port, struct cpu_user_regs *regs)
@@ -81,6 +83,8 @@ void serial_tx_interrupt(struct serial_p
port->driver->putc(
port, port->txbuf[mask_serial_txbuf_idx(port->txbufc++)]);
}
+ if ( i && port->driver->flush )
+ port->driver->flush(port);
}
spin_unlock(&port->tx_lock);
@@ -175,6 +179,9 @@ void serial_putc(int handle, char c)
__serial_putc(port, c);
+ if ( port->driver->flush )
+ port->driver->flush(port);
+
spin_unlock_irqrestore(&port->tx_lock, flags);
}
@@ -206,6 +213,9 @@ void serial_puts(int handle, const char
__serial_putc(port, c);
}
+ if ( port->driver->flush )
+ port->driver->flush(port);
+
spin_unlock_irqrestore(&port->tx_lock, flags);
}
@@ -261,10 +271,10 @@ int __init serial_parse_handle(char *con
switch ( conf[3] )
{
case '1':
- handle = 0;
+ handle = SERHND_COM1;
break;
case '2':
- handle = 1;
+ handle = SERHND_COM2;
break;
default:
goto fail;
@@ -365,6 +375,8 @@ void serial_start_sync(int handle)
port->driver->putc(
port, port->txbuf[mask_serial_txbuf_idx(port->txbufc++)]);
}
+ if ( port->driver->flush )
+ port->driver->flush(port);
}
spin_unlock_irqrestore(&port->tx_lock, flags);
--- a/xen/include/xen/serial.h
+++ b/xen/include/xen/serial.h
@@ -60,6 +60,8 @@ struct uart_driver {
int (*tx_empty)(struct serial_port *);
/* Put a character onto the serial line. */
void (*putc)(struct serial_port *, char);
+ /* Flush accumulated characters. */
+ void (*flush)(struct serial_port *);
/* Get a character from the serial line: returns 0 if none available. */
int (*getc)(struct serial_port *, char *);
/* Get IRQ number for this port's serial line: returns -1 if none. */
@@ -67,10 +69,12 @@ struct uart_driver {
};
/* 'Serial handles' are composed from the following fields. */
-#define SERHND_IDX (1<<0) /* COM1 or COM2? */
-#define SERHND_HI (1<<1) /* Mux/demux each transferred char by MSB. */
-#define SERHND_LO (1<<2) /* Ditto, except that the MSB is cleared. */
-#define SERHND_COOKED (1<<3) /* Newline/carriage-return translation? */
+#define SERHND_IDX (3<<0) /* COM1 or COM2? */
+# define SERHND_COM1 (0<<0)
+# define SERHND_COM2 (1<<0)
+#define SERHND_HI (1<<2) /* Mux/demux each transferred char by MSB. */
+#define SERHND_LO (1<<3) /* Ditto, except that the MSB is cleared. */
+#define SERHND_COOKED (1<<4) /* Newline/carriage-return translation? */
/* Two-stage initialisation (before/after IRQ-subsystem initialisation). */
void serial_init_preirq(void);
++++++ 25863-sercon-ehci-dbgp.patch ++++++
++++ 1798 lines (skipped)
++++++ 25864-sercon-unused.patch ++++++
# HG changeset patch
# User Jan Beulich
# Date 1347371512 -7200
# Node ID e1380b5311ccee14eb47d7badb75339933d42249
# Parent 0d0c55a1975db9c6cac2e9259b5ebea7a7bdbaec
serial: avoid fully initializing unused consoles
Defer calling the drivers' post-IRQ initialization functions (generally
doing allocation of transmit buffers) until it is known that the
respective console is actually going to be used.
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
--- a/xen/drivers/char/ehci-dbgp.c
+++ b/xen/drivers/char/ehci-dbgp.c
@@ -1391,7 +1391,8 @@ static int ehci_dbgp_check_release(struc
printk(XENLOG_INFO "Releasing EHCI debug port at %02x:%02x.%u\n",
dbgp->bus, dbgp->slot, dbgp->func);
- kill_timer(&dbgp->timer);
+ if ( dbgp->timer.function )
+ kill_timer(&dbgp->timer);
dbgp->ehci_debug = NULL;
ctrl = readl(&ehci_debug->control);
--- a/xen/drivers/char/serial.c
+++ b/xen/drivers/char/serial.c
@@ -29,6 +29,8 @@ static struct serial_port com[SERHND_IDX
}
};
+static bool_t __read_mostly post_irq;
+
void serial_rx_interrupt(struct serial_port *port, struct cpu_user_regs *regs)
{
char c;
@@ -263,14 +265,12 @@ char serial_getc(int handle)
int __init serial_parse_handle(char *conf)
{
- int handle;
+ int handle, flags = 0;
if ( !strncmp(conf, "dbgp", 4) && (!conf[4] || conf[4] == ',') )
{
- if ( !com[SERHND_DBGP].driver )
- goto fail;
-
- return SERHND_DBGP | SERHND_COOKED;
+ handle = SERHND_DBGP;
+ goto common;
}
if ( strncmp(conf, "com", 3) )
@@ -288,17 +288,25 @@ int __init serial_parse_handle(char *con
goto fail;
}
- if ( !com[handle].driver )
- goto fail;
-
if ( conf[4] == 'H' )
- handle |= SERHND_HI;
+ flags |= SERHND_HI;
else if ( conf[4] == 'L' )
- handle |= SERHND_LO;
+ flags |= SERHND_LO;
- handle |= SERHND_COOKED;
+ common:
+ if ( !com[handle].driver )
+ goto fail;
+
+ if ( !post_irq )
+ com[handle].state = serial_parsed;
+ else if ( com[handle].state != serial_initialized )
+ {
+ if ( com[handle].driver->init_postirq )
+ com[handle].driver->init_postirq(&com[handle]);
+ com[handle].state = serial_initialized;
+ }
- return handle;
+ return handle | flags | SERHND_COOKED;
fail:
return -1;
@@ -450,8 +458,13 @@ void __init serial_init_postirq(void)
{
int i;
for ( i = 0; i < ARRAY_SIZE(com); i++ )
- if ( com[i].driver && com[i].driver->init_postirq )
- com[i].driver->init_postirq(&com[i]);
+ if ( com[i].state == serial_parsed )
+ {
+ if ( com[i].driver->init_postirq )
+ com[i].driver->init_postirq(&com[i]);
+ com[i].state = serial_initialized;
+ }
+ post_irq = 1;
}
void __init serial_endboot(void)
@@ -475,7 +488,7 @@ void serial_suspend(void)
{
int i;
for ( i = 0; i < ARRAY_SIZE(com); i++ )
- if ( com[i].driver && com[i].driver->suspend )
+ if ( com[i].state == serial_initialized && com[i].driver->suspend )
com[i].driver->suspend(&com[i]);
}
@@ -483,7 +496,7 @@ void serial_resume(void)
{
int i;
for ( i = 0; i < ARRAY_SIZE(com); i++ )
- if ( com[i].driver && com[i].driver->resume )
+ if ( com[i].state == serial_initialized && com[i].driver->resume )
com[i].driver->resume(&com[i]);
}
--- a/xen/include/xen/serial.h
+++ b/xen/include/xen/serial.h
@@ -25,10 +25,17 @@ extern unsigned int serial_txbufsz;
struct uart_driver;
+enum serial_port_state {
+ serial_unused,
+ serial_parsed,
+ serial_initialized
+};
+
struct serial_port {
/* Uart-driver parameters. */
struct uart_driver *driver;
void *uart;
+ enum serial_port_state state;
/* Number of characters the port can hold for transmit. */
int tx_fifo_size;
/* Transmit data buffer (interrupt-driven uart). */
++++++ 25866-sercon-ns16550-pci-irq.patch ++++++
# HG changeset patch
# User Jan Beulich
# Date 1347371733 -7200
# Node ID ee12dc357fbecbb0517798f395d14bf1764c6766
# Parent 5fb5b3b70e34ef278d06aff27878b4b8e6d9145f
ns16550: PCI initialization adjustments
Besides single-port serial cards, also accept multi-port ones and such
providing mixed functionality (e.g. also having a parallel port).
Reading PCI_INTERRUPT_PIN before ACPI gets enabled generally produces
an incorrect IRQ (below 16, whereas after enabling ACPI it frequently
would end up at a higher one), so this is useful (almost) only when a
system already boots in ACPI mode.
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -449,7 +449,6 @@ static int __init check_existence(struct
static int
pci_uart_config (struct ns16550 *uart, int skip_amt, int bar_idx)
{
- uint16_t class;
uint32_t bar, len;
int b, d, f;
@@ -460,9 +459,15 @@ pci_uart_config (struct ns16550 *uart, i
{
for ( f = 0; f < 0x8; f++ )
{
- class = pci_conf_read16(0, b, d, f, PCI_CLASS_DEVICE);
- if ( class != 0x700 )
+ switch ( pci_conf_read16(0, b, d, f, PCI_CLASS_DEVICE) )
+ {
+ case 0x0700: /* single port serial */
+ case 0x0702: /* multi port serial */
+ case 0x0780: /* other (e.g serial+parallel) */
+ break;
+ default:
continue;
+ }
bar = pci_conf_read32(0, b, d, f,
PCI_BASE_ADDRESS_0 + bar_idx*4);
@@ -485,7 +490,8 @@ pci_uart_config (struct ns16550 *uart, i
uart->bar = bar;
uart->bar_idx = bar_idx;
uart->io_base = bar & 0xfffe;
- uart->irq = 0;
+ uart->irq = pci_conf_read8(0, b, d, f, PCI_INTERRUPT_PIN) ?
+ pci_conf_read8(0, b, d, f, PCI_INTERRUPT_LINE) : 0;
return 0;
}
++++++ 25867-sercon-ns16550-parse.patch ++++++
# HG changeset patch
# User Jan Beulich
# Date 1347371805 -7200
# Node ID b22f184e1a3cac03abeed92ec4b74235fd0881f4
# Parent ee12dc357fbecbb0517798f395d14bf1764c6766
ns16550: command line parsing adjustments
Allow intermediate parts of the command line options to be absent
(expressed by two immediately succeeding commas).
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -199,7 +199,7 @@ If set, override Xen's calculation of th
If set, override Xen's default choice for the platform timer.
### com1,com2
-> `= <baud>[/][,DPS[,<io-base>[,<irq>[,<port-bdf>[,<bridge-bdf>]]]] | pci | amt ] `
+> `= <baud>[/][,[DPS][,[<io-base>|pci|amt][,[<irq>][,[<port-bdf>][,[<bridge-bdf>]]]]]]`
Both option `com1` and `com2` follow the same format.
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -536,26 +536,23 @@ static void __init ns16550_parse_port_co
else if ( (baud = simple_strtoul(conf, &conf, 10)) != 0 )
uart->baud = baud;
- if ( *conf == '/')
+ if ( *conf == '/' )
{
conf++;
uart->clock_hz = simple_strtoul(conf, &conf, 0) << 4;
}
- if ( *conf != ',' )
- goto config_parsed;
- conf++;
-
- uart->data_bits = simple_strtoul(conf, &conf, 10);
+ if ( *conf == ',' && *++conf != ',' )
+ {
+ uart->data_bits = simple_strtoul(conf, &conf, 10);
- uart->parity = parse_parity_char(*conf);
- conf++;
+ uart->parity = parse_parity_char(*conf);
- uart->stop_bits = simple_strtoul(conf, &conf, 10);
+ uart->stop_bits = simple_strtoul(conf + 1, &conf, 10);
+ }
- if ( *conf == ',' )
+ if ( *conf == ',' && *++conf != ',' )
{
- conf++;
if ( strncmp(conf, "pci", 3) == 0 )
{
if ( pci_uart_config(uart, 1/* skip AMT */, uart - ns16550_com) )
@@ -572,24 +569,21 @@ static void __init ns16550_parse_port_co
{
uart->io_base = simple_strtoul(conf, &conf, 0);
}
+ }
- if ( *conf == ',' )
- {
- conf++;
- uart->irq = simple_strtoul(conf, &conf, 10);
- if ( *conf == ',' )
- {
- conf++;
- uart->ps_bdf_enable = 1;
- parse_pci_bdf(&conf, &uart->ps_bdf[0]);
- if ( *conf == ',' )
- {
- conf++;
- uart->pb_bdf_enable = 1;
- parse_pci_bdf(&conf, &uart->pb_bdf[0]);
- }
- }
- }
+ if ( *conf == ',' && *++conf != ',' )
+ uart->irq = simple_strtol(conf, &conf, 10);
+
+ if ( *conf == ',' && *++conf != ',' )
+ {
+ uart->ps_bdf_enable = 1;
+ parse_pci_bdf(&conf, &uart->ps_bdf[0]);
+ }
+
+ if ( *conf == ',' && *++conf != ',' )
+ {
+ uart->pb_bdf_enable = 1;
+ parse_pci_bdf(&conf, &uart->pb_bdf[0]);
}
config_parsed:
++++++ 25874-x86-EFI-chain-cfg.patch ++++++
# HG changeset patch
# User Jan Beulich
# Date 1347437974 -7200
# Node ID 8c0aa97d529a55de2ab96be1a5a6e9ed6a9c6bf0
# Parent ac8f4afccd6c6786a3fd5691e8b0c9b38c47e994
x86-64/EFI: allow chaining of config files
Namely when making use the CONFIG_XEN_COMPAT_* options in the legacy
Linux kernels, newer kernels may not be compatible with older
hypervisors, so trying to boot such a combination makes little sense.
Booting older kernels on newer hypervisors, however, has to always
work.
With the way xen.efi looks for its configuration file, allowing
individual configuration files to refer only to compatible kernels,
and referring from an older- to a newer-hypervisor one (the kernels
of which will, as said, necessarily be compatible with the older
hypervisor) allows to greatly reduce redundancy at least in
development environments where one frequently wants multiple
hypervisors and kernles to be installed in parallel.
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
--- a/docs/misc/efi.markdown
+++ b/docs/misc/efi.markdown
@@ -75,6 +75,13 @@ Specifies an XSM module to load.
Specifies a CPU microcode blob to load.
+###`chain=<filename>`
+
+Specifies an alternate configuration file to use in case the specified section
+(and in particular its `kernel=` setting) can't be found in the default (or
+specified) configuration file. This is only meaningful in the [global] section
+and really not meant to be used together with the `-cfg=` command line option.
+
Filenames must be specified relative to the location of the EFI binary.
Extra options to be passed to Xen can also be specified on the command line,
--- a/xen/arch/x86/efi/boot.c
+++ b/xen/arch/x86/efi/boot.c
@@ -797,7 +797,26 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SY
else
section.s = get_value(&cfg, "global", "default");
- name.s = get_value(&cfg, section.s, "kernel");
+ for ( ; ; )
+ {
+ name.s = get_value(&cfg, section.s, "kernel");
+ if ( name.s )
+ break;
+ name.s = get_value(&cfg, "global", "chain");
+ if ( !name.s )
+ break;
+ efi_bs->FreePages(cfg.addr, PFN_UP(cfg.size));
+ cfg.addr = 0;
+ if ( !read_file(dir_handle, s2w(&name), &cfg) )
+ {
+ PrintStr(L"Chained configuration file '");
+ PrintStr(name.w);
+ efi_bs->FreePool(name.w);
+ blexit(L"'not found\r\n");
+ }
+ pre_parse(&cfg);
+ efi_bs->FreePool(name.w);
+ }
if ( !name.s )
blexit(L"No Dom0 kernel image specified\r\n");
split_value(name.s);
++++++ 25909-xenpm-consistent.patch ++++++
++++ 630 lines (skipped)
++++++ 25920-x86-APICV-enable.patch ++++++
References: FATE#313605
# HG changeset patch
# User Jiongxi Li
# Date 1347912248 -3600
# Node ID ec60de627945f17ec2ce5c14e1224b59403875f7
# Parent 62de66cec48a1716bb700912da451a26296b8d1e
xen: enable APIC-Register Virtualization
Add APIC register virtualization support
- APIC read doesn't cause VM-Exit
- APIC write becomes trap-like
Signed-off-by: Gang Wei
Signed-off-by: Yang Zhang
Signed-off-by: Jiongxi Li
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -822,6 +822,12 @@ static int vlapic_write(struct vcpu *v,
return rc;
}
+int vlapic_apicv_write(struct vcpu *v, unsigned int offset)
+{
+ uint32_t val = vlapic_get_reg(vcpu_vlapic(v), offset);
+ return vlapic_reg_write(v, offset, val);
+}
+
int hvm_x2apic_msr_write(struct vcpu *v, unsigned int msr, uint64_t msr_content)
{
struct vlapic *vlapic = vcpu_vlapic(v);
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -89,6 +89,7 @@ static void __init vmx_display_features(
P(cpu_has_vmx_vnmi, "Virtual NMI");
P(cpu_has_vmx_msr_bitmap, "MSR direct-access bitmap");
P(cpu_has_vmx_unrestricted_guest, "Unrestricted Guest");
+ P(cpu_has_vmx_apic_reg_virt, "APIC Register Virtualization");
#undef P
if ( !printed )
@@ -186,6 +187,14 @@ static int vmx_init_vmcs_config(void)
if ( opt_unrestricted_guest_enabled )
opt |= SECONDARY_EXEC_UNRESTRICTED_GUEST;
+ /*
+ * "APIC Register Virtualization"
+ * can be set only when "use TPR shadow" is set
+ */
+ if ( _vmx_cpu_based_exec_control & CPU_BASED_TPR_SHADOW )
+ opt |= SECONDARY_EXEC_APIC_REGISTER_VIRT;
+
+
_vmx_secondary_exec_control = adjust_vmx_controls(
"Secondary Exec Control", min, opt,
MSR_IA32_VMX_PROCBASED_CTLS2, &mismatch);
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2279,6 +2279,16 @@ static void vmx_idtv_reinject(unsigned l
}
}
+static int vmx_handle_apic_write(void)
+{
+ unsigned long exit_qualification = __vmread(EXIT_QUALIFICATION);
+ unsigned int offset = exit_qualification & 0xfff;
+
+ ASSERT(cpu_has_vmx_apic_reg_virt);
+
+ return vlapic_apicv_write(current, offset);
+}
+
void vmx_vmexit_handler(struct cpu_user_regs *regs)
{
unsigned int exit_reason, idtv_info, intr_info = 0, vector = 0;
@@ -2741,6 +2751,11 @@ void vmx_vmexit_handler(struct cpu_user_
break;
}
+ case EXIT_REASON_APIC_WRITE:
+ if ( vmx_handle_apic_write() )
+ hvm_inject_hw_exception(TRAP_gp_fault, 0);
+ break;
+
case EXIT_REASON_ACCESS_GDTR_OR_IDTR:
case EXIT_REASON_ACCESS_LDTR_OR_TR:
case EXIT_REASON_VMX_PREEMPTION_TIMER_EXPIRED:
--- a/xen/include/asm-x86/hvm/vlapic.h
+++ b/xen/include/asm-x86/hvm/vlapic.h
@@ -103,6 +103,8 @@ void vlapic_EOI_set(struct vlapic *vlapi
int vlapic_ipi(struct vlapic *vlapic, uint32_t icr_low, uint32_t icr_high);
+int vlapic_apicv_write(struct vcpu *v, unsigned int offset);
+
struct vlapic *vlapic_lowest_prio(
struct domain *d, struct vlapic *source,
int short_hand, uint8_t dest, uint8_t dest_mode);
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -182,6 +182,7 @@ extern u32 vmx_vmentry_control;
#define SECONDARY_EXEC_ENABLE_VPID 0x00000020
#define SECONDARY_EXEC_WBINVD_EXITING 0x00000040
#define SECONDARY_EXEC_UNRESTRICTED_GUEST 0x00000080
+#define SECONDARY_EXEC_APIC_REGISTER_VIRT 0x00000100
#define SECONDARY_EXEC_PAUSE_LOOP_EXITING 0x00000400
#define SECONDARY_EXEC_ENABLE_INVPCID 0x00001000
extern u32 vmx_secondary_exec_control;
@@ -230,6 +231,8 @@ extern bool_t cpu_has_vmx_ins_outs_instr
SECONDARY_EXEC_UNRESTRICTED_GUEST)
#define cpu_has_vmx_ple \
(vmx_secondary_exec_control & SECONDARY_EXEC_PAUSE_LOOP_EXITING)
+#define cpu_has_vmx_apic_reg_virt \
+ (vmx_secondary_exec_control & SECONDARY_EXEC_APIC_REGISTER_VIRT)
/* GUEST_INTERRUPTIBILITY_INFO flags. */
#define VMX_INTR_SHADOW_STI 0x00000001
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -129,6 +129,7 @@ void vmx_update_cpu_exec_control(struct
#define EXIT_REASON_INVVPID 53
#define EXIT_REASON_WBINVD 54
#define EXIT_REASON_XSETBV 55
+#define EXIT_REASON_APIC_WRITE 56
#define EXIT_REASON_INVPCID 58
/*
++++++ 25921-x86-APICV-delivery.patch ++++++
References: FATE#313605
# HG changeset patch
# User Jiongxi Li
# Date 1347912311 -3600
# Node ID 713b8849b11afa05f1dde157a3f5086fa3aaad08
# Parent ec60de627945f17ec2ce5c14e1224b59403875f7
xen: enable Virtual-interrupt delivery
Virtual interrupt delivery avoids Xen to inject vAPIC interrupts
manually, which is fully taken care of by the hardware. This needs
some special awareness into existing interrupr injection path:
For pending interrupt from vLAPIC, instead of direct injection, we may
need update architecture specific indicators before resuming to guest.
Before returning to guest, RVI should be updated if any pending IRRs
EOI exit bitmap controls whether an EOI write should cause VM-Exit. If
set, a trap-like induced EOI VM-Exit is triggered. The approach here
is to manipulate EOI exit bitmap based on value of TMR. Level
triggered irq requires a hook in vLAPIC EOI write, so that vIOAPIC EOI
is triggered and emulated
Signed-off-by: Gang Wei
Signed-off-by: Yang Zhang
Signed-off-by: Jiongxi Li
Committed-by: Keir Fraser
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -145,6 +145,9 @@ int vlapic_set_irq(struct vlapic *vlapic
if ( trig )
vlapic_set_vector(vec, &vlapic->regs->data[APIC_TMR]);
+ if ( hvm_funcs.update_eoi_exit_bitmap )
+ hvm_funcs.update_eoi_exit_bitmap(vlapic_vcpu(vlapic), vec ,trig);
+
/* We may need to wake up target vcpu, besides set pending bit here */
return !vlapic_test_and_set_irr(vec, vlapic);
}
@@ -410,6 +413,14 @@ void vlapic_EOI_set(struct vlapic *vlapi
hvm_dpci_msi_eoi(current->domain, vector);
}
+void vlapic_handle_EOI_induced_exit(struct vlapic *vlapic, int vector)
+{
+ if ( vlapic_test_and_clear_vector(vector, &vlapic->regs->data[APIC_TMR]) )
+ vioapic_update_EOI(vlapic_domain(vlapic), vector);
+
+ hvm_dpci_msi_eoi(current->domain, vector);
+}
+
int vlapic_ipi(
struct vlapic *vlapic, uint32_t icr_low, uint32_t icr_high)
{
@@ -996,6 +1007,14 @@ void vlapic_adjust_i8259_target(struct d
pt_adjust_global_vcpu_target(v);
}
+int vlapic_virtual_intr_delivery_enabled(void)
+{
+ if ( hvm_funcs.virtual_intr_delivery_enabled )
+ return hvm_funcs.virtual_intr_delivery_enabled();
+ else
+ return 0;
+}
+
int vlapic_has_pending_irq(struct vcpu *v)
{
struct vlapic *vlapic = vcpu_vlapic(v);
@@ -1008,6 +1027,9 @@ int vlapic_has_pending_irq(struct vcpu *
if ( irr == -1 )
return -1;
+ if ( vlapic_virtual_intr_delivery_enabled() )
+ return irr;
+
isr = vlapic_find_highest_isr(vlapic);
isr = (isr != -1) ? isr : 0;
if ( (isr & 0xf0) >= (irr & 0xf0) )
@@ -1020,6 +1042,9 @@ int vlapic_ack_pending_irq(struct vcpu *
{
struct vlapic *vlapic = vcpu_vlapic(v);
+ if ( vlapic_virtual_intr_delivery_enabled() )
+ return 1;
+
vlapic_set_vector(vector, &vlapic->regs->data[APIC_ISR]);
vlapic_clear_irr(vector, vlapic);
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -209,6 +209,7 @@ void vmx_intr_assist(void)
struct vcpu *v = current;
unsigned int tpr_threshold = 0;
enum hvm_intblk intblk;
+ int pt_vector = -1;
/* Block event injection when single step with MTF. */
if ( unlikely(v->arch.hvm_vcpu.single_step) )
@@ -219,7 +220,7 @@ void vmx_intr_assist(void)
}
/* Crank the handle on interrupt state. */
- pt_update_irq(v);
+ pt_vector = pt_update_irq(v);
do {
intack = hvm_vcpu_has_pending_irq(v);
@@ -230,16 +231,34 @@ void vmx_intr_assist(void)
goto out;
intblk = hvm_interrupt_blocked(v, intack);
- if ( intblk == hvm_intblk_tpr )
+ if ( cpu_has_vmx_virtual_intr_delivery )
+ {
+ /* Set "Interrupt-window exiting" for ExtINT */
+ if ( (intblk != hvm_intblk_none) &&
+ ( (intack.source == hvm_intsrc_pic) ||
+ ( intack.source == hvm_intsrc_vector) ) )
+ {
+ enable_intr_window(v, intack);
+ goto out;
+ }
+
+ if ( __vmread(VM_ENTRY_INTR_INFO) & INTR_INFO_VALID_MASK )
+ {
+ if ( (intack.source == hvm_intsrc_pic) ||
+ (intack.source == hvm_intsrc_nmi) ||
+ (intack.source == hvm_intsrc_mce) )
+ enable_intr_window(v, intack);
+
+ goto out;
+ }
+ } else if ( intblk == hvm_intblk_tpr )
{
ASSERT(vlapic_enabled(vcpu_vlapic(v)));
ASSERT(intack.source == hvm_intsrc_lapic);
tpr_threshold = intack.vector >> 4;
goto out;
- }
-
- if ( (intblk != hvm_intblk_none) ||
- (__vmread(VM_ENTRY_INTR_INFO) & INTR_INFO_VALID_MASK) )
+ } else if ( (intblk != hvm_intblk_none) ||
+ (__vmread(VM_ENTRY_INTR_INFO) & INTR_INFO_VALID_MASK) )
{
enable_intr_window(v, intack);
goto out;
@@ -256,6 +275,44 @@ void vmx_intr_assist(void)
{
hvm_inject_hw_exception(TRAP_machine_check, HVM_DELIVER_NO_ERROR_CODE);
}
+ else if ( cpu_has_vmx_virtual_intr_delivery &&
+ intack.source != hvm_intsrc_pic &&
+ intack.source != hvm_intsrc_vector )
+ {
+ unsigned long status = __vmread(GUEST_INTR_STATUS);
+
+ /*
+ * Set eoi_exit_bitmap for periodic timer interrup to cause EOI-induced VM
+ * exit, then pending periodic time interrups have the chance to be injected
+ * for compensation
+ */
+ if (pt_vector != -1)
+ vmx_set_eoi_exit_bitmap(v, pt_vector);
+
+ /* we need update the RVI field */
+ status &= ~(unsigned long)0x0FF;
+ status |= (unsigned long)0x0FF &
+ intack.vector;
+ __vmwrite(GUEST_INTR_STATUS, status);
+ if (v->arch.hvm_vmx.eoi_exitmap_changed) {
+#ifdef __i386__
+#define UPDATE_EOI_EXITMAP(v, e) { \
+ if (test_and_clear_bit(e, &v->arch.hvm_vmx.eoi_exitmap_changed)) { \
+ __vmwrite(EOI_EXIT_BITMAP##e, v->arch.hvm_vmx.eoi_exit_bitmap[e]); \
+ __vmwrite(EOI_EXIT_BITMAP##e##_HIGH, v->arch.hvm_vmx.eoi_exit_bitmap[e] >> 32);}}
+#else
+#define UPDATE_EOI_EXITMAP(v, e) { \
+ if (test_and_clear_bit(e, &v->arch.hvm_vmx.eoi_exitmap_changed)) { \
+ __vmwrite(EOI_EXIT_BITMAP##e, v->arch.hvm_vmx.eoi_exit_bitmap[e]);}}
+#endif
+ UPDATE_EOI_EXITMAP(v, 0);
+ UPDATE_EOI_EXITMAP(v, 1);
+ UPDATE_EOI_EXITMAP(v, 2);
+ UPDATE_EOI_EXITMAP(v, 3);
+ }
+
+ pt_intr_post(v, intack);
+ }
else
{
HVMTRACE_2D(INJ_VIRQ, intack.vector, /*fake=*/ 0);
@@ -265,11 +322,16 @@ void vmx_intr_assist(void)
/* Is there another IRQ to queue up behind this one? */
intack = hvm_vcpu_has_pending_irq(v);
- if ( unlikely(intack.source != hvm_intsrc_none) )
- enable_intr_window(v, intack);
+ if ( !cpu_has_vmx_virtual_intr_delivery ||
+ intack.source == hvm_intsrc_pic ||
+ intack.source == hvm_intsrc_vector )
+ {
+ if ( unlikely(intack.source != hvm_intsrc_none) )
+ enable_intr_window(v, intack);
+ }
out:
- if ( cpu_has_vmx_tpr_shadow )
+ if ( !cpu_has_vmx_virtual_intr_delivery && cpu_has_vmx_tpr_shadow )
__vmwrite(TPR_THRESHOLD, tpr_threshold);
}
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -90,6 +90,7 @@ static void __init vmx_display_features(
P(cpu_has_vmx_msr_bitmap, "MSR direct-access bitmap");
P(cpu_has_vmx_unrestricted_guest, "Unrestricted Guest");
P(cpu_has_vmx_apic_reg_virt, "APIC Register Virtualization");
+ P(cpu_has_vmx_virtual_intr_delivery, "Virtual Interrupt Delivery");
#undef P
if ( !printed )
@@ -188,11 +189,12 @@ static int vmx_init_vmcs_config(void)
opt |= SECONDARY_EXEC_UNRESTRICTED_GUEST;
/*
- * "APIC Register Virtualization"
+ * "APIC Register Virtualization" and "Virtual Interrupt Delivery"
* can be set only when "use TPR shadow" is set
*/
if ( _vmx_cpu_based_exec_control & CPU_BASED_TPR_SHADOW )
- opt |= SECONDARY_EXEC_APIC_REGISTER_VIRT;
+ opt |= SECONDARY_EXEC_APIC_REGISTER_VIRT |
+ SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY;
_vmx_secondary_exec_control = adjust_vmx_controls(
@@ -787,6 +789,22 @@ static int construct_vmcs(struct vcpu *v
__vmwrite(IO_BITMAP_A, virt_to_maddr((char *)hvm_io_bitmap + 0));
__vmwrite(IO_BITMAP_B, virt_to_maddr((char *)hvm_io_bitmap + PAGE_SIZE));
+ if ( cpu_has_vmx_virtual_intr_delivery )
+ {
+ /* EOI-exit bitmap */
+ v->arch.hvm_vmx.eoi_exit_bitmap[0] = (uint64_t)0;
+ v->arch.hvm_vmx.eoi_exit_bitmap[1] = (uint64_t)0;
+ v->arch.hvm_vmx.eoi_exit_bitmap[2] = (uint64_t)0;
+ v->arch.hvm_vmx.eoi_exit_bitmap[3] = (uint64_t)0;
+ __vmwrite(EOI_EXIT_BITMAP0, 0);
+ __vmwrite(EOI_EXIT_BITMAP1, 0);
+ __vmwrite(EOI_EXIT_BITMAP2, 0);
+ __vmwrite(EOI_EXIT_BITMAP3, 0);
+
+ /* Initialise Guest Interrupt Status (RVI and SVI) to 0 */
+ __vmwrite(GUEST_INTR_STATUS, 0);
+ }
+
/* Host data selectors. */
__vmwrite(HOST_SS_SELECTOR, __HYPERVISOR_DS);
__vmwrite(HOST_DS_SELECTOR, __HYPERVISOR_DS);
@@ -1028,6 +1046,30 @@ int vmx_add_host_load_msr(u32 msr)
return 0;
}
+void vmx_set_eoi_exit_bitmap(struct vcpu *v, u8 vector)
+{
+ int index, offset, changed;
+
+ index = vector >> 6;
+ offset = vector & 63;
+ changed = !test_and_set_bit(offset,
+ (uint64_t *)&v->arch.hvm_vmx.eoi_exit_bitmap[index]);
+ if (changed)
+ set_bit(index, &v->arch.hvm_vmx.eoi_exitmap_changed);
+}
+
+void vmx_clear_eoi_exit_bitmap(struct vcpu *v, u8 vector)
+{
+ int index, offset, changed;
+
+ index = vector >> 6;
+ offset = vector & 63;
+ changed = test_and_clear_bit(offset,
+ (uint64_t *)&v->arch.hvm_vmx.eoi_exit_bitmap[index]);
+ if (changed)
+ set_bit(index, &v->arch.hvm_vmx.eoi_exitmap_changed);
+}
+
int vmx_create_vmcs(struct vcpu *v)
{
struct arch_vmx_struct *arch_vmx = &v->arch.hvm_vmx;
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1507,6 +1507,22 @@ static void vmx_set_info_guest(struct vc
vmx_vmcs_exit(v);
}
+static void vmx_update_eoi_exit_bitmap(struct vcpu *v, u8 vector, u8 trig)
+{
+ if ( cpu_has_vmx_virtual_intr_delivery )
+ {
+ if (trig)
+ vmx_set_eoi_exit_bitmap(v, vector);
+ else
+ vmx_clear_eoi_exit_bitmap(v, vector);
+ }
+}
+
+static int vmx_virtual_intr_delivery_enabled(void)
+{
+ return cpu_has_vmx_virtual_intr_delivery;
+}
+
static struct hvm_function_table __read_mostly vmx_function_table = {
.name = "VMX",
.cpu_up_prepare = vmx_cpu_up_prepare,
@@ -1553,7 +1569,9 @@ static struct hvm_function_table __read_
.nhvm_vmcx_guest_intercepts_trap = nvmx_intercepts_exception,
.nhvm_vcpu_vmexit_trap = nvmx_vmexit_trap,
.nhvm_intr_blocked = nvmx_intr_blocked,
- .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources
+ .nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
+ .update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
+ .virtual_intr_delivery_enabled = vmx_virtual_intr_delivery_enabled
};
struct hvm_function_table * __init start_vmx(void)
@@ -2289,6 +2307,17 @@ static int vmx_handle_apic_write(void)
return vlapic_apicv_write(current, offset);
}
+/*
+ * When "Virtual Interrupt Delivery" is enabled, this function is used
+ * to handle EOI-induced VM exit
+ */
+void vmx_handle_EOI_induced_exit(struct vlapic *vlapic, int vector)
+{
+ ASSERT(cpu_has_vmx_virtual_intr_delivery);
+
+ vlapic_handle_EOI_induced_exit(vlapic, vector);
+}
+
void vmx_vmexit_handler(struct cpu_user_regs *regs)
{
unsigned int exit_reason, idtv_info, intr_info = 0, vector = 0;
@@ -2689,6 +2718,16 @@ void vmx_vmexit_handler(struct cpu_user_
hvm_inject_hw_exception(TRAP_gp_fault, 0);
break;
+ case EXIT_REASON_EOI_INDUCED:
+ {
+ int vector;
+ exit_qualification = __vmread(EXIT_QUALIFICATION);
+ vector = exit_qualification & 0xff;
+
+ vmx_handle_EOI_induced_exit(vcpu_vlapic(current), vector);
+ break;
+ }
+
case EXIT_REASON_IO_INSTRUCTION:
exit_qualification = __vmread(EXIT_QUALIFICATION);
if ( exit_qualification & 0x10 )
--- a/xen/arch/x86/hvm/vpt.c
+++ b/xen/arch/x86/hvm/vpt.c
@@ -212,7 +212,7 @@ static void pt_timer_fn(void *data)
pt_unlock(pt);
}
-void pt_update_irq(struct vcpu *v)
+int pt_update_irq(struct vcpu *v)
{
struct list_head *head = &v->arch.hvm_vcpu.tm_list;
struct periodic_time *pt, *temp, *earliest_pt = NULL;
@@ -245,7 +245,7 @@ void pt_update_irq(struct vcpu *v)
if ( earliest_pt == NULL )
{
spin_unlock(&v->arch.hvm_vcpu.tm_lock);
- return;
+ return -1;
}
earliest_pt->irq_issued = 1;
@@ -263,6 +263,17 @@ void pt_update_irq(struct vcpu *v)
hvm_isa_irq_deassert(v->domain, irq);
hvm_isa_irq_assert(v->domain, irq);
}
+
+ /*
+ * If periodic timer interrut is handled by lapic, its vector in
+ * IRR is returned and used to set eoi_exit_bitmap for virtual
+ * interrupt delivery case. Otherwise return -1 to do nothing.
+ */
+ if ( vlapic_accept_pic_intr(v) &&
+ (&v->domain->arch.hvm_domain)->vpic[0].int_output )
+ return -1;
+ else
+ return pt_irq_vector(earliest_pt, hvm_intsrc_lapic);
}
static struct periodic_time *is_pt_irq(
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -180,6 +180,10 @@ struct hvm_function_table {
enum hvm_intblk (*nhvm_intr_blocked)(struct vcpu *v);
void (*nhvm_domain_relinquish_resources)(struct domain *d);
+
+ /* Virtual interrupt delivery */
+ void (*update_eoi_exit_bitmap)(struct vcpu *v, u8 vector, u8 trig);
+ int (*virtual_intr_delivery_enabled)(void);
};
extern struct hvm_function_table hvm_funcs;
--- a/xen/include/asm-x86/hvm/vlapic.h
+++ b/xen/include/asm-x86/hvm/vlapic.h
@@ -100,6 +100,7 @@ int vlapic_accept_pic_intr(struct vcpu *
void vlapic_adjust_i8259_target(struct domain *d);
void vlapic_EOI_set(struct vlapic *vlapic);
+void vlapic_handle_EOI_induced_exit(struct vlapic *vlapic, int vector);
int vlapic_ipi(struct vlapic *vlapic, uint32_t icr_low, uint32_t icr_high);
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -110,6 +110,9 @@ struct arch_vmx_struct {
unsigned int host_msr_count;
struct vmx_msr_entry *host_msr_area;
+ uint32_t eoi_exitmap_changed;
+ uint64_t eoi_exit_bitmap[4];
+
unsigned long host_cr0;
/* Is the guest in real mode? */
@@ -183,6 +186,7 @@ extern u32 vmx_vmentry_control;
#define SECONDARY_EXEC_WBINVD_EXITING 0x00000040
#define SECONDARY_EXEC_UNRESTRICTED_GUEST 0x00000080
#define SECONDARY_EXEC_APIC_REGISTER_VIRT 0x00000100
+#define SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY 0x00000200
#define SECONDARY_EXEC_PAUSE_LOOP_EXITING 0x00000400
#define SECONDARY_EXEC_ENABLE_INVPCID 0x00001000
extern u32 vmx_secondary_exec_control;
@@ -233,6 +237,8 @@ extern bool_t cpu_has_vmx_ins_outs_instr
(vmx_secondary_exec_control & SECONDARY_EXEC_PAUSE_LOOP_EXITING)
#define cpu_has_vmx_apic_reg_virt \
(vmx_secondary_exec_control & SECONDARY_EXEC_APIC_REGISTER_VIRT)
+#define cpu_has_vmx_virtual_intr_delivery \
+ (vmx_secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY)
/* GUEST_INTERRUPTIBILITY_INFO flags. */
#define VMX_INTR_SHADOW_STI 0x00000001
@@ -251,6 +257,7 @@ enum vmcs_field {
GUEST_GS_SELECTOR = 0x0000080a,
GUEST_LDTR_SELECTOR = 0x0000080c,
GUEST_TR_SELECTOR = 0x0000080e,
+ GUEST_INTR_STATUS = 0x00000810,
HOST_ES_SELECTOR = 0x00000c00,
HOST_CS_SELECTOR = 0x00000c02,
HOST_SS_SELECTOR = 0x00000c04,
@@ -278,6 +285,14 @@ enum vmcs_field {
APIC_ACCESS_ADDR_HIGH = 0x00002015,
EPT_POINTER = 0x0000201a,
EPT_POINTER_HIGH = 0x0000201b,
+ EOI_EXIT_BITMAP0 = 0x0000201c,
+ EOI_EXIT_BITMAP0_HIGH = 0x0000201d,
+ EOI_EXIT_BITMAP1 = 0x0000201e,
+ EOI_EXIT_BITMAP1_HIGH = 0x0000201f,
+ EOI_EXIT_BITMAP2 = 0x00002020,
+ EOI_EXIT_BITMAP2_HIGH = 0x00002021,
+ EOI_EXIT_BITMAP3 = 0x00002022,
+ EOI_EXIT_BITMAP3_HIGH = 0x00002023,
GUEST_PHYSICAL_ADDRESS = 0x00002400,
GUEST_PHYSICAL_ADDRESS_HIGH = 0x00002401,
VMCS_LINK_POINTER = 0x00002800,
@@ -398,6 +413,8 @@ int vmx_write_guest_msr(u32 msr, u64 val
int vmx_add_guest_msr(u32 msr);
int vmx_add_host_load_msr(u32 msr);
void vmx_vmcs_switch(struct vmcs_struct *from, struct vmcs_struct *to);
+void vmx_set_eoi_exit_bitmap(struct vcpu *v, u8 vector);
+void vmx_clear_eoi_exit_bitmap(struct vcpu *v, u8 vector);
#endif /* ASM_X86_HVM_VMX_VMCS_H__ */
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -119,6 +119,7 @@ void vmx_update_cpu_exec_control(struct
#define EXIT_REASON_MCE_DURING_VMENTRY 41
#define EXIT_REASON_TPR_BELOW_THRESHOLD 43
#define EXIT_REASON_APIC_ACCESS 44
+#define EXIT_REASON_EOI_INDUCED 45
#define EXIT_REASON_ACCESS_GDTR_OR_IDTR 46
#define EXIT_REASON_ACCESS_LDTR_OR_TR 47
#define EXIT_REASON_EPT_VIOLATION 48
--- a/xen/include/asm-x86/hvm/vpt.h
+++ b/xen/include/asm-x86/hvm/vpt.h
@@ -141,7 +141,7 @@ struct pl_time { /* platform time */
void pt_save_timer(struct vcpu *v);
void pt_restore_timer(struct vcpu *v);
-void pt_update_irq(struct vcpu *v);
+int pt_update_irq(struct vcpu *v);
void pt_intr_post(struct vcpu *v, struct hvm_intack intack);
void pt_migrate(struct vcpu *v);
++++++ 25922-x86-APICV-x2APIC.patch ++++++
References: FATE#313605
# HG changeset patch
# User Jiongxi Li
# Date 1347912362 -3600
# Node ID c2578dd96b8318e108fff0f340411135dedaa47d
# Parent 713b8849b11afa05f1dde157a3f5086fa3aaad08
xen: add virtual x2apic support for apicv
basically to benefit from apicv, we need clear MSR bitmap for
corresponding x2apic MSRs:
0x800 - 0x8ff: no read intercept for apicv register virtualization
TPR,EOI,SELF-IPI: no write intercept for virtual interrupt
delivery
Signed-off-by: Jiongxi Li
Committed-by: Keir Fraser
Index: xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vmcs.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/hvm/vmx/vmcs.c
+++ xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vmcs.c
@@ -658,7 +658,7 @@ static void vmx_set_host_env(struct vcpu
(unsigned long)&get_cpu_info()->guest_cpu_user_regs.error_code);
}
-void vmx_disable_intercept_for_msr(struct vcpu *v, u32 msr)
+void vmx_disable_intercept_for_msr(struct vcpu *v, u32 msr, int type)
{
unsigned long *msr_bitmap = v->arch.hvm_vmx.msr_bitmap;
@@ -673,14 +673,18 @@ void vmx_disable_intercept_for_msr(struc
*/
if ( msr <= 0x1fff )
{
- __clear_bit(msr, msr_bitmap + 0x000/BYTES_PER_LONG); /* read-low */
- __clear_bit(msr, msr_bitmap + 0x800/BYTES_PER_LONG); /* write-low */
+ if (type & MSR_TYPE_R)
+ __clear_bit(msr, msr_bitmap + 0x000/BYTES_PER_LONG); /* read-low */
+ if (type & MSR_TYPE_W)
+ __clear_bit(msr, msr_bitmap + 0x800/BYTES_PER_LONG); /* write-low */
}
else if ( (msr >= 0xc0000000) && (msr <= 0xc0001fff) )
{
msr &= 0x1fff;
- __clear_bit(msr, msr_bitmap + 0x400/BYTES_PER_LONG); /* read-high */
- __clear_bit(msr, msr_bitmap + 0xc00/BYTES_PER_LONG); /* write-high */
+ if (type & MSR_TYPE_R)
+ __clear_bit(msr, msr_bitmap + 0x400/BYTES_PER_LONG); /* read-high */
+ if (type & MSR_TYPE_W)
+ __clear_bit(msr, msr_bitmap + 0xc00/BYTES_PER_LONG); /* write-high */
}
}
@@ -776,13 +780,25 @@ static int construct_vmcs(struct vcpu *v
v->arch.hvm_vmx.msr_bitmap = msr_bitmap;
__vmwrite(MSR_BITMAP, virt_to_maddr(msr_bitmap));
- vmx_disable_intercept_for_msr(v, MSR_FS_BASE);
- vmx_disable_intercept_for_msr(v, MSR_GS_BASE);
- vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_CS);
- vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_ESP);
- vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_EIP);
+ vmx_disable_intercept_for_msr(v, MSR_FS_BASE, MSR_TYPE_R | MSR_TYPE_W);
+ vmx_disable_intercept_for_msr(v, MSR_GS_BASE, MSR_TYPE_R | MSR_TYPE_W);
+ vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_CS, MSR_TYPE_R | MSR_TYPE_W);
+ vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_ESP, MSR_TYPE_R | MSR_TYPE_W);
+ vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_EIP, MSR_TYPE_R | MSR_TYPE_W);
if ( cpu_has_vmx_pat && paging_mode_hap(d) )
- vmx_disable_intercept_for_msr(v, MSR_IA32_CR_PAT);
+ vmx_disable_intercept_for_msr(v, MSR_IA32_CR_PAT, MSR_TYPE_R | MSR_TYPE_W);
+ if ( cpu_has_vmx_apic_reg_virt )
+ {
+ int msr;
+ for (msr = MSR_IA32_APICBASE_MSR; msr <= MSR_IA32_APICBASE_MSR + 0xff; msr++)
+ vmx_disable_intercept_for_msr(v, msr, MSR_TYPE_R);
+ }
+ if ( cpu_has_vmx_virtual_intr_delivery )
+ {
+ vmx_disable_intercept_for_msr(v, MSR_IA32_APICTPR_MSR, MSR_TYPE_W);
+ vmx_disable_intercept_for_msr(v, MSR_IA32_APICEOI_MSR, MSR_TYPE_W);
+ vmx_disable_intercept_for_msr(v, MSR_IA32_APICSELF_MSR, MSR_TYPE_W);
+ }
}
/* I/O access bitmap. */
Index: xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vmx.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/hvm/vmx/vmx.c
+++ xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vmx.c
@@ -2041,7 +2041,7 @@ static int vmx_msr_write_intercept(unsig
for ( ; (rc == 0) && lbr->count; lbr++ )
for ( i = 0; (rc == 0) && (i < lbr->count); i++ )
if ( (rc = vmx_add_guest_msr(lbr->base + i)) == 0 )
- vmx_disable_intercept_for_msr(v, lbr->base + i);
+ vmx_disable_intercept_for_msr(v, lbr->base + i, MSR_TYPE_R | MSR_TYPE_W);
}
if ( (rc < 0) ||
Index: xen-4.2.2-testing/xen/include/asm-x86/hvm/vmx/vmcs.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ xen-4.2.2-testing/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -407,7 +407,9 @@ enum vmcs_field {
#define VMCS_VPID_WIDTH 16
-void vmx_disable_intercept_for_msr(struct vcpu *v, u32 msr);
+#define MSR_TYPE_R 1
+#define MSR_TYPE_W 2
+void vmx_disable_intercept_for_msr(struct vcpu *v, u32 msr, int type);
int vmx_read_guest_msr(u32 msr, u64 *val);
int vmx_write_guest_msr(u32 msr, u64 val);
int vmx_add_guest_msr(u32 msr);
Index: xen-4.2.2-testing/xen/include/asm-x86/msr-index.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/asm-x86/msr-index.h
+++ xen-4.2.2-testing/xen/include/asm-x86/msr-index.h
@@ -293,6 +293,9 @@
#define MSR_IA32_APICBASE_ENABLE (1<<11)
#define MSR_IA32_APICBASE_BASE (0xfffff<<12)
#define MSR_IA32_APICBASE_MSR 0x800
+#define MSR_IA32_APICTPR_MSR 0x808
+#define MSR_IA32_APICEOI_MSR 0x80b
+#define MSR_IA32_APICSELF_MSR 0x83f
#define MSR_IA32_UCODE_WRITE 0x00000079
#define MSR_IA32_UCODE_REV 0x0000008b
++++++ 25952-x86-MMIO-remap-permissions.patch ++++++
# HG changeset patch
# User Daniel De Graaf
# Date 1348653367 -7200
# Node ID 8278d7d8fa485996f51134c5265fceaf239adf6a
# Parent b83f414ccf7a6e4e077a10bc422cf3f6c7d30566
x86: check remote MMIO remap permissions
When a domain is mapping pages from a different pg_owner domain, the
iomem_access checks are currently only applied to the pg_owner domain,
potentially allowing a domain with a more restrictive iomem_access
policy to have the pages mapped into its page tables. To catch this,
also check the owner of the page tables. The current domain does not
need to be checked because the ability to manipulate a domain's page
tables implies full access to the target domain, so checking that
domain's permission is sufficient.
Signed-off-by: Daniel De Graaf
Committed-by: Jan Beulich
Index: xen-4.2.2-testing/xen/arch/x86/mm.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/mm.c
+++ xen-4.2.2-testing/xen/arch/x86/mm.c
@@ -884,6 +884,19 @@ get_page_from_l1e(
return -EINVAL;
}
+ if ( pg_owner != l1e_owner &&
+ !iomem_access_permitted(l1e_owner, mfn, mfn) )
+ {
+ if ( mfn != (PADDR_MASK >> PAGE_SHIFT) ) /* INVALID_MFN? */
+ {
+ MEM_LOG("Dom%u attempted to map I/O space %08lx in dom%u to dom%u",
+ curr->domain->domain_id, mfn, pg_owner->domain_id,
+ l1e_owner->domain_id);
+ return -EPERM;
+ }
+ return -EINVAL;
+ }
+
if ( !(l1f & _PAGE_RW) ||
!rangeset_contains_singleton(mmio_ro_ranges, mfn) )
return 0;
++++++ 25957-x86-TSC-adjust-HVM.patch ++++++
References: FATE#313633
# HG changeset patch
# User Liu, Jinsong
# Date 1348654362 -7200
# Node ID c47ef9592fb39325e33f8406b4bd736cc84482e5
# Parent 5d63c633a60b9a1d695594f9c17cf933240bec81
x86: Implement TSC adjust feature for HVM guest
IA32_TSC_ADJUST MSR is maintained separately for each logical
processor. A logical processor maintains and uses the IA32_TSC_ADJUST
MSR as follows:
1). On RESET, the value of the IA32_TSC_ADJUST MSR is 0;
2). If an execution of WRMSR to the IA32_TIME_STAMP_COUNTER MSR adds
(or subtracts) value X from the TSC, the logical processor also
adds (or subtracts) value X from the IA32_TSC_ADJUST MSR;
3). If an execution of WRMSR to the IA32_TSC_ADJUST MSR adds (or
subtracts) value X from that MSR, the logical processor also adds
(or subtracts) value X from the TSC.
This patch provides tsc adjust support for hvm guest, with it guest OS
would be happy when sync tsc.
Signed-off-by: Liu, Jinsong
Committed-by: Jan Beulich
Index: xen-4.2.2-testing/xen/arch/x86/hvm/hvm.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/hvm/hvm.c
+++ xen-4.2.2-testing/xen/arch/x86/hvm/hvm.c
@@ -244,6 +244,7 @@ int hvm_set_guest_pat(struct vcpu *v, u6
void hvm_set_guest_tsc(struct vcpu *v, u64 guest_tsc)
{
uint64_t tsc;
+ uint64_t delta_tsc;
if ( v->domain->arch.vtsc )
{
@@ -255,10 +256,22 @@ void hvm_set_guest_tsc(struct vcpu *v, u
rdtscll(tsc);
}
- v->arch.hvm_vcpu.cache_tsc_offset = guest_tsc - tsc;
+ delta_tsc = guest_tsc - tsc;
+ v->arch.hvm_vcpu.msr_tsc_adjust += delta_tsc
+ - v->arch.hvm_vcpu.cache_tsc_offset;
+ v->arch.hvm_vcpu.cache_tsc_offset = delta_tsc;
+
hvm_funcs.set_tsc_offset(v, v->arch.hvm_vcpu.cache_tsc_offset);
}
+void hvm_set_guest_tsc_adjust(struct vcpu *v, u64 tsc_adjust)
+{
+ v->arch.hvm_vcpu.cache_tsc_offset += tsc_adjust
+ - v->arch.hvm_vcpu.msr_tsc_adjust;
+ hvm_funcs.set_tsc_offset(v, v->arch.hvm_vcpu.cache_tsc_offset);
+ v->arch.hvm_vcpu.msr_tsc_adjust = tsc_adjust;
+}
+
u64 hvm_get_guest_tsc(struct vcpu *v)
{
uint64_t tsc;
@@ -277,6 +290,11 @@ u64 hvm_get_guest_tsc(struct vcpu *v)
return tsc + v->arch.hvm_vcpu.cache_tsc_offset;
}
+u64 hvm_get_guest_tsc_adjust(struct vcpu *v)
+{
+ return v->arch.hvm_vcpu.msr_tsc_adjust;
+}
+
void hvm_migrate_timers(struct vcpu *v)
{
rtc_migrate_timers(v);
@@ -2798,6 +2816,10 @@ int hvm_msr_read_intercept(unsigned int
*msr_content = hvm_get_guest_tsc(v);
break;
+ case MSR_IA32_TSC_ADJUST:
+ *msr_content = hvm_get_guest_tsc_adjust(v);
+ break;
+
case MSR_TSC_AUX:
*msr_content = hvm_msr_tsc_aux(v);
break;
@@ -2911,6 +2933,10 @@ int hvm_msr_write_intercept(unsigned int
hvm_set_guest_tsc(v, msr_content);
break;
+ case MSR_IA32_TSC_ADJUST:
+ hvm_set_guest_tsc_adjust(v, msr_content);
+ break;
+
case MSR_TSC_AUX:
v->arch.hvm_vcpu.msr_tsc_aux = (uint32_t)msr_content;
if ( cpu_has_rdtscp
@@ -3482,6 +3508,8 @@ void hvm_vcpu_reset_state(struct vcpu *v
v->domain->vcpu[0]->arch.hvm_vcpu.cache_tsc_offset;
hvm_funcs.set_tsc_offset(v, v->arch.hvm_vcpu.cache_tsc_offset);
+ v->arch.hvm_vcpu.msr_tsc_adjust = 0;
+
paging_update_paging_modes(v);
v->arch.flags |= TF_kernel_mode;
Index: xen-4.2.2-testing/xen/include/asm-x86/hvm/vcpu.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/asm-x86/hvm/vcpu.h
+++ xen-4.2.2-testing/xen/include/asm-x86/hvm/vcpu.h
@@ -137,6 +137,7 @@ struct hvm_vcpu {
struct hvm_vcpu_asid n1asid;
u32 msr_tsc_aux;
+ u64 msr_tsc_adjust;
/* VPMU */
struct vpmu_struct vpmu;
Index: xen-4.2.2-testing/xen/include/asm-x86/msr-index.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/asm-x86/msr-index.h
+++ xen-4.2.2-testing/xen/include/asm-x86/msr-index.h
@@ -286,6 +286,7 @@
#define MSR_IA32_PLATFORM_ID 0x00000017
#define MSR_IA32_EBL_CR_POWERON 0x0000002a
#define MSR_IA32_EBC_FREQUENCY_ID 0x0000002c
+#define MSR_IA32_TSC_ADJUST 0x0000003b
#define MSR_IA32_APICBASE 0x0000001b
#define MSR_IA32_APICBASE_BSP (1<<8)
++++++ 25958-x86-TSC-adjust-sr.patch ++++++
References: FATE#313633
# HG changeset patch
# User Liu, Jinsong
# Date 1348654418 -7200
# Node ID 56fb977ce6eb4626a02d4a7a34e85009bb8ee3e0
# Parent c47ef9592fb39325e33f8406b4bd736cc84482e5
x86: Save/restore TSC adjust during HVM guest migration
Signed-off-by: Liu, Jinsong
Committed-by: Jan Beulich
--- a/tools/misc/xen-hvmctx.c
+++ b/tools/misc/xen-hvmctx.c
@@ -390,6 +390,13 @@ static void dump_vmce_vcpu(void)
printf(" VMCE_VCPU: caps %" PRIx64 "\n", p.caps);
}
+static void dump_tsc_adjust(void)
+{
+ HVM_SAVE_TYPE(TSC_ADJUST) p;
+ READ(p);
+ printf(" TSC_ADJUST: tsc_adjust %" PRIx64 "\n", p.tsc_adjust);
+}
+
int main(int argc, char **argv)
{
int entry, domid;
@@ -457,6 +464,7 @@ int main(int argc, char **argv)
case HVM_SAVE_CODE(VIRIDIAN_DOMAIN): dump_viridian_domain(); break;
case HVM_SAVE_CODE(VIRIDIAN_VCPU): dump_viridian_vcpu(); break;
case HVM_SAVE_CODE(VMCE_VCPU): dump_vmce_vcpu(); break;
+ case HVM_SAVE_CODE(TSC_ADJUST): dump_tsc_adjust(); break;
case HVM_SAVE_CODE(END): break;
default:
printf(" ** Don't understand type %u: skipping\n",
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -610,6 +610,46 @@ void hvm_domain_destroy(struct domain *d
hvm_destroy_cacheattr_region_list(d);
}
+static int hvm_save_tsc_adjust(struct domain *d, hvm_domain_context_t *h)
+{
+ struct vcpu *v;
+ struct hvm_tsc_adjust ctxt;
+ int err = 0;
+
+ for_each_vcpu ( d, v )
+ {
+ ctxt.tsc_adjust = v->arch.hvm_vcpu.msr_tsc_adjust;
+ err = hvm_save_entry(TSC_ADJUST, v->vcpu_id, h, &ctxt);
+ if ( err )
+ break;
+ }
+
+ return err;
+}
+
+static int hvm_load_tsc_adjust(struct domain *d, hvm_domain_context_t *h)
+{
+ unsigned int vcpuid = hvm_load_instance(h);
+ struct vcpu *v;
+ struct hvm_tsc_adjust ctxt;
+
+ if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL )
+ {
+ dprintk(XENLOG_G_ERR, "HVM restore: dom%d has no vcpu%u\n",
+ d->domain_id, vcpuid);
+ return -EINVAL;
+ }
+
+ if ( hvm_load_entry(TSC_ADJUST, h, &ctxt) != 0 )
+ return -EINVAL;
+
+ v->arch.hvm_vcpu.msr_tsc_adjust = ctxt.tsc_adjust;
+ return 0;
+}
+
+HVM_REGISTER_SAVE_RESTORE(TSC_ADJUST, hvm_save_tsc_adjust,
+ hvm_load_tsc_adjust, 1, HVMSR_PER_VCPU);
+
static int hvm_save_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
{
struct vcpu *v;
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -581,9 +581,15 @@ struct hvm_vmce_vcpu {
DECLARE_HVM_SAVE_TYPE(VMCE_VCPU, 18, struct hvm_vmce_vcpu);
+struct hvm_tsc_adjust {
+ uint64_t tsc_adjust;
+};
+
+DECLARE_HVM_SAVE_TYPE(TSC_ADJUST, 19, struct hvm_tsc_adjust);
+
/*
* Largest type-code in use
*/
-#define HVM_SAVE_CODE_MAX 18
+#define HVM_SAVE_CODE_MAX 19
#endif /* __XEN_PUBLIC_HVM_SAVE_X86_H__ */
++++++ 25959-x86-TSC-adjust-expose.patch ++++++
References: FATE#313633
# HG changeset patch
# User Liu, Jinsong
# Date 1348654470 -7200
# Node ID 3aa66543a51ba77cb73e8c874e2416d065426a22
# Parent 56fb977ce6eb4626a02d4a7a34e85009bb8ee3e0
x86: Expose TSC adjust to HVM guest
Intel latest SDM (17.13.3) release a new MSR CPUID.7.0.EBX[1]=1
indicates TSC_ADJUST MSR 0x3b is supported.
This patch expose it to hvm guest.
Signed-off-by: Liu, Jinsong
Committed-by: Jan Beulich
--- a/tools/libxc/xc_cpufeature.h
+++ b/tools/libxc/xc_cpufeature.h
@@ -128,6 +128,7 @@
/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx) */
#define X86_FEATURE_FSGSBASE 0 /* {RD,WR}{FS,GS}BASE instructions */
+#define X86_FEATURE_TSC_ADJUST 1 /* Tsc thread offset */
#define X86_FEATURE_BMI1 3 /* 1st group bit manipulation extensions */
#define X86_FEATURE_HLE 4 /* Hardware Lock Elision */
#define X86_FEATURE_AVX2 5 /* AVX2 instructions */
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -362,7 +362,8 @@ static void xc_cpuid_hvm_policy(
case 0x00000007: /* Intel-defined CPU features */
if ( input[1] == 0 ) {
- regs[1] &= (bitmaskof(X86_FEATURE_BMI1) |
+ regs[1] &= (bitmaskof(X86_FEATURE_TSC_ADJUST) |
+ bitmaskof(X86_FEATURE_BMI1) |
bitmaskof(X86_FEATURE_HLE) |
bitmaskof(X86_FEATURE_AVX2) |
bitmaskof(X86_FEATURE_SMEP) |
++++++ 25975-x86-IvyBridge.patch ++++++
# HG changeset patch
# User Jan Beulich
# Date 1349172840 -7200
# Node ID 87bf99fad7a9f018530d13213f57610621838085
# Parent 5fbdbf585f5f2ee9a3e3c75a8a9f9f2cc6eda65c
x86/Intel: add further support for Ivy Bridge CPU models
And some initial Haswell ones at once.
Signed-off-by: Jan Beulich
Acked-by: "Nakajima, Jun"
Index: xen-4.2.2-testing/xen/arch/x86/acpi/cpu_idle.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/acpi/cpu_idle.c
+++ xen-4.2.2-testing/xen/arch/x86/acpi/cpu_idle.c
@@ -105,11 +105,15 @@ static void do_get_hw_residencies(void *
switch ( c->x86_model )
{
- /* Ivy bridge */
- case 0x3A:
/* Sandy bridge */
case 0x2A:
case 0x2D:
+ /* Ivy bridge */
+ case 0x3A:
+ case 0x3E:
+ /* Haswell */
+ case 0x3C:
+ case 0x45:
GET_PC2_RES(hw_res->pc2);
GET_CC7_RES(hw_res->cc7);
/* fall through */
Index: xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vmx.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/hvm/vmx/vmx.c
+++ xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vmx.c
@@ -1825,7 +1825,9 @@ static const struct lbr_info *last_branc
/* Sandy Bridge */
case 42: case 45:
/* Ivy Bridge */
- case 58:
+ case 58: case 62:
+ /* Haswell */
+ case 60: case 69:
return nh_lbr;
break;
/* Atom */
Index: xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vpmu_core2.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/hvm/vmx/vpmu_core2.c
+++ xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vpmu_core2.c
@@ -747,6 +747,7 @@ int vmx_vpmu_initialise(struct vcpu *v,
case 46:
case 47:
case 58:
+ case 62:
ret = core2_vpmu_initialise(v, vpmu_flags);
if ( !ret )
vpmu->arch_vpmu_ops = &core2_vpmu_ops;
++++++ 26077-stubdom_fix_compile_errors_in_grub.patch ++++++
changeset: 26077:33348baecf37
user: Olaf Hering
date: Thu Oct 18 09:34:59 2012 +0100
files: stubdom/grub.patches/70compiler_warnings.diff
description:
stubdom: fix compile errors in grub
Building xen.rpm in SLES11 started to fail due to these compiler
warnings:
[ 1436s] ../grub-upstream/netboot/fsys_tftp.c:213: warning: operation on 'block' may be undefined
[ 1437s] ../grub-upstream/netboot/main.c:444: warning: operation on 'block' may be undefined
[ 1234s] E: xen sequence-point ../grub-upstream/netboot/fsys_tftp.c:213
[ 1234s] E: xen sequence-point ../grub-upstream/netboot/main.c:444
The reason for this is that the assignment is done twice:
tp.u.ack.block = ((uint16_t)( (((uint16_t)((block = prevblock)) & (uint16_t)0x00ffU) << 8) | (((uint16_t)((block = prevblock)) & (uint16_t)0xff00U) >> 8)));
Fix this package build error by adding another patch for grub, which
moves the assignment out of the macro usage.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r 8dcab28b8081 -r 33348baecf37 stubdom/grub.patches/70compiler_warnings.diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/stubdom/grub.patches/70compiler_warnings.diff Thu Oct 18 09:34:59 2012 +0100
@@ -0,0 +1,45 @@
+[ 1436s] ../grub-upstream/netboot/fsys_tftp.c:213: warning: operation on 'block' may be undefined
+[ 1437s] ../grub-upstream/netboot/main.c:444: warning: operation on 'block' may be undefined
+
+[ 1234s] E: xen sequence-point ../grub-upstream/netboot/fsys_tftp.c:213
+[ 1234s] E: xen sequence-point ../grub-upstream/netboot/main.c:444
+
+---
+ netboot/fsys_tftp.c | 5 ++++-
+ netboot/main.c | 5 ++++-
+ 2 files changed, 8 insertions(+), 2 deletions(-)
+
+Index: grub-0.97/netboot/fsys_tftp.c
+===================================================================
+--- grub-0.97.orig/netboot/fsys_tftp.c
++++ grub-0.97/netboot/fsys_tftp.c
+@@ -209,8 +209,11 @@ buf_fill (int abort)
+ break;
+
+ if ((block || bcounter) && (block != prevblock + (unsigned short) 1))
++ {
++ block = prevblock;
+ /* Block order should be continuous */
+- tp.u.ack.block = htons (block = prevblock);
++ tp.u.ack.block = htons (block);
++ }
+
+ /* Should be continuous. */
+ tp.opcode = abort ? htons (TFTP_ERROR) : htons (TFTP_ACK);
+Index: grub-0.97/netboot/main.c
+===================================================================
+--- grub-0.97.orig/netboot/main.c
++++ grub-0.97/netboot/main.c
+@@ -440,8 +440,11 @@ tftp (const char *name, int (*fnc) (unsi
+ break;
+
+ if ((block || bcounter) && (block != prevblock + 1))
++ {
++ block = prevblock;
+ /* Block order should be continuous */
+- tp.u.ack.block = htons (block = prevblock);
++ tp.u.ack.block = htons (block);
++ }
+
+ /* Should be continuous. */
+ tp.opcode = htons (TFTP_ACK);
++++++ 26078-hotplug-Linux_remove_hotplug_support_rely_on_udev_instead.patch ++++++
changeset: 26078:019ca95dfa34
user: Olaf Hering
date: Thu Oct 18 09:35:00 2012 +0100
files: Makefile README install.sh tools/hotplug/Linux/Makefile tools/hotplug/Linux/xen-backend.agent
description:
hotplug/Linux: remove hotplug support, rely on udev instead
Hotplug has been replaced by udev since several years. Remove the
hotplug related files and install udev unconditionally.
This makes it possible to remove udev from rpm BuildRequires which
reduces the buildtime dependency chain. For openSuSE:Factory it was
done just now:
http://lists.opensuse.org/opensuse-buildservice/2012-10/msg00085.html
The patch by itself will have no practical impact unless someone
attempts to build and run a Xen dom0 on a really old base system. e.g.
circa SLES9/2007 or earlier
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r 33348baecf37 -r 019ca95dfa34 Makefile
--- a/Makefile Thu Oct 18 09:34:59 2012 +0100
+++ b/Makefile Thu Oct 18 09:35:00 2012 +0100
@@ -223,7 +223,6 @@ uninstall:
$(MAKE) -C xen uninstall
rm -rf $(D)$(CONFIG_DIR)/init.d/xendomains $(D)$(CONFIG_DIR)/init.d/xend
rm -rf $(D)$(CONFIG_DIR)/init.d/xencommons $(D)$(CONFIG_DIR)/init.d/xen-watchdog
- rm -rf $(D)$(CONFIG_DIR)/hotplug/xen-backend.agent
rm -f $(D)$(CONFIG_DIR)/udev/rules.d/xen-backend.rules
rm -f $(D)$(CONFIG_DIR)/udev/rules.d/xend.rules
rm -f $(D)$(SYSCONFIG_DIR)/xendomains
diff -r 33348baecf37 -r 019ca95dfa34 README
--- a/README Thu Oct 18 09:34:59 2012 +0100
+++ b/README Thu Oct 18 09:35:00 2012 +0100
@@ -54,7 +54,7 @@ provided by your OS distributor:
* pkg-config
* bridge-utils package (/sbin/brctl)
* iproute package (/sbin/ip)
- * hotplug or udev
+ * udev
* GNU bison and GNU flex
* GNU gettext
* 16-bit x86 assembler, loader and compiler (dev86 rpm or bin86 & bcc debs)
@@ -120,9 +120,9 @@ 4. To rebuild an existing tree without m
make install and make dist differ in that make install does the
right things for your local machine (installing the appropriate
- version of hotplug or udev scripts, for example), but make dist
- includes all versions of those scripts, so that you can copy the dist
- directory to another machine and install from that distribution.
+ version of udev scripts, for example), but make dist includes all
+ versions of those scripts, so that you can copy the dist directory
+ to another machine and install from that distribution.
Python Runtime Libraries
========================
diff -r 33348baecf37 -r 019ca95dfa34 install.sh
--- a/install.sh Thu Oct 18 09:34:59 2012 +0100
+++ b/install.sh Thu Oct 18 09:35:00 2012 +0100
@@ -27,20 +27,6 @@ echo "Installing Xen from '$src' to '$ds
echo "Installing Xen from '$src' to '$dst'..."
(cd $src; tar -cf - * ) | tar -C "$tmp" -xf -
-[ -x "$(which udevinfo)" ] && \
- UDEV_VERSION=$(udevinfo -V | sed -e 's/^[^0-9]* \([0-9]\{1,\}\)[^0-9]\{0,\}/\1/')
-
-[ -z "$UDEV_VERSION" -a -x /sbin/udevadm ] && \
- UDEV_VERSION=$(/sbin/udevadm info -V | awk '{print $NF}')
-
-if [ -n "$UDEV_VERSION" ] && [ $UDEV_VERSION -ge 059 ]; then
- echo " - installing for udev-based system"
- rm -rf "$tmp/etc/hotplug"
-else
- echo " - installing for hotplug-based system"
- rm -rf "$tmp/etc/udev"
-fi
-
echo " - modifying permissions"
chmod -R a+rX "$tmp"
diff -r 33348baecf37 -r 019ca95dfa34 tools/hotplug/Linux/Makefile
--- a/tools/hotplug/Linux/Makefile Thu Oct 18 09:34:59 2012 +0100
+++ b/tools/hotplug/Linux/Makefile Thu Oct 18 09:35:00 2012 +0100
@@ -27,31 +27,8 @@ XEN_SCRIPT_DATA += block-common.sh vtpm-
XEN_SCRIPT_DATA += block-common.sh vtpm-common.sh vtpm-hotplug-common.sh
XEN_SCRIPT_DATA += vtpm-migration.sh vtpm-impl
-XEN_HOTPLUG_DIR = $(CONFIG_DIR)/hotplug
-XEN_HOTPLUG_SCRIPTS = xen-backend.agent
-
-UDEVVER = 0
-ifeq ($(shell [ -x /sbin/udevadm ] && echo 1),1)
-UDEVVER = $(shell /sbin/udevadm info -V | sed -e 's/^[^0-9]* \([0-9]\{1,\}\)[^0-9]\{0,\}/\1/' )
-endif
-ifeq ($(shell [ -x /usr/bin/udevinfo ] && echo 1),1)
-UDEVVER = $(shell /usr/bin/udevinfo -V | sed -e 's/^[^0-9]* \([0-9]\{1,\}\)[^0-9]\{0,\}/\1/' )
-endif
-
UDEV_RULES_DIR = $(CONFIG_DIR)/udev
UDEV_RULES = xen-backend.rules xend.rules
-
-DI = $(if $(DISTDIR),$(shell readlink -f $(DISTDIR)),)
-DE = $(if $(DESTDIR),$(shell readlink -f $(DESTDIR)),)
-ifeq ($(findstring $(DI),$(DE)),$(DI))
-HOTPLUGS=install-hotplug install-udev
-else
-ifeq ($(shell [ $(UDEVVER) -ge 059 ] && echo 1),1)
-HOTPLUGS=install-udev
-else
-HOTPLUGS=install-hotplug
-endif
-endif
.PHONY: all
all:
@@ -60,7 +37,7 @@ build:
build:
.PHONY: install
-install: all install-initd install-scripts $(HOTPLUGS)
+install: all install-initd install-scripts install-udev
# See docs/misc/distro_mapping.txt for INITD_DIR location
.PHONY: install-initd
@@ -87,15 +64,6 @@ install-scripts:
$(INSTALL_DATA) $$i $(DESTDIR)$(XEN_SCRIPT_DIR); \
done
-.PHONY: install-hotplug
-install-hotplug:
- [ -d $(DESTDIR)$(XEN_HOTPLUG_DIR) ] || \
- $(INSTALL_DIR) $(DESTDIR)$(XEN_HOTPLUG_DIR)
- set -e; for i in $(XEN_HOTPLUG_SCRIPTS); \
- do \
- $(INSTALL_PROG) $$i $(DESTDIR)$(XEN_HOTPLUG_DIR); \
- done
-
.PHONY: install-udev
install-udev:
[ -d $(DESTDIR)$(UDEV_RULES_DIR) ] || \
diff -r 33348baecf37 -r 019ca95dfa34 tools/hotplug/Linux/xen-backend.agent
--- a/tools/hotplug/Linux/xen-backend.agent Thu Oct 18 09:34:59 2012 +0100
+++ /dev/null Thu Jan 01 00:00:00 1970 +0000
@@ -1,39 +0,0 @@
-#! /bin/bash
-
-PATH=/etc/xen/scripts:$PATH
-
-. /etc/xen/scripts/locking.sh
-
-claim_lock xenbus_hotplug_global
-
-case "$XENBUS_TYPE" in
- tap)
- /etc/xen/scripts/blktap "$ACTION"
- ;;
- vbd)
- /etc/xen/scripts/block "$ACTION"
- ;;
- vtpm)
- /etc/xen/scripts/vtpm "$ACTION"
- ;;
- vif)
- [ -n "$script" ] && $script "$ACTION"
- ;;
- vscsi)
- /etc/xen/scripts/vscsi "$ACTION"
- ;;
-esac
-
-case "$ACTION" in
- add)
- ;;
- remove)
- /etc/xen/scripts/xen-hotplug-cleanup
- ;;
- online)
- ;;
- offline)
- ;;
-esac
-
-release_lock xenbus_hotplug_global
++++++ 26079-hotplug-Linux_close_lockfd_after_lock_attempt.patch ++++++
changeset: 26079:b3b03536789a
user: Olaf Hering
date: Thu Oct 18 09:35:01 2012 +0100
files: tools/hotplug/Linux/locking.sh
description:
hotplug/Linux: close lockfd after lock attempt
When a HVM guest is shutdown some of the 'remove' events can not claim
the lock for some reason. Instead they try to grab the lock in a busy
loop, until udev reaps the xen-hotplug-cleanup helper.
After analyzing the resulting logfile its not obvious what the cause is.
The only explanation is that bash (?) gets confused if the same lockfd
is opened again and again. Closing it in each iteration seem to fix the
issue.
This was observed with sles11sp2 (bash 3.2) and 4.2 xend.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
[ ijc -- added the comment ]
Committed-by: Ian Campbell
diff -r 019ca95dfa34 -r b3b03536789a tools/hotplug/Linux/locking.sh
--- a/tools/hotplug/Linux/locking.sh Thu Oct 18 09:35:00 2012 +0100
+++ b/tools/hotplug/Linux/locking.sh Thu Oct 18 09:35:01 2012 +0100
@@ -59,6 +59,9 @@ claim_lock()
print "y\n" if $fd_inum eq $file_inum;
' "$_lockfile" )
if [ x$rightfile = xy ]; then break; fi
+ # Some versions of bash appear to be buggy if the same
+ # $_lockfile is opened repeatedly. Close the current fd here.
+ eval "exec $_lockfd<&-"
done
}
++++++ 26081-stubdom_fix_rpmlint_warning_spurious-executable-perm.patch ++++++
changeset: 26081:02064298ebcb
user: Olaf Hering
date: Thu Oct 18 09:35:03 2012 +0100
files: stubdom/Makefile
description:
stubdom: fix rpmlint warning spurious-executable-perm
[ 1758s] xen-tools.x86_64: E: spurious-executable-perm (Badness: 50) /usr/lib/xen/boot/xenstore-stubdom.gz
[ 1758s] The file is installed with executable permissions, but was identified as one
[ 1758s] that probably should not be executable. Verify if the executable bits are
[ 1758s] desired, and remove if not. NOTE: example scripts should be packaged under
[ 1758s] %docdir/examples, which will avoid this warning.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r 25b2f53d2583 -r 02064298ebcb stubdom/Makefile
--- a/stubdom/Makefile Thu Oct 18 09:35:02 2012 +0100
+++ b/stubdom/Makefile Thu Oct 18 09:35:03 2012 +0100
@@ -396,7 +396,7 @@ install-grub: pv-grub
install-xenstore: xenstore-stubdom
$(INSTALL_DIR) "$(DESTDIR)/usr/lib/xen/boot"
- $(INSTALL_PROG) mini-os-$(XEN_TARGET_ARCH)-xenstore/mini-os.gz "$(DESTDIR)/usr/lib/xen/boot/xenstore-stubdom.gz"
+ $(INSTALL_DATA) mini-os-$(XEN_TARGET_ARCH)-xenstore/mini-os.gz "$(DESTDIR)/usr/lib/xen/boot/xenstore-stubdom.gz"
#######
# clean
++++++ 26082-blktap2-libvhd_fix_rpmlint_warning_spurious-executable-perm.patch ++++++
changeset: 26082:8cf26ace9ca0
user: Olaf Hering
date: Thu Oct 18 09:35:03 2012 +0100
files: tools/blktap2/vhd/lib/Makefile
description:
blktap2/libvhd: fix rpmlint warning spurious-executable-perm
[ 1758s] xen-devel.x86_64: E: spurious-executable-perm (Badness: 50) /usr/lib64/libvhd.a
[ 1758s] The file is installed with executable permissions, but was identified as one
[ 1758s] that probably should not be executable. Verify if the executable bits are
[ 1758s] desired, and remove if not. NOTE: example scripts should be packaged under
[ 1758s] %docdir/examples, which will avoid this warning.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r 02064298ebcb -r 8cf26ace9ca0 tools/blktap2/vhd/lib/Makefile
--- a/tools/blktap2/vhd/lib/Makefile Thu Oct 18 09:35:03 2012 +0100
+++ b/tools/blktap2/vhd/lib/Makefile Thu Oct 18 09:35:03 2012 +0100
@@ -68,7 +68,7 @@ libvhd.so.$(LIBVHD-MAJOR).$(LIBVHD-MINOR
install: all
$(INSTALL_DIR) -p $(DESTDIR)$(INST-DIR)
- $(INSTALL_PROG) libvhd.a $(DESTDIR)$(INST-DIR)
+ $(INSTALL_DATA) libvhd.a $(DESTDIR)$(INST-DIR)
$(INSTALL_PROG) libvhd.so.$(LIBVHD-MAJOR).$(LIBVHD-MINOR) $(DESTDIR)$(INST-DIR)
ln -sf libvhd.so.$(LIBVHD-MAJOR).$(LIBVHD-MINOR) $(DESTDIR)$(INST-DIR)/libvhd.so.$(LIBVHD-MAJOR)
ln -sf libvhd.so.$(LIBVHD-MAJOR) $(DESTDIR)$(INST-DIR)/libvhd.so
++++++ 26083-blktap_fix_rpmlint_warning_spurious-executable-perm.patch ++++++
changeset: 26083:3fbeb019d522
user: Olaf Hering
date: Thu Oct 18 09:35:04 2012 +0100
files: tools/blktap/lib/Makefile
description:
blktap: fix rpmlint warning spurious-executable-perm
[ 1758s] xen-devel.x86_64: E: spurious-executable-perm (Badness: 50) /usr/lib64/libblktap.a
[ 1758s] The file is installed with executable permissions, but was identified as one
[ 1758s] that probably should not be executable. Verify if the executable bits are
[ 1758s] desired, and remove if not. NOTE: example scripts should be packaged under
[ 1758s] %docdir/examples, which will avoid this warning.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r 8cf26ace9ca0 -r 3fbeb019d522 tools/blktap/lib/Makefile
--- a/tools/blktap/lib/Makefile Thu Oct 18 09:35:03 2012 +0100
+++ b/tools/blktap/lib/Makefile Thu Oct 18 09:35:04 2012 +0100
@@ -23,23 +23,25 @@ OBJS_PIC = $(SRCS:.c=.opic)
OBJS_PIC = $(SRCS:.c=.opic)
IBINS :=
-LIB = libblktap.a libblktap.so.$(MAJOR).$(MINOR)
+LIB = libblktap.a
+LIB_SO = libblktap.so.$(MAJOR).$(MINOR)
.PHONY: all
-all: $(LIB)
+all: $(LIB) $(LIB_SO)
.PHONY: install
install: all
$(INSTALL_DIR) $(DESTDIR)$(LIBDIR)
$(INSTALL_DIR) $(DESTDIR)$(INCLUDEDIR)
- $(INSTALL_PROG) $(LIB) $(DESTDIR)$(LIBDIR)
+ $(INSTALL_PROG) $(LIB_SO) $(DESTDIR)$(LIBDIR)
+ $(INSTALL_DATA) $(LIB) $(DESTDIR)$(LIBDIR)
ln -sf libblktap.so.$(MAJOR).$(MINOR) $(DESTDIR)$(LIBDIR)/libblktap.so.$(MAJOR)
ln -sf libblktap.so.$(MAJOR) $(DESTDIR)$(LIBDIR)/libblktap.so
$(INSTALL_DATA) blktaplib.h $(DESTDIR)$(INCLUDEDIR)
.PHONY: clean
clean:
- rm -rf *.a *.so* *.o *.opic *.rpm $(LIB) *~ $(DEPS) xen TAGS
+ rm -rf *.a *.so* *.o *.opic *.rpm $(LIB) $(LIB_SO) *~ $(DEPS) xen TAGS
libblktap.so.$(MAJOR).$(MINOR): $(OBJS_PIC)
$(CC) $(LDFLAGS) -Wl,$(SONAME_LDFLAG) -Wl,$(SONAME) $(SHLIB_LDFLAGS) \
++++++ 26084-hotplug_install_hotplugpath.sh_as_data_file.patch ++++++
changeset: 26084:fe9a0eb9aaaa
user: Olaf Hering
date: Thu Oct 18 09:35:05 2012 +0100
files: tools/hotplug/common/Makefile
description:
hotplug: install hotplugpath.sh as data file
rpmlint complains a script helper which is only sourced:
[ 1875s] xen-tools.i586: W: script-without-shebang /etc/xen/scripts/hotplugpath.sh
[ 1875s] This text file has executable bits set or is located in a path dedicated for
[ 1875s] executables, but lacks a shebang and cannot thus be executed. If the file is
[ 1875s] meant to be an executable script, add the shebang, otherwise remove the
[ 1875s] executable bits or move the file elsewhere.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r 3fbeb019d522 -r fe9a0eb9aaaa tools/hotplug/common/Makefile
--- a/tools/hotplug/common/Makefile Thu Oct 18 09:35:04 2012 +0100
+++ b/tools/hotplug/common/Makefile Thu Oct 18 09:35:05 2012 +0100
@@ -6,8 +6,8 @@ HOTPLUGPATH="hotplugpath.sh"
# OS-independent hotplug scripts go in this directory
# Xen scripts to go there.
-XEN_SCRIPTS = $(HOTPLUGPATH)
-XEN_SCRIPT_DATA =
+XEN_SCRIPTS =
+XEN_SCRIPT_DATA = $(HOTPLUGPATH)
genpath-target = $(call buildmakevars2file,$(HOTPLUGPATH))
$(eval $(genpath-target))
++++++ 26085-stubdom_install_stubdompath.sh_as_data_file.patch ++++++
changeset: 26085:e32f4301f384
user: Olaf Hering
date: Thu Oct 18 09:35:06 2012 +0100
files: stubdom/Makefile
description:
stubdom: install stubdompath.sh as data file
rpmlint complains a script helper which is only sourced:
[ 1875s] xen-tools.i586: W: script-without-shebang /usr/lib/xen/bin/stubdompath.sh
[ 1875s] This text file has executable bits set or is located in a path dedicated for
[ 1875s] executables, but lacks a shebang and cannot thus be executed. If the file is
[ 1875s] meant to be an executable script, add the shebang, otherwise remove the
[ 1875s] executable bits or move the file elsewhere.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r fe9a0eb9aaaa -r e32f4301f384 stubdom/Makefile
--- a/stubdom/Makefile Thu Oct 18 09:35:05 2012 +0100
+++ b/stubdom/Makefile Thu Oct 18 09:35:06 2012 +0100
@@ -386,7 +386,8 @@ install-readme:
install-ioemu: ioemu-stubdom
$(INSTALL_DIR) "$(DESTDIR)$(LIBEXEC)"
- $(INSTALL_PROG) stubdompath.sh stubdom-dm "$(DESTDIR)$(LIBEXEC)"
+ $(INSTALL_PROG) stubdom-dm "$(DESTDIR)$(LIBEXEC)"
+ $(INSTALL_DATA) stubdompath.sh "$(DESTDIR)$(LIBEXEC)"
$(INSTALL_DIR) "$(DESTDIR)$(XENFIRMWAREDIR)"
$(INSTALL_DATA) mini-os-$(XEN_TARGET_ARCH)-ioemu/mini-os.gz "$(DESTDIR)$(XENFIRMWAREDIR)/ioemu-stubdom.gz"
++++++ 26086-hotplug-Linux_correct_sysconfig_tag_in_xendomains.patch ++++++
changeset: 26086:ba6b1db89ec8
user: Olaf Hering
date: Thu Oct 18 09:35:07 2012 +0100
files: tools/hotplug/Linux/init.d/sysconfig.xendomains
description:
hotplug/Linux: correct sysconfig tag in xendomains
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r e32f4301f384 -r ba6b1db89ec8 tools/hotplug/Linux/init.d/sysconfig.xendomains
--- a/tools/hotplug/Linux/init.d/sysconfig.xendomains Thu Oct 18 09:35:06 2012 +0100
+++ b/tools/hotplug/Linux/init.d/sysconfig.xendomains Thu Oct 18 09:35:07 2012 +0100
@@ -1,4 +1,4 @@
-## Path: System/xen
+## Path: System/Virtualization
## Description: xen domain start/stop on boot
## Type: string
## Default:
++++++ 26087-hotplug-Linux_install_sysconfig_files_as_data_files.patch ++++++
changeset: 26087:6239ace16749
user: Olaf Hering
date: Thu Oct 18 09:35:07 2012 +0100
files: tools/hotplug/Linux/Makefile
description:
hotplug/Linux: install sysconfig files as data files
rpmlint complains about wrong permissions of config files:
[ 455s] xen-tools.i586: W: script-without-shebang /var/adm/fillup-templates/sysconfig.xendomains
[ 455s] xen-tools.i586: W: script-without-shebang /var/adm/fillup-templates/sysconfig.xencommons
[ 455s] This text file has executable bits set or is located in a path dedicated for
[ 455s] executables, but lacks a shebang and cannot thus be executed. If the file is
[ 455s] meant to be an executable script, add the shebang, otherwise remove the
[ 455s] executable bits or move the file elsewhere.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r ba6b1db89ec8 -r 6239ace16749 tools/hotplug/Linux/Makefile
--- a/tools/hotplug/Linux/Makefile Thu Oct 18 09:35:07 2012 +0100
+++ b/tools/hotplug/Linux/Makefile Thu Oct 18 09:35:07 2012 +0100
@@ -46,9 +46,9 @@ install-initd:
[ -d $(DESTDIR)$(SYSCONFIG_DIR) ] || $(INSTALL_DIR) $(DESTDIR)$(SYSCONFIG_DIR)
$(INSTALL_PROG) $(XEND_INITD) $(DESTDIR)$(INITD_DIR)
$(INSTALL_PROG) $(XENDOMAINS_INITD) $(DESTDIR)$(INITD_DIR)
- $(INSTALL_PROG) $(XENDOMAINS_SYSCONFIG) $(DESTDIR)$(SYSCONFIG_DIR)/xendomains
+ $(INSTALL_DATA) $(XENDOMAINS_SYSCONFIG) $(DESTDIR)$(SYSCONFIG_DIR)/xendomains
$(INSTALL_PROG) $(XENCOMMONS_INITD) $(DESTDIR)$(INITD_DIR)
- $(INSTALL_PROG) $(XENCOMMONS_SYSCONFIG) $(DESTDIR)$(SYSCONFIG_DIR)/xencommons
+ $(INSTALL_DATA) $(XENCOMMONS_SYSCONFIG) $(DESTDIR)$(SYSCONFIG_DIR)/xencommons
$(INSTALL_PROG) init.d/xen-watchdog $(DESTDIR)$(INITD_DIR)
.PHONY: install-scripts
++++++ 26114-pygrub-list-entries.patch ++++++
# HG changeset patch
# User Charles Arnold
# Date 1351249508 -3600
# Node ID 6f9e46917eb8771914041b98f714e8f485fca5ef
# Parent 03af0abd2b72dfab3f2e50dd502108de8603f741
pygrub: Add option to list grub entries
The argument to "--entry" allows 2 syntaxes, either directly the entry
number in menu.lst, or the whole string behind the "title" key word.
This poses the following issue:
title string because this string contains the kernel version, which
will change with a kernel update.
This patch adds [-l|--list-entries] as an argument to pygrub.
Signed-off-by: Charles Arnold
Acked-by: Ian Jackson
Committed-by: Ian Jackson
diff -r 03af0abd2b72 -r 6f9e46917eb8 tools/pygrub/src/pygrub
--- a/tools/pygrub/src/pygrub Fri Oct 26 12:03:12 2012 +0100
+++ b/tools/pygrub/src/pygrub Fri Oct 26 12:05:08 2012 +0100
@@ -595,7 +595,17 @@ def run_grub(file, entry, fs, cfg_args):
sel = g.run()
g = Grub(file, fs)
- if interactive:
+
+ if list_entries:
+ for i in range(len(g.cf.images)):
+ img = g.cf.images[i]
+ print "title: %s" % img.title
+ print " root: %s" % img.root
+ print " kernel: %s" % img.kernel[1]
+ print " args: %s" % img.args
+ print " initrd: %s" % img.initrd[1]
+
+ if interactive and not list_entries:
curses.wrapper(run_main)
else:
sel = g.cf.default
@@ -702,7 +712,7 @@ if __name__ == "__main__":
sel = None
def usage():
- print >> sys.stderr, "Usage: %s [-q|--quiet] [-i|--interactive] [-n|--not-really] [--output=] [--kernel=] [--ramdisk=] [--args=] [--entry=] [--output-directory=] [--output-format=sxp|simple|simple0] <image>" %(sys.argv[0],)
+ print >> sys.stderr, "Usage: %s [-q|--quiet] [-i|--interactive] [-l|--list-entries] [-n|--not-really] [--output=] [--kernel=] [--ramdisk=] [--args=] [--entry=] [--output-directory=] [--output-format=sxp|simple|simple0] <image>" %(sys.argv[0],)
def copy_from_image(fs, file_to_read, file_type, output_directory,
not_really):
@@ -736,8 +746,8 @@ if __name__ == "__main__":
dataoff += len(data)
try:
- opts, args = getopt.gnu_getopt(sys.argv[1:], 'qinh::',
- ["quiet", "interactive", "not-really", "help",
+ opts, args = getopt.gnu_getopt(sys.argv[1:], 'qilnh::',
+ ["quiet", "interactive", "list-entries", "not-really", "help",
"output=", "output-format=", "output-directory=",
"entry=", "kernel=",
"ramdisk=", "args=", "isconfig", "debug"])
@@ -753,6 +763,7 @@ if __name__ == "__main__":
output = None
entry = None
interactive = True
+ list_entries = False
isconfig = False
debug = False
not_really = False
@@ -771,6 +782,8 @@ if __name__ == "__main__":
interactive = False
elif o in ("-i", "--interactive"):
interactive = True
+ elif o in ("-l", "--list-entries"):
+ list_entries = True
elif o in ("-n", "--not-really"):
not_really = True
elif o in ("-h", "--help"):
@@ -855,6 +868,9 @@ if __name__ == "__main__":
fs = None
continue
+ if list_entries:
+ sys.exit(0)
+
# Did looping through partitions find us a kernel?
if not fs:
raise RuntimeError, "Unable to find partition containing kernel"
++++++ 26129-ACPI-BGRT-invalidate.patch ++++++
++++ 643 lines (skipped)
++++++ 26133-IOMMU-defer-BM-disable.patch ++++++
References: bnc#787169
# HG changeset patch
# User Jan Beulich
# Date 1352709367 -3600
# Node ID fdb69dd527cd01a46f87efb380050559dcf12d37
# Parent 286ef4ced2164f4e9bf52fd0c52248182e69a6e6
IOMMU: don't immediately disable bus mastering on faults
Instead, give the owning domain at least a small opportunity of fixing
things up, and allow for rare faults to not bring down the device at
all.
Signed-off-by: Jan Beulich
Acked-by: Tim Deegan
Acked-by: Dario Faggioli
Index: xen-4.2.2-testing/xen/drivers/passthrough/amd/iommu_init.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/amd/iommu_init.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/amd/iommu_init.c
@@ -564,7 +564,7 @@ static hw_irq_controller iommu_msi_type
static void parse_event_log_entry(struct amd_iommu *iommu, u32 entry[])
{
- u16 domain_id, device_id, bdf, cword;
+ u16 domain_id, device_id, bdf;
u32 code;
u64 *addr;
int count = 0;
@@ -615,18 +615,10 @@ static void parse_event_log_entry(struct
"fault address = 0x%"PRIx64"\n",
event_str[code-1], domain_id, device_id, *addr);
- /* Tell the device to stop DMAing; we can't rely on the guest to
- * control it for us. */
for ( bdf = 0; bdf < ivrs_bdf_entries; bdf++ )
if ( get_dma_requestor_id(iommu->seg, bdf) == device_id )
- {
- cword = pci_conf_read16(iommu->seg, PCI_BUS(bdf),
- PCI_SLOT(bdf), PCI_FUNC(bdf),
- PCI_COMMAND);
- pci_conf_write16(iommu->seg, PCI_BUS(bdf), PCI_SLOT(bdf),
- PCI_FUNC(bdf), PCI_COMMAND,
- cword & ~PCI_COMMAND_MASTER);
- }
+ pci_check_disable_device(iommu->seg, PCI_BUS(bdf),
+ PCI_DEVFN2(bdf));
}
else
{
Index: xen-4.2.2-testing/xen/drivers/passthrough/iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/iommu.c
@@ -218,6 +218,7 @@ static int device_assigned(u16 seg, u8 b
static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
{
struct hvm_iommu *hd = domain_hvm_iommu(d);
+ struct pci_dev *pdev;
int rc = 0;
if ( !iommu_enabled || !hd->platform_ops )
@@ -231,6 +232,10 @@ static int assign_device(struct domain *
return -EXDEV;
spin_lock(&pcidevs_lock);
+ pdev = pci_get_pdev(seg, bus, devfn);
+ if ( pdev )
+ pdev->fault.count = 0;
+
if ( (rc = hd->platform_ops->assign_device(d, seg, bus, devfn)) )
goto done;
@@ -382,6 +387,8 @@ int deassign_device(struct domain *d, u1
return ret;
}
+ pdev->fault.count = 0;
+
if ( !has_arch_pdevs(d) && need_iommu(d) )
{
d->need_iommu = 0;
Index: xen-4.2.2-testing/xen/drivers/passthrough/pci.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/pci.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/pci.c
@@ -637,6 +637,36 @@ int __init pci_device_detect(u16 seg, u8
return 1;
}
+void pci_check_disable_device(u16 seg, u8 bus, u8 devfn)
+{
+ struct pci_dev *pdev;
+ s_time_t now = NOW();
+ u16 cword;
+
+ spin_lock(&pcidevs_lock);
+ pdev = pci_get_pdev(seg, bus, devfn);
+ if ( pdev )
+ {
+ if ( now < pdev->fault.time ||
+ now - pdev->fault.time > MILLISECS(10) )
+ pdev->fault.count >>= 1;
+ pdev->fault.time = now;
+ if ( ++pdev->fault.count < PT_FAULT_THRESHOLD )
+ pdev = NULL;
+ }
+ spin_unlock(&pcidevs_lock);
+
+ if ( !pdev )
+ return;
+
+ /* Tell the device to stop DMAing; we can't rely on the guest to
+ * control it for us. */
+ cword = pci_conf_read16(seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+ PCI_COMMAND);
+ pci_conf_write16(seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+ PCI_COMMAND, cword & ~PCI_COMMAND_MASTER);
+}
+
/*
* scan pci devices to add all existed PCI devices to alldevs_list,
* and setup pci hierarchy in array bus2bridge.
Index: xen-4.2.2-testing/xen/drivers/passthrough/vtd/iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/vtd/iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/vtd/iommu.c
@@ -936,7 +936,7 @@ static void __do_iommu_page_fault(struct
while (1)
{
u8 fault_reason;
- u16 source_id, cword;
+ u16 source_id;
u32 data;
u64 guest_addr;
int type;
@@ -969,14 +969,8 @@ static void __do_iommu_page_fault(struct
iommu_page_fault_do_one(iommu, type, fault_reason,
source_id, guest_addr);
- /* Tell the device to stop DMAing; we can't rely on the guest to
- * control it for us. */
- cword = pci_conf_read16(iommu->intel->drhd->segment,
- PCI_BUS(source_id), PCI_SLOT(source_id),
- PCI_FUNC(source_id), PCI_COMMAND);
- pci_conf_write16(iommu->intel->drhd->segment, PCI_BUS(source_id),
- PCI_SLOT(source_id), PCI_FUNC(source_id),
- PCI_COMMAND, cword & ~PCI_COMMAND_MASTER);
+ pci_check_disable_device(iommu->intel->drhd->segment,
+ PCI_BUS(source_id), PCI_DEVFN2(source_id));
fault_index++;
if ( fault_index > cap_num_fault_regs(iommu->cap) )
Index: xen-4.2.2-testing/xen/include/xen/pci.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/xen/pci.h
+++ xen-4.2.2-testing/xen/include/xen/pci.h
@@ -65,6 +65,11 @@ struct pci_dev {
const u8 devfn;
struct pci_dev_info info;
struct arch_pci_dev arch;
+ struct {
+ s_time_t time;
+ unsigned int count;
+#define PT_FAULT_THRESHOLD 10
+ } fault;
u64 vf_rlen[6];
};
@@ -107,6 +112,7 @@ void arch_pci_ro_device(int seg, int bdf
struct pci_dev *pci_get_pdev(int seg, int bus, int devfn);
struct pci_dev *pci_get_pdev_by_domain(
struct domain *, int seg, int bus, int devfn);
+void pci_check_disable_device(u16 seg, u8 bus, u8 devfn);
uint8_t pci_conf_read8(
unsigned int seg, unsigned int bus, unsigned int dev, unsigned int func,
++++++ 26189-xenstore-chmod.patch ++++++
# HG changeset patch
# Parent 8b93ac0c93f3fb8a140b4688ba71841ac927d4e3
xenstore-chmod: handle arbitrary number of perms rather than MAX_PERMS constant
Constant MAX_PERMS 16 is too small to use in some occasions, e.g. if
there are more than 16 domU(s) on one hypervisor (it's easy to
achieve) and one wants to do xenstore-chmod PATH to all domU(s). So,
remove MAX_PERMS limitation and make it as arbitrary number of perms.
Signed-off-by: Chunyan Liu
Acked-by: Ian Campbell
diff -r 8b93ac0c93f3 tools/xenstore/xenstore_client.c
--- a/tools/xenstore/xenstore_client.c Tue Nov 13 11:19:17 2012 +0000
+++ b/tools/xenstore/xenstore_client.c Mon Nov 26 11:33:38 2012 +0800
@@ -25,7 +25,6 @@
#define PATH_SEP '/'
#define MAX_PATH_LEN 256
-#define MAX_PERMS 16
enum mode {
MODE_unknown,
@@ -407,44 +406,41 @@ perform(enum mode mode, int optind, int
output("%s\n", list[i]);
}
free(list);
- optind++;
- break;
- }
- case MODE_ls: {
- do_ls(xsh, argv[optind], 0, prefix);
- optind++;
- break;
+ optind++;
+ break;
+ }
+ case MODE_ls: {
+ do_ls(xsh, argv[optind], 0, prefix);
+ optind++;
+ break;
}
case MODE_chmod: {
- struct xs_permissions perms[MAX_PERMS];
- int nperms = 0;
/* save path pointer: */
char *path = argv[optind++];
- for (; argv[optind]; optind++, nperms++)
+ int nperms = argc - optind;
+ struct xs_permissions perms[nperms];
+ int i;
+ for (i = 0; argv[optind]; optind++, i++)
{
- if (MAX_PERMS <= nperms)
- errx(1, "Too many permissions specified. "
- "Maximum per invocation is %d.", MAX_PERMS);
-
- perms[nperms].id = atoi(argv[optind]+1);
+ perms[i].id = atoi(argv[optind]+1);
switch (argv[optind][0])
{
case 'n':
- perms[nperms].perms = XS_PERM_NONE;
+ perms[i].perms = XS_PERM_NONE;
break;
case 'r':
- perms[nperms].perms = XS_PERM_READ;
+ perms[i].perms = XS_PERM_READ;
break;
case 'w':
- perms[nperms].perms = XS_PERM_WRITE;
+ perms[i].perms = XS_PERM_WRITE;
break;
case 'b':
- perms[nperms].perms = XS_PERM_READ | XS_PERM_WRITE;
+ perms[i].perms = XS_PERM_READ | XS_PERM_WRITE;
break;
default:
errx(1, "Invalid permission specification: '%c'",
- argv[optind][0]);
+ argv[optind][0]);
}
}
++++++ 26262-x86-EFI-secure-shim.patch ++++++
# HG changeset patch
# User Jan Beulich
# Date 1354884272 -3600
# Node ID b62bd62b26836fafe19cf41fec194bcf33e2ead6
# Parent cb542e58da25211843eb79998ea8568ebe9c8056
x86/EFI: add code interfacing with the secure boot shim
... to validate the kernel image (which is required to be in PE
format, as is e.g. the case for the Linux bzImage when built with
CONFIG_EFI_STUB).
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
--- a/xen/arch/x86/efi/boot.c
+++ b/xen/arch/x86/efi/boot.c
@@ -24,6 +24,18 @@
#include
#include
+#define SHIM_LOCK_PROTOCOL_GUID \
+ { 0x605dab50, 0xe046, 0x4300, {0xab, 0xb6, 0x3d, 0xd8, 0x10, 0xdd, 0x8b, 0x23} }
+
+typedef EFI_STATUS
+(/* _not_ EFIAPI */ *EFI_SHIM_LOCK_VERIFY) (
+ IN VOID *Buffer,
+ IN UINT32 Size);
+
+typedef struct {
+ EFI_SHIM_LOCK_VERIFY Verify;
+} EFI_SHIM_LOCK_PROTOCOL;
+
extern char start[];
extern u32 cpuid_ext_features;
@@ -628,12 +640,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SY
static EFI_GUID __initdata gop_guid = EFI_GRAPHICS_OUTPUT_PROTOCOL_GUID;
static EFI_GUID __initdata bio_guid = BLOCK_IO_PROTOCOL;
static EFI_GUID __initdata devp_guid = DEVICE_PATH_PROTOCOL;
+ static EFI_GUID __initdata shim_lock_guid = SHIM_LOCK_PROTOCOL_GUID;
EFI_LOADED_IMAGE *loaded_image;
EFI_STATUS status;
unsigned int i, argc;
CHAR16 **argv, *file_name, *cfg_file_name = NULL;
UINTN cols, rows, depth, size, map_key, info_size, gop_mode = ~0;
EFI_HANDLE *handles = NULL;
+ EFI_SHIM_LOCK_PROTOCOL *shim_lock;
EFI_GRAPHICS_OUTPUT_PROTOCOL *gop = NULL;
EFI_GRAPHICS_OUTPUT_MODE_INFORMATION *mode_info;
EFI_FILE_HANDLE dir_handle;
@@ -823,6 +837,11 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SY
read_file(dir_handle, s2w(&name), &kernel);
efi_bs->FreePool(name.w);
+ if ( !EFI_ERROR(efi_bs->LocateProtocol(&shim_lock_guid, NULL,
+ (void **)&shim_lock)) &&
+ shim_lock->Verify(kernel.ptr, kernel.size) != EFI_SUCCESS )
+ blexit(L"Dom0 kernel image could not be verified\r\n");
+
name.s = get_value(&cfg, section.s, "ramdisk");
if ( name.s )
{
++++++ 26324-IOMMU-assign-params.patch ++++++
References: bnc#787169
# HG changeset patch
# User Jan Beulich
# Date 1357559364 -3600
# Node ID 62dd78a4e3fc9d190840549f13b4d613f2d19c41
# Parent 64b36dde26bc3c4fc80312cc9eeb0e511f0cf94b
IOMMU: adjust (re)assign operation parameters
... to use a (struct pci_dev *, devfn) pair.
Signed-off-by: Jan Beulich
Acked-by: "Zhang, Xiantao"
Index: xen-4.2.2-testing/xen/drivers/passthrough/amd/pci_amd_iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -333,34 +333,31 @@ void amd_iommu_disable_domain_device(str
disable_ats_device(iommu->seg, bus, devfn);
}
-static int reassign_device( struct domain *source, struct domain *target,
- u16 seg, u8 bus, u8 devfn)
+static int reassign_device(struct domain *source, struct domain *target,
+ u8 devfn, struct pci_dev *pdev)
{
- struct pci_dev *pdev;
struct amd_iommu *iommu;
int bdf;
struct hvm_iommu *t = domain_hvm_iommu(target);
- ASSERT(spin_is_locked(&pcidevs_lock));
- pdev = pci_get_pdev_by_domain(source, seg, bus, devfn);
- if ( !pdev )
- return -ENODEV;
-
- bdf = PCI_BDF2(bus, devfn);
- iommu = find_iommu_for_device(seg, bdf);
+ bdf = PCI_BDF2(pdev->bus, pdev->devfn);
+ iommu = find_iommu_for_device(pdev->seg, bdf);
if ( !iommu )
{
AMD_IOMMU_DEBUG("Fail to find iommu."
" %04x:%02x:%x02.%x cannot be assigned to dom%d\n",
- seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+ pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
target->domain_id);
return -ENODEV;
}
amd_iommu_disable_domain_device(source, iommu, bdf);
- list_move(&pdev->domain_list, &target->arch.pdev_list);
- pdev->domain = target;
+ if ( devfn == pdev->devfn )
+ {
+ list_move(&pdev->domain_list, &target->arch.pdev_list);
+ pdev->domain = target;
+ }
/* IO page tables might be destroyed after pci-detach the last device
* In this case, we have to re-allocate root table for next pci-attach.*/
@@ -369,17 +366,18 @@ static int reassign_device( struct domai
amd_iommu_setup_domain_device(target, iommu, bdf);
AMD_IOMMU_DEBUG("Re-assign %04x:%02x:%02x.%u from dom%d to dom%d\n",
- seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+ pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
source->domain_id, target->domain_id);
return 0;
}
-static int amd_iommu_assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
+static int amd_iommu_assign_device(struct domain *d, u8 devfn,
+ struct pci_dev *pdev)
{
- struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(seg);
- int bdf = (bus << 8) | devfn;
- int req_id = get_dma_requestor_id(seg, bdf);
+ struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(pdev->seg);
+ int bdf = PCI_BDF2(pdev->bus, devfn);
+ int req_id = get_dma_requestor_id(pdev->seg, bdf);
if ( ivrs_mappings[req_id].unity_map_enable )
{
@@ -391,7 +389,7 @@ static int amd_iommu_assign_device(struc
ivrs_mappings[req_id].read_permission);
}
- return reassign_device(dom0, d, seg, bus, devfn);
+ return reassign_device(dom0, d, devfn, pdev);
}
static void deallocate_next_page_table(struct page_info* pg, int level)
@@ -456,12 +454,6 @@ static void amd_iommu_domain_destroy(str
amd_iommu_flush_all_pages(d);
}
-static int amd_iommu_return_device(
- struct domain *s, struct domain *t, u16 seg, u8 bus, u8 devfn)
-{
- return reassign_device(s, t, seg, bus, devfn);
-}
-
static int amd_iommu_add_device(struct pci_dev *pdev)
{
struct amd_iommu *iommu;
@@ -601,7 +593,7 @@ const struct iommu_ops amd_iommu_ops = {
.teardown = amd_iommu_domain_destroy,
.map_page = amd_iommu_map_page,
.unmap_page = amd_iommu_unmap_page,
- .reassign_device = amd_iommu_return_device,
+ .reassign_device = reassign_device,
.get_device_group_id = amd_iommu_group_id,
.update_ire_from_apic = amd_iommu_ioapic_update_ire,
.update_ire_from_msi = amd_iommu_msi_msg_update_ire,
Index: xen-4.2.2-testing/xen/drivers/passthrough/iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/iommu.c
@@ -232,11 +232,16 @@ static int assign_device(struct domain *
return -EXDEV;
spin_lock(&pcidevs_lock);
- pdev = pci_get_pdev(seg, bus, devfn);
- if ( pdev )
- pdev->fault.count = 0;
+ pdev = pci_get_pdev_by_domain(dom0, seg, bus, devfn);
+ if ( !pdev )
+ {
+ rc = pci_get_pdev(seg, bus, devfn) ? -EBUSY : -ENODEV;
+ goto done;
+ }
+
+ pdev->fault.count = 0;
- if ( (rc = hd->platform_ops->assign_device(d, seg, bus, devfn)) )
+ if ( (rc = hd->platform_ops->assign_device(d, devfn, pdev)) )
goto done;
if ( has_arch_pdevs(d) && !need_iommu(d) )
@@ -367,18 +372,11 @@ int deassign_device(struct domain *d, u1
return -EINVAL;
ASSERT(spin_is_locked(&pcidevs_lock));
- pdev = pci_get_pdev(seg, bus, devfn);
+ pdev = pci_get_pdev_by_domain(d, seg, bus, devfn);
if ( !pdev )
return -ENODEV;
- if ( pdev->domain != d )
- {
- dprintk(XENLOG_ERR VTDPREFIX,
- "d%d: deassign a device not owned\n", d->domain_id);
- return -EINVAL;
- }
-
- ret = hd->platform_ops->reassign_device(d, dom0, seg, bus, devfn);
+ ret = hd->platform_ops->reassign_device(d, dom0, devfn, pdev);
if ( ret )
{
dprintk(XENLOG_ERR VTDPREFIX,
Index: xen-4.2.2-testing/xen/drivers/passthrough/vtd/iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/vtd/iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/vtd/iommu.c
@@ -1689,17 +1689,10 @@ out:
static int reassign_device_ownership(
struct domain *source,
struct domain *target,
- u16 seg, u8 bus, u8 devfn)
+ u8 devfn, struct pci_dev *pdev)
{
- struct pci_dev *pdev;
int ret;
- ASSERT(spin_is_locked(&pcidevs_lock));
- pdev = pci_get_pdev_by_domain(source, seg, bus, devfn);
-
- if (!pdev)
- return -ENODEV;
-
/*
* Devices assigned to untrusted domains (here assumed to be any domU)
* can attempt to send arbitrary LAPIC/MSI messages. We are unprotected
@@ -1708,16 +1701,19 @@ static int reassign_device_ownership(
if ( (target != dom0) && !iommu_intremap )
untrusted_msi = 1;
- ret = domain_context_unmap(source, seg, bus, devfn);
+ ret = domain_context_unmap(source, pdev->seg, pdev->bus, devfn);
if ( ret )
return ret;
- ret = domain_context_mapping(target, seg, bus, devfn);
+ ret = domain_context_mapping(target, pdev->seg, pdev->bus, devfn);
if ( ret )
return ret;
- list_move(&pdev->domain_list, &target->arch.pdev_list);
- pdev->domain = target;
+ if ( devfn == pdev->devfn )
+ {
+ list_move(&pdev->domain_list, &target->arch.pdev_list);
+ pdev->domain = target;
+ }
return ret;
}
@@ -2222,36 +2218,26 @@ int __init intel_vtd_setup(void)
}
static int intel_iommu_assign_device(
- struct domain *d, u16 seg, u8 bus, u8 devfn)
+ struct domain *d, u8 devfn, struct pci_dev *pdev)
{
struct acpi_rmrr_unit *rmrr;
int ret = 0, i;
- struct pci_dev *pdev;
- u16 bdf;
+ u16 bdf, seg;
+ u8 bus;
if ( list_empty(&acpi_drhd_units) )
return -ENODEV;
- ASSERT(spin_is_locked(&pcidevs_lock));
- pdev = pci_get_pdev(seg, bus, devfn);
- if (!pdev)
- return -ENODEV;
-
- if (pdev->domain != dom0)
- {
- dprintk(XENLOG_ERR VTDPREFIX,
- "IOMMU: assign a assigned device\n");
- return -EBUSY;
- }
-
- ret = reassign_device_ownership(dom0, d, seg, bus, devfn);
+ ret = reassign_device_ownership(dom0, d, devfn, pdev);
if ( ret )
goto done;
/* FIXME: Because USB RMRR conflicts with guest bios region,
* ignore USB RMRR temporarily.
*/
- if ( is_usb_device(seg, bus, devfn) )
+ seg = pdev->seg;
+ bus = pdev->bus;
+ if ( is_usb_device(seg, bus, pdev->devfn) )
{
ret = 0;
goto done;
Index: xen-4.2.2-testing/xen/include/xen/iommu.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/xen/iommu.h
+++ xen-4.2.2-testing/xen/include/xen/iommu.h
@@ -123,13 +123,13 @@ struct iommu_ops {
int (*add_device)(struct pci_dev *pdev);
int (*enable_device)(struct pci_dev *pdev);
int (*remove_device)(struct pci_dev *pdev);
- int (*assign_device)(struct domain *d, u16 seg, u8 bus, u8 devfn);
+ int (*assign_device)(struct domain *, u8 devfn, struct pci_dev *);
void (*teardown)(struct domain *d);
int (*map_page)(struct domain *d, unsigned long gfn, unsigned long mfn,
unsigned int flags);
int (*unmap_page)(struct domain *d, unsigned long gfn);
int (*reassign_device)(struct domain *s, struct domain *t,
- u16 seg, u8 bus, u8 devfn);
+ u8 devfn, struct pci_dev *);
int (*get_device_group_id)(u16 seg, u8 bus, u8 devfn);
void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
void (*update_ire_from_msi)(struct msi_desc *msi_desc, struct msi_msg *msg);
++++++ 26325-IOMMU-add-remove-params.patch ++++++
References: bnc#787169
# HG changeset patch
# User Jan Beulich
# Date 1357559482 -3600
# Node ID 75cc4943b1ff509c4074800a23ff51d773233b8a
# Parent 62dd78a4e3fc9d190840549f13b4d613f2d19c41
IOMMU: adjust add/remove operation parameters
... to use a (struct pci_dev *, devfn) pair.
Signed-off-by: Jan Beulich
Acked-by: "Zhang, Xiantao"
Index: xen-4.2.2-testing/xen/drivers/passthrough/amd/pci_amd_iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -83,14 +83,14 @@ static void disable_translation(u32 *dte
}
static void amd_iommu_setup_domain_device(
- struct domain *domain, struct amd_iommu *iommu, int bdf)
+ struct domain *domain, struct amd_iommu *iommu,
+ u8 devfn, struct pci_dev *pdev)
{
void *dte;
unsigned long flags;
int req_id, valid = 1;
int dte_i = 0;
- u8 bus = PCI_BUS(bdf);
- u8 devfn = PCI_DEVFN2(bdf);
+ u8 bus = pdev->bus;
struct hvm_iommu *hd = domain_hvm_iommu(domain);
@@ -103,7 +103,7 @@ static void amd_iommu_setup_domain_devic
dte_i = 1;
/* get device-table entry */
- req_id = get_dma_requestor_id(iommu->seg, bdf);
+ req_id = get_dma_requestor_id(iommu->seg, PCI_BDF2(bus, devfn));
dte = iommu->dev_table.buffer + (req_id * IOMMU_DEV_TABLE_ENTRY_SIZE);
spin_lock_irqsave(&iommu->lock, flags);
@@ -115,7 +115,7 @@ static void amd_iommu_setup_domain_devic
(u32 *)dte, page_to_maddr(hd->root_table), hd->domain_id,
hd->paging_mode, valid);
- if ( pci_ats_device(iommu->seg, bus, devfn) &&
+ if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
iommu_dte_set_iotlb((u32 *)dte, dte_i);
@@ -132,32 +132,31 @@ static void amd_iommu_setup_domain_devic
ASSERT(spin_is_locked(&pcidevs_lock));
- if ( pci_ats_device(iommu->seg, bus, devfn) &&
- !pci_ats_enabled(iommu->seg, bus, devfn) )
+ if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
+ !pci_ats_enabled(iommu->seg, bus, pdev->devfn) )
{
- struct pci_dev *pdev;
+ if ( devfn == pdev->devfn )
+ enable_ats_device(iommu->seg, bus, devfn);
- enable_ats_device(iommu->seg, bus, devfn);
-
- ASSERT(spin_is_locked(&pcidevs_lock));
- pdev = pci_get_pdev(iommu->seg, bus, devfn);
-
- ASSERT( pdev != NULL );
amd_iommu_flush_iotlb(pdev, INV_IOMMU_ALL_PAGES_ADDRESS, 0);
}
}
-static void __init amd_iommu_setup_dom0_device(struct pci_dev *pdev)
+static int __init amd_iommu_setup_dom0_device(u8 devfn, struct pci_dev *pdev)
{
int bdf = PCI_BDF2(pdev->bus, pdev->devfn);
struct amd_iommu *iommu = find_iommu_for_device(pdev->seg, bdf);
- if ( likely(iommu != NULL) )
- amd_iommu_setup_domain_device(pdev->domain, iommu, bdf);
- else
+ if ( unlikely(!iommu) )
+ {
AMD_IOMMU_DEBUG("No iommu for device %04x:%02x:%02x.%u\n",
pdev->seg, pdev->bus,
- PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+ PCI_SLOT(devfn), PCI_FUNC(devfn));
+ return -ENODEV;
+ }
+
+ amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev);
+ return 0;
}
int __init amd_iov_detect(void)
@@ -296,16 +295,16 @@ static void __init amd_iommu_dom0_init(s
}
void amd_iommu_disable_domain_device(struct domain *domain,
- struct amd_iommu *iommu, int bdf)
+ struct amd_iommu *iommu,
+ u8 devfn, struct pci_dev *pdev)
{
void *dte;
unsigned long flags;
int req_id;
- u8 bus = PCI_BUS(bdf);
- u8 devfn = PCI_DEVFN2(bdf);
+ u8 bus = pdev->bus;
BUG_ON ( iommu->dev_table.buffer == NULL );
- req_id = get_dma_requestor_id(iommu->seg, bdf);
+ req_id = get_dma_requestor_id(iommu->seg, PCI_BDF2(bus, devfn));
dte = iommu->dev_table.buffer + (req_id * IOMMU_DEV_TABLE_ENTRY_SIZE);
spin_lock_irqsave(&iommu->lock, flags);
@@ -313,7 +312,7 @@ void amd_iommu_disable_domain_device(str
{
disable_translation((u32 *)dte);
- if ( pci_ats_device(iommu->seg, bus, devfn) &&
+ if ( pci_ats_device(iommu->seg, bus, pdev->devfn) &&
iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
iommu_dte_set_iotlb((u32 *)dte, 0);
@@ -328,7 +327,8 @@ void amd_iommu_disable_domain_device(str
ASSERT(spin_is_locked(&pcidevs_lock));
- if ( pci_ats_device(iommu->seg, bus, devfn) &&
+ if ( devfn == pdev->devfn &&
+ pci_ats_device(iommu->seg, bus, devfn) &&
pci_ats_enabled(iommu->seg, bus, devfn) )
disable_ats_device(iommu->seg, bus, devfn);
}
@@ -351,7 +351,7 @@ static int reassign_device(struct domain
return -ENODEV;
}
- amd_iommu_disable_domain_device(source, iommu, bdf);
+ amd_iommu_disable_domain_device(source, iommu, devfn, pdev);
if ( devfn == pdev->devfn )
{
@@ -364,7 +364,7 @@ static int reassign_device(struct domain
if ( t->root_table == NULL )
allocate_domain_resources(t);
- amd_iommu_setup_domain_device(target, iommu, bdf);
+ amd_iommu_setup_domain_device(target, iommu, devfn, pdev);
AMD_IOMMU_DEBUG("Re-assign %04x:%02x:%02x.%u from dom%d to dom%d\n",
pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
source->domain_id, target->domain_id);
@@ -454,7 +454,7 @@ static void amd_iommu_domain_destroy(str
amd_iommu_flush_all_pages(d);
}
-static int amd_iommu_add_device(struct pci_dev *pdev)
+static int amd_iommu_add_device(u8 devfn, struct pci_dev *pdev)
{
struct amd_iommu *iommu;
u16 bdf;
@@ -467,16 +467,16 @@ static int amd_iommu_add_device(struct p
{
AMD_IOMMU_DEBUG("Fail to find iommu."
" %04x:%02x:%02x.%u cannot be assigned to dom%d\n",
- pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
- PCI_FUNC(pdev->devfn), pdev->domain->domain_id);
+ pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+ pdev->domain->domain_id);
return -ENODEV;
}
- amd_iommu_setup_domain_device(pdev->domain, iommu, bdf);
+ amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev);
return 0;
}
-static int amd_iommu_remove_device(struct pci_dev *pdev)
+static int amd_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
{
struct amd_iommu *iommu;
u16 bdf;
@@ -489,12 +489,12 @@ static int amd_iommu_remove_device(struc
{
AMD_IOMMU_DEBUG("Fail to find iommu."
" %04x:%02x:%02x.%u cannot be removed from dom%d\n",
- pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
- PCI_FUNC(pdev->devfn), pdev->domain->domain_id);
+ pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+ pdev->domain->domain_id);
return -ENODEV;
}
- amd_iommu_disable_domain_device(pdev->domain, iommu, bdf);
+ amd_iommu_disable_domain_device(pdev->domain, iommu, devfn, pdev);
return 0;
}
Index: xen-4.2.2-testing/xen/drivers/passthrough/iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/iommu.c
@@ -167,7 +167,7 @@ int iommu_add_device(struct pci_dev *pde
if ( !iommu_enabled || !hd->platform_ops )
return 0;
- return hd->platform_ops->add_device(pdev);
+ return hd->platform_ops->add_device(pdev->devfn, pdev);
}
int iommu_enable_device(struct pci_dev *pdev)
@@ -197,7 +197,7 @@ int iommu_remove_device(struct pci_dev *
if ( !iommu_enabled || !hd->platform_ops )
return 0;
- return hd->platform_ops->remove_device(pdev);
+ return hd->platform_ops->remove_device(pdev->devfn, pdev);
}
/*
Index: xen-4.2.2-testing/xen/drivers/passthrough/pci.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/pci.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/pci.c
@@ -715,7 +715,7 @@ int __init scan_pci_devices(void)
struct setup_dom0 {
struct domain *d;
- void (*handler)(struct pci_dev *);
+ int (*handler)(u8 devfn, struct pci_dev *);
};
static int __init _setup_dom0_pci_devices(struct pci_seg *pseg, void *arg)
@@ -734,7 +734,7 @@ static int __init _setup_dom0_pci_device
pdev->domain = ctxt->d;
list_add(&pdev->domain_list, &ctxt->d->arch.pdev_list);
- ctxt->handler(pdev);
+ ctxt->handler(devfn, pdev);
}
}
@@ -742,7 +742,7 @@ static int __init _setup_dom0_pci_device
}
void __init setup_dom0_pci_devices(
- struct domain *d, void (*handler)(struct pci_dev *))
+ struct domain *d, int (*handler)(u8 devfn, struct pci_dev *))
{
struct setup_dom0 ctxt = { .d = d, .handler = handler };
Index: xen-4.2.2-testing/xen/drivers/passthrough/vtd/iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/vtd/iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/vtd/iommu.c
@@ -52,7 +52,7 @@ int nr_iommus;
static struct tasklet vtd_fault_tasklet;
-static void setup_dom0_device(struct pci_dev *);
+static int setup_dom0_device(u8 devfn, struct pci_dev *);
static void setup_dom0_rmrr(struct domain *d);
static int domain_iommu_domid(struct domain *d,
@@ -1904,7 +1904,7 @@ static int rmrr_identity_mapping(struct
return 0;
}
-static int intel_iommu_add_device(struct pci_dev *pdev)
+static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev)
{
struct acpi_rmrr_unit *rmrr;
u16 bdf;
@@ -1915,8 +1915,7 @@ static int intel_iommu_add_device(struct
if ( !pdev->domain )
return -EINVAL;
- ret = domain_context_mapping(pdev->domain, pdev->seg, pdev->bus,
- pdev->devfn);
+ ret = domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, devfn);
if ( ret )
{
dprintk(XENLOG_ERR VTDPREFIX, "d%d: context mapping failed\n",
@@ -1928,7 +1927,7 @@ static int intel_iommu_add_device(struct
{
if ( rmrr->segment == pdev->seg &&
PCI_BUS(bdf) == pdev->bus &&
- PCI_DEVFN2(bdf) == pdev->devfn )
+ PCI_DEVFN2(bdf) == devfn )
{
ret = rmrr_identity_mapping(pdev->domain, rmrr);
if ( ret )
@@ -1953,7 +1952,7 @@ static int intel_iommu_enable_device(str
return ret >= 0 ? 0 : ret;
}
-static int intel_iommu_remove_device(struct pci_dev *pdev)
+static int intel_iommu_remove_device(u8 devfn, struct pci_dev *pdev)
{
struct acpi_rmrr_unit *rmrr;
u16 bdf;
@@ -1971,19 +1970,22 @@ static int intel_iommu_remove_device(str
{
if ( rmrr->segment == pdev->seg &&
PCI_BUS(bdf) == pdev->bus &&
- PCI_DEVFN2(bdf) == pdev->devfn )
+ PCI_DEVFN2(bdf) == devfn )
return 0;
}
}
- return domain_context_unmap(pdev->domain, pdev->seg, pdev->bus,
- pdev->devfn);
+ return domain_context_unmap(pdev->domain, pdev->seg, pdev->bus, devfn);
}
-static void __init setup_dom0_device(struct pci_dev *pdev)
+static int __init setup_dom0_device(u8 devfn, struct pci_dev *pdev)
{
- domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, pdev->devfn);
- pci_vtd_quirk(pdev);
+ int err;
+
+ err = domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, devfn);
+ if ( !err && devfn == pdev->devfn )
+ pci_vtd_quirk(pdev);
+ return err;
}
void clear_fault_bits(struct iommu *iommu)
Index: xen-4.2.2-testing/xen/include/xen/iommu.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/xen/iommu.h
+++ xen-4.2.2-testing/xen/include/xen/iommu.h
@@ -120,9 +120,9 @@ bool_t pt_irq_need_timer(uint32_t flags)
struct iommu_ops {
int (*init)(struct domain *d);
void (*dom0_init)(struct domain *d);
- int (*add_device)(struct pci_dev *pdev);
+ int (*add_device)(u8 devfn, struct pci_dev *);
int (*enable_device)(struct pci_dev *pdev);
- int (*remove_device)(struct pci_dev *pdev);
+ int (*remove_device)(u8 devfn, struct pci_dev *);
int (*assign_device)(struct domain *, u8 devfn, struct pci_dev *);
void (*teardown)(struct domain *d);
int (*map_page)(struct domain *d, unsigned long gfn, unsigned long mfn,
Index: xen-4.2.2-testing/xen/include/xen/pci.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/xen/pci.h
+++ xen-4.2.2-testing/xen/include/xen/pci.h
@@ -101,7 +101,8 @@ struct pci_dev *pci_lock_pdev(int seg, i
struct pci_dev *pci_lock_domain_pdev(
struct domain *, int seg, int bus, int devfn);
-void setup_dom0_pci_devices(struct domain *, void (*)(struct pci_dev *));
+void setup_dom0_pci_devices(struct domain *,
+ int (*)(u8 devfn, struct pci_dev *));
void pci_release_devices(struct domain *d);
int pci_add_segment(u16 seg);
const unsigned long *pci_get_ro_map(u16 seg);
++++++ 26326-VT-d-context-map-params.patch ++++++
References: bnc#787169
# HG changeset patch
# User Jan Beulich
# Date 1357559549 -3600
# Node ID afb598bd0f5436bea15b7ef842e8ad5c6adefa1a
# Parent 75cc4943b1ff509c4074800a23ff51d773233b8a
VT-d: adjust context map/unmap parameters
... to use a (struct pci_dev *, devfn) pair.
Signed-off-by: Jan Beulich
Acked-by: "Zhang, Xiantao"
Index: xen-4.2.2-testing/xen/drivers/passthrough/vtd/extern.h
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/vtd/extern.h
+++ xen-4.2.2-testing/xen/drivers/passthrough/vtd/extern.h
@@ -95,7 +95,7 @@ void free_pgtable_maddr(u64 maddr);
void *map_vtd_domain_page(u64 maddr);
void unmap_vtd_domain_page(void *va);
int domain_context_mapping_one(struct domain *domain, struct iommu *iommu,
- u8 bus, u8 devfn);
+ u8 bus, u8 devfn, const struct pci_dev *);
int domain_context_unmap_one(struct domain *domain, struct iommu *iommu,
u8 bus, u8 devfn);
Index: xen-4.2.2-testing/xen/drivers/passthrough/vtd/iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/vtd/iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/vtd/iommu.c
@@ -1308,7 +1308,7 @@ static void __init intel_iommu_dom0_init
int domain_context_mapping_one(
struct domain *domain,
struct iommu *iommu,
- u8 bus, u8 devfn)
+ u8 bus, u8 devfn, const struct pci_dev *pdev)
{
struct hvm_iommu *hd = domain_hvm_iommu(domain);
struct context_entry *context, *context_entries;
@@ -1325,11 +1325,9 @@ int domain_context_mapping_one(
if ( context_present(*context) )
{
int res = 0;
- struct pci_dev *pdev = NULL;
- /* First try to get domain ownership from device structure. If that's
+ /* Try to get domain ownership from device structure. If that's
* not available, try to read it from the context itself. */
- pdev = pci_get_pdev(seg, bus, devfn);
if ( pdev )
{
if ( pdev->domain != domain )
@@ -1448,13 +1446,12 @@ int domain_context_mapping_one(
}
static int domain_context_mapping(
- struct domain *domain, u16 seg, u8 bus, u8 devfn)
+ struct domain *domain, u8 devfn, const struct pci_dev *pdev)
{
struct acpi_drhd_unit *drhd;
int ret = 0;
u32 type;
- u8 secbus;
- struct pci_dev *pdev = pci_get_pdev(seg, bus, devfn);
+ u8 seg = pdev->seg, bus = pdev->bus, secbus;
drhd = acpi_find_matched_drhd_unit(pdev);
if ( !drhd )
@@ -1475,8 +1472,9 @@ static int domain_context_mapping(
dprintk(VTDPREFIX, "d%d:PCIe: map %04x:%02x:%02x.%u\n",
domain->domain_id, seg, bus,
PCI_SLOT(devfn), PCI_FUNC(devfn));
- ret = domain_context_mapping_one(domain, drhd->iommu, bus, devfn);
- if ( !ret && ats_device(pdev, drhd) > 0 )
+ ret = domain_context_mapping_one(domain, drhd->iommu, bus, devfn,
+ pdev);
+ if ( !ret && devfn == pdev->devfn && ats_device(pdev, drhd) > 0 )
enable_ats_device(seg, bus, devfn);
break;
@@ -1487,14 +1485,16 @@ static int domain_context_mapping(
domain->domain_id, seg, bus,
PCI_SLOT(devfn), PCI_FUNC(devfn));
- ret = domain_context_mapping_one(domain, drhd->iommu, bus, devfn);
+ ret = domain_context_mapping_one(domain, drhd->iommu, bus, devfn,
+ pdev);
if ( ret )
break;
if ( find_upstream_bridge(seg, &bus, &devfn, &secbus) < 1 )
break;
- ret = domain_context_mapping_one(domain, drhd->iommu, bus, devfn);
+ ret = domain_context_mapping_one(domain, drhd->iommu, bus, devfn,
+ pci_get_pdev(seg, bus, devfn));
/*
* Devices behind PCIe-to-PCI/PCIx bridge may generate different
@@ -1503,7 +1503,8 @@ static int domain_context_mapping(
*/
if ( !ret && pdev_type(seg, bus, devfn) == DEV_TYPE_PCIe2PCI_BRIDGE &&
(secbus != pdev->bus || pdev->devfn != 0) )
- ret = domain_context_mapping_one(domain, drhd->iommu, secbus, 0);
+ ret = domain_context_mapping_one(domain, drhd->iommu, secbus, 0,
+ pci_get_pdev(seg, secbus, 0));
break;
@@ -1576,18 +1577,15 @@ int domain_context_unmap_one(
}
static int domain_context_unmap(
- struct domain *domain, u16 seg, u8 bus, u8 devfn)
+ struct domain *domain, u8 devfn, const struct pci_dev *pdev)
{
struct acpi_drhd_unit *drhd;
struct iommu *iommu;
int ret = 0;
u32 type;
- u8 tmp_bus, tmp_devfn, secbus;
- struct pci_dev *pdev = pci_get_pdev(seg, bus, devfn);
+ u8 seg = pdev->seg, bus = pdev->bus, tmp_bus, tmp_devfn, secbus;
int found = 0;
- BUG_ON(!pdev);
-
drhd = acpi_find_matched_drhd_unit(pdev);
if ( !drhd )
return -ENODEV;
@@ -1607,7 +1605,7 @@ static int domain_context_unmap(
domain->domain_id, seg, bus,
PCI_SLOT(devfn), PCI_FUNC(devfn));
ret = domain_context_unmap_one(domain, iommu, bus, devfn);
- if ( !ret && ats_device(pdev, drhd) > 0 )
+ if ( !ret && devfn == pdev->devfn && ats_device(pdev, drhd) > 0 )
disable_ats_device(seg, bus, devfn);
break;
@@ -1701,11 +1699,11 @@ static int reassign_device_ownership(
if ( (target != dom0) && !iommu_intremap )
untrusted_msi = 1;
- ret = domain_context_unmap(source, pdev->seg, pdev->bus, devfn);
+ ret = domain_context_unmap(source, devfn, pdev);
if ( ret )
return ret;
- ret = domain_context_mapping(target, pdev->seg, pdev->bus, devfn);
+ ret = domain_context_mapping(target, devfn, pdev);
if ( ret )
return ret;
@@ -1915,7 +1913,7 @@ static int intel_iommu_add_device(u8 dev
if ( !pdev->domain )
return -EINVAL;
- ret = domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, devfn);
+ ret = domain_context_mapping(pdev->domain, devfn, pdev);
if ( ret )
{
dprintk(XENLOG_ERR VTDPREFIX, "d%d: context mapping failed\n",
@@ -1975,14 +1973,14 @@ static int intel_iommu_remove_device(u8
}
}
- return domain_context_unmap(pdev->domain, pdev->seg, pdev->bus, devfn);
+ return domain_context_unmap(pdev->domain, devfn, pdev);
}
static int __init setup_dom0_device(u8 devfn, struct pci_dev *pdev)
{
int err;
- err = domain_context_mapping(pdev->domain, pdev->seg, pdev->bus, devfn);
+ err = domain_context_mapping(pdev->domain, devfn, pdev);
if ( !err && devfn == pdev->devfn )
pci_vtd_quirk(pdev);
return err;
Index: xen-4.2.2-testing/xen/drivers/passthrough/vtd/quirks.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/vtd/quirks.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/vtd/quirks.c
@@ -319,7 +319,7 @@ static void map_me_phantom_function(stru
/* map or unmap ME phantom function */
if ( map )
domain_context_mapping_one(domain, drhd->iommu, 0,
- PCI_DEVFN(dev, 7));
+ PCI_DEVFN(dev, 7), NULL);
else
domain_context_unmap_one(domain, drhd->iommu, 0,
PCI_DEVFN(dev, 7));
++++++ 26327-AMD-IOMMU-flush-params.patch ++++++
References: bnc#787169
# HG changeset patch
# User Jan Beulich
# Date 1357559599 -3600
# Node ID 2a2c63f641ee3bda4ad552eb0b3ea479d37590cc
# Parent afb598bd0f5436bea15b7ef842e8ad5c6adefa1a
AMD IOMMU: adjust flush function parameters
... to use a (struct pci_dev *, devfn) pair.
Signed-off-by: Jan Beulich
Acked-by: "Zhang, Xiantao"
--- a/xen/drivers/passthrough/amd/iommu_cmd.c
+++ b/xen/drivers/passthrough/amd/iommu_cmd.c
@@ -287,12 +287,12 @@ void invalidate_iommu_all(struct amd_iom
send_iommu_command(iommu, cmd);
}
-void amd_iommu_flush_iotlb(struct pci_dev *pdev,
+void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
uint64_t gaddr, unsigned int order)
{
unsigned long flags;
struct amd_iommu *iommu;
- unsigned int bdf, req_id, queueid, maxpend;
+ unsigned int req_id, queueid, maxpend;
struct pci_ats_dev *ats_pdev;
if ( !ats_enabled )
@@ -305,8 +305,8 @@ void amd_iommu_flush_iotlb(struct pci_de
if ( !pci_ats_enabled(ats_pdev->seg, ats_pdev->bus, ats_pdev->devfn) )
return;
- bdf = PCI_BDF2(ats_pdev->bus, ats_pdev->devfn);
- iommu = find_iommu_for_device(ats_pdev->seg, bdf);
+ iommu = find_iommu_for_device(ats_pdev->seg,
+ PCI_BDF2(ats_pdev->bus, ats_pdev->devfn));
if ( !iommu )
{
@@ -319,7 +319,7 @@ void amd_iommu_flush_iotlb(struct pci_de
if ( !iommu_has_cap(iommu, PCI_CAP_IOTLB_SHIFT) )
return;
- req_id = get_dma_requestor_id(iommu->seg, bdf);
+ req_id = get_dma_requestor_id(iommu->seg, PCI_BDF2(ats_pdev->bus, devfn));
queueid = req_id;
maxpend = ats_pdev->ats_queue_depth & 0xff;
@@ -339,7 +339,7 @@ static void amd_iommu_flush_all_iotlbs(s
return;
for_each_pdev( d, pdev )
- amd_iommu_flush_iotlb(pdev, gaddr, order);
+ amd_iommu_flush_iotlb(pdev->devfn, pdev, gaddr, order);
}
/* Flush iommu cache after p2m changes. */
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -138,7 +138,7 @@ static void amd_iommu_setup_domain_devic
if ( devfn == pdev->devfn )
enable_ats_device(iommu->seg, bus, devfn);
- amd_iommu_flush_iotlb(pdev, INV_IOMMU_ALL_PAGES_ADDRESS, 0);
+ amd_iommu_flush_iotlb(devfn, pdev, INV_IOMMU_ALL_PAGES_ADDRESS, 0);
}
}
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -78,8 +78,8 @@ void iommu_dte_set_guest_cr3(u32 *dte, u
void amd_iommu_flush_all_pages(struct domain *d);
void amd_iommu_flush_pages(struct domain *d, unsigned long gfn,
unsigned int order);
-void amd_iommu_flush_iotlb(struct pci_dev *pdev, uint64_t gaddr,
- unsigned int order);
+void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
+ uint64_t gaddr, unsigned int order);
void amd_iommu_flush_device(struct amd_iommu *iommu, uint16_t bdf);
void amd_iommu_flush_intremap(struct amd_iommu *iommu, uint16_t bdf);
void amd_iommu_flush_all_caches(struct amd_iommu *iommu);
++++++ 26328-IOMMU-pdev-type.patch ++++++
References: bnc#787169
# HG changeset patch
# User Jan Beulich
# Date 1357559679 -3600
# Node ID 11fa145c880ee814aaf56a7f47f47ee3e5560c7c
# Parent 2a2c63f641ee3bda4ad552eb0b3ea479d37590cc
IOMMU/PCI: consolidate pdev_type() and cache its result for a given device
Add an "unknown" device types as well as one for PCI-to-PCIe bridges
(the latter of which other IOMMU code with or without this patch
doesn't appear to handle properly).
Make sure we don't mistake a device for which we can't access its
config space as a legacy PCI device (after all we in fact don't know
how to deal with such a device, and hence shouldn't try to).
Signed-off-by: Jan Beulich
Acked-by: "Zhang, Xiantao"
Index: xen-4.2.2-testing/xen/drivers/passthrough/pci.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/pci.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/pci.c
@@ -144,7 +144,7 @@ static struct pci_dev *alloc_pdev(struct
spin_lock_init(&pdev->msix_table_lock);
/* update bus2bridge */
- switch ( pdev_type(pseg->nr, bus, devfn) )
+ switch ( pdev->type = pdev_type(pseg->nr, bus, devfn) )
{
u8 sec_bus, sub_bus;
@@ -184,7 +184,7 @@ static struct pci_dev *alloc_pdev(struct
static void free_pdev(struct pci_seg *pseg, struct pci_dev *pdev)
{
/* update bus2bridge */
- switch ( pdev_type(pseg->nr, pdev->bus, pdev->devfn) )
+ switch ( pdev->type )
{
u8 dev, func, sec_bus, sub_bus;
@@ -202,6 +202,9 @@ static void free_pdev(struct pci_seg *ps
pseg->bus2bridge[sec_bus] = pseg->bus2bridge[pdev->bus];
spin_unlock(&pseg->bus2bridge_lock);
break;
+
+ default:
+ break;
}
list_del(&pdev->alldevs_list);
@@ -563,20 +566,30 @@ void pci_release_devices(struct domain *
#define PCI_CLASS_BRIDGE_PCI 0x0604
-int pdev_type(u16 seg, u8 bus, u8 devfn)
+enum pdev_type pdev_type(u16 seg, u8 bus, u8 devfn)
{
u16 class_device, creg;
u8 d = PCI_SLOT(devfn), f = PCI_FUNC(devfn);
int pos = pci_find_cap_offset(seg, bus, d, f, PCI_CAP_ID_EXP);
class_device = pci_conf_read16(seg, bus, d, f, PCI_CLASS_DEVICE);
- if ( class_device == PCI_CLASS_BRIDGE_PCI )
+ switch ( class_device )
{
+ case PCI_CLASS_BRIDGE_PCI:
if ( !pos )
return DEV_TYPE_LEGACY_PCI_BRIDGE;
creg = pci_conf_read16(seg, bus, d, f, pos + PCI_EXP_FLAGS);
- return ((creg & PCI_EXP_FLAGS_TYPE) >> 4) == PCI_EXP_TYPE_PCI_BRIDGE ?
- DEV_TYPE_PCIe2PCI_BRIDGE : DEV_TYPE_PCIe_BRIDGE;
+ switch ( (creg & PCI_EXP_FLAGS_TYPE) >> 4 )
+ {
+ case PCI_EXP_TYPE_PCI_BRIDGE:
+ return DEV_TYPE_PCIe2PCI_BRIDGE;
+ case PCI_EXP_TYPE_PCIE_BRIDGE:
+ return DEV_TYPE_PCI2PCIe_BRIDGE;
+ }
+ return DEV_TYPE_PCIe_BRIDGE;
+
+ case 0x0000: case 0xffff:
+ return DEV_TYPE_PCI_UNKNOWN;
}
return pos ? DEV_TYPE_PCIe_ENDPOINT : DEV_TYPE_PCI;
Index: xen-4.2.2-testing/xen/drivers/passthrough/vtd/intremap.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/vtd/intremap.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/vtd/intremap.c
@@ -426,7 +426,6 @@ void io_apic_write_remap_rte(
static void set_msi_source_id(struct pci_dev *pdev, struct iremap_entry *ire)
{
- int type;
u16 seg;
u8 bus, devfn, secbus;
int ret;
@@ -437,8 +436,7 @@ static void set_msi_source_id(struct pci
seg = pdev->seg;
bus = pdev->bus;
devfn = pdev->devfn;
- type = pdev_type(seg, bus, devfn);
- switch ( type )
+ switch ( pdev->type )
{
case DEV_TYPE_PCIe_BRIDGE:
case DEV_TYPE_PCIe2PCI_BRIDGE:
@@ -470,7 +468,7 @@ static void set_msi_source_id(struct pci
default:
dprintk(XENLOG_WARNING VTDPREFIX,
"d%d: unknown(%u): %04x:%02x:%02x.%u\n",
- pdev->domain->domain_id, type,
+ pdev->domain->domain_id, pdev->type,
seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
break;
}
Index: xen-4.2.2-testing/xen/drivers/passthrough/vtd/iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/vtd/iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/vtd/iommu.c
@@ -1450,7 +1450,6 @@ static int domain_context_mapping(
{
struct acpi_drhd_unit *drhd;
int ret = 0;
- u32 type;
u8 seg = pdev->seg, bus = pdev->bus, secbus;
drhd = acpi_find_matched_drhd_unit(pdev);
@@ -1459,8 +1458,7 @@ static int domain_context_mapping(
ASSERT(spin_is_locked(&pcidevs_lock));
- type = pdev_type(seg, bus, devfn);
- switch ( type )
+ switch ( pdev->type )
{
case DEV_TYPE_PCIe_BRIDGE:
case DEV_TYPE_PCIe2PCI_BRIDGE:
@@ -1510,7 +1508,7 @@ static int domain_context_mapping(
default:
dprintk(XENLOG_ERR VTDPREFIX, "d%d:unknown(%u): %04x:%02x:%02x.%u\n",
- domain->domain_id, type,
+ domain->domain_id, pdev->type,
seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
ret = -EINVAL;
break;
@@ -1582,7 +1580,6 @@ static int domain_context_unmap(
struct acpi_drhd_unit *drhd;
struct iommu *iommu;
int ret = 0;
- u32 type;
u8 seg = pdev->seg, bus = pdev->bus, tmp_bus, tmp_devfn, secbus;
int found = 0;
@@ -1591,8 +1588,7 @@ static int domain_context_unmap(
return -ENODEV;
iommu = drhd->iommu;
- type = pdev_type(seg, bus, devfn);
- switch ( type )
+ switch ( pdev->type )
{
case DEV_TYPE_PCIe_BRIDGE:
case DEV_TYPE_PCIe2PCI_BRIDGE:
@@ -1639,7 +1635,7 @@ static int domain_context_unmap(
default:
dprintk(XENLOG_ERR VTDPREFIX, "d%d:unknown(%u): %04x:%02x:%02x.%u\n",
- domain->domain_id, type,
+ domain->domain_id, pdev->type,
seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
ret = -EINVAL;
goto out;
Index: xen-4.2.2-testing/xen/include/xen/pci.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/xen/pci.h
+++ xen-4.2.2-testing/xen/include/xen/pci.h
@@ -63,6 +63,17 @@ struct pci_dev {
const u16 seg;
const u8 bus;
const u8 devfn;
+
+ enum pdev_type {
+ DEV_TYPE_PCI_UNKNOWN,
+ DEV_TYPE_PCIe_ENDPOINT,
+ DEV_TYPE_PCIe_BRIDGE, // PCIe root port, switch
+ DEV_TYPE_PCIe2PCI_BRIDGE, // PCIe-to-PCI/PCIx bridge
+ DEV_TYPE_PCI2PCIe_BRIDGE, // PCI/PCIx-to-PCIe bridge
+ DEV_TYPE_LEGACY_PCI_BRIDGE, // Legacy PCI bridge
+ DEV_TYPE_PCI,
+ } type;
+
struct pci_dev_info info;
struct arch_pci_dev arch;
struct {
@@ -84,18 +95,10 @@ struct pci_dev {
extern spinlock_t pcidevs_lock;
-enum {
- DEV_TYPE_PCIe_ENDPOINT,
- DEV_TYPE_PCIe_BRIDGE, // PCIe root port, switch
- DEV_TYPE_PCIe2PCI_BRIDGE, // PCIe-to-PCI/PCIx bridge
- DEV_TYPE_LEGACY_PCI_BRIDGE, // Legacy PCI bridge
- DEV_TYPE_PCI,
-};
-
bool_t pci_known_segment(u16 seg);
int pci_device_detect(u16 seg, u8 bus, u8 dev, u8 func);
int scan_pci_devices(void);
-int pdev_type(u16 seg, u8 bus, u8 devfn);
+enum pdev_type pdev_type(u16 seg, u8 bus, u8 devfn);
int find_upstream_bridge(u16 seg, u8 *bus, u8 *devfn, u8 *secbus);
struct pci_dev *pci_lock_pdev(int seg, int bus, int devfn);
struct pci_dev *pci_lock_domain_pdev(
Index: xen-4.2.2-testing/xen/include/xen/pci_regs.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/xen/pci_regs.h
+++ xen-4.2.2-testing/xen/include/xen/pci_regs.h
@@ -371,6 +371,9 @@
#define PCI_EXP_TYPE_UPSTREAM 0x5 /* Upstream Port */
#define PCI_EXP_TYPE_DOWNSTREAM 0x6 /* Downstream Port */
#define PCI_EXP_TYPE_PCI_BRIDGE 0x7 /* PCI/PCI-X Bridge */
+#define PCI_EXP_TYPE_PCIE_BRIDGE 0x8 /* PCI/PCI-X to PCIE Bridge */
+#define PCI_EXP_TYPE_RC_END 0x9 /* Root Complex Integrated Endpoint */
+#define PCI_EXP_TYPE_RC_EC 0xa /* Root Complex Event Collector */
#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
#define PCI_EXP_DEVCAP 4 /* Device capabilities */
++++++ 26329-IOMMU-phantom-dev.patch ++++++
References: bnc#787169
# HG changeset patch
# User Jan Beulich
# Date 1357559742 -3600
# Node ID c9a01b396cb4eaedef30e9a6ed615115a9f8bfc5
# Parent 11fa145c880ee814aaf56a7f47f47ee3e5560c7c
IOMMU: add phantom function support
Apart from generating device context entries for the base function,
all phantom functions also need context entries to be generated for
them.
In order to distinguish different use cases, a variant of
pci_get_pdev() is being introduced that, even when passed a phantom
function number, would return the underlying actual device.
Signed-off-by: Jan Beulich
Acked-by: "Zhang, Xiantao"
Index: xen-4.2.2-testing/xen/drivers/passthrough/amd/iommu_cmd.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/amd/iommu_cmd.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/amd/iommu_cmd.c
@@ -339,7 +339,15 @@ static void amd_iommu_flush_all_iotlbs(s
return;
for_each_pdev( d, pdev )
- amd_iommu_flush_iotlb(pdev->devfn, pdev, gaddr, order);
+ {
+ u8 devfn = pdev->devfn;
+
+ do {
+ amd_iommu_flush_iotlb(devfn, pdev, gaddr, order);
+ devfn += pdev->phantom_stride;
+ } while ( devfn != pdev->devfn &&
+ PCI_SLOT(devfn) == PCI_SLOT(pdev->devfn) );
+ }
}
/* Flush iommu cache after p2m changes. */
Index: xen-4.2.2-testing/xen/drivers/passthrough/amd/iommu_init.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/amd/iommu_init.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/amd/iommu_init.c
@@ -692,7 +692,7 @@ void parse_ppr_log_entry(struct amd_iomm
devfn = PCI_DEVFN2(device_id);
spin_lock(&pcidevs_lock);
- pdev = pci_get_pdev(iommu->seg, bus, devfn);
+ pdev = pci_get_real_pdev(iommu->seg, bus, devfn);
spin_unlock(&pcidevs_lock);
if ( pdev )
Index: xen-4.2.2-testing/xen/drivers/passthrough/amd/iommu_map.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/amd/iommu_map.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/amd/iommu_map.c
@@ -612,7 +612,6 @@ static int update_paging_mode(struct dom
for_each_pdev( d, pdev )
{
bdf = (pdev->bus << 8) | pdev->devfn;
- req_id = get_dma_requestor_id(pdev->seg, bdf);
iommu = find_iommu_for_device(pdev->seg, bdf);
if ( !iommu )
{
@@ -621,16 +620,21 @@ static int update_paging_mode(struct dom
}
spin_lock_irqsave(&iommu->lock, flags);
- device_entry = iommu->dev_table.buffer +
- (req_id * IOMMU_DEV_TABLE_ENTRY_SIZE);
+ do {
+ req_id = get_dma_requestor_id(pdev->seg, bdf);
+ device_entry = iommu->dev_table.buffer +
+ (req_id * IOMMU_DEV_TABLE_ENTRY_SIZE);
- /* valid = 0 only works for dom0 passthrough mode */
- amd_iommu_set_root_page_table((u32 *)device_entry,
- page_to_maddr(hd->root_table),
- hd->domain_id,
- hd->paging_mode, 1);
+ /* valid = 0 only works for dom0 passthrough mode */
+ amd_iommu_set_root_page_table((u32 *)device_entry,
+ page_to_maddr(hd->root_table),
+ hd->domain_id,
+ hd->paging_mode, 1);
- amd_iommu_flush_device(iommu, req_id);
+ amd_iommu_flush_device(iommu, req_id);
+ bdf += pdev->phantom_stride;
+ } while ( PCI_DEVFN2(bdf) != pdev->devfn &&
+ PCI_SLOT(bdf) == PCI_SLOT(pdev->devfn) );
spin_unlock_irqrestore(&iommu->lock, flags);
}
Index: xen-4.2.2-testing/xen/drivers/passthrough/iommu.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/iommu.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/iommu.c
@@ -157,6 +157,8 @@ void __init iommu_dom0_init(struct domai
int iommu_add_device(struct pci_dev *pdev)
{
struct hvm_iommu *hd;
+ int rc;
+ u8 devfn;
if ( !pdev->domain )
return -EINVAL;
@@ -167,7 +169,20 @@ int iommu_add_device(struct pci_dev *pde
if ( !iommu_enabled || !hd->platform_ops )
return 0;
- return hd->platform_ops->add_device(pdev->devfn, pdev);
+ rc = hd->platform_ops->add_device(pdev->devfn, pdev);
+ if ( rc || !pdev->phantom_stride )
+ return rc;
+
+ for ( devfn = pdev->devfn ; ; )
+ {
+ devfn += pdev->phantom_stride;
+ if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
+ return 0;
+ rc = hd->platform_ops->add_device(devfn, pdev);
+ if ( rc )
+ printk(XENLOG_WARNING "IOMMU: add %04x:%02x:%02x.%u failed (%d)\n",
+ pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn), rc);
+ }
}
int iommu_enable_device(struct pci_dev *pdev)
@@ -190,6 +205,8 @@ int iommu_enable_device(struct pci_dev *
int iommu_remove_device(struct pci_dev *pdev)
{
struct hvm_iommu *hd;
+ u8 devfn;
+
if ( !pdev->domain )
return -EINVAL;
@@ -197,6 +214,22 @@ int iommu_remove_device(struct pci_dev *
if ( !iommu_enabled || !hd->platform_ops )
return 0;
+ for ( devfn = pdev->devfn ; pdev->phantom_stride; )
+ {
+ int rc;
+
+ devfn += pdev->phantom_stride;
+ if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
+ break;
+ rc = hd->platform_ops->remove_device(devfn, pdev);
+ if ( !rc )
+ continue;
+
+ printk(XENLOG_ERR "IOMMU: remove %04x:%02x:%02x.%u failed (%d)\n",
+ pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn), rc);
+ return rc;
+ }
+
return hd->platform_ops->remove_device(pdev->devfn, pdev);
}
@@ -244,6 +277,18 @@ static int assign_device(struct domain *
if ( (rc = hd->platform_ops->assign_device(d, devfn, pdev)) )
goto done;
+ for ( ; pdev->phantom_stride; rc = 0 )
+ {
+ devfn += pdev->phantom_stride;
+ if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
+ break;
+ rc = hd->platform_ops->assign_device(d, devfn, pdev);
+ if ( rc )
+ printk(XENLOG_G_WARNING "d%d: assign %04x:%02x:%02x.%u failed (%d)\n",
+ d->domain_id, seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+ rc);
+ }
+
if ( has_arch_pdevs(d) && !need_iommu(d) )
{
d->need_iommu = 1;
@@ -376,6 +421,21 @@ int deassign_device(struct domain *d, u1
if ( !pdev )
return -ENODEV;
+ while ( pdev->phantom_stride )
+ {
+ devfn += pdev->phantom_stride;
+ if ( PCI_SLOT(devfn) != PCI_SLOT(pdev->devfn) )
+ break;
+ ret = hd->platform_ops->reassign_device(d, dom0, devfn, pdev);
+ if ( !ret )
+ continue;
+
+ printk(XENLOG_G_ERR "d%d: deassign %04x:%02x:%02x.%u failed (%d)\n",
+ d->domain_id, seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), ret);
+ return ret;
+ }
+
+ devfn = pdev->devfn;
ret = hd->platform_ops->reassign_device(d, dom0, devfn, pdev);
if ( ret )
{
Index: xen-4.2.2-testing/xen/drivers/passthrough/pci.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/pci.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/pci.c
@@ -146,6 +146,8 @@ static struct pci_dev *alloc_pdev(struct
/* update bus2bridge */
switch ( pdev->type = pdev_type(pseg->nr, bus, devfn) )
{
+ int pos;
+ u16 cap;
u8 sec_bus, sub_bus;
case DEV_TYPE_PCIe_BRIDGE:
@@ -169,6 +171,20 @@ static struct pci_dev *alloc_pdev(struct
break;
case DEV_TYPE_PCIe_ENDPOINT:
+ pos = pci_find_cap_offset(pseg->nr, bus, PCI_SLOT(devfn),
+ PCI_FUNC(devfn), PCI_CAP_ID_EXP);
+ BUG_ON(!pos);
+ cap = pci_conf_read16(pseg->nr, bus, PCI_SLOT(devfn),
+ PCI_FUNC(devfn), pos + PCI_EXP_DEVCAP);
+ if ( cap & PCI_EXP_DEVCAP_PHANTOM )
+ {
+ pdev->phantom_stride = 8 >> MASK_EXTR(cap,
+ PCI_EXP_DEVCAP_PHANTOM);
+ if ( PCI_FUNC(devfn) >= pdev->phantom_stride )
+ pdev->phantom_stride = 0;
+ }
+ break;
+
case DEV_TYPE_PCI:
break;
@@ -266,6 +282,27 @@ struct pci_dev *pci_get_pdev(int seg, in
return NULL;
}
+struct pci_dev *pci_get_real_pdev(int seg, int bus, int devfn)
+{
+ struct pci_dev *pdev;
+ int stride;
+
+ if ( seg < 0 || bus < 0 || devfn < 0 )
+ return NULL;
+
+ for ( pdev = pci_get_pdev(seg, bus, devfn), stride = 4;
+ !pdev && stride; stride >>= 1 )
+ {
+ if ( !(devfn & (8 - stride)) )
+ continue;
+ pdev = pci_get_pdev(seg, bus, devfn & ~(8 - stride));
+ if ( pdev && stride != pdev->phantom_stride )
+ pdev = NULL;
+ }
+
+ return pdev;
+}
+
struct pci_dev *pci_get_pdev_by_domain(
struct domain *d, int seg, int bus, int devfn)
{
@@ -464,8 +501,19 @@ int pci_add_device(u16 seg, u8 bus, u8 d
out:
spin_unlock(&pcidevs_lock);
- printk(XENLOG_DEBUG "PCI add %s %04x:%02x:%02x.%u\n", pdev_type,
- seg, bus, slot, func);
+ if ( !ret )
+ {
+ printk(XENLOG_DEBUG "PCI add %s %04x:%02x:%02x.%u\n", pdev_type,
+ seg, bus, slot, func);
+ while ( pdev->phantom_stride )
+ {
+ func += pdev->phantom_stride;
+ if ( PCI_SLOT(func) )
+ break;
+ printk(XENLOG_DEBUG "PCI phantom %04x:%02x:%02x.%u\n",
+ seg, bus, slot, func);
+ }
+ }
return ret;
}
@@ -657,7 +705,7 @@ void pci_check_disable_device(u16 seg, u
u16 cword;
spin_lock(&pcidevs_lock);
- pdev = pci_get_pdev(seg, bus, devfn);
+ pdev = pci_get_real_pdev(seg, bus, devfn);
if ( pdev )
{
if ( now < pdev->fault.time ||
@@ -674,6 +722,7 @@ void pci_check_disable_device(u16 seg, u
/* Tell the device to stop DMAing; we can't rely on the guest to
* control it for us. */
+ devfn = pdev->devfn;
cword = pci_conf_read16(seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
PCI_COMMAND);
pci_conf_write16(seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
@@ -731,6 +780,27 @@ struct setup_dom0 {
int (*handler)(u8 devfn, struct pci_dev *);
};
+static void setup_one_dom0_device(const struct setup_dom0 *ctxt,
+ struct pci_dev *pdev)
+{
+ u8 devfn = pdev->devfn;
+
+ do {
+ int err = ctxt->handler(devfn, pdev);
+
+ if ( err )
+ {
+ printk(XENLOG_ERR "setup %04x:%02x:%02x.%u for d%d failed (%d)\n",
+ pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn),
+ ctxt->d->domain_id, err);
+ if ( devfn == pdev->devfn )
+ return;
+ }
+ devfn += pdev->phantom_stride;
+ } while ( devfn != pdev->devfn &&
+ PCI_SLOT(devfn) == PCI_SLOT(pdev->devfn) );
+}
+
static int __init _setup_dom0_pci_devices(struct pci_seg *pseg, void *arg)
{
struct setup_dom0 *ctxt = arg;
@@ -747,7 +817,7 @@ static int __init _setup_dom0_pci_device
pdev->domain = ctxt->d;
list_add(&pdev->domain_list, &ctxt->d->arch.pdev_list);
- ctxt->handler(devfn, pdev);
+ setup_one_dom0_device(ctxt, pdev);
}
}
Index: xen-4.2.2-testing/xen/include/xen/lib.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/xen/lib.h
+++ xen-4.2.2-testing/xen/include/xen/lib.h
@@ -58,6 +58,9 @@ do {
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]) + __must_be_array(x))
+#define MASK_EXTR(v, m) (((v) & (m)) / ((m) & -(m)))
+#define MASK_INSR(v, m) (((v) * ((m) & -(m))) & (m))
+
#define reserve_bootmem(_p,_l) ((void)0)
struct domain;
Index: xen-4.2.2-testing/xen/include/xen/pci.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/xen/pci.h
+++ xen-4.2.2-testing/xen/include/xen/pci.h
@@ -64,6 +64,8 @@ struct pci_dev {
const u8 bus;
const u8 devfn;
+ u8 phantom_stride;
+
enum pdev_type {
DEV_TYPE_PCI_UNKNOWN,
DEV_TYPE_PCIe_ENDPOINT,
@@ -114,6 +116,7 @@ int pci_remove_device(u16 seg, u8 bus, u
int pci_ro_device(int seg, int bus, int devfn);
void arch_pci_ro_device(int seg, int bdf);
struct pci_dev *pci_get_pdev(int seg, int bus, int devfn);
+struct pci_dev *pci_get_real_pdev(int seg, int bus, int devfn);
struct pci_dev *pci_get_pdev_by_domain(
struct domain *, int seg, int bus, int devfn);
void pci_check_disable_device(u16 seg, u8 bus, u8 devfn);
++++++ 26330-VT-d-phantom-MSI.patch ++++++
References: bnc#787169
# HG changeset patch
# User Jan Beulich
# Date 1357559812 -3600
# Node ID b514b7118958327605e33dd387944832bc8d734a
# Parent c9a01b396cb4eaedef30e9a6ed615115a9f8bfc5
VT-d: relax source qualifier for MSI of phantom functions
With ordinary requests allowed to come from phantom functions, the
remapping tables ought to be set up to allow for MSI triggers to come
from other than the "real" device too.
Signed-off-by: Jan Beulich
Acked-by: "Zhang, Xiantao"
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -438,13 +438,22 @@ static void set_msi_source_id(struct pci
devfn = pdev->devfn;
switch ( pdev->type )
{
+ unsigned int sq;
+
case DEV_TYPE_PCIe_BRIDGE:
case DEV_TYPE_PCIe2PCI_BRIDGE:
case DEV_TYPE_LEGACY_PCI_BRIDGE:
break;
case DEV_TYPE_PCIe_ENDPOINT:
- set_ire_sid(ire, SVT_VERIFY_SID_SQ, SQ_ALL_16, PCI_BDF2(bus, devfn));
+ switch ( pdev->phantom_stride )
+ {
+ case 1: sq = SQ_13_IGNORE_3; break;
+ case 2: sq = SQ_13_IGNORE_2; break;
+ case 4: sq = SQ_13_IGNORE_1; break;
+ default: sq = SQ_ALL_16; break;
+ }
+ set_ire_sid(ire, SVT_VERIFY_SID_SQ, sq, PCI_BDF2(bus, devfn));
break;
case DEV_TYPE_PCI:
++++++ 26331-IOMMU-phantom-dev-quirk.patch ++++++
References: bnc#787169
# HG changeset patch
# User Jan Beulich
# Date 1357559889 -3600
# Node ID 23c4bbc0111dd807561b2c62cbc5798220943a0d
# Parent b514b7118958327605e33dd387944832bc8d734a
IOMMU: add option to specify devices behaving like ones using phantom functions
At least certain Marvell SATA controllers are known to issue bus master
requests with a non-zero function as origin, despite themselves being
single function devices.
Signed-off-by: Jan Beulich
Acked-by: "Zhang, Xiantao"
Index: xen-4.2.2-testing/docs/misc/xen-command-line.markdown
===================================================================
--- xen-4.2.2-testing.orig/docs/misc/xen-command-line.markdown
+++ xen-4.2.2-testing/docs/misc/xen-command-line.markdown
@@ -679,6 +679,16 @@ Defaults to booting secondary processors
Default: `on`
+### pci-phantom
+> `=[<seg>:]<bus>:<device>,<stride>`
+
+Mark a group of PCI devices as using phantom functions without actually
+advertising so, so the IOMMU can create translation contexts for them.
+
+All numbers specified must be hexadecimal ones.
+
+This option can be specified more than once (up to 8 times at present).
+
### ple\_gap
`= <integer>`
Index: xen-4.2.2-testing/xen/drivers/passthrough/pci.c
===================================================================
--- xen-4.2.2-testing.orig/xen/drivers/passthrough/pci.c
+++ xen-4.2.2-testing/xen/drivers/passthrough/pci.c
@@ -123,6 +123,49 @@ const unsigned long *pci_get_ro_map(u16
return pseg ? pseg->ro_map : NULL;
}
+static struct phantom_dev {
+ u16 seg;
+ u8 bus, slot, stride;
+} phantom_devs[8];
+static unsigned int nr_phantom_devs;
+
+static void __init parse_phantom_dev(char *str) {
+ const char *s = str;
+ struct phantom_dev phantom;
+
+ if ( !s || !*s || nr_phantom_devs >= ARRAY_SIZE(phantom_devs) )
+ return;
+
+ phantom.seg = simple_strtol(s, &s, 16);
+ if ( *s != ':' )
+ return;
+
+ phantom.bus = simple_strtol(s + 1, &s, 16);
+ if ( *s == ',' )
+ {
+ phantom.slot = phantom.bus;
+ phantom.bus = phantom.seg;
+ phantom.seg = 0;
+ }
+ else if ( *s == ':' )
+ phantom.slot = simple_strtol(s + 1, &s, 16);
+ else
+ return;
+
+ if ( *s != ',' )
+ return;
+ switch ( phantom.stride = simple_strtol(s + 1, &s, 0) )
+ {
+ case 1: case 2: case 4:
+ if ( *s )
+ default:
+ return;
+ }
+
+ phantom_devs[nr_phantom_devs++] = phantom;
+}
+custom_param("pci-phantom", parse_phantom_dev);
+
static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
{
struct pci_dev *pdev;
@@ -183,6 +226,20 @@ static struct pci_dev *alloc_pdev(struct
if ( PCI_FUNC(devfn) >= pdev->phantom_stride )
pdev->phantom_stride = 0;
}
+ else
+ {
+ unsigned int i;
+
+ for ( i = 0; i < nr_phantom_devs; ++i )
+ if ( phantom_devs[i].seg == pseg->nr &&
+ phantom_devs[i].bus == bus &&
+ phantom_devs[i].slot == PCI_SLOT(devfn) &&
+ phantom_devs[i].stride > PCI_FUNC(devfn) )
+ {
+ pdev->phantom_stride = phantom_devs[i].stride;
+ break;
+ }
+ }
break;
case DEV_TYPE_PCI:
++++++ 26341-hvm-firmware-passthrough.patch ++++++
fate#313584: pass bios information to XEN HVM guest
# HG changeset patch
# User Ross Philipson
# Date 1357838188 0
# Node ID 07bf59a7ce837bd795e2df2f28166cfe41990d3d
# Parent 19fd1237ff0dfa3d97a896d6ed6fbbd33f816a9f
HVM xenstore strings and firmware passthrough header
Add public HVM definitions header for xenstore strings used in
HVMLOADER. In addition this header describes the use of the firmware
passthrough values set using xenstore.
Signed-off-by: Ross Philipson
Committed-by: Keir Fraser
diff -r 19fd1237ff0d -r 07bf59a7ce83 xen/include/public/hvm/hvm_xs_strings.h
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/xen/include/public/hvm/hvm_xs_strings.h Thu Jan 10 17:16:28 2013 +0000
@@ -0,0 +1,79 @@
+/******************************************************************************
+ * hvm/hvm_xs_strings.h
+ *
+ * HVM xenstore strings used in HVMLOADER.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_HVM_HVM_XS_STRINGS_H__
+#define __XEN_PUBLIC_HVM_HVM_XS_STRINGS_H__
+
+#define HVM_XS_HVMLOADER "hvmloader"
+#define HVM_XS_BIOS "hvmloader/bios"
+#define HVM_XS_GENERATION_ID_ADDRESS "hvmloader/generation-id-address"
+
+/* The following values allow additional ACPI tables to be added to the
+ * virtual ACPI BIOS that hvmloader constructs. The values specify the guest
+ * physical address and length of a block of ACPI tables to add. The format of
+ * the block is simply concatenated raw tables (which specify their own length
+ * in the ACPI header).
+ */
+#define HVM_XS_ACPI_PT_ADDRESS "hvmloader/acpi/address"
+#define HVM_XS_ACPI_PT_LENGTH "hvmloader/acpi/length"
+
+/* Any number of SMBIOS types can be passed through to an HVM guest using
+ * the following xenstore values. The values specify the guest physical
+ * address and length of a block of SMBIOS structures for hvmloader to use.
+ * The block is formatted in the following way:
+ *
+ * <length><struct><length><struct>...
+ *
+ * Each length separator is a 32b integer indicating the length of the next
+ * SMBIOS structure. For DMTF defined types (0 - 121), the passed in struct
+ * will replace the default structure in hvmloader. In addition, any
+ * OEM/vendortypes (128 - 255) will all be added.
+ */
+#define HVM_XS_SMBIOS_PT_ADDRESS "hvmloader/smbios/address"
+#define HVM_XS_SMBIOS_PT_LENGTH "hvmloader/smbios/length"
+
+/* Set to 1 to enable SMBIOS default portable battery (type 22) values. */
+#define HVM_XS_SMBIOS_DEFAULT_BATTERY "hvmloader/smbios/default_battery"
+
+/* The following xenstore values are used to override some of the default
+ * string values in the SMBIOS table constructed in hvmloader.
+ */
+#define HVM_XS_BIOS_STRINGS "bios-strings"
+#define HVM_XS_BIOS_VENDOR "bios-strings/bios-vendor"
+#define HVM_XS_BIOS_VERSION "bios-strings/bios-version"
+#define HVM_XS_SYSTEM_MANUFACTURER "bios-strings/system-manufacturer"
+#define HVM_XS_SYSTEM_PRODUCT_NAME "bios-strings/system-product-name"
+#define HVM_XS_SYSTEM_VERSION "bios-strings/system-version"
+#define HVM_XS_SYSTEM_SERIAL_NUMBER "bios-strings/system-serial-number"
+#define HVM_XS_ENCLOSURE_MANUFACTURER "bios-strings/enclosure-manufacturer"
+#define HVM_XS_ENCLOSURE_SERIAL_NUMBER "bios-strings/enclosure-serial-number"
+#define HVM_XS_BATTERY_MANUFACTURER "bios-strings/battery-manufacturer"
+#define HVM_XS_BATTERY_DEVICE_NAME "bios-strings/battery-device-name"
+
+/* 1 to 99 OEM strings can be set in xenstore using values of the form
+ * below. These strings will be loaded into the SMBIOS type 11 structure.
+ */
+#define HVM_XS_OEM_STRINGS "bios-strings/oem-%02d"
+
+#endif /* __XEN_PUBLIC_HVM_HVM_XS_STRINGS_H__ */
++++++ 26342-hvm-firmware-passthrough.patch ++++++
fate#313584: pass bios information to XEN HVM guest
# HG changeset patch
# User Ross Philipson
# Date 1357838241 0
# Node ID cabf395a6c849cc65e56f1640b18db0c3e0faf5d
# Parent 07bf59a7ce837bd795e2df2f28166cfe41990d3d
HVM firmware passthrough control tools support
Xen control tools support for loading the firmware passthrough blocks
during domain construction. SMBIOS and ACPI blocks are passed in using
the new xc_hvm_build_args structure. Each block is read and loaded
into the new domain address space behind the HVMLOADER image. The base
address for the two blocks is returned as an out parameter to the
caller via the args structure.
Signed-off-by: Ross Philipson
Committed-by: Keir Fraser
diff -r 07bf59a7ce83 -r cabf395a6c84 tools/libxc/xc_hvm_build_arm.c
--- a/tools/libxc/xc_hvm_build_arm.c Thu Jan 10 17:16:28 2013 +0000
+++ b/tools/libxc/xc_hvm_build_arm.c Thu Jan 10 17:17:21 2013 +0000
@@ -22,7 +22,7 @@
#include
int xc_hvm_build(xc_interface *xch, uint32_t domid,
- const struct xc_hvm_build_args *hvm_args)
+ struct xc_hvm_build_args *hvm_args)
{
errno = ENOSYS;
return -1;
diff -r 07bf59a7ce83 -r cabf395a6c84 tools/libxc/xc_hvm_build_x86.c
--- a/tools/libxc/xc_hvm_build_x86.c Thu Jan 10 17:16:28 2013 +0000
+++ b/tools/libxc/xc_hvm_build_x86.c Thu Jan 10 17:17:21 2013 +0000
@@ -49,6 +49,40 @@
#define NR_SPECIAL_PAGES 8
#define special_pfn(x) (0xff000u - NR_SPECIAL_PAGES + (x))
+static int modules_init(struct xc_hvm_build_args *args,
+ uint64_t vend, struct elf_binary *elf,
+ uint64_t *mstart_out, uint64_t *mend_out)
+{
+#define MODULE_ALIGN 1UL << 7
+#define MB_ALIGN 1UL << 20
+#define MKALIGN(x, a) (((uint64_t)(x) + (a) - 1) & ~(uint64_t)((a) - 1))
+ uint64_t total_len = 0, offset1 = 0;
+
+ if ( (args->acpi_module.length == 0)&&(args->smbios_module.length == 0) )
+ return 0;
+
+ /* Find the total length for the firmware modules with a reasonable large
+ * alignment size to align each the modules.
+ */
+ total_len = MKALIGN(args->acpi_module.length, MODULE_ALIGN);
+ offset1 = total_len;
+ total_len += MKALIGN(args->smbios_module.length, MODULE_ALIGN);
+
+ /* Want to place the modules 1Mb+change behind the loader image. */
+ *mstart_out = MKALIGN(elf->pend, MB_ALIGN) + (MB_ALIGN);
+ *mend_out = *mstart_out + total_len;
+
+ if ( *mend_out > vend )
+ return -1;
+
+ if ( args->acpi_module.length != 0 )
+ args->acpi_module.guest_addr_out = *mstart_out;
+ if ( args->smbios_module.length != 0 )
+ args->smbios_module.guest_addr_out = *mstart_out + offset1;
+
+ return 0;
+}
+
static void build_hvm_info(void *hvm_info_page, uint64_t mem_size,
uint64_t mmio_start, uint64_t mmio_size)
{
@@ -86,9 +120,8 @@ static void build_hvm_info(void *hvm_inf
hvm_info->checksum = -sum;
}
-static int loadelfimage(
- xc_interface *xch,
- struct elf_binary *elf, uint32_t dom, unsigned long *parray)
+static int loadelfimage(xc_interface *xch, struct elf_binary *elf,
+ uint32_t dom, unsigned long *parray)
{
privcmd_mmap_entry_t *entries = NULL;
unsigned long pfn_start = elf->pstart >> PAGE_SHIFT;
@@ -126,6 +159,66 @@ static int loadelfimage(
return rc;
}
+static int loadmodules(xc_interface *xch,
+ struct xc_hvm_build_args *args,
+ uint64_t mstart, uint64_t mend,
+ uint32_t dom, unsigned long *parray)
+{
+ privcmd_mmap_entry_t *entries = NULL;
+ unsigned long pfn_start;
+ unsigned long pfn_end;
+ size_t pages;
+ uint32_t i;
+ uint8_t *dest;
+ int rc = -1;
+
+ if ( (mstart == 0)||(mend == 0) )
+ return 0;
+
+ pfn_start = (unsigned long)(mstart >> PAGE_SHIFT);
+ pfn_end = (unsigned long)((mend + PAGE_SIZE - 1) >> PAGE_SHIFT);
+ pages = pfn_end - pfn_start;
+
+ /* Map address space for module list. */
+ entries = calloc(pages, sizeof(privcmd_mmap_entry_t));
+ if ( entries == NULL )
+ goto error_out;
+
+ for ( i = 0; i < pages; i++ )
+ entries[i].mfn = parray[(mstart >> PAGE_SHIFT) + i];
+
+ dest = xc_map_foreign_ranges(
+ xch, dom, pages << PAGE_SHIFT, PROT_READ | PROT_WRITE, 1 << PAGE_SHIFT,
+ entries, pages);
+ if ( dest == NULL )
+ goto error_out;
+
+ /* Zero the range so padding is clear between modules */
+ memset(dest, 0, pages << PAGE_SHIFT);
+
+ /* Load modules into range */
+ if ( args->acpi_module.length != 0 )
+ {
+ memcpy(dest,
+ args->acpi_module.data,
+ args->acpi_module.length);
+ }
+ if ( args->smbios_module.length != 0 )
+ {
+ memcpy(dest + (args->smbios_module.guest_addr_out - mstart),
+ args->smbios_module.data,
+ args->smbios_module.length);
+ }
+
+ munmap(dest, pages << PAGE_SHIFT);
+ rc = 0;
+
+ error_out:
+ free(entries);
+
+ return rc;
+}
+
/*
* Check whether there exists mmio hole in the specified memory range.
* Returns 1 if exists, else returns 0.
@@ -140,7 +233,7 @@ static int check_mmio_hole(uint64_t star
}
static int setup_guest(xc_interface *xch,
- uint32_t dom, const struct xc_hvm_build_args *args,
+ uint32_t dom, struct xc_hvm_build_args *args,
char *image, unsigned long image_size)
{
xen_pfn_t *page_array = NULL;
@@ -153,6 +246,7 @@ static int setup_guest(xc_interface *xch
uint32_t *ident_pt;
struct elf_binary elf;
uint64_t v_start, v_end;
+ uint64_t m_start = 0, m_end = 0;
int rc;
xen_capabilities_info_t caps;
unsigned long stat_normal_pages = 0, stat_2mb_pages = 0,
@@ -178,11 +272,19 @@ static int setup_guest(xc_interface *xch
goto error_out;
}
+ if ( modules_init(args, v_end, &elf, &m_start, &m_end) != 0 )
+ {
+ ERROR("Insufficient space to load modules.");
+ goto error_out;
+ }
+
IPRINTF("VIRTUAL MEMORY ARRANGEMENT:\n"
" Loader: %016"PRIx64"->%016"PRIx64"\n"
+ " Modules: %016"PRIx64"->%016"PRIx64"\n"
" TOTAL: %016"PRIx64"->%016"PRIx64"\n"
" ENTRY ADDRESS: %016"PRIx64"\n",
elf.pstart, elf.pend,
+ m_start, m_end,
v_start, v_end,
elf_uval(&elf, elf.ehdr, e_entry));
@@ -337,6 +439,9 @@ static int setup_guest(xc_interface *xch
if ( loadelfimage(xch, &elf, dom, page_array) != 0 )
goto error_out;
+ if ( loadmodules(xch, args, m_start, m_end, dom, page_array) != 0 )
+ goto error_out;
+
if ( (hvm_info_page = xc_map_foreign_range(
xch, dom, PAGE_SIZE, PROT_READ | PROT_WRITE,
HVM_INFO_PFN)) == NULL )
@@ -413,7 +518,7 @@ static int setup_guest(xc_interface *xch
* Create a domain for a virtualized Linux, using files/filenames.
*/
int xc_hvm_build(xc_interface *xch, uint32_t domid,
- const struct xc_hvm_build_args *hvm_args)
+ struct xc_hvm_build_args *hvm_args)
{
struct xc_hvm_build_args args = *hvm_args;
void *image;
@@ -441,6 +546,15 @@ int xc_hvm_build(xc_interface *xch, uint
sts = setup_guest(xch, domid, &args, image, image_size);
+ if (!sts)
+ {
+ /* Return module load addresses to caller */
+ hvm_args->acpi_module.guest_addr_out =
+ args.acpi_module.guest_addr_out;
+ hvm_args->smbios_module.guest_addr_out =
+ args.smbios_module.guest_addr_out;
+ }
+
free(image);
return sts;
@@ -461,6 +575,7 @@ int xc_hvm_build_target_mem(xc_interface
{
struct xc_hvm_build_args args = {};
+ memset(&args, 0, sizeof(struct xc_hvm_build_args));
args.mem_size = (uint64_t)memsize << 20;
args.mem_target = (uint64_t)target << 20;
args.image_file_name = image_name;
diff -r 07bf59a7ce83 -r cabf395a6c84 tools/libxc/xenguest.h
--- a/tools/libxc/xenguest.h Thu Jan 10 17:16:28 2013 +0000
+++ b/tools/libxc/xenguest.h Thu Jan 10 17:17:21 2013 +0000
@@ -211,11 +211,23 @@ int xc_linux_build_mem(xc_interface *xch
unsigned int console_evtchn,
unsigned long *console_mfn);
+struct xc_hvm_firmware_module {
+ uint8_t *data;
+ uint32_t length;
+ uint64_t guest_addr_out;
+};
+
struct xc_hvm_build_args {
uint64_t mem_size; /* Memory size in bytes. */
uint64_t mem_target; /* Memory target in bytes. */
uint64_t mmio_size; /* Size of the MMIO hole in bytes. */
const char *image_file_name; /* File name of the image to load. */
+
+ /* Extra ACPI tables passed to HVMLOADER */
+ struct xc_hvm_firmware_module acpi_module;
+
+ /* Extra SMBIOS structures passed to HVMLOADER */
+ struct xc_hvm_firmware_module smbios_module;
};
/**
@@ -228,7 +240,7 @@ struct xc_hvm_build_args {
* are optional.
*/
int xc_hvm_build(xc_interface *xch, uint32_t domid,
- const struct xc_hvm_build_args *hvm_args);
+ struct xc_hvm_build_args *hvm_args);
int xc_hvm_build_target_mem(xc_interface *xch,
uint32_t domid,
diff -r 07bf59a7ce83 -r cabf395a6c84 tools/libxc/xg_private.c
--- a/tools/libxc/xg_private.c Thu Jan 10 17:16:28 2013 +0000
+++ b/tools/libxc/xg_private.c Thu Jan 10 17:17:21 2013 +0000
@@ -192,7 +192,7 @@ unsigned long csum_page(void *page)
__attribute__((weak))
int xc_hvm_build(xc_interface *xch,
uint32_t domid,
- const struct xc_hvm_build_args *hvm_args)
+ struct xc_hvm_build_args *hvm_args)
{
errno = ENOSYS;
return -1;
++++++ 26343-hvm-firmware-passthrough.patch ++++++
++++ 645 lines (skipped)
++++++ 26344-hvm-firmware-passthrough.patch ++++++
fate#313584: pass bios information to XEN HVM guest
# HG changeset patch
# User Ross Philipson
# Date 1357838323 0
# Node ID b9c38bea15b117552ecb51809779c7cfef82dd44
# Parent a7ce196f40444fafbe8f13b2d80e4885d4321806
HVM firmware passthrough ACPI processing
ACPI table passthrough support allowing additional static tables and
SSDTs (AML code) to be loaded. These additional tables are added at
the end of the secondary table list in the RSDT/XSDT tables.
Signed-off-by: Ross Philipson
Committed-by: Keir Fraser
diff -r a7ce196f4044 -r b9c38bea15b1 tools/firmware/hvmloader/acpi/build.c
--- a/tools/firmware/hvmloader/acpi/build.c Thu Jan 10 17:18:10 2013 +0000
+++ b/tools/firmware/hvmloader/acpi/build.c Thu Jan 10 17:18:43 2013 +0000
@@ -23,6 +23,9 @@
#include "ssdt_pm.h"
#include "../config.h"
#include "../util.h"
+#include
+
+#define ACPI_MAX_SECONDARY_TABLES 16
#define align16(sz) (((sz) + 15) & ~15)
#define fixed_strcpy(d, s) strncpy((d), (s), sizeof(d))
@@ -198,6 +201,52 @@ static struct acpi_20_waet *construct_wa
return waet;
}
+static int construct_passthrough_tables(unsigned long *table_ptrs,
+ int nr_tables)
+{
+ const char *s;
+ uint8_t *acpi_pt_addr;
+ uint32_t acpi_pt_length;
+ struct acpi_header *header;
+ int nr_added;
+ int nr_max = (ACPI_MAX_SECONDARY_TABLES - nr_tables - 1);
+ uint32_t total = 0;
+ uint8_t *buffer;
+
+ s = xenstore_read(HVM_XS_ACPI_PT_ADDRESS, NULL);
+ if ( s == NULL )
+ return 0;
+
+ acpi_pt_addr = (uint8_t*)(uint32_t)strtoll(s, NULL, 0);
+ if ( acpi_pt_addr == NULL )
+ return 0;
+
+ s = xenstore_read(HVM_XS_ACPI_PT_LENGTH, NULL);
+ if ( s == NULL )
+ return 0;
+
+ acpi_pt_length = (uint32_t)strtoll(s, NULL, 0);
+
+ for ( nr_added = 0; nr_added < nr_max; nr_added++ )
+ {
+ if ( (acpi_pt_length - total) < sizeof(struct acpi_header) )
+ break;
+
+ header = (struct acpi_header*)acpi_pt_addr;
+
+ buffer = mem_alloc(header->length, 16);
+ if ( buffer == NULL )
+ break;
+ memcpy(buffer, header, header->length);
+
+ table_ptrs[nr_tables++] = (unsigned long)buffer;
+ total += header->length;
+ acpi_pt_addr += header->length;
+ }
+
+ return nr_added;
+}
+
static int construct_secondary_tables(unsigned long *table_ptrs,
struct acpi_info *info)
{
@@ -293,6 +342,9 @@ static int construct_secondary_tables(un
}
}
+ /* Load any additional tables passed through. */
+ nr_tables += construct_passthrough_tables(table_ptrs, nr_tables);
+
table_ptrs[nr_tables] = 0;
return nr_tables;
}
@@ -327,7 +379,7 @@ void acpi_build_tables(struct acpi_confi
struct acpi_10_fadt *fadt_10;
struct acpi_20_facs *facs;
unsigned char *dsdt;
- unsigned long secondary_tables[16];
+ unsigned long secondary_tables[ACPI_MAX_SECONDARY_TABLES];
int nr_secondaries, i;
unsigned long vm_gid_addr;
++++++ 26369-libxl-devid.patch ++++++
commit 5420f26507fc5c9853eb1076401a8658d72669da
Author: Jim Fehlig
Date: Fri Jan 11 12:22:26 2013 +0000
libxl: Set vfb and vkb devid if not done so by the caller
Other devices set a sensible devid if the caller has not done so.
Do the same for vfb and vkb. While at it, factor out the common code
used to determine a sensible devid, so it can be used by other
libxl__device_*_add functions.
Signed-off-by: Jim Fehlig
Acked-by: Ian Campbell
Committed-by: Ian Campbell
Index: xen-4.2.2-testing/tools/libxl/libxl.c
===================================================================
--- xen-4.2.2-testing.orig/tools/libxl/libxl.c
+++ xen-4.2.2-testing/tools/libxl/libxl.c
@@ -1710,6 +1710,26 @@ out:
return;
}
+/* common function to get next device id */
+static int libxl__device_nextid(libxl__gc *gc, uint32_t domid, char *device)
+{
+ char *dompath, **l;
+ unsigned int nb;
+ int nextid = -1;
+
+ if (!(dompath = libxl__xs_get_dompath(gc, domid)))
+ return nextid;
+
+ l = libxl__xs_directory(gc, XBT_NULL,
+ GCSPRINTF("%s/device/%s", dompath, device), &nb);
+ if (l == NULL || nb == 0)
+ nextid = 0;
+ else
+ nextid = strtoul(l[nb - 1], NULL, 10) + 1;
+
+ return nextid;
+}
+
/******************************************************************************/
int libxl__device_disk_setdefault(libxl__gc *gc, libxl_device_disk *disk)
@@ -2549,8 +2569,7 @@ void libxl__device_nic_add(libxl__egc *e
flexarray_t *front;
flexarray_t *back;
libxl__device *device;
- char *dompath, **l;
- unsigned int nb, rc;
+ unsigned int rc;
rc = libxl__device_nic_setdefault(gc, nic, domid);
if (rc) goto out;
@@ -2567,17 +2586,10 @@ void libxl__device_nic_add(libxl__egc *e
}
if (nic->devid == -1) {
- if (!(dompath = libxl__xs_get_dompath(gc, domid))) {
+ if ((nic->devid = libxl__device_nextid(gc, domid, "vif") < 0)) {
rc = ERROR_FAIL;
goto out_free;
}
- if (!(l = libxl__xs_directory(gc, XBT_NULL,
- libxl__sprintf(gc, "%s/device/vif", dompath), &nb)) ||
- nb == 0) {
- nic->devid = 0;
- } else {
- nic->devid = strtoul(l[nb - 1], NULL, 10) + 1;
- }
}
GCNEW(device);
@@ -2964,6 +2976,13 @@ int libxl__device_vkb_add(libxl__gc *gc,
goto out_free;
}
+ if (vkb->devid == -1) {
+ if ((vkb->devid = libxl__device_nextid(gc, domid, "vkb") < 0)) {
+ rc = ERROR_FAIL;
+ goto out_free;
+ }
+ }
+
rc = libxl__device_from_vkb(gc, domid, vkb, &device);
if (rc != 0) goto out_free;
@@ -3065,6 +3084,13 @@ int libxl__device_vfb_add(libxl__gc *gc,
goto out_free;
}
+ if (vfb->devid == -1) {
+ if ((vfb->devid = libxl__device_nextid(gc, domid, "vfb") < 0)) {
+ rc = ERROR_FAIL;
+ goto out_free;
+ }
+ }
+
rc = libxl__device_from_vfb(gc, domid, vfb, &device);
if (rc != 0) goto out_free;
++++++ 26370-libxc-x86-initial-mapping-fit.patch ++++++
# HG changeset patch
# User Ian Campbell
# Date 1357906947 0
# Node ID ba2d73234d73fc0faa027cd9bdfd3ac90642733c
# Parent 84d87ca765be81c215ef3b67d2ed71acfba73553
libxc: x86: ensure that the initial mapping fits into the guest's memory
In particular we need to check that adding 512KB of slack and
rounding up to a 4MB boundary do not overflow the guest's memory
allocation. Otherwise we run off the end of the p2m when building the
guest's initial page tables and populate them with garbage.
Wei noticed this when build tiny (2MB) mini-os domains.
Reported-by: Wei Liu
Signed-off-by: Ian Campbell
Acked-by: Jan Beulich
Committed-by: Ian Campbell
--- a/tools/libxc/xc_dom_core.c
+++ b/tools/libxc/xc_dom_core.c
@@ -871,7 +871,8 @@ int xc_dom_build_image(struct xc_dom_ima
goto err;
if ( dom->arch_hooks->count_pgtables )
{
- dom->arch_hooks->count_pgtables(dom);
+ if ( dom->arch_hooks->count_pgtables(dom) != 0 )
+ goto err;
if ( (dom->pgtables > 0) &&
(xc_dom_alloc_segment(dom, &dom->pgtables_seg, "page tables", 0,
dom->pgtables * page_size) != 0) )
--- a/tools/libxc/xc_dom_x86.c
+++ b/tools/libxc/xc_dom_x86.c
@@ -82,6 +82,7 @@ static int count_pgtables(struct xc_dom_
{
int pages, extra_pages;
xen_vaddr_t try_virt_end;
+ xen_pfn_t try_pfn_end;
extra_pages = dom->alloc_bootstack ? 1 : 0;
extra_pages += dom->extra_pages;
@@ -91,6 +92,17 @@ static int count_pgtables(struct xc_dom_
{
try_virt_end = round_up(dom->virt_alloc_end + pages * PAGE_SIZE_X86,
bits_to_mask(22)); /* 4MB alignment */
+
+ try_pfn_end = (try_virt_end - dom->parms.virt_base) >> PAGE_SHIFT_X86;
+
+ if ( try_pfn_end > dom->total_pages )
+ {
+ xc_dom_panic(dom->xch, XC_OUT_OF_MEMORY,
+ "%s: not enough memory for initial mapping (%#"PRIpfn" > %#"PRIpfn")",
+ __FUNCTION__, try_pfn_end, dom->total_pages);
+ return -ENOMEM;
+ }
+
dom->pg_l4 =
nr_page_tables(dom, dom->parms.virt_base, try_virt_end, l4_bits);
dom->pg_l3 =
++++++ 26372-tools-paths.patch ++++++
# HG changeset patch
# User Bamvor Jian Zhang
# Date 1357906948 0
# Node ID 2ad5792b4274d76ced39515cbd3f84898b181768
# Parent ba2d73234d73fc0faa027cd9bdfd3ac90642733c
fix wrong path while calling pygrub and libxl-save-helper
in current xen x86_64, the default libexec directory is /usr/lib/xen/bin,
while the private binder is /usr/lib64/xen/bin. but some commands(pygrub,
libxl-save-helper) located in private binder directory is called from
libexec directory which lead to the following error:
1, for pygrub bootloader:
libxl: debug: libxl_bootloader.c:429:bootloader_disk_attached_cb: /usr/lib/xen/bin/pygrub doesn't exist, falling back to config path
2, for libxl-save-helper:
libxl: cannot execute /usr/lib/xen/bin/libxl-save-helper: No such file or directory
libxl: error: libxl_utils.c:363:libxl_read_exactly: file/stream truncated reading ipc msg header from domain 3 save/restore helper stdout pipe
libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: domain 3 save/restore helper [10222] exited with error status 255
there are two ways to fix above error. the first one is make such command
store in the /usr/lib/xen/bin and /usr/lib64/xen/bin(symbol link to
previous), e.g. qemu-dm. The second way is using private binder dir
instead of libexec dir. e.g. xenconsole.
For these cases, the latter one is suitable.
Signed-off-by: Bamvor Jian Zhang
Committed-by: Ian Campbell
Index: xen-4.2.1-testing/tools/libxl/libxl_bootloader.c
===================================================================
--- xen-4.2.1-testing.orig/tools/libxl/libxl_bootloader.c
+++ xen-4.2.1-testing/tools/libxl/libxl_bootloader.c
@@ -419,7 +419,7 @@ static void bootloader_disk_attached_cb(
const char *bltmp;
struct stat st;
- bltmp = libxl__abs_path(gc, bootloader, libxl__libexec_path());
+ bltmp = libxl__abs_path(gc, bootloader, libxl__private_bindir_path());
/* Check to see if the file exists in this location; if not,
* fall back to checking the path */
LOG(DEBUG, "Checking for bootloader in libexec path: %s", bltmp);
Index: xen-4.2.1-testing/tools/libxl/libxl_save_callout.c
===================================================================
--- xen-4.2.1-testing.orig/tools/libxl/libxl_save_callout.c
+++ xen-4.2.1-testing/tools/libxl/libxl_save_callout.c
@@ -172,7 +172,7 @@ static void run_helper(libxl__egc *egc,
shs->stdout_what = GCSPRINTF("domain %"PRIu32" save/restore helper"
" stdout pipe", domid);
- *arg++ = getenv("LIBXL_SAVE_HELPER") ?: LIBEXEC "/" "libxl-save-helper";
+ *arg++ = getenv("LIBXL_SAVE_HELPER") ?: PRIVATE_BINDIR "/" "libxl-save-helper";
*arg++ = mode_arg;
const char **stream_fd_arg = arg++;
for (i=0; i
# Date 1358427591 -3600
# Node ID 76598d4bf61ef0c575deba539ff99078c80e651e
# Parent 0dee85c061addb7124d77c5f6cfe2ea7bc03b760
x86: handle both NMI kinds if they occur simultaneously
We shouldn't assume PCI SERR excludes IOCHK.
Once at it, also remove the doubly redundant range restriction on
"reason" - the variable already is "unsigned char".
Signed-off-by: Jan Beulich
Acked-by: Andrew Cooper
Acked-by: Keir Fraser
Index: xen-4.2.2-testing/xen/arch/x86/traps.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/traps.c
+++ xen-4.2.2-testing/xen/arch/x86/traps.c
@@ -3369,10 +3369,10 @@ void do_nmi(struct cpu_user_regs *regs)
reason = inb(0x61);
if ( reason & 0x80 )
pci_serr_error(regs);
- else if ( reason & 0x40 )
+ if ( reason & 0x40 )
io_check_error(regs);
- else if ( !nmi_watchdog )
- unknown_nmi_error(regs, (unsigned char)(reason&0xff));
+ if ( !(reason & 0xc0) && !nmi_watchdog )
+ unknown_nmi_error(regs, reason);
}
}
++++++ 26418-x86-trampoline-consider-multiboot.patch ++++++
# HG changeset patch
# User Paolo Bonzini
# Date 1358505311 -3600
# Node ID 3b59a6c3e9b0fb5009bdfff97c8493bb9f0bec54
# Parent 025f202f3022c30d1ec3b6ffcb72861c43a32cf7
x86: find a better location for the real-mode trampoline
On some machines, the location at 0x40e does not point to the beginning
of the EBDA. Rather, it points to the beginning of the BIOS-reserved
area of the EBDA, while the option ROMs place their data below that
segment.
For this reason, 0x413 is actually a better source than 0x40e to get
the location of the real-mode trampoline. Xen was already using it
as a second source, and this patch keeps that working. However, just
in case, let's also fetch the information from the multiboot structure,
where the boot loader should have placed it. This way we don't
necessarily trust one of the BIOS or the multiboot loader more than
the other.
Signed-off-by: Paolo Bonzini
Retain the previous code, thus using the multiboot value only if it's
sane but lower than the BDA computed one. Also use the full 32-bit
mem_lower value and prefer MBI_MEMLIMITS over open coding it (requiring
a slight adjustment to multiboot.h to make its constants actually
usable in assembly code, which previously they were only meant to be).
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
Committed-by: Jan Beulich
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -88,6 +88,20 @@ __start:
movzwl 0x413,%eax /* use base memory size on failure */
shl $10-4,%eax
1:
+ /*
+ * Compare the value in the BDA with the information from the
+ * multiboot structure (if available) and use the smallest.
+ */
+ testb $MBI_MEMLIMITS,(%ebx)
+ jz 2f /* not available? BDA value will be fine */
+ mov 4(%ebx),%edx
+ cmp $0x100,%edx /* is the multiboot value too small? */
+ jb 2f /* if so, do not use it */
+ shl $10-4,%edx
+ cmp %eax,%edx /* compare with BDA value */
+ cmovb %edx,%eax /* and use the smaller */
+
+2: /* Reserve 64kb for the trampoline */
sub $0x1000,%eax
/* From arch/x86/smpboot.c: start_eip had better be page-aligned! */
--- a/xen/include/xen/multiboot.h
+++ b/xen/include/xen/multiboot.h
@@ -18,6 +18,7 @@
#ifndef __MULTIBOOT_H__
#define __MULTIBOOT_H__
+#include "const.h"
/*
* Multiboot header structure.
@@ -31,17 +32,17 @@
/* The magic number passed by a Multiboot-compliant boot loader. */
#define MULTIBOOT_BOOTLOADER_MAGIC 0x2BADB002
-#define MBI_MEMLIMITS (1u<< 0)
-#define MBI_BOOTDEV (1u<< 1)
-#define MBI_CMDLINE (1u<< 2)
-#define MBI_MODULES (1u<< 3)
-#define MBI_AOUT_SYMS (1u<< 4)
-#define MBI_ELF_SYMS (1u<< 5)
-#define MBI_MEMMAP (1u<< 6)
-#define MBI_DRIVES (1u<< 7)
-#define MBI_BIOSCONFIG (1u<< 8)
-#define MBI_LOADERNAME (1u<< 9)
-#define MBI_APM (1u<<10)
+#define MBI_MEMLIMITS (_AC(1,u) << 0)
+#define MBI_BOOTDEV (_AC(1,u) << 1)
+#define MBI_CMDLINE (_AC(1,u) << 2)
+#define MBI_MODULES (_AC(1,u) << 3)
+#define MBI_AOUT_SYMS (_AC(1,u) << 4)
+#define MBI_ELF_SYMS (_AC(1,u) << 5)
+#define MBI_MEMMAP (_AC(1,u) << 6)
+#define MBI_DRIVES (_AC(1,u) << 7)
+#define MBI_BIOSCONFIG (_AC(1,u) << 8)
+#define MBI_LOADERNAME (_AC(1,u) << 9)
+#define MBI_APM (_AC(1,u) << 10)
#ifndef __ASSEMBLY__
++++++ 26532-AMD-IOMMU-phantom-MSI.patch ++++++
References: bnc#787169
# HG changeset patch
# User Jan Beulich
# Date 1360831377 -3600
# Node ID 788f4551580d476e13ea907e373e58806a32179e
# Parent e68f14b9e73925e9d404e517ba510f73fe472e4e
AMD IOMMU: handle MSI for phantom functions
With ordinary requests allowed to come from phantom functions, the
remapping tables ought to be set up to also allow for MSI triggers to
come from other than the "real" device too.
It is not clear to me whether the alias-ID handling also needs
adjustment for this to work properly, or whether firmware can be
expected to properly express this through a device alias range
descriptor (or multiple device alias ones).
Signed-off-by: Jan Beulich
Acked-by: Ian Campbell
--- a/xen/drivers/passthrough/amd/iommu_intr.c
+++ b/xen/drivers/passthrough/amd/iommu_intr.c
@@ -284,33 +284,32 @@ void amd_iommu_ioapic_update_ire(
}
static void update_intremap_entry_from_msi_msg(
- struct amd_iommu *iommu, struct pci_dev *pdev,
- struct msi_desc *msi_desc, struct msi_msg *msg)
+ struct amd_iommu *iommu, u16 bdf,
+ int *remap_index, const struct msi_msg *msg)
{
unsigned long flags;
u32* entry;
- u16 bdf, req_id, alias_id;
+ u16 req_id, alias_id;
u8 delivery_mode, dest, vector, dest_mode;
spinlock_t *lock;
int offset;
- bdf = (pdev->bus << 8) | pdev->devfn;
- req_id = get_dma_requestor_id(pdev->seg, bdf);
- alias_id = get_intremap_requestor_id(pdev->seg, bdf);
+ req_id = get_dma_requestor_id(iommu->seg, bdf);
+ alias_id = get_intremap_requestor_id(iommu->seg, bdf);
if ( msg == NULL )
{
lock = get_intremap_lock(iommu->seg, req_id);
spin_lock_irqsave(lock, flags);
- free_intremap_entry(iommu->seg, req_id, msi_desc->remap_index);
+ free_intremap_entry(iommu->seg, req_id, *remap_index);
spin_unlock_irqrestore(lock, flags);
if ( ( req_id != alias_id ) &&
- get_ivrs_mappings(pdev->seg)[alias_id].intremap_table != NULL )
+ get_ivrs_mappings(iommu->seg)[alias_id].intremap_table != NULL )
{
lock = get_intremap_lock(iommu->seg, alias_id);
spin_lock_irqsave(lock, flags);
- free_intremap_entry(iommu->seg, alias_id, msi_desc->remap_index);
+ free_intremap_entry(iommu->seg, alias_id, *remap_index);
spin_unlock_irqrestore(lock, flags);
}
goto done;
@@ -324,7 +323,10 @@ static void update_intremap_entry_from_m
vector = (msg->data >> MSI_DATA_VECTOR_SHIFT) & MSI_DATA_VECTOR_MASK;
dest = (msg->address_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff;
offset = get_intremap_offset(vector, delivery_mode);
- msi_desc->remap_index = offset;
+ if ( *remap_index < 0)
+ *remap_index = offset;
+ else
+ BUG_ON(*remap_index != offset);
entry = (u32*)get_intremap_entry(iommu->seg, req_id, offset);
update_intremap_entry(entry, vector, delivery_mode, dest_mode, dest);
@@ -339,7 +341,7 @@ static void update_intremap_entry_from_m
lock = get_intremap_lock(iommu->seg, alias_id);
if ( ( req_id != alias_id ) &&
- get_ivrs_mappings(pdev->seg)[alias_id].intremap_table != NULL )
+ get_ivrs_mappings(iommu->seg)[alias_id].intremap_table != NULL )
{
spin_lock_irqsave(lock, flags);
entry = (u32*)get_intremap_entry(iommu->seg, alias_id, offset);
@@ -362,27 +364,44 @@ void amd_iommu_msi_msg_update_ire(
struct msi_desc *msi_desc, struct msi_msg *msg)
{
struct pci_dev *pdev = msi_desc->dev;
+ int bdf = PCI_BDF2(pdev->bus, pdev->devfn);
struct amd_iommu *iommu = NULL;
if ( !iommu_intremap )
return;
- iommu = find_iommu_for_device(pdev->seg, (pdev->bus << 8) | pdev->devfn);
-
+ iommu = find_iommu_for_device(pdev->seg, bdf);
if ( !iommu )
{
- AMD_IOMMU_DEBUG("Fail to find iommu for MSI device id = 0x%x\n",
- (pdev->bus << 8) | pdev->devfn);
+ AMD_IOMMU_DEBUG("Fail to find iommu for MSI device id = 0x%x\n", bdf);
return;
}
if ( msi_desc->remap_index >= 0 )
- update_intremap_entry_from_msi_msg(iommu, pdev, msi_desc, NULL);
+ {
+ do {
+ update_intremap_entry_from_msi_msg(iommu, bdf,
+ &msi_desc->remap_index, NULL);
+ if ( !pdev || !pdev->phantom_stride )
+ break;
+ bdf += pdev->phantom_stride;
+ } while ( PCI_SLOT(bdf) == PCI_SLOT(pdev->devfn) );
+
+ msi_desc->remap_index = -1;
+ if ( pdev )
+ bdf = PCI_BDF2(pdev->bus, pdev->devfn);
+ }
if ( !msg )
return;
- update_intremap_entry_from_msi_msg(iommu, pdev, msi_desc, msg);
+ do {
+ update_intremap_entry_from_msi_msg(iommu, bdf, &msi_desc->remap_index,
+ msg);
+ if ( !pdev || !pdev->phantom_stride )
+ break;
+ bdf += pdev->phantom_stride;
+ } while ( PCI_SLOT(bdf) == PCI_SLOT(pdev->devfn) );
}
void amd_iommu_read_msi_from_ire(
++++++ 26547-tools-xc_fix_logic_error_in_stdiostream_progress.patch ++++++
changeset: 26547:8285d20a6f5b
user: Olaf Hering
date: Fri Feb 15 13:32:11 2013 +0000
files: tools/libxc/xtl_logger_stdio.c
description:
tools/xc: fix logic error in stdiostream_progress
Setting XTL_STDIOSTREAM_HIDE_PROGRESS should disable progress reporting.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r 0141aeb86b79 -r 8285d20a6f5b tools/libxc/xtl_logger_stdio.c
--- a/tools/libxc/xtl_logger_stdio.c Fri Feb 15 13:32:10 2013 +0000
+++ b/tools/libxc/xtl_logger_stdio.c Fri Feb 15 13:32:11 2013 +0000
@@ -89,7 +89,7 @@ static void stdiostream_progress(struct
int newpel, extra_erase;
xentoollog_level this_level;
- if (!(lg->flags & XTL_STDIOSTREAM_HIDE_PROGRESS))
+ if (lg->flags & XTL_STDIOSTREAM_HIDE_PROGRESS)
return;
if (percent < lg->progress_last_percent) {
++++++ 26548-tools-xc_handle_tty_output_differently_in_stdiostream_progress.patch ++++++
changeset: 26548:e7d9bac5c11d
user: Olaf Hering
date: Fri Feb 15 13:32:11 2013 +0000
files: tools/libxc/xtl_logger_stdio.c
description:
tools/xc: handle tty output differently in stdiostream_progress
If the output goes to a tty, rewind the cursor and print everything in a
single line as it was done up to now. If the output goes to a file or
pipe print a newline after each progress output. This will fix logging
of progress messages from xc_save to xend.log.
To support XTL_STDIOSTREAM_SHOW_PID or XTL_STDIOSTREAM_SHOW_DATE print
the output via vmessage if the output is not a tty.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r 8285d20a6f5b -r e7d9bac5c11d tools/libxc/xtl_logger_stdio.c
--- a/tools/libxc/xtl_logger_stdio.c Fri Feb 15 13:32:11 2013 +0000
+++ b/tools/libxc/xtl_logger_stdio.c Fri Feb 15 13:32:11 2013 +0000
@@ -81,6 +81,17 @@ static void stdiostream_vmessage(xentool
fflush(lg->f);
}
+static void stdiostream_message(struct xentoollog_logger *logger_in,
+ xentoollog_level level,
+ const char *context,
+ const char *format, ...)
+{
+ va_list al;
+ va_start(al,format);
+ stdiostream_vmessage(logger_in, level, -1, context, format, al);
+ va_end(al);
+}
+
static void stdiostream_progress(struct xentoollog_logger *logger_in,
const char *context,
const char *doing_what, int percent,
@@ -105,11 +116,18 @@ static void stdiostream_progress(struct
if (this_level < lg->min_level)
return;
+ lg->progress_last_percent = percent;
+
+ if (isatty(fileno(lg->f)) <= 0) {
+ stdiostream_message(logger_in, this_level, context,
+ "%s: %lu/%lu %3d%%",
+ doing_what, done, total, percent);
+ return;
+ }
+
if (lg->progress_erase_len)
putc('\r', lg->f);
- lg->progress_last_percent = percent;
-
newpel = fprintf(lg->f, "%s%s" "%s: %lu/%lu %3d%%%s",
context?context:"", context?": ":"",
doing_what, done, total, percent,
++++++ 26549-tools-xc_turn_XCFLAGS__into_shifts.patch ++++++
changeset: 26549:d2991367ecd2
user: Olaf Hering
date: Fri Feb 15 13:32:12 2013 +0000
files: tools/libxc/xenguest.h
description:
tools/xc: turn XCFLAGS_* into shifts
to make it clear that these are bits and to make it easier to use in
xend code.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r e7d9bac5c11d -r d2991367ecd2 tools/libxc/xenguest.h
--- a/tools/libxc/xenguest.h Fri Feb 15 13:32:11 2013 +0000
+++ b/tools/libxc/xenguest.h Fri Feb 15 13:32:12 2013 +0000
@@ -23,11 +23,12 @@
#ifndef XENGUEST_H
#define XENGUEST_H
-#define XCFLAGS_LIVE 1
-#define XCFLAGS_DEBUG 2
-#define XCFLAGS_HVM 4
-#define XCFLAGS_STDVGA 8
-#define XCFLAGS_CHECKPOINT_COMPRESS 16
+#define XCFLAGS_LIVE (1 << 0)
+#define XCFLAGS_DEBUG (1 << 1)
+#define XCFLAGS_HVM (1 << 2)
+#define XCFLAGS_STDVGA (1 << 3)
+#define XCFLAGS_CHECKPOINT_COMPRESS (1 << 4)
+
#define X86_64_B_SIZE 64
#define X86_32_B_SIZE 32
++++++ 26550-tools-xc_restore_logging_in_xc_save.patch ++++++
changeset: 26550:e6c373fcb73e
user: Olaf Hering
date: Fri Feb 15 13:32:13 2013 +0000
files: tools/xcutils/xc_save.c
description:
tools/xc: restore logging in xc_save
Prior to xen-4.1 the helper xc_save would print some progress during
migration. With the new xc_interface_open API no more messages were
printed because no logger was configured.
Restore previous behaviour by providing a logger. The progress in
xc_domain_save will be disabled because it generates alot of output and
fills up xend.log quickly.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
diff -r d2991367ecd2 -r e6c373fcb73e tools/xcutils/xc_save.c
--- a/tools/xcutils/xc_save.c Fri Feb 15 13:32:12 2013 +0000
+++ b/tools/xcutils/xc_save.c Fri Feb 15 13:32:13 2013 +0000
@@ -166,17 +166,15 @@ static int switch_qemu_logdirty(int domi
int
main(int argc, char **argv)
{
- unsigned int maxit, max_f;
+ unsigned int maxit, max_f, lflags;
int io_fd, ret, port;
struct save_callbacks callbacks;
+ xentoollog_level lvl;
+ xentoollog_logger *l;
if (argc != 6)
errx(1, "usage: %s iofd domid maxit maxf flags", argv[0]);
- si.xch = xc_interface_open(0,0,0);
- if (!si.xch)
- errx(1, "failed to open control interface");
-
io_fd = atoi(argv[1]);
si.domid = atoi(argv[2]);
maxit = atoi(argv[3]);
@@ -185,6 +183,13 @@ main(int argc, char **argv)
si.suspend_evtchn = -1;
+ lvl = si.flags & XCFLAGS_DEBUG ? XTL_DEBUG: XTL_DETAIL;
+ lflags = XTL_STDIOSTREAM_HIDE_PROGRESS;
+ l = (xentoollog_logger *)xtl_createlogger_stdiostream(stderr, lvl, lflags);
+ si.xch = xc_interface_open(l, 0, 0);
+ if (!si.xch)
+ errx(1, "failed to open control interface");
+
si.xce = xc_evtchn_open(NULL, 0);
if (si.xce == NULL)
warnx("failed to open event channel handle");
++++++ 26551-tools-xc_log_pid_in_xc_save-xc_restore_output.patch ++++++
changeset: 26551:48f9436959dd
user: Olaf Hering
date: Fri Feb 15 13:32:13 2013 +0000
files: tools/libxc/xc_domain_restore.c tools/libxc/xc_domain_save.c tools/xcutils/xc_restore.c tools/xcutils/xc_save.c
description:
tools/xc: log pid in xc_save/xc_restore output
If several migrations log their output to xend.log its not clear which
line belongs to a which guest. Print entry/exit of xc_save and
xc_restore and also request to print pid with each log call.
Signed-off-by: Olaf Hering
Acked-by: Ian Campbell
Committed-by: Ian Campbell
Index: xen-4.2.1-testing/tools/libxc/xc_domain_restore.c
===================================================================
--- xen-4.2.1-testing.orig/tools/libxc/xc_domain_restore.c
+++ xen-4.2.1-testing/tools/libxc/xc_domain_restore.c
@@ -1382,6 +1382,8 @@ int xc_domain_restore(xc_interface *xch,
struct restore_ctx *ctx = &_ctx;
struct domain_info_context *dinfo = &ctx->dinfo;
+ DPRINTF("%s: starting restore of new domid %u", __func__, dom);
+
pagebuf_init(&pagebuf);
memset(&tailbuf, 0, sizeof(tailbuf));
tailbuf.ishvm = hvm;
@@ -1408,7 +1410,7 @@ int xc_domain_restore(xc_interface *xch,
PERROR("read: p2m_size");
goto out;
}
- DPRINTF("xc_domain_restore start: p2m_size = %lx\n", dinfo->p2m_size);
+ DPRINTF("%s: p2m_size = %lx\n", __func__, dinfo->p2m_size);
if ( !get_platform_info(xch, dom,
&ctx->max_mfn, &ctx->hvirt_start, &ctx->pt_levels, &dinfo->guest_width) )
@@ -2215,7 +2217,7 @@ int xc_domain_restore(xc_interface *xch,
fcntl(io_fd, F_SETFL, orig_io_fd_flags);
- DPRINTF("Restore exit with rc=%d\n", rc);
+ DPRINTF("Restore exit of domid %u with rc=%d\n", dom, rc);
return rc;
}
Index: xen-4.2.1-testing/tools/libxc/xc_domain_save.c
===================================================================
--- xen-4.2.1-testing.orig/tools/libxc/xc_domain_save.c
+++ xen-4.2.1-testing/tools/libxc/xc_domain_save.c
@@ -897,6 +897,8 @@ int xc_domain_save(xc_interface *xch, in
int completed = 0;
+ DPRINTF("%s: starting save of domid %u", __func__, dom);
+
if ( hvm && !callbacks->switch_qemu_logdirty )
{
ERROR("No switch_qemu_logdirty callback provided.");
@@ -2112,7 +2114,7 @@ int xc_domain_save(xc_interface *xch, in
free(pfn_err);
free(to_fix);
- DPRINTF("Save exit rc=%d\n",rc);
+ DPRINTF("Save exit of domid %u with rc=%d\n", dom, rc);
return !!rc;
}
Index: xen-4.2.1-testing/tools/xcutils/xc_restore.c
===================================================================
--- xen-4.2.1-testing.orig/tools/xcutils/xc_restore.c
+++ xen-4.2.1-testing/tools/xcutils/xc_restore.c
@@ -19,17 +19,22 @@ int
main(int argc, char **argv)
{
unsigned int domid, store_evtchn, console_evtchn;
- unsigned int hvm, pae, apic;
+ unsigned int hvm, pae, apic, lflags;
xc_interface *xch;
int io_fd, ret;
int superpages;
unsigned long store_mfn, console_mfn;
+ xentoollog_level lvl;
+ xentoollog_logger *l;
if ( (argc != 8) && (argc != 9) )
errx(1, "usage: %s iofd domid store_evtchn "
"console_evtchn hvm pae apic [superpages]", argv[0]);
- xch = xc_interface_open(0,0,0);
+ lvl = XTL_DETAIL;
+ lflags = XTL_STDIOSTREAM_SHOW_PID | XTL_STDIOSTREAM_HIDE_PROGRESS;
+ l = (xentoollog_logger *)xtl_createlogger_stdiostream(stderr, lvl, lflags);
+ xch = xc_interface_open(l, 0, 0);
if ( !xch )
errx(1, "failed to open control interface");
Index: xen-4.2.1-testing/tools/xcutils/xc_save.c
===================================================================
--- xen-4.2.1-testing.orig/tools/xcutils/xc_save.c
+++ xen-4.2.1-testing/tools/xcutils/xc_save.c
@@ -184,7 +184,7 @@ main(int argc, char **argv)
si.suspend_evtchn = -1;
lvl = si.flags & XCFLAGS_DEBUG ? XTL_DEBUG: XTL_DETAIL;
- lflags = XTL_STDIOSTREAM_HIDE_PROGRESS;
+ lflags = XTL_STDIOSTREAM_SHOW_PID | XTL_STDIOSTREAM_HIDE_PROGRESS;
l = (xentoollog_logger *)xtl_createlogger_stdiostream(stderr, lvl, lflags);
si.xch = xc_interface_open(l, 0, 0);
if (!si.xch)
++++++ 26554-hvm-firmware-passthrough.patch ++++++
# HG changeset patch
# User Ross Philipson
# Date 1360935136 0
# Node ID 3124ab7855fd7d4e0f3ea125cb21b60d693e8800
# Parent 71c15ae0998378b5c117bbd27a48015757685706
libxl: switch to using the new xc_hvm_build() libxc API.
Signed-off-by: Ross Philipson
Acked-by: Ian Campbell
Committed-by: Ian Campbell
Index: xen-4.2.2-testing/tools/libxl/libxl_dom.c
===================================================================
--- xen-4.2.2-testing.orig/tools/libxl/libxl_dom.c
+++ xen-4.2.2-testing/tools/libxl/libxl_dom.c
@@ -546,17 +546,24 @@ int libxl__build_hvm(libxl__gc *gc, uint
libxl__domain_build_state *state)
{
libxl_ctx *ctx = libxl__gc_owner(gc);
+ struct xc_hvm_build_args args = {};
int ret, rc = ERROR_FAIL;
const char *firmware = libxl__domain_firmware(gc, info);
if (!firmware)
goto out;
- ret = xc_hvm_build_target_mem(
- ctx->xch,
- domid,
- (info->max_memkb - info->video_memkb) / 1024,
- (info->target_memkb - info->video_memkb) / 1024,
- firmware);
+
+ memset(&args, 0, sizeof(struct xc_hvm_build_args));
+ /* The params from the configuration file are in Mb, which are then
+ * multiplied by 1 Kb. This was then divided off when calling
+ * the old xc_hvm_build_target_mem() which then turned them to bytes.
+ * Do all this in one step here...
+ */
+ args.mem_size = (uint64_t)(info->max_memkb - info->video_memkb) << 10;
+ args.mem_target = (uint64_t)(info->target_memkb - info->video_memkb) << 10;
+ args.image_file_name = firmware;
+
+ ret = xc_hvm_build(ctx->xch, domid, &args);
if (ret) {
LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, ret, "hvm building failed");
goto out;
++++++ 26555-hvm-firmware-passthrough.patch ++++++
# HG changeset patch
# User Ross Philipson
# Date 1360935136 0
# Node ID 17a228e37ec0913ff86b8b5f2d88f1b8e92146f1
# Parent 3124ab7855fd7d4e0f3ea125cb21b60d693e8800
libxl: HVM firmware passthrough support
This patch introduces support for two new parameters in libxl:
smbios_firmware=
acpi_firmware=
The changes are primarily in the domain building code where the firmware files
are read and passed to libxc for loading into the new guest. After the domain
building call to libxc, the addresses for the loaded blobs are returned and
written to xenstore.
LIBXL_HAVE_FIRMWARE_PASSTHROUGH is defined in libxl.h to allow users to
determine if the feature is present.
This patch also updates the xl.cfg man page with descriptions of the two new
parameters for firmware passthrough.
Signed-off-by: Ross Philipson
Acked-by: Ian Campbell
Committed-by: Ian Campbell
Index: xen-4.2.2-testing/docs/man/xl.cfg.pod.5
===================================================================
--- xen-4.2.2-testing.orig/docs/man/xl.cfg.pod.5
+++ xen-4.2.2-testing/docs/man/xl.cfg.pod.5
@@ -637,6 +637,25 @@ of Xen) within a Xen guest or to support
which uses hardware virtualisation extensions (e.g. Windows XP
compatibility mode on more modern Windows OS).
+=item B
+
+Specify a path to a file that contains extra ACPI firmware tables to pass in to
+a guest. The file can contain several tables in their binary AML form
+concatenated together. Each table self describes its length so no additional
+information is needed. These tables will be added to the ACPI table set in the
+guest. Note that existing tables cannot be overridden by this feature. For
+example this cannot be used to override tables like DSDT, FADT, etc.
+
+=item B
+
+Specify a path to a file that contains extra SMBIOS firmware structures to pass
+in to a guest. The file can contain a set DMTF predefined structures which will
+override the internal defaults. Not all predefined structures can be overridden,
+only the following types: 0, 1, 2, 3, 11, 22, 39. The file can also contain any
+number of vendor defined SMBIOS structures (type 128 - 255). Since SMBIOS
+structures do not present their overall size, each entry in the file must be
+preceded by a 32b integer indicating the size of the next structure.
+
=back
=head3 Guest Virtual Time Controls
Index: xen-4.2.2-testing/tools/libxl/libxl.h
===================================================================
--- xen-4.2.2-testing.orig/tools/libxl/libxl.h
+++ xen-4.2.2-testing/tools/libxl/libxl.h
@@ -68,6 +68,13 @@
*/
/*
+ * LIBXL_HAVE_FIRMWARE_PASSTHROUGH indicates the feature for
+ * passing in SMBIOS and ACPI firmware to HVM guests is present
+ * in the library.
+ */
+#define LIBXL_HAVE_FIRMWARE_PASSTHROUGH 1
+
+/*
* libxl ABI compatibility
*
* The only guarantee which libxl makes regarding ABI compatibility
Index: xen-4.2.2-testing/tools/libxl/libxl_dom.c
===================================================================
--- xen-4.2.2-testing.orig/tools/libxl/libxl_dom.c
+++ xen-4.2.2-testing/tools/libxl/libxl_dom.c
@@ -22,6 +22,7 @@
#include
#include
+#include
libxl_domain_type libxl__domain_type(libxl__gc *gc, uint32_t domid)
{
@@ -514,11 +515,61 @@ static int hvm_build_set_params(xc_inter
return 0;
}
-static const char *libxl__domain_firmware(libxl__gc *gc,
- libxl_domain_build_info *info)
+static int hvm_build_set_xs_values(libxl__gc *gc,
+ uint32_t domid,
+ struct xc_hvm_build_args *args)
+{
+ char *path = NULL;
+ int ret = 0;
+
+ if (args->smbios_module.guest_addr_out) {
+ path = GCSPRINTF("/local/domain/%d/"HVM_XS_SMBIOS_PT_ADDRESS, domid);
+
+ ret = libxl__xs_write(gc, XBT_NULL, path, "0x%"PRIx64,
+ args->smbios_module.guest_addr_out);
+ if (ret)
+ goto err;
+
+ path = GCSPRINTF("/local/domain/%d/"HVM_XS_SMBIOS_PT_LENGTH, domid);
+
+ ret = libxl__xs_write(gc, XBT_NULL, path, "0x%x",
+ args->smbios_module.length);
+ if (ret)
+ goto err;
+ }
+
+ if (args->acpi_module.guest_addr_out) {
+ path = GCSPRINTF("/local/domain/%d/"HVM_XS_ACPI_PT_ADDRESS, domid);
+
+ ret = libxl__xs_write(gc, XBT_NULL, path, "0x%"PRIx64,
+ args->acpi_module.guest_addr_out);
+ if (ret)
+ goto err;
+
+ path = GCSPRINTF("/local/domain/%d/"HVM_XS_ACPI_PT_LENGTH, domid);
+
+ ret = libxl__xs_write(gc, XBT_NULL, path, "0x%x",
+ args->acpi_module.length);
+ if (ret)
+ goto err;
+ }
+
+ return 0;
+
+err:
+ LOG(ERROR, "failed to write firmware xenstore value, err: %d", ret);
+ return ret;
+}
+
+static int libxl__domain_firmware(libxl__gc *gc,
+ libxl_domain_build_info *info,
+ struct xc_hvm_build_args *args)
{
libxl_ctx *ctx = libxl__gc_owner(gc);
const char *firmware;
+ int e, rc = ERROR_FAIL;
+ int datalen = 0;
+ void *data;
if (info->u.hvm.firmware)
firmware = info->u.hvm.firmware;
@@ -532,13 +583,52 @@ static const char *libxl__domain_firmwar
firmware = "hvmloader";
break;
default:
- LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "invalid device model version %d",
- info->device_model_version);
- return NULL;
+ LOG(ERROR, "invalid device model version %d",
+ info->device_model_version);
+ return ERROR_FAIL;
break;
}
}
- return libxl__abs_path(gc, firmware, libxl__xenfirmwaredir_path());
+ args->image_file_name = libxl__abs_path(gc, firmware,
+ libxl__xenfirmwaredir_path());
+
+ if (info->u.hvm.smbios_firmware) {
+ data = NULL;
+ e = libxl_read_file_contents(ctx, info->u.hvm.smbios_firmware,
+ &data, &datalen);
+ if (e) {
+ LOGEV(ERROR, e, "failed to read SMBIOS firmware file %s",
+ info->u.hvm.smbios_firmware);
+ goto out;
+ }
+ libxl__ptr_add(gc, data);
+ if (datalen) {
+ /* Only accept non-empty files */
+ args->smbios_module.data = data;
+ args->smbios_module.length = (uint32_t)datalen;
+ }
+ }
+
+ if (info->u.hvm.acpi_firmware) {
+ data = NULL;
+ e = libxl_read_file_contents(ctx, info->u.hvm.acpi_firmware,
+ &data, &datalen);
+ if (e) {
+ LOGEV(ERROR, e, "failed to read ACPI firmware file %s",
+ info->u.hvm.acpi_firmware);
+ goto out;
+ }
+ libxl__ptr_add(gc, data);
+ if (datalen) {
+ /* Only accept non-empty files */
+ args->acpi_module.data = data;
+ args->acpi_module.length = (uint32_t)datalen;
+ }
+ }
+
+ return 0;
+out:
+ return rc;
}
int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
@@ -548,10 +638,6 @@ int libxl__build_hvm(libxl__gc *gc, uint
libxl_ctx *ctx = libxl__gc_owner(gc);
struct xc_hvm_build_args args = {};
int ret, rc = ERROR_FAIL;
- const char *firmware = libxl__domain_firmware(gc, info);
-
- if (!firmware)
- goto out;
memset(&args, 0, sizeof(struct xc_hvm_build_args));
/* The params from the configuration file are in Mb, which are then
@@ -561,22 +647,34 @@ int libxl__build_hvm(libxl__gc *gc, uint
*/
args.mem_size = (uint64_t)(info->max_memkb - info->video_memkb) << 10;
args.mem_target = (uint64_t)(info->target_memkb - info->video_memkb) << 10;
- args.image_file_name = firmware;
+
+ if (libxl__domain_firmware(gc, info, &args)) {
+ LOG(ERROR, "initializing domain firmware failed");
+ goto out;
+ }
ret = xc_hvm_build(ctx->xch, domid, &args);
if (ret) {
- LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, ret, "hvm building failed");
+ LOGEV(ERROR, ret, "hvm building failed");
goto out;
}
+
ret = hvm_build_set_params(ctx->xch, domid, info, state->store_port,
&state->store_mfn, state->console_port,
&state->console_mfn, state->store_domid,
state->console_domid);
if (ret) {
- LIBXL__LOG_ERRNOVAL(ctx, LIBXL__LOG_ERROR, ret, "hvm build set params failed");
+ LOGEV(ERROR, ret, "hvm build set params failed");
goto out;
}
- rc = 0;
+
+ ret = hvm_build_set_xs_values(gc, domid, &args);
+ if (ret) {
+ LOG(ERROR, "hvm build set xenstore values failed (ret=%d)", ret);
+ goto out;
+ }
+
+ return 0;
out:
return rc;
}
@@ -638,7 +736,7 @@ int libxl__toolstack_restore(uint32_t do
memcpy(&count, ptr, sizeof(count));
ptr += sizeof(count);
-
+
if (size < sizeof(version) + sizeof(count) +
count * (sizeof(struct libxl__physmap_info))) {
LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "wrong size");
@@ -852,7 +950,7 @@ static void switch_logdirty_xswatch(libx
rc = libxl__xs_rm_checked(gc, t, lds->ret_path);
if (rc) goto out;
- rc = libxl__xs_transaction_commit(gc, &t);
+ rc = libxl__xs_transaction_commit(gc, &t);
if (!rc) break;
if (rc<0) goto out;
}
@@ -1324,7 +1422,7 @@ void libxl__xc_domain_save_done(libxl__e
if (type == LIBXL_DOMAIN_TYPE_HVM) {
rc = libxl__domain_suspend_device_model(gc, dss);
if (rc) goto out;
-
+
libxl__domain_save_device_model(egc, dss, domain_suspend_done);
return;
}
Index: xen-4.2.2-testing/tools/libxl/libxl_types.idl
===================================================================
--- xen-4.2.2-testing.orig/tools/libxl/libxl_types.idl
+++ xen-4.2.2-testing/tools/libxl/libxl_types.idl
@@ -301,6 +301,8 @@ libxl_domain_build_info = Struct("domain
("vpt_align", libxl_defbool),
("timer_mode", libxl_timer_mode),
("nested_hvm", libxl_defbool),
+ ("smbios_firmware", string),
+ ("acpi_firmware", string),
("nographic", libxl_defbool),
("vga", libxl_vga_interface_info),
("vnc", libxl_vnc_info),
Index: xen-4.2.2-testing/tools/libxl/xl_cmdimpl.c
===================================================================
--- xen-4.2.2-testing.orig/tools/libxl/xl_cmdimpl.c
+++ xen-4.2.2-testing/tools/libxl/xl_cmdimpl.c
@@ -863,6 +863,11 @@ static void parse_config_data(const char
}
xlu_cfg_get_defbool(config, "nestedhvm", &b_info->u.hvm.nested_hvm, 0);
+
+ xlu_cfg_replace_string(config, "smbios_firmware",
+ &b_info->u.hvm.smbios_firmware, 0);
+ xlu_cfg_replace_string(config, "acpi_firmware",
+ &b_info->u.hvm.acpi_firmware, 0);
break;
case LIBXL_DOMAIN_TYPE_PV:
{
++++++ 26556-hvm-firmware-passthrough.patch ++++++
# HG changeset patch
# User Ross Philipson
# Date 1360935137 0
# Node ID 6a9549a15108669408123e5e39f52ad09dea1c10
# Parent 17a228e37ec0913ff86b8b5f2d88f1b8e92146f1
libxl: Cleanup, use LOG* and GCSPRINTF macro in libxl_dom.c
Signed-off-by: Ross Philipson
Acked-by: Ian Campbell
Committed-by: Ian Campbell
Index: xen-4.2.2-testing/tools/libxl/libxl_dom.c
===================================================================
--- xen-4.2.2-testing.orig/tools/libxl/libxl_dom.c
+++ xen-4.2.2-testing/tools/libxl/libxl_dom.c
@@ -32,8 +32,7 @@ libxl_domain_type libxl__domain_type(lib
ret = xc_domain_getinfolist(ctx->xch, domid, 1, &info);
if (ret != 1 || info.domain != domid) {
- LIBXL__LOG(CTX, LIBXL__LOG_ERROR,
- "unable to get domain type for domid=%"PRIu32, domid);
+ LOG(ERROR, "unable to get domain type for domid=%"PRIu32, domid);
return LIBXL_DOMAIN_TYPE_INVALID;
}
if (info.flags & XEN_DOMINF_hvm_guest)
@@ -317,20 +316,19 @@ int libxl__build_post(libxl__gc *gc, uin
ents = libxl__calloc(gc, 12 + (info->max_vcpus * 2) + 2, sizeof(char *));
ents[0] = "memory/static-max";
- ents[1] = libxl__sprintf(gc, "%"PRId64, info->max_memkb);
+ ents[1] = GCSPRINTF("%"PRId64, info->max_memkb);
ents[2] = "memory/target";
- ents[3] = libxl__sprintf(gc, "%"PRId64,
- info->target_memkb - info->video_memkb);
+ ents[3] = GCSPRINTF("%"PRId64, info->target_memkb - info->video_memkb);
ents[4] = "memory/videoram";
- ents[5] = libxl__sprintf(gc, "%"PRId64, info->video_memkb);
+ ents[5] = GCSPRINTF("%"PRId64, info->video_memkb);
ents[6] = "domid";
- ents[7] = libxl__sprintf(gc, "%d", domid);
+ ents[7] = GCSPRINTF("%d", domid);
ents[8] = "store/port";
- ents[9] = libxl__sprintf(gc, "%"PRIu32, state->store_port);
+ ents[9] = GCSPRINTF("%"PRIu32, state->store_port);
ents[10] = "store/ring-ref";
- ents[11] = libxl__sprintf(gc, "%lu", state->store_mfn);
+ ents[11] = GCSPRINTF("%lu", state->store_mfn);
for (i = 0; i < info->max_vcpus; i++) {
- ents[12+(i*2)] = libxl__sprintf(gc, "cpu/%d/availability", i);
+ ents[12+(i*2)] = GCSPRINTF("cpu/%d/availability", i);
ents[12+(i*2)+1] = libxl_bitmap_test(&info->avail_vcpus, i)
? "online" : "offline";
}
@@ -339,7 +337,7 @@ int libxl__build_post(libxl__gc *gc, uin
if (info->type == LIBXL_DOMAIN_TYPE_HVM) {
hvm_ents = libxl__calloc(gc, 3, sizeof(char *));
hvm_ents[0] = "hvmloader/generation-id-address";
- hvm_ents[1] = libxl__sprintf(gc, "0x%lx", state->vm_generationid_addr);
+ hvm_ents[1] = GCSPRINTF("0x%lx", state->vm_generationid_addr);
}
dom_path = libxl__xs_get_dompath(gc, domid);
@@ -347,7 +345,7 @@ int libxl__build_post(libxl__gc *gc, uin
return ERROR_FAIL;
}
- vm_path = xs_read(ctx->xsh, XBT_NULL, libxl__sprintf(gc, "%s/vm", dom_path), NULL);
+ vm_path = xs_read(ctx->xsh, XBT_NULL, GCSPRINTF("%s/vm", dom_path), NULL);
retry_transaction:
t = xs_transaction_start(ctx->xsh);
@@ -378,7 +376,7 @@ int libxl__build_pv(libxl__gc *gc, uint3
dom = xc_dom_allocate(ctx->xch, state->pv_cmdline, info->u.pv.features);
if (!dom) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_allocate failed");
+ LOGE(ERROR, "xc_dom_allocate failed");
return ERROR_FAIL;
}
@@ -388,13 +386,13 @@ int libxl__build_pv(libxl__gc *gc, uint3
state->pv_kernel.data,
state->pv_kernel.size);
if ( ret != 0) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_kernel_mem failed");
+ LOGE(ERROR, "xc_dom_kernel_mem failed");
goto out;
}
} else {
ret = xc_dom_kernel_file(dom, state->pv_kernel.path);
if ( ret != 0) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_kernel_file failed");
+ LOGE(ERROR, "xc_dom_kernel_file failed");
goto out;
}
}
@@ -402,12 +400,12 @@ int libxl__build_pv(libxl__gc *gc, uint3
if ( state->pv_ramdisk.path && strlen(state->pv_ramdisk.path) ) {
if (state->pv_ramdisk.mapped) {
if ( (ret = xc_dom_ramdisk_mem(dom, state->pv_ramdisk.data, state->pv_ramdisk.size)) != 0 ) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_ramdisk_mem failed");
+ LOGE(ERROR, "xc_dom_ramdisk_mem failed");
goto out;
}
} else {
if ( (ret = xc_dom_ramdisk_file(dom, state->pv_ramdisk.path)) != 0 ) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_ramdisk_file failed");
+ LOGE(ERROR, "xc_dom_ramdisk_file failed");
goto out;
}
}
@@ -420,31 +418,31 @@ int libxl__build_pv(libxl__gc *gc, uint3
dom->xenstore_domid = state->store_domid;
if ( (ret = xc_dom_boot_xen_init(dom, ctx->xch, domid)) != 0 ) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_boot_xen_init failed");
+ LOGE(ERROR, "xc_dom_boot_xen_init failed");
goto out;
}
if ( (ret = xc_dom_parse_image(dom)) != 0 ) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_parse_image failed");
+ LOGE(ERROR, "xc_dom_parse_image failed");
goto out;
}
if ( (ret = xc_dom_mem_init(dom, info->target_memkb / 1024)) != 0 ) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_mem_init failed");
+ LOGE(ERROR, "xc_dom_mem_init failed");
goto out;
}
if ( (ret = xc_dom_boot_mem_init(dom)) != 0 ) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_boot_mem_init failed");
+ LOGE(ERROR, "xc_dom_boot_mem_init failed");
goto out;
}
if ( (ret = xc_dom_build_image(dom)) != 0 ) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_build_image failed");
+ LOGE(ERROR, "xc_dom_build_image failed");
goto out;
}
if ( (ret = xc_dom_boot_image(dom)) != 0 ) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_boot_image failed");
+ LOGE(ERROR, "xc_dom_boot_image failed");
goto out;
}
if ( (ret = xc_dom_gnttab_init(dom)) != 0 ) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "xc_dom_gnttab_init failed");
+ LOGE(ERROR, "xc_dom_gnttab_init failed");
goto out;
}
@@ -683,8 +681,7 @@ int libxl__qemu_traditional_cmd(libxl__g
const char *cmd)
{
char *path = NULL;
- path = libxl__sprintf(gc, "/local/domain/0/device-model/%d/command",
- domid);
+ path = GCSPRINTF("/local/domain/0/device-model/%d/command", domid);
return libxl__xs_write(gc, XBT_NULL, path, "%s", cmd);
}
@@ -701,8 +698,7 @@ struct libxl__physmap_info {
static inline char *restore_helper(libxl__gc *gc, uint32_t domid,
uint64_t phys_offset, char *node)
{
- return libxl__sprintf(gc,
- "/local/domain/0/device-model/%d/physmap/%"PRIx64"/%s",
+ return GCSPRINTF("/local/domain/0/device-model/%d/physmap/%"PRIx64"/%s",
domid, phys_offset, node);
}
@@ -712,7 +708,6 @@ int libxl__toolstack_restore(uint32_t do
libxl__save_helper_state *shs = user;
libxl__domain_create_state *dcs = CONTAINER_OF(shs, *dcs, shs);
STATE_AO_GC(dcs->ao);
- libxl_ctx *ctx = CTX;
int i, ret;
const uint8_t *ptr = buf;
uint32_t count = 0, version = 0;
@@ -722,7 +717,7 @@ int libxl__toolstack_restore(uint32_t do
LOG(DEBUG,"domain=%"PRIu32" toolstack data size=%"PRIu32, domid, size);
if (size < sizeof(version) + sizeof(count)) {
- LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "wrong size");
+ LOG(ERROR, "wrong size");
return -1;
}
@@ -730,7 +725,7 @@ int libxl__toolstack_restore(uint32_t do
ptr += sizeof(version);
if (version != TOOLSTACK_SAVE_VERSION) {
- LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "wrong version");
+ LOG(ERROR, "wrong version");
return -1;
}
@@ -739,7 +734,7 @@ int libxl__toolstack_restore(uint32_t do
if (size < sizeof(version) + sizeof(count) +
count * (sizeof(struct libxl__physmap_info))) {
- LIBXL__LOG(ctx, LIBXL__LOG_ERROR, "wrong size");
+ LOG(ERROR, "wrong size");
return -1;
}
@@ -988,15 +983,13 @@ static void switch_logdirty_done(libxl__
int libxl__domain_suspend_device_model(libxl__gc *gc,
libxl__domain_suspend_state *dss)
{
- libxl_ctx *ctx = libxl__gc_owner(gc);
int ret = 0;
uint32_t const domid = dss->domid;
const char *const filename = dss->dm_savefile;
switch (libxl__device_model_version_running(gc, domid)) {
case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: {
- LIBXL__LOG(ctx, LIBXL__LOG_DEBUG,
- "Saving device model state to %s", filename);
+ LOG(DEBUG, "Saving device model state to %s", filename);
libxl__qemu_traditional_cmd(gc, domid, "save");
libxl__wait_for_device_model(gc, domid, "paused", NULL, NULL, NULL);
break;
@@ -1172,8 +1165,7 @@ int libxl__domain_suspend_common_callbac
static inline char *physmap_path(libxl__gc *gc, uint32_t domid,
char *phys_offset, char *node)
{
- return libxl__sprintf(gc,
- "/local/domain/0/device-model/%d/physmap/%s/%s",
+ return GCSPRINTF("/local/domain/0/device-model/%d/physmap/%s/%s",
domid, phys_offset, node);
}
@@ -1190,7 +1182,7 @@ int libxl__toolstack_save(uint32_t domid
char **entries = NULL;
struct libxl__physmap_info *pi;
- entries = libxl__xs_directory(gc, 0, libxl__sprintf(gc,
+ entries = libxl__xs_directory(gc, 0, GCSPRINTF(
"/local/domain/0/device-model/%d/physmap", domid), &num);
count = num;
@@ -1331,7 +1323,7 @@ void libxl__domain_suspend(libxl__egc *e
char *path;
char *addr;
- path = libxl__sprintf(gc, "%s/hvmloader/generation-id-address",
+ path = GCSPRINTF("%s/hvmloader/generation-id-address",
libxl__xs_get_dompath(gc, domid));
addr = libxl__xs_read(gc, XBT_NULL, path);
@@ -1545,10 +1537,7 @@ static void domain_suspend_done(libxl__e
char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid)
{
- char *s = libxl__sprintf(gc, LIBXL_UUID_FMT, LIBXL_UUID_BYTES(uuid));
- if (!s)
- LIBXL__LOG(libxl__gc_owner(gc), LIBXL__LOG_ERROR, "cannot allocate for uuid");
- return s;
+ return GCSPRINTF(LIBXL_UUID_FMT, LIBXL_UUID_BYTES(uuid));
}
static const char *userdata_path(libxl__gc *gc, uint32_t domid,
@@ -1556,34 +1545,27 @@ static const char *userdata_path(libxl__
const char *wh)
{
libxl_ctx *ctx = libxl__gc_owner(gc);
- char *path, *uuid_string;
+ char *uuid_string;
libxl_dominfo info;
int rc;
rc = libxl_domain_info(ctx, &info, domid);
if (rc) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "unable to find domain info"
- " for domain %"PRIu32, domid);
+ LOGE(ERROR, "unable to find domain info for domain %"PRIu32, domid);
return NULL;
}
- uuid_string = libxl__sprintf(gc, LIBXL_UUID_FMT, LIBXL_UUID_BYTES(info.uuid));
+ uuid_string = GCSPRINTF(LIBXL_UUID_FMT, LIBXL_UUID_BYTES(info.uuid));
- path = libxl__sprintf(gc, "/var/lib/xen/"
- "userdata-%s.%u.%s.%s",
- wh, domid, uuid_string, userdata_userid);
- if (!path)
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "unable to allocate for"
- " userdata path");
- return path;
+ return GCSPRINTF("/var/lib/xen/userdata-%s.%u.%s.%s",
+ wh, domid, uuid_string, userdata_userid);
}
static int userdata_delete(libxl__gc *gc, const char *path)
{
- libxl_ctx *ctx = libxl__gc_owner(gc);
int r;
r = unlink(path);
if (r) {
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "remove failed for %s", path);
+ LOGE(ERROR, "remove failed for %s", path);
return errno;
}
return 0;
@@ -1591,7 +1573,6 @@ static int userdata_delete(libxl__gc *gc
void libxl__userdata_destroyall(libxl__gc *gc, uint32_t domid)
{
- libxl_ctx *ctx = libxl__gc_owner(gc);
const char *pattern;
glob_t gl;
int r, i;
@@ -1607,7 +1588,7 @@ void libxl__userdata_destroyall(libxl__g
if (r == GLOB_NOMATCH)
goto out;
if (r)
- LIBXL__LOG_ERRNO(ctx, LIBXL__LOG_ERROR, "glob failed for %s", pattern);
+ LOGE(ERROR, "glob failed for %s", pattern);
for (i=0; i
# Date 1361176078 -3600
# Node ID 4c3355d776e115f979fd2abc135bb77ba710f0d4
# Parent 217a4fc4cd46e8de06f2f43eed727838891e9398
x86/VMX: fix live migration while enabling APICV
SVI should be restored in case guest is processing virtual interrupt
while saveing a domain state. Otherwise SVI would be missed when
virtual interrupt delivery is enabled.
Signed-off-by: Jiongxi Li
Acked-by: Eddie Dong
Acked-by: Jun Nakajima
Committed-by: Jan Beulich
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -1194,6 +1194,9 @@ static int lapic_load_regs(struct domain
if ( hvm_load_entry(LAPIC_REGS, h, s->regs) != 0 )
return -EINVAL;
+ if ( hvm_funcs.process_isr )
+ hvm_funcs.process_isr(vlapic_find_highest_isr(s), v);
+
vlapic_adjust_i8259_target(d);
lapic_rearm(s);
return 0;
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -290,8 +290,8 @@ void vmx_intr_assist(void)
vmx_set_eoi_exit_bitmap(v, pt_vector);
/* we need update the RVI field */
- status &= ~(unsigned long)0x0FF;
- status |= (unsigned long)0x0FF &
+ status &= ~VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK;
+ status |= VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK &
intack.vector;
__vmwrite(GUEST_INTR_STATUS, status);
if (v->arch.hvm_vmx.eoi_exitmap_changed) {
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1523,6 +1523,29 @@ static int vmx_virtual_intr_delivery_ena
return cpu_has_vmx_virtual_intr_delivery;
}
+static void vmx_process_isr(int isr, struct vcpu *v)
+{
+ unsigned long status;
+ u8 old;
+
+ if ( !cpu_has_vmx_virtual_intr_delivery )
+ return;
+
+ if ( isr < 0 )
+ isr = 0;
+
+ vmx_vmcs_enter(v);
+ status = __vmread(GUEST_INTR_STATUS);
+ old = status >> VMX_GUEST_INTR_STATUS_SVI_OFFSET;
+ if ( isr != old )
+ {
+ status &= VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK;
+ status |= isr << VMX_GUEST_INTR_STATUS_SVI_OFFSET;
+ __vmwrite(GUEST_INTR_STATUS, status);
+ }
+ vmx_vmcs_exit(v);
+}
+
static struct hvm_function_table __read_mostly vmx_function_table = {
.name = "VMX",
.cpu_up_prepare = vmx_cpu_up_prepare,
@@ -1571,7 +1594,8 @@ static struct hvm_function_table __read_
.nhvm_intr_blocked = nvmx_intr_blocked,
.nhvm_domain_relinquish_resources = nvmx_domain_relinquish_resources,
.update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
- .virtual_intr_delivery_enabled = vmx_virtual_intr_delivery_enabled
+ .virtual_intr_delivery_enabled = vmx_virtual_intr_delivery_enabled,
+ .process_isr = vmx_process_isr,
};
struct hvm_function_table * __init start_vmx(void)
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -184,6 +184,7 @@ struct hvm_function_table {
/* Virtual interrupt delivery */
void (*update_eoi_exit_bitmap)(struct vcpu *v, u8 vector, u8 trig);
int (*virtual_intr_delivery_enabled)(void);
+ void (*process_isr)(int isr, struct vcpu *v);
};
extern struct hvm_function_table hvm_funcs;
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -246,6 +246,10 @@ extern bool_t cpu_has_vmx_ins_outs_instr
#define VMX_INTR_SHADOW_SMI 0x00000004
#define VMX_INTR_SHADOW_NMI 0x00000008
+/* Guest interrupt status */
+#define VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK 0x0FF
+#define VMX_GUEST_INTR_STATUS_SVI_OFFSET 8
+
/* VMCS field encodings. */
enum vmcs_field {
VIRTUAL_PROCESSOR_ID = 0x00000000,
++++++ 26577-x86-APICV-x2APIC.patch ++++++
References: FATE#313605
# HG changeset patch
# User Jiongxi Li
# Date 1361176458 -3600
# Node ID 45d59b822ed187c535b127679e32853b148ed411
# Parent 4c3355d776e115f979fd2abc135bb77ba710f0d4
x86/VMX: fix VMCS setting for x2APIC mode guest while enabling APICV
The "APIC-register virtualization" and "virtual-interrupt deliver"
VM-execution control has no effect on the behavior of RDMSR/WRMSR if
the "virtualize x2APIC mode" VM-execution control is 0.
When guest uses x2APIC mode, we should enable "virtualize x2APIC mode"
for APICV first.
Signed-off-by: Jiongxi Li
Acked-by: Eddie Dong
Acked-by: Jun Nakajima
Committed-by: Jan Beulich
Index: xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vmcs.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/hvm/vmx/vmcs.c
+++ xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vmcs.c
@@ -194,7 +194,8 @@ static int vmx_init_vmcs_config(void)
*/
if ( _vmx_cpu_based_exec_control & CPU_BASED_TPR_SHADOW )
opt |= SECONDARY_EXEC_APIC_REGISTER_VIRT |
- SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY;
+ SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
+ SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
_vmx_secondary_exec_control = adjust_vmx_controls(
@@ -673,19 +674,59 @@ void vmx_disable_intercept_for_msr(struc
*/
if ( msr <= 0x1fff )
{
- if (type & MSR_TYPE_R)
- __clear_bit(msr, msr_bitmap + 0x000/BYTES_PER_LONG); /* read-low */
- if (type & MSR_TYPE_W)
- __clear_bit(msr, msr_bitmap + 0x800/BYTES_PER_LONG); /* write-low */
+ if ( type & MSR_TYPE_R )
+ clear_bit(msr, msr_bitmap + 0x000/BYTES_PER_LONG); /* read-low */
+ if ( type & MSR_TYPE_W )
+ clear_bit(msr, msr_bitmap + 0x800/BYTES_PER_LONG); /* write-low */
}
else if ( (msr >= 0xc0000000) && (msr <= 0xc0001fff) )
{
msr &= 0x1fff;
- if (type & MSR_TYPE_R)
- __clear_bit(msr, msr_bitmap + 0x400/BYTES_PER_LONG); /* read-high */
- if (type & MSR_TYPE_W)
- __clear_bit(msr, msr_bitmap + 0xc00/BYTES_PER_LONG); /* write-high */
+ if ( type & MSR_TYPE_R )
+ clear_bit(msr, msr_bitmap + 0x400/BYTES_PER_LONG); /* read-high */
+ if ( type & MSR_TYPE_W )
+ clear_bit(msr, msr_bitmap + 0xc00/BYTES_PER_LONG); /* write-high */
}
+ else
+ HVM_DBG_LOG(DBG_LEVEL_0,
+ "msr %x is out of the control range"
+ "0x00000000-0x00001fff and 0xc0000000-0xc0001fff"
+ "RDMSR or WRMSR will cause a VM exit", msr);
+}
+
+void vmx_enable_intercept_for_msr(struct vcpu *v, u32 msr, int type)
+{
+ unsigned long *msr_bitmap = v->arch.hvm_vmx.msr_bitmap;
+
+ /* VMX MSR bitmap supported? */
+ if ( msr_bitmap == NULL )
+ return;
+
+ /*
+ * See Intel PRM Vol. 3, 20.6.9 (MSR-Bitmap Address). Early manuals
+ * have the write-low and read-high bitmap offsets the wrong way round.
+ * We can control MSRs 0x00000000-0x00001fff and 0xc0000000-0xc0001fff.
+ */
+ if ( msr <= 0x1fff )
+ {
+ if ( type & MSR_TYPE_R )
+ set_bit(msr, msr_bitmap + 0x000/BYTES_PER_LONG); /* read-low */
+ if ( type & MSR_TYPE_W )
+ set_bit(msr, msr_bitmap + 0x800/BYTES_PER_LONG); /* write-low */
+ }
+ else if ( (msr >= 0xc0000000) && (msr <= 0xc0001fff) )
+ {
+ msr &= 0x1fff;
+ if ( type & MSR_TYPE_R )
+ set_bit(msr, msr_bitmap + 0x400/BYTES_PER_LONG); /* read-high */
+ if ( type & MSR_TYPE_W )
+ set_bit(msr, msr_bitmap + 0xc00/BYTES_PER_LONG); /* write-high */
+ }
+ else
+ HVM_DBG_LOG(DBG_LEVEL_0,
+ "msr %x is out of the control range"
+ "0x00000000-0x00001fff and 0xc0000000-0xc0001fff"
+ "RDMSR or WRMSR will cause a VM exit", msr);
}
/*
@@ -751,6 +792,10 @@ static int construct_vmcs(struct vcpu *v
vmentry_ctl &= ~VM_ENTRY_LOAD_GUEST_PAT;
}
+ /* Disable Virtualize x2APIC mode by default. */
+ v->arch.hvm_vmx.secondary_exec_control &=
+ ~SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
+
/* Do not enable Monitor Trap Flag unless start single step debug */
v->arch.hvm_vmx.exec_control &= ~CPU_BASED_MONITOR_TRAP_FLAG;
@@ -787,18 +832,6 @@ static int construct_vmcs(struct vcpu *v
vmx_disable_intercept_for_msr(v, MSR_IA32_SYSENTER_EIP, MSR_TYPE_R | MSR_TYPE_W);
if ( cpu_has_vmx_pat && paging_mode_hap(d) )
vmx_disable_intercept_for_msr(v, MSR_IA32_CR_PAT, MSR_TYPE_R | MSR_TYPE_W);
- if ( cpu_has_vmx_apic_reg_virt )
- {
- int msr;
- for (msr = MSR_IA32_APICBASE_MSR; msr <= MSR_IA32_APICBASE_MSR + 0xff; msr++)
- vmx_disable_intercept_for_msr(v, msr, MSR_TYPE_R);
- }
- if ( cpu_has_vmx_virtual_intr_delivery )
- {
- vmx_disable_intercept_for_msr(v, MSR_IA32_APICTPR_MSR, MSR_TYPE_W);
- vmx_disable_intercept_for_msr(v, MSR_IA32_APICEOI_MSR, MSR_TYPE_W);
- vmx_disable_intercept_for_msr(v, MSR_IA32_APICSELF_MSR, MSR_TYPE_W);
- }
}
/* I/O access bitmap. */
Index: xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vmx.c
===================================================================
--- xen-4.2.2-testing.orig/xen/arch/x86/hvm/vmx/vmx.c
+++ xen-4.2.2-testing/xen/arch/x86/hvm/vmx/vmx.c
@@ -2012,18 +2012,63 @@ static void vmx_install_vlapic_mapping(s
void vmx_vlapic_msr_changed(struct vcpu *v)
{
+ int virtualize_x2apic_mode;
struct vlapic *vlapic = vcpu_vlapic(v);
- if ( !cpu_has_vmx_virtualize_apic_accesses )
+ virtualize_x2apic_mode = ( (cpu_has_vmx_apic_reg_virt ||
+ cpu_has_vmx_virtual_intr_delivery) &&
+ cpu_has_vmx_virtualize_x2apic_mode );
+
+ if ( !cpu_has_vmx_virtualize_apic_accesses &&
+ !virtualize_x2apic_mode )
return;
vmx_vmcs_enter(v);
v->arch.hvm_vmx.secondary_exec_control &=
- ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
+ ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
+ SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE);
if ( !vlapic_hw_disabled(vlapic) &&
(vlapic_base_address(vlapic) == APIC_DEFAULT_PHYS_BASE) )
- v->arch.hvm_vmx.secondary_exec_control |=
- SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
+ {
+ unsigned int msr;
+
+ if ( virtualize_x2apic_mode && vlapic_x2apic_mode(vlapic) )
+ {
+ v->arch.hvm_vmx.secondary_exec_control |=
+ SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
+ if ( cpu_has_vmx_apic_reg_virt )
+ {
+ for ( msr = MSR_IA32_APICBASE_MSR;
+ msr <= MSR_IA32_APICBASE_MSR + 0xff; msr++ )
+ vmx_disable_intercept_for_msr(v, msr, MSR_TYPE_R);
+
+ vmx_enable_intercept_for_msr(v, MSR_IA32_APICPPR_MSR,
+ MSR_TYPE_R);
+ vmx_enable_intercept_for_msr(v, MSR_IA32_APICTMICT_MSR,
+ MSR_TYPE_R);
+ vmx_enable_intercept_for_msr(v, MSR_IA32_APICTMCCT_MSR,
+ MSR_TYPE_R);
+ }
+ if ( cpu_has_vmx_virtual_intr_delivery )
+ {
+ vmx_disable_intercept_for_msr(v, MSR_IA32_APICTPR_MSR,
+ MSR_TYPE_W);
+ vmx_disable_intercept_for_msr(v, MSR_IA32_APICEOI_MSR,
+ MSR_TYPE_W);
+ vmx_disable_intercept_for_msr(v, MSR_IA32_APICSELF_MSR,
+ MSR_TYPE_W);
+ }
+ }
+ else
+ {
+ v->arch.hvm_vmx.secondary_exec_control |=
+ SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
+ for ( msr = MSR_IA32_APICBASE_MSR;
+ msr <= MSR_IA32_APICBASE_MSR + 0xff; msr++ )
+ vmx_enable_intercept_for_msr(v, msr,
+ MSR_TYPE_R | MSR_TYPE_W);
+ }
+ }
vmx_update_secondary_exec_control(v);
vmx_vmcs_exit(v);
}
Index: xen-4.2.2-testing/xen/include/asm-x86/hvm/vmx/vmcs.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ xen-4.2.2-testing/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -182,6 +182,7 @@ extern u32 vmx_vmentry_control;
#define SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES 0x00000001
#define SECONDARY_EXEC_ENABLE_EPT 0x00000002
#define SECONDARY_EXEC_ENABLE_RDTSCP 0x00000008
+#define SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE 0x00000010
#define SECONDARY_EXEC_ENABLE_VPID 0x00000020
#define SECONDARY_EXEC_WBINVD_EXITING 0x00000040
#define SECONDARY_EXEC_UNRESTRICTED_GUEST 0x00000080
@@ -239,6 +240,8 @@ extern bool_t cpu_has_vmx_ins_outs_instr
(vmx_secondary_exec_control & SECONDARY_EXEC_APIC_REGISTER_VIRT)
#define cpu_has_vmx_virtual_intr_delivery \
(vmx_secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY)
+#define cpu_has_vmx_virtualize_x2apic_mode \
+ (vmx_secondary_exec_control & SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)
/* GUEST_INTERRUPTIBILITY_INFO flags. */
#define VMX_INTR_SHADOW_STI 0x00000001
@@ -414,6 +417,7 @@ enum vmcs_field {
#define MSR_TYPE_R 1
#define MSR_TYPE_W 2
void vmx_disable_intercept_for_msr(struct vcpu *v, u32 msr, int type);
+void vmx_enable_intercept_for_msr(struct vcpu *v, u32 msr, int type);
int vmx_read_guest_msr(u32 msr, u64 *val);
int vmx_write_guest_msr(u32 msr, u64 val);
int vmx_add_guest_msr(u32 msr);
Index: xen-4.2.2-testing/xen/include/asm-x86/msr-index.h
===================================================================
--- xen-4.2.2-testing.orig/xen/include/asm-x86/msr-index.h
+++ xen-4.2.2-testing/xen/include/asm-x86/msr-index.h
@@ -295,7 +295,10 @@
#define MSR_IA32_APICBASE_BASE (0xfffff<<12)
#define MSR_IA32_APICBASE_MSR 0x800
#define MSR_IA32_APICTPR_MSR 0x808
+#define MSR_IA32_APICPPR_MSR 0x80a
#define MSR_IA32_APICEOI_MSR 0x80b
+#define MSR_IA32_APICTMICT_MSR 0x838
+#define MSR_IA32_APICTMCCT_MSR 0x839
#define MSR_IA32_APICSELF_MSR 0x83f
#define MSR_IA32_UCODE_WRITE 0x00000079
++++++ 26675-tools-xentoollog_update_tty_detection_in_stdiostream_progress.patch ++++++
changeset: 26675:3eb62c576a1a
user: Olaf Hering
date: Wed Feb 27 14:16:36 2013 +0000
files: tools/libxc/xtl_logger_stdio.c
description:
tools/xentoollog: update tty detection in stdiostream_progress
As suggested by IanJ:
Check isatty only once to preserve the errno of ->progress users, and to
reduce the noice in strace output.
Signed-off-by: Olaf Hering
Acked-by: Ian Jackson
diff -r 4b25c1e6cfbb -r 3eb62c576a1a tools/libxc/xtl_logger_stdio.c
--- a/tools/libxc/xtl_logger_stdio.c Wed Feb 27 11:16:47 2013 +0000
+++ b/tools/libxc/xtl_logger_stdio.c Wed Feb 27 14:16:36 2013 +0000
@@ -35,6 +35,7 @@ struct xentoollog_logger_stdiostream {
xentoollog_level min_level;
unsigned flags;
int progress_erase_len, progress_last_percent;
+ int tty;
};
static void progress_erase(xentoollog_logger_stdiostream *lg) {
@@ -118,7 +119,7 @@ static void stdiostream_progress(struct
lg->progress_last_percent = percent;
- if (isatty(fileno(lg->f)) <= 0) {
+ if (!lg->tty) {
stdiostream_message(logger_in, this_level, context,
"%s: %lu/%lu %3d%%",
doing_what, done, total, percent);
@@ -166,6 +167,7 @@ xentoollog_logger_stdiostream *xtl_creat
newlogger.f = f;
newlogger.min_level = min_level;
newlogger.flags = flags;
+ newlogger.tty = isatty(fileno(newlogger.f)) > 0;
if (newlogger.flags & XTL_STDIOSTREAM_SHOW_DATE) tzset();
++++++ 26754-hvm-Improve-APIC-INIT-SIPI-emulation.patch ++++++
References: FATE#313605
# Commit 7242e0dc2c6c083ded570de159007d112ee34e88
# Date 2013-03-28 11:44:11 +0000
# Author Keir Fraser
# Committer Keir Fraser
hvm: Improve APIC INIT/SIPI emulation, fixing it for call paths other than x86_emulate().
In particular, on broadcast/multicast INIT/SIPI, we handle all target
APICs at once in a single invocation of the init/sipi tasklet. This
avoids needing to return an X86EMUL_RETRY error code to the caller,
which was being ignored by all except x86_emulate().
The original bug, and the general approach in this fix, pointed out by
Intel (yang.z.zhang@intel.com).
Signed-off-by: Keir Fraser
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3481,8 +3481,6 @@ void hvm_vcpu_reset_state(struct vcpu *v
struct domain *d = v->domain;
struct segment_register reg;
- BUG_ON(vcpu_runnable(v));
-
domain_lock(d);
if ( v->is_initialised )
--- a/xen/arch/x86/hvm/viridian.c
+++ b/xen/arch/x86/hvm/viridian.c
@@ -241,8 +241,8 @@ int wrmsr_viridian_regs(uint32_t idx, ui
eax &= ~(1 << 12);
edx &= 0xff000000;
vlapic_set_reg(vlapic, APIC_ICR2, edx);
- if ( vlapic_ipi(vlapic, eax, edx) == X86EMUL_OKAY )
- vlapic_set_reg(vlapic, APIC_ICR, eax);
+ vlapic_ipi(vlapic, eax, edx);
+ vlapic_set_reg(vlapic, APIC_ICR, eax);
break;
}
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -243,18 +243,22 @@ bool_t vlapic_match_dest(
return 0;
}
-static void vlapic_init_sipi_action(unsigned long _vcpu)
+static void vlapic_init_sipi_one(struct vcpu *target, uint32_t icr)
{
- struct vcpu *origin = (struct vcpu *)_vcpu;
- struct vcpu *target = vcpu_vlapic(origin)->init_sipi.target;
- uint32_t icr = vcpu_vlapic(origin)->init_sipi.icr;
-
vcpu_pause(target);
switch ( icr & APIC_MODE_MASK )
{
case APIC_DM_INIT: {
bool_t fpu_initialised;
+ /* No work on INIT de-assert for P4-type APIC. */
+ if ( (icr & (APIC_INT_LEVELTRIG | APIC_INT_ASSERT)) ==
+ APIC_INT_LEVELTRIG )
+ break;
+ /* Nothing to do if the VCPU is already reset. */
+ if ( !target->is_initialised )
+ break;
+ hvm_vcpu_down(target);
domain_lock(target->domain);
/* Reset necessary VCPU state. This does not include FPU state. */
fpu_initialised = target->fpu_initialised;
@@ -276,36 +280,36 @@ static void vlapic_init_sipi_action(unsi
}
vcpu_unpause(target);
-
- vcpu_vlapic(origin)->init_sipi.target = NULL;
- vcpu_unpause(origin);
}
-static int vlapic_schedule_init_sipi_tasklet(struct vcpu *target, uint32_t icr)
+static void vlapic_init_sipi_action(unsigned long _vcpu)
{
- struct vcpu *origin = current;
+ struct vcpu *origin = (struct vcpu *)_vcpu;
+ uint32_t icr = vcpu_vlapic(origin)->init_sipi.icr;
+ uint32_t dest = vcpu_vlapic(origin)->init_sipi.dest;
+ uint32_t short_hand = icr & APIC_SHORT_MASK;
+ uint32_t dest_mode = !!(icr & APIC_DEST_MASK);
+ struct vcpu *v;
+
+ if ( icr == 0 )
+ return;
- if ( vcpu_vlapic(origin)->init_sipi.target != NULL )
+ for_each_vcpu ( origin->domain, v )
{
- WARN(); /* should be impossible but don't BUG, just in case */
- return X86EMUL_UNHANDLEABLE;
+ if ( vlapic_match_dest(vcpu_vlapic(v), vcpu_vlapic(origin),
+ short_hand, dest, dest_mode) )
+ vlapic_init_sipi_one(v, icr);
}
- vcpu_pause_nosync(origin);
-
- vcpu_vlapic(origin)->init_sipi.target = target;
- vcpu_vlapic(origin)->init_sipi.icr = icr;
- tasklet_schedule(&vcpu_vlapic(origin)->init_sipi.tasklet);
-
- return X86EMUL_RETRY;
+ vcpu_vlapic(origin)->init_sipi.icr = 0;
+ vcpu_unpause(origin);
}
/* Add a pending IRQ into lapic. */
-static int vlapic_accept_irq(struct vcpu *v, uint32_t icr_low)
+static void vlapic_accept_irq(struct vcpu *v, uint32_t icr_low)
{
struct vlapic *vlapic = vcpu_vlapic(v);
uint8_t vector = (uint8_t)icr_low;
- int rc = X86EMUL_OKAY;
switch ( icr_low & APIC_MODE_MASK )
{
@@ -339,31 +343,15 @@ static int vlapic_accept_irq(struct vcpu
break;
case APIC_DM_INIT:
- /* No work on INIT de-assert for P4-type APIC. */
- if ( (icr_low & (APIC_INT_LEVELTRIG | APIC_INT_ASSERT)) ==
- APIC_INT_LEVELTRIG )
- break;
- /* Nothing to do if the VCPU is already reset. */
- if ( !v->is_initialised )
- break;
- hvm_vcpu_down(v);
- rc = vlapic_schedule_init_sipi_tasklet(v, icr_low);
- break;
-
case APIC_DM_STARTUP:
- /* Nothing to do if the VCPU is already initialised. */
- if ( v->is_initialised )
- break;
- rc = vlapic_schedule_init_sipi_tasklet(v, icr_low);
- break;
+ /* Handled in vlapic_ipi(). */
+ BUG();
default:
gdprintk(XENLOG_ERR, "TODO: unsupported delivery mode in ICR %x\n",
icr_low);
domain_crash(v->domain);
}
-
- return rc;
}
struct vlapic *vlapic_lowest_prio(
@@ -421,15 +409,12 @@ void vlapic_handle_EOI_induced_exit(stru
hvm_dpci_msi_eoi(current->domain, vector);
}
-int vlapic_ipi(
+void vlapic_ipi(
struct vlapic *vlapic, uint32_t icr_low, uint32_t icr_high)
{
unsigned int dest;
unsigned int short_hand = icr_low & APIC_SHORT_MASK;
unsigned int dest_mode = !!(icr_low & APIC_DEST_MASK);
- struct vlapic *target;
- struct vcpu *v;
- int rc = X86EMUL_OKAY;
HVM_DBG_LOG(DBG_LEVEL_VLAPIC, "icr = 0x%08x:%08x", icr_high, icr_low);
@@ -437,25 +422,40 @@ int vlapic_ipi(
? icr_high
: GET_xAPIC_DEST_FIELD(icr_high));
- if ( (icr_low & APIC_MODE_MASK) == APIC_DM_LOWEST )
+ switch ( icr_low & APIC_MODE_MASK )
{
- target = vlapic_lowest_prio(vlapic_domain(vlapic), vlapic,
- short_hand, dest, dest_mode);
+ case APIC_DM_INIT:
+ case APIC_DM_STARTUP:
+ if ( vlapic->init_sipi.icr != 0 )
+ {
+ WARN(); /* should be impossible but don't BUG, just in case */
+ break;
+ }
+ vcpu_pause_nosync(vlapic_vcpu(vlapic));
+ vlapic->init_sipi.icr = icr_low;
+ vlapic->init_sipi.dest = dest;
+ tasklet_schedule(&vlapic->init_sipi.tasklet);
+ break;
+
+ case APIC_DM_LOWEST: {
+ struct vlapic *target = vlapic_lowest_prio(
+ vlapic_domain(vlapic), vlapic, short_hand, dest, dest_mode);
if ( target != NULL )
- rc = vlapic_accept_irq(vlapic_vcpu(target), icr_low);
- return rc;
+ vlapic_accept_irq(vlapic_vcpu(target), icr_low);
+ break;
}
- for_each_vcpu ( vlapic_domain(vlapic), v )
- {
- if ( vlapic_match_dest(vcpu_vlapic(v), vlapic,
- short_hand, dest, dest_mode) )
- rc = vlapic_accept_irq(v, icr_low);
- if ( rc != X86EMUL_OKAY )
- break;
+ default: {
+ struct vcpu *v;
+ for_each_vcpu ( vlapic_domain(vlapic), v )
+ {
+ if ( vlapic_match_dest(vcpu_vlapic(v), vlapic,
+ short_hand, dest, dest_mode) )
+ vlapic_accept_irq(v, icr_low);
+ }
+ break;
+ }
}
-
- return rc;
}
static uint32_t vlapic_get_tmcct(struct vlapic *vlapic)
@@ -687,9 +687,8 @@ static int vlapic_reg_write(struct vcpu
case APIC_ICR:
val &= ~(1 << 12); /* always clear the pending bit */
- rc = vlapic_ipi(vlapic, val, vlapic_get_reg(vlapic, APIC_ICR2));
- if ( rc == X86EMUL_OKAY )
- vlapic_set_reg(vlapic, APIC_ICR, val);
+ vlapic_ipi(vlapic, val, vlapic_get_reg(vlapic, APIC_ICR2));
+ vlapic_set_reg(vlapic, APIC_ICR, val);
break;
case APIC_ICR2:
--- a/xen/include/asm-x86/hvm/vlapic.h
+++ b/xen/include/asm-x86/hvm/vlapic.h
@@ -62,8 +62,7 @@ struct vlapic {
struct page_info *regs_page;
/* INIT-SIPI-SIPI work gets deferred to a tasklet. */
struct {
- struct vcpu *target;
- uint32_t icr;
+ uint32_t icr, dest;
struct tasklet tasklet;
} init_sipi;
};
@@ -102,7 +101,7 @@ void vlapic_adjust_i8259_target(struct d
void vlapic_EOI_set(struct vlapic *vlapic);
void vlapic_handle_EOI_induced_exit(struct vlapic *vlapic, int vector);
-int vlapic_ipi(struct vlapic *vlapic, uint32_t icr_low, uint32_t icr_high);
+void vlapic_ipi(struct vlapic *vlapic, uint32_t icr_low, uint32_t icr_high);
int vlapic_apicv_write(struct vcpu *v, unsigned int offset);
++++++ 26860-x86-allow-VCPUOP_register_vcpu_info-to-work-again-on-PVHVM-guests.patch ++++++
# Commit 63753b3e0dc56efb1acf94fa46f3fee7bc59281c
# Date 2013-04-17 11:35:38 +0200
# Author Konrad Rzeszutek Wilk
# Committer Jan Beulich
x86: allow VCPUOP_register_vcpu_info to work again on PVHVM guests
For details on the hypercall please see commit
c58ae69360ccf2495a19bf4ca107e21cf873c75b (VCPUOP_register_vcpu_info) and
the c/s 23143 (git commit 6b063a4a6f44245a727aa04ef76408b2e00af9c7)
(x86: move pv-only members of struct vcpu to struct pv_vcpu)
that introduced the regression.
The current code allows the PVHVM guest to make this hypercall.
But for PVHVM guest it always returns -EINVAL (-22) for Xen 4.2
and above. Xen 4.1 and earlier worked.
The reason is that the check in map_vcpu_info would fail
at:
if ( v->arch.vcpu_info_mfn != INVALID_MFN )
The reason is that the vcpu_info_mfn for PVHVM guests ends up by
defualt with the value of zero (introduced by c/s 23143).
The code in vcpu_initialise which initialized vcpu_info_mfn to a
valid value (INVALID_MFN), would never be called for PVHVM:
if ( is_hvm_domain(d) )
{
rc = hvm_vcpu_initialise(v);
goto done;
}
v->arch.pv_vcpu.vcpu_info_mfn = INVALID_MFN;
while previously it would be:
v->arch.vcpu_info_mfn = INVALID_MFN;
[right at the start of the function in Xen 4.1]
This fixes the problem with Linux advertising this error:
register_vcpu_info failed: err=-22
Signed-off-by: Konrad Rzeszutek Wilk
# Commit 9626d1c1fafe2da5af6e59478c9e9db6d03144df
# Date 2013-05-02 09:29:36 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: call unmap_vcpu_info() regardless of guest type
This fixes a regression from 63753b3e ("x86: allow
VCPUOP_register_vcpu_info to work again on PVHVM guests").
Signed-off-by: Jan Beulich
Tested-by: Sander Eikelenboom
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -431,13 +431,14 @@ int vcpu_initialise(struct vcpu *v)
vmce_init_vcpu(v);
+ v->arch.vcpu_info_mfn = INVALID_MFN;
+
if ( is_hvm_domain(d) )
{
rc = hvm_vcpu_initialise(v);
goto done;
}
- v->arch.pv_vcpu.vcpu_info_mfn = INVALID_MFN;
spin_lock_init(&v->arch.pv_vcpu.shadow_ldt_lock);
@@ -1076,14 +1077,14 @@ unmap_vcpu_info(struct vcpu *v)
{
unsigned long mfn;
- if ( v->arch.pv_vcpu.vcpu_info_mfn == INVALID_MFN )
+ if ( v->arch.vcpu_info_mfn == INVALID_MFN )
return;
- mfn = v->arch.pv_vcpu.vcpu_info_mfn;
+ mfn = v->arch.vcpu_info_mfn;
unmap_domain_page_global(v->vcpu_info);
v->vcpu_info = &dummy_vcpu_info;
- v->arch.pv_vcpu.vcpu_info_mfn = INVALID_MFN;
+ v->arch.vcpu_info_mfn = INVALID_MFN;
put_page_and_type(mfn_to_page(mfn));
}
@@ -1106,7 +1107,7 @@ map_vcpu_info(struct vcpu *v, unsigned l
if ( offset > (PAGE_SIZE - sizeof(vcpu_info_t)) )
return -EINVAL;
- if ( v->arch.pv_vcpu.vcpu_info_mfn != INVALID_MFN )
+ if ( v->arch.vcpu_info_mfn != INVALID_MFN )
return -EINVAL;
/* Run this command on yourself or on other offline VCPUS. */
@@ -1143,7 +1144,7 @@ map_vcpu_info(struct vcpu *v, unsigned l
}
v->vcpu_info = new_info;
- v->arch.pv_vcpu.vcpu_info_mfn = page_to_mfn(page);
+ v->arch.vcpu_info_mfn = page_to_mfn(page);
/* Set new vcpu_info pointer /before/ setting pending flags. */
wmb();
@@ -2143,8 +2144,12 @@ int domain_relinquish_resources(struct d
/* Drop the in-use references to page-table bases. */
for_each_vcpu ( d, v )
+ {
vcpu_destroy_pagetables(v);
+ unmap_vcpu_info(v);
+ }
+
if ( !is_hvm_domain(d) )
{
for_each_vcpu ( d, v )
@@ -2155,8 +2160,6 @@ int domain_relinquish_resources(struct d
* mappings.
*/
destroy_gdt(v);
-
- unmap_vcpu_info(v);
}
if ( d->arch.pv_domain.pirq_eoi_map != NULL )
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -420,9 +420,6 @@ struct pv_vcpu
/* Current LDT details. */
unsigned long shadow_ldt_mapcnt;
spinlock_t shadow_ldt_lock;
-
- /* Guest-specified relocation of vcpu_info. */
- unsigned long vcpu_info_mfn;
};
struct arch_vcpu
@@ -494,6 +491,9 @@ struct arch_vcpu
struct paging_vcpu paging;
+ /* Guest-specified relocation of vcpu_info. */
+ unsigned long vcpu_info_mfn;
+
#ifdef CONFIG_X86_32
/* map_domain_page() mapping cache. */
struct mapcache_vcpu mapcache;
++++++ 26878-VMX-Detect-posted-interrupt-capability.patch ++++++
References: bnc#808269 FATE#313605
# Commit b266924990af96ee47ee299e1b6bb87fac2e2548
# Date 2013-04-18 11:32:02 +0200
# Author Yang Zhang
# Committer Jan Beulich
VMX: Detect posted interrupt capability
Check whether the Hardware supports posted interrupt capability.
Signed-off-by: Yang Zhang
Reviewed-by: Jun Nakajima
Acked-by: Keir Fraser
Acked-by: George Dunlap (from a release perspective)
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -91,6 +91,7 @@ static void __init vmx_display_features(
P(cpu_has_vmx_unrestricted_guest, "Unrestricted Guest");
P(cpu_has_vmx_apic_reg_virt, "APIC Register Virtualization");
P(cpu_has_vmx_virtual_intr_delivery, "Virtual Interrupt Delivery");
+ P(cpu_has_vmx_posted_intr_processing, "Posted Interrupt Processing");
#undef P
if ( !printed )
@@ -140,7 +141,8 @@ static int vmx_init_vmcs_config(void)
min = (PIN_BASED_EXT_INTR_MASK |
PIN_BASED_NMI_EXITING);
- opt = PIN_BASED_VIRTUAL_NMIS;
+ opt = (PIN_BASED_VIRTUAL_NMIS |
+ PIN_BASED_POSTED_INTERRUPT);
_vmx_pin_based_exec_control = adjust_vmx_controls(
"Pin-Based Exec Control", min, opt,
MSR_IA32_VMX_PINBASED_CTLS, &mismatch);
@@ -276,6 +278,14 @@ static int vmx_init_vmcs_config(void)
_vmx_vmexit_control = adjust_vmx_controls(
"VMExit Control", min, opt, MSR_IA32_VMX_EXIT_CTLS, &mismatch);
+ /*
+ * "Process posted interrupt" can be set only when "virtual-interrupt
+ * delivery" and "acknowledge interrupt on exit" is set
+ */
+ if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY)
+ || !(_vmx_vmexit_control & VM_EXIT_ACK_INTR_ON_EXIT) )
+ _vmx_pin_based_exec_control &= ~ PIN_BASED_POSTED_INTERRUPT;
+
min = 0;
opt = VM_ENTRY_LOAD_GUEST_PAT;
_vmx_vmentry_control = adjust_vmx_controls(
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -160,6 +160,7 @@ extern u32 vmx_cpu_based_exec_control;
#define PIN_BASED_NMI_EXITING 0x00000008
#define PIN_BASED_VIRTUAL_NMIS 0x00000020
#define PIN_BASED_PREEMPT_TIMER 0x00000040
+#define PIN_BASED_POSTED_INTERRUPT 0x00000080
extern u32 vmx_pin_based_exec_control;
#define VM_EXIT_SAVE_DEBUG_CNTRLS 0x00000004
@@ -242,6 +243,8 @@ extern bool_t cpu_has_vmx_ins_outs_instr
(vmx_secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY)
#define cpu_has_vmx_virtualize_x2apic_mode \
(vmx_secondary_exec_control & SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)
+#define cpu_has_vmx_posted_intr_processing \
+ (vmx_pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT)
/* GUEST_INTERRUPTIBILITY_INFO flags. */
#define VMX_INTR_SHADOW_STI 0x00000001
@@ -256,6 +259,7 @@ extern bool_t cpu_has_vmx_ins_outs_instr
/* VMCS field encodings. */
enum vmcs_field {
VIRTUAL_PROCESSOR_ID = 0x00000000,
+ POSTED_INTR_NOTIFICATION_VECTOR = 0x00000002,
GUEST_ES_SELECTOR = 0x00000800,
GUEST_CS_SELECTOR = 0x00000802,
GUEST_SS_SELECTOR = 0x00000804,
@@ -290,6 +294,8 @@ enum vmcs_field {
VIRTUAL_APIC_PAGE_ADDR_HIGH = 0x00002013,
APIC_ACCESS_ADDR = 0x00002014,
APIC_ACCESS_ADDR_HIGH = 0x00002015,
+ PI_DESC_ADDR = 0x00002016,
+ PI_DESC_ADDR_HIGH = 0x00002017,
EPT_POINTER = 0x0000201a,
EPT_POINTER_HIGH = 0x0000201b,
EOI_EXIT_BITMAP0 = 0x0000201c,
++++++ 26879-VMX-Turn-on-posted-interrupt-bit-in-vmcs.patch ++++++
References: bnc#808269 FATE#313605
# Commit 1c0ac49b1d6c3d54fc1f75661742a988ca7cf255
# Date 2013-04-18 11:34:04 +0200
# Author Yang Zhang
# Committer Jan Beulich
VMX: Turn on posted interrupt bit in vmcs
Turn on posted interrupt for vcpu if posted interrupt is avaliable.
Signed-off-by: Yang Zhang
Reviewed-by: Jun Nakajima
Acked-by: Keir Fraser
Acked-by: George Dunlap (from a release perspective)
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -864,6 +864,12 @@ static int construct_vmcs(struct vcpu *v
__vmwrite(GUEST_INTR_STATUS, 0);
}
+ if ( cpu_has_vmx_posted_intr_processing )
+ {
+ __vmwrite(PI_DESC_ADDR, virt_to_maddr(&v->arch.hvm_vmx.pi_desc));
+ __vmwrite(POSTED_INTR_NOTIFICATION_VECTOR, posted_intr_vector);
+ }
+
/* Host data selectors. */
__vmwrite(HOST_SS_SELECTOR, __HYPERVISOR_DS);
__vmwrite(HOST_DS_SELECTOR, __HYPERVISOR_DS);
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -76,6 +76,8 @@ static int vmx_msr_write_intercept(unsig
static void vmx_invlpg_intercept(unsigned long vaddr);
static void __ept_sync_domain(void *info);
+uint8_t __read_mostly posted_intr_vector;
+
static int vmx_domain_initialise(struct domain *d)
{
int rc;
@@ -1625,6 +1627,9 @@ struct hvm_function_table * __init start
setup_ept_dump();
}
+ if ( cpu_has_vmx_posted_intr_processing )
+ alloc_direct_apic_vector(&posted_intr_vector, event_check_interrupt);
+
setup_vmcs_dump();
return &vmx_function_table;
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -21,6 +21,7 @@
#include
#include
+#include
extern void vmcs_dump_vcpu(struct vcpu *v);
extern void setup_vmcs_dump(void);
@@ -70,6 +71,12 @@ struct vmx_domain {
cpumask_var_t ept_synced;
};
+struct pi_desc {
+ DECLARE_BITMAP(pir, NR_VECTORS);
+ u32 control;
+ u32 rsvd[7];
+} __attribute__ ((aligned (64)));
+
#define ept_get_wl(d) \
((d)->arch.hvm_domain.vmx.ept_control.ept_wl)
#define ept_get_asr(d) \
@@ -112,6 +119,7 @@ struct arch_vmx_struct {
uint32_t eoi_exitmap_changed;
uint64_t eoi_exit_bitmap[4];
+ struct pi_desc pi_desc;
unsigned long host_cr0;
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -189,6 +189,7 @@ void vmx_update_cpu_exec_control(struct
#define MODRM_EAX_ECX ".byte 0xc1\n" /* EAX, ECX */
extern u64 vmx_ept_vpid_cap;
+extern uint8_t posted_intr_vector;
#define cpu_has_vmx_ept_wl4_supported \
(vmx_ept_vpid_cap & VMX_EPT_WALK_LENGTH_4_SUPPORTED)
++++++ 26880-VMX-Add-posted-interrupt-supporting.patch ++++++
References: bnc#808269 FATE#313605
# Commit d7dafa375bc13772e2e3274d975d544af4208939
# Date 2013-04-18 11:34:49 +0200
# Author Yang Zhang
# Committer Jan Beulich
VMX: Add posted interrupt supporting
Add the supporting of using posted interrupt to deliver interrupt.
Signed-off-by: Yang Zhang
Reviewed-by: Jun Nakajima
Acked-by: Keir Fraser
Acked-by: George Dunlap (from a release perspective)
# Commit e4e1e0ecc023caf4cf3b87a4aa4d5e210760f4e8
# Date 2013-04-19 12:31:18 +0200
# Author Jan Beulich
# Committer Jan Beulich
VMX: adjust correct table when there's no posted interrupt support
The caller of start_vmx() will overwrite hvm_funcs, so there's no point
in adjusting that table in start_vmx().
Signed-off-by: Jan Beulich
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -90,24 +90,6 @@ static unsigned int vlapic_lvt_mask[VLAP
((vlapic_get_reg(vlapic, APIC_LVTT) & APIC_TIMER_MODE_MASK) \
== APIC_TIMER_MODE_TSC_DEADLINE)
-
-/*
- * Generic APIC bitmap vector update & search routines.
- */
-
-#define VEC_POS(v) ((v)%32)
-#define REG_POS(v) (((v)/32) * 0x10)
-#define vlapic_test_and_set_vector(vec, bitmap) \
- test_and_set_bit(VEC_POS(vec), \
- (unsigned long *)((bitmap) + REG_POS(vec)))
-#define vlapic_test_and_clear_vector(vec, bitmap) \
- test_and_clear_bit(VEC_POS(vec), \
- (unsigned long *)((bitmap) + REG_POS(vec)))
-#define vlapic_set_vector(vec, bitmap) \
- set_bit(VEC_POS(vec), (unsigned long *)((bitmap) + REG_POS(vec)))
-#define vlapic_clear_vector(vec, bitmap) \
- clear_bit(VEC_POS(vec), (unsigned long *)((bitmap) + REG_POS(vec)))
-
static int vlapic_find_highest_vector(void *bitmap)
{
uint32_t *word = bitmap;
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -55,6 +55,7 @@
#include
#include
#include
+#include
enum handler_return { HNDL_done, HNDL_unhandled, HNDL_exception_raised };
@@ -1548,6 +1549,60 @@ static void vmx_process_isr(int isr, str
vmx_vmcs_exit(v);
}
+static void __vmx_deliver_posted_interrupt(struct vcpu *v)
+{
+ bool_t running = v->is_running;
+
+ vcpu_unblock(v);
+ if ( running && (in_irq() || (v != current)) )
+ {
+ unsigned int cpu = v->processor;
+
+ if ( !test_and_set_bit(VCPU_KICK_SOFTIRQ, &softirq_pending(cpu))
+ && (cpu != smp_processor_id()) )
+ send_IPI_mask(cpumask_of(cpu), posted_intr_vector);
+ }
+}
+
+static void vmx_deliver_posted_intr(struct vcpu *v, u8 vector)
+{
+ if ( pi_test_and_set_pir(vector, &v->arch.hvm_vmx.pi_desc) )
+ return;
+
+ if ( unlikely(v->arch.hvm_vmx.eoi_exitmap_changed) )
+ {
+ /*
+ * If EOI exitbitmap needs to changed or notification vector
+ * can't be allocated, interrupt will not be injected till
+ * VMEntry as it used to be.
+ */
+ pi_set_on(&v->arch.hvm_vmx.pi_desc);
+ }
+ else if ( !pi_test_and_set_on(&v->arch.hvm_vmx.pi_desc) )
+ {
+ __vmx_deliver_posted_interrupt(v);
+ return;
+ }
+
+ vcpu_kick(v);
+}
+
+static void vmx_sync_pir_to_irr(struct vcpu *v)
+{
+ struct vlapic *vlapic = vcpu_vlapic(v);
+ unsigned int group, i;
+ DECLARE_BITMAP(pending_intr, NR_VECTORS);
+
+ if ( !pi_test_and_clear_on(&v->arch.hvm_vmx.pi_desc) )
+ return;
+
+ for ( group = 0; group < ARRAY_SIZE(pending_intr); group++ )
+ pending_intr[group] = pi_get_pir(&v->arch.hvm_vmx.pi_desc, group);
+
+ for_each_set_bit(i, pending_intr, NR_VECTORS)
+ vlapic_set_vector(i, &vlapic->regs->data[APIC_IRR]);
+}
+
static struct hvm_function_table __read_mostly vmx_function_table = {
.name = "VMX",
.cpu_up_prepare = vmx_cpu_up_prepare,
@@ -1598,6 +1653,8 @@ static struct hvm_function_table __read_
.update_eoi_exit_bitmap = vmx_update_eoi_exit_bitmap,
.virtual_intr_delivery_enabled = vmx_virtual_intr_delivery_enabled,
.process_isr = vmx_process_isr,
+ .deliver_posted_intr = vmx_deliver_posted_intr,
+ .sync_pir_to_irr = vmx_sync_pir_to_irr,
};
struct hvm_function_table * __init start_vmx(void)
@@ -1629,6 +1686,11 @@ struct hvm_function_table * __init start
if ( cpu_has_vmx_posted_intr_processing )
alloc_direct_apic_vector(&posted_intr_vector, event_check_interrupt);
+ else
+ {
+ vmx_function_table.deliver_posted_intr = NULL;
+ vmx_function_table.sync_pir_to_irr = NULL;
+ }
setup_vmcs_dump();
--- a/xen/include/asm-x86/bitops.h
+++ b/xen/include/asm-x86/bitops.h
@@ -367,6 +367,16 @@ static inline unsigned int __scanbit(uns
((off)+(__scanbit(~(((*(const unsigned long *)addr)) >> (off)), size))) : \
__find_next_zero_bit(addr,size,off)))
+/**
+ * for_each_set_bit - iterate over every set bit in a memory region
+ * @bit: The integer iterator
+ * @addr: The address to base the search on
+ * @size: The maximum size to search
+ */
+#define for_each_set_bit(bit, addr, size) \
+ for ( (bit) = find_first_bit(addr, size); \
+ (bit) < (size); \
+ (bit) = find_next_bit(addr, size, (bit) + 1) )
/**
* find_first_set_bit - find the first set bit in @word
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -185,6 +185,8 @@ struct hvm_function_table {
void (*update_eoi_exit_bitmap)(struct vcpu *v, u8 vector, u8 trig);
int (*virtual_intr_delivery_enabled)(void);
void (*process_isr)(int isr, struct vcpu *v);
+ void (*deliver_posted_intr)(struct vcpu *v, u8 vector);
+ void (*sync_pir_to_irr)(struct vcpu *v);
};
extern struct hvm_function_table hvm_funcs;
--- a/xen/include/asm-x86/hvm/vlapic.h
+++ b/xen/include/asm-x86/hvm/vlapic.h
@@ -54,6 +54,21 @@
#define vlapic_x2apic_mode(vlapic) \
((vlapic)->hw.apic_base_msr & MSR_IA32_APICBASE_EXTD)
+/*
+ * Generic APIC bitmap vector update & search routines.
+ */
+
+#define VEC_POS(v) ((v) % 32)
+#define REG_POS(v) (((v) / 32) * 0x10)
+#define vlapic_test_and_set_vector(vec, bitmap) \
+ test_and_set_bit(VEC_POS(vec), (uint32_t *)((bitmap) + REG_POS(vec)))
+#define vlapic_test_and_clear_vector(vec, bitmap) \
+ test_and_clear_bit(VEC_POS(vec), (uint32_t *)((bitmap) + REG_POS(vec)))
+#define vlapic_set_vector(vec, bitmap) \
+ set_bit(VEC_POS(vec), (uint32_t *)((bitmap) + REG_POS(vec)))
+#define vlapic_clear_vector(vec, bitmap) \
+ clear_bit(VEC_POS(vec), (uint32_t *)((bitmap) + REG_POS(vec)))
+
struct vlapic {
struct hvm_hw_lapic hw;
struct hvm_hw_lapic_regs *regs;
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -71,6 +71,31 @@ void vmx_update_debug_state(struct vcpu
void vmx_update_exception_bitmap(struct vcpu *v);
void vmx_update_cpu_exec_control(struct vcpu *v);
+#define POSTED_INTR_ON 0
+static inline int pi_test_and_set_pir(int vector, struct pi_desc *pi_desc)
+{
+ return test_and_set_bit(vector, pi_desc->pir);
+}
+
+static inline int pi_test_and_set_on(struct pi_desc *pi_desc)
+{
+ return test_and_set_bit(POSTED_INTR_ON, &pi_desc->control);
+}
+
+static inline void pi_set_on(struct pi_desc *pi_desc)
+{
+ set_bit(POSTED_INTR_ON, &pi_desc->control);
+}
+
+static inline int pi_test_and_clear_on(struct pi_desc *pi_desc)
+{
+ return test_and_clear_bit(POSTED_INTR_ON, &pi_desc->control);
+}
+
+static inline unsigned long pi_get_pir(struct pi_desc *pi_desc, int group)
+{
+ return xchg(&pi_desc->pir[group], 0);
+}
/*
* Exit Reasons
++++++ 26881-x86-HVM-Call-vlapic_set_irq-to-delivery-virtual-interrupt.patch ++++++
References: bnc#808269 FATE#313605
# Commit 04015d6326f17e4aafb32593b94dac44b72ef4c1
# Date 2013-04-18 11:35:43 +0200
# Author Yang Zhang
# Committer Jan Beulich
x86/HVM: Call vlapic_set_irq() to delivery virtual interrupt
Move kick_vcpu into vlapic_set_irq. And call it to deliver virtual interrupt
instead set vIRR directly.
Signed-off-by: Yang Zhang
Acked-by: Keir Fraser
Acked-by: George Dunlap (from a release perspective)
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -263,8 +263,7 @@ static void ioapic_inj_irq(
ASSERT((delivery_mode == dest_Fixed) ||
(delivery_mode == dest_LowestPrio));
- if ( vlapic_set_irq(target, vector, trig_mode) )
- vcpu_kick(vlapic_vcpu(target));
+ vlapic_set_irq(target, vector, trig_mode);
}
static inline int pit_channel0_enabled(void)
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -122,16 +122,19 @@ static int vlapic_find_highest_irr(struc
return vlapic_find_highest_vector(&vlapic->regs->data[APIC_IRR]);
}
-int vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
+void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig)
{
+ struct vcpu *target = vlapic_vcpu(vlapic);
+
if ( trig )
vlapic_set_vector(vec, &vlapic->regs->data[APIC_TMR]);
if ( hvm_funcs.update_eoi_exit_bitmap )
- hvm_funcs.update_eoi_exit_bitmap(vlapic_vcpu(vlapic), vec ,trig);
+ hvm_funcs.update_eoi_exit_bitmap(target, vec, trig);
/* We may need to wake up target vcpu, besides set pending bit here */
- return !vlapic_test_and_set_irr(vec, vlapic);
+ if ( !vlapic_test_and_set_irr(vec, vlapic) )
+ vcpu_kick(target);
}
static int vlapic_find_highest_isr(struct vlapic *vlapic)
@@ -297,9 +300,8 @@ static void vlapic_accept_irq(struct vcp
{
case APIC_DM_FIXED:
case APIC_DM_LOWEST:
- if ( vlapic_enabled(vlapic) &&
- !vlapic_test_and_set_irr(vector, vlapic) )
- vcpu_kick(v);
+ if ( vlapic_enabled(vlapic) )
+ vlapic_set_irq(vlapic, vector, 0);
break;
case APIC_DM_REMRD:
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -57,8 +57,7 @@ static void vmsi_inj_irq(
{
case dest_Fixed:
case dest_LowestPrio:
- if ( vlapic_set_irq(target, vector, trig_mode) )
- vcpu_kick(vlapic_vcpu(target));
+ vlapic_set_irq(target, vector, trig_mode);
break;
default:
gdprintk(XENLOG_WARNING, "error delivery mode %d\n", delivery_mode);
--- a/xen/include/asm-x86/hvm/vlapic.h
+++ b/xen/include/asm-x86/hvm/vlapic.h
@@ -95,7 +95,7 @@ static inline void vlapic_set_reg(
bool_t is_vlapic_lvtpc_enabled(struct vlapic *vlapic);
-int vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig);
+void vlapic_set_irq(struct vlapic *vlapic, uint8_t vec, uint8_t trig);
int vlapic_has_pending_irq(struct vcpu *v);
int vlapic_ack_pending_irq(struct vcpu *v, int vector);
++++++ 26882-VMX-Use-posted-interrupt-to-deliver-virtual-interrupt.patch ++++++
References: bnc#808269 FATE#313605
# Commit 8d266f6b2c5c3d2d0da42f1e40ba1fb2ac8fdf1a
# Date 2013-04-18 11:36:28 +0200
# Author Yang Zhang
# Committer Jan Beulich
VMX: Use posted interrupt to deliver virutal interrupt
Deliver virtual interrupt through posted way if posted interrupt
is enabled.
Signed-off-by: Yang Zhang
Reviewed-by: Jun Nakajima
Acked-by: Keir Fraser
Acked-by: George Dunlap (from a release perspective)
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -119,6 +119,9 @@ static void vlapic_clear_irr(int vector,
static int vlapic_find_highest_irr(struct vlapic *vlapic)
{
+ if ( hvm_funcs.sync_pir_to_irr )
+ hvm_funcs.sync_pir_to_irr(vlapic_vcpu(vlapic));
+
return vlapic_find_highest_vector(&vlapic->regs->data[APIC_IRR]);
}
@@ -132,8 +135,9 @@ void vlapic_set_irq(struct vlapic *vlapi
if ( hvm_funcs.update_eoi_exit_bitmap )
hvm_funcs.update_eoi_exit_bitmap(target, vec, trig);
- /* We may need to wake up target vcpu, besides set pending bit here */
- if ( !vlapic_test_and_set_irr(vec, vlapic) )
+ if ( hvm_funcs.deliver_posted_intr )
+ hvm_funcs.deliver_posted_intr(target, vec);
+ else if ( !vlapic_test_and_set_irr(vec, vlapic) )
vcpu_kick(target);
}
++++++ 26891-x86-S3-Fix-cpu-pool-scheduling-after-suspend-resume.patch ++++++
# Commit 9aa356bc9f7533c3cb7f02c823f532532876d444
# Date 2013-04-19 12:29:01 +0200
# Author Ben Guthro
# Committer Jan Beulich
x86/S3: Fix cpu pool scheduling after suspend/resume
This review is another S3 scheduler problem with the system_state
variable introduced with the following changeset:
http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=269f543ea750ed567d18f2e8...
Specifically, the cpu_callback function that takes the CPU down during
suspend, and back up during resume. We were seeing situations where,
after S3, only CPU0 was in cpupool0. Guest performance suffered
greatly, since all vcpus were only on a single pcpu. Guests under high
CPU load showed the problem much more quickly than an idle guest.
Removing this if condition forces the CPUs to go through the expected
online/offline state, and be properly scheduled after S3.
This also includes a necessary partial change proposed earlier by
Tomasz Wroblewski here:
http://lists.xen.org/archives/html/xen-devel/2013-01/msg02206.html
It should also resolve the issues discussed in this thread:
http://lists.xen.org/archives/html/xen-devel/2012-11/msg01801.html
Signed-off-by: Ben Guthro
Acked-by: Juergen Gross
--- a/xen/common/cpupool.c
+++ b/xen/common/cpupool.c
@@ -41,16 +41,28 @@ static struct cpupool *alloc_cpupool_str
{
struct cpupool *c = xzalloc(struct cpupool);
- if ( c && zalloc_cpumask_var(&c->cpu_valid) )
- return c;
- xfree(c);
- return NULL;
+ if ( !c || !zalloc_cpumask_var(&c->cpu_valid) )
+ {
+ xfree(c);
+ c = NULL;
+ }
+ else if ( !zalloc_cpumask_var(&c->cpu_suspended) )
+ {
+ free_cpumask_var(c->cpu_valid);
+ xfree(c);
+ c = NULL;
+ }
+
+ return c;
}
static void free_cpupool_struct(struct cpupool *c)
{
if ( c )
+ {
+ free_cpumask_var(c->cpu_suspended);
free_cpumask_var(c->cpu_valid);
+ }
xfree(c);
}
@@ -417,14 +429,32 @@ void cpupool_rm_domain(struct domain *d)
/*
* called to add a new cpu to pool admin
- * we add a hotplugged cpu to the cpupool0 to be able to add it to dom0
+ * we add a hotplugged cpu to the cpupool0 to be able to add it to dom0,
+ * unless we are resuming from S3, in which case we put the cpu back
+ * in the cpupool it was in prior to suspend.
*/
static void cpupool_cpu_add(unsigned int cpu)
{
spin_lock(&cpupool_lock);
cpumask_clear_cpu(cpu, &cpupool_locked_cpus);
cpumask_set_cpu(cpu, &cpupool_free_cpus);
- cpupool_assign_cpu_locked(cpupool0, cpu);
+
+ if ( system_state == SYS_STATE_resume )
+ {
+ struct cpupool **c;
+
+ for_each_cpupool(c)
+ {
+ if ( cpumask_test_cpu(cpu, (*c)->cpu_suspended ) )
+ {
+ cpupool_assign_cpu_locked(*c, cpu);
+ cpumask_clear_cpu(cpu, (*c)->cpu_suspended);
+ }
+ }
+ }
+
+ if ( cpumask_test_cpu(cpu, &cpupool_free_cpus) )
+ cpupool_assign_cpu_locked(cpupool0, cpu);
spin_unlock(&cpupool_lock);
}
@@ -436,7 +466,7 @@ static void cpupool_cpu_add(unsigned int
static int cpupool_cpu_remove(unsigned int cpu)
{
int ret = 0;
-
+
spin_lock(&cpupool_lock);
if ( !cpumask_test_cpu(cpu, cpupool0->cpu_valid))
ret = -EBUSY;
@@ -633,9 +663,14 @@ static int cpu_callback(
unsigned int cpu = (unsigned long)hcpu;
int rc = 0;
- if ( (system_state == SYS_STATE_suspend) ||
- (system_state == SYS_STATE_resume) )
- goto out;
+ if ( system_state == SYS_STATE_suspend )
+ {
+ struct cpupool **c;
+
+ for_each_cpupool(c)
+ if ( cpumask_test_cpu(cpu, (*c)->cpu_valid ) )
+ cpumask_set_cpu(cpu, (*c)->cpu_suspended);
+ }
switch ( action )
{
@@ -650,7 +685,6 @@ static int cpu_callback(
break;
}
-out:
return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
}
--- a/xen/include/xen/sched-if.h
+++ b/xen/include/xen/sched-if.h
@@ -199,6 +199,7 @@ struct cpupool
{
int cpupool_id;
cpumask_var_t cpu_valid; /* all cpus assigned to pool */
+ cpumask_var_t cpu_suspended; /* cpus in S3 that should be in this pool */
struct cpupool *next;
unsigned int n_dom;
struct scheduler *sched;
++++++ 26902-x86-EFI-pass-boot-services-variable-info-to-runtime-code.patch ++++++
References: FATE#314499, FATE#314509
# Commit 9be8a4447103d92843fcfeaad8be42408c90e9a9
# Date 2013-04-22 13:58:01 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86/EFI: pass boot services variable info to runtime code
EFI variables can be flagged as being accessible only within boot services.
This makes it awkward for us to figure out how much space they use at
runtime. In theory we could figure this out by simply comparing the results
from QueryVariableInfo() to the space used by all of our variables, but
that fails if the platform doesn't garbage collect on every boot. Thankfully,
calling QueryVariableInfo() while still inside boot services gives a more
reliable answer. This patch passes that information from the EFI boot stub
up to the efi platform code.
Based on a similarly named Linux patch by Matthew Garrett .
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
Acked-by: George Dunlap
# Commit 47f71a8ccb0c881cf3d9e0b917ef5f0dc084b062
# Date 2013-05-23 13:08:51 +0200
# Author Eric Shelton
# Committer Jan Beulich
x86/EFI: fix boot for pre-UEFI systems
efi/boot.c makes a call to QueryVariableInfo on all systems. However,
as it is only promised for UEFI 2.0+ systems, pre-UEFI systems can
hang or crash on this call. The below patch adds a version check, a
technique used in other parts of the Xen EFI codebase.
Signed-off-by: Eric Shelton
Check runtime services version instead of EFI version (while generally
they would be in sync, nothing requires them to be).
Signed-off-by: Jan Beulich
--- a/xen/arch/x86/efi/boot.c
+++ b/xen/arch/x86/efi/boot.c
@@ -1128,6 +1128,25 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SY
if (efi.smbios != EFI_INVALID_TABLE_ADDR)
dmi_efi_get_table((void *)(long)efi.smbios);
+ /* Get snapshot of variable store parameters. */
+ status = (efi_rs->Hdr.Revision >> 16) >= 2 ?
+ efi_rs->QueryVariableInfo(EFI_VARIABLE_NON_VOLATILE |
+ EFI_VARIABLE_BOOTSERVICE_ACCESS |
+ EFI_VARIABLE_RUNTIME_ACCESS,
+ &efi_boot_max_var_store_size,
+ &efi_boot_remain_var_store_size,
+ &efi_boot_max_var_size) :
+ EFI_INCOMPATIBLE_VERSION;
+ if ( EFI_ERROR(status) )
+ {
+ efi_boot_max_var_store_size = 0;
+ efi_boot_remain_var_store_size = 0;
+ efi_boot_max_var_size = status;
+ PrintStr(L"Warning: Could not query variable store: ");
+ DisplayUint(status, 0);
+ PrintStr(newline);
+ }
+
/* Allocate space for trampoline (in first Mb). */
cfg.addr = 0x100000;
cfg.size = trampoline_end - trampoline_start;
--- a/xen/arch/x86/efi/efi.h
+++ b/xen/arch/x86/efi/efi.h
@@ -22,5 +22,8 @@ extern void *efi_memmap;
extern l4_pgentry_t *efi_l4_pgtable;
+extern UINT64 efi_boot_max_var_store_size, efi_boot_remain_var_store_size,
+ efi_boot_max_var_size;
+
unsigned long efi_rs_enter(void);
void efi_rs_leave(unsigned long);
--- a/xen/arch/x86/efi/runtime.c
+++ b/xen/arch/x86/efi/runtime.c
@@ -28,6 +28,10 @@ UINTN __read_mostly efi_memmap_size;
UINTN __read_mostly efi_mdesc_size;
void *__read_mostly efi_memmap;
+UINT64 __read_mostly efi_boot_max_var_store_size;
+UINT64 __read_mostly efi_boot_remain_var_store_size;
+UINT64 __read_mostly efi_boot_max_var_size;
+
struct efi __read_mostly efi = {
.acpi = EFI_INVALID_TABLE_ADDR,
.acpi20 = EFI_INVALID_TABLE_ADDR,
@@ -446,6 +450,35 @@ int efi_runtime_call(struct xenpf_efi_ru
break;
case XEN_EFI_query_variable_info:
+ if ( op->misc & ~XEN_EFI_VARINFO_BOOT_SNAPSHOT )
+ return -EINVAL;
+
+ if ( op->misc & XEN_EFI_VARINFO_BOOT_SNAPSHOT )
+ {
+ if ( (op->u.query_variable_info.attr
+ & ~EFI_VARIABLE_APPEND_WRITE) !=
+ (EFI_VARIABLE_NON_VOLATILE |
+ EFI_VARIABLE_BOOTSERVICE_ACCESS |
+ EFI_VARIABLE_RUNTIME_ACCESS) )
+ return -EINVAL;
+
+ op->u.query_variable_info.max_store_size =
+ efi_boot_max_var_store_size;
+ op->u.query_variable_info.remain_store_size =
+ efi_boot_remain_var_store_size;
+ if ( efi_boot_max_var_store_size )
+ {
+ op->u.query_variable_info.max_size = efi_boot_max_var_size;
+ status = EFI_SUCCESS;
+ }
+ else
+ {
+ op->u.query_variable_info.max_size = 0;
+ status = efi_boot_max_var_size;
+ }
+ break;
+ }
+
cr3 = efi_rs_enter();
if ( (efi_rs->Hdr.Revision >> 16) < 2 )
{
@@ -462,6 +495,9 @@ int efi_runtime_call(struct xenpf_efi_ru
case XEN_EFI_query_capsule_capabilities:
case XEN_EFI_update_capsule:
+ if ( op->misc )
+ return -EINVAL;
+
cr3 = efi_rs_enter();
if ( (efi_rs->Hdr.Revision >> 16) < 2 )
{
--- a/xen/include/efi/efiapi.h
+++ b/xen/include/efi/efiapi.h
@@ -213,6 +213,10 @@ VOID
#define EFI_VARIABLE_NON_VOLATILE 0x00000001
#define EFI_VARIABLE_BOOTSERVICE_ACCESS 0x00000002
#define EFI_VARIABLE_RUNTIME_ACCESS 0x00000004
+#define EFI_VARIABLE_HARDWARE_ERROR_RECORD 0x00000008
+#define EFI_VARIABLE_AUTHENTICATED_WRITE_ACCESS 0x00000010
+#define EFI_VARIABLE_TIME_BASED_AUTHENTICATED_WRITE_ACCESS 0x00000020
+#define EFI_VARIABLE_APPEND_WRITE 0x00000040
// Variable size limitation
#define EFI_MAXIMUM_VARIABLE_SIZE 1024
--- a/xen/include/efi/efierr.h
+++ b/xen/include/efi/efierr.h
@@ -50,6 +50,7 @@ Revision History
#define EFI_ICMP_ERROR EFIERR(22)
#define EFI_TFTP_ERROR EFIERR(23)
#define EFI_PROTOCOL_ERROR EFIERR(24)
+#define EFI_INCOMPATIBLE_VERSION EFIERR(25)
#define EFI_WARN_UNKOWN_GLYPH EFIWARN(1)
#define EFI_WARN_DELETE_FAILURE EFIWARN(2)
--- a/xen/include/public/platform.h
+++ b/xen/include/public/platform.h
@@ -184,6 +184,7 @@ struct xenpf_efi_runtime_call {
struct xenpf_efi_guid vendor_guid;
} get_next_variable_name;
+#define XEN_EFI_VARINFO_BOOT_SNAPSHOT 0x00000001
struct {
uint32_t attr;
uint64_t max_store_size;
++++++ 26910-ns16550-delay-resume-until-dom0-ACPI-has-a-chance-to-run.patch ++++++
# Commit 6e96c186d23873597896051b043cfeb119c4a7d5
# Date 2013-04-24 11:41:53 +0200
# Author Ben Guthro
# Committer Jan Beulich
ns16550: delay resume until dom0 ACPI has a chance to run
Check for ioport access, before fully resuming operation, to avoid
spinning in __ns16550_poll when reading the LSR register returns 0xFF
on failing ioport access.
On some systems (like Lenovo T410, and some HP machines of similar vintage)
there is a SuperIO card that provides this legacy ioport on the LPC bus.
In this case, we need to wait for dom0's ACPI processing to run the proper
AML to re-initialize the chip, before we can use the card again.
This may cause a small amount of garbage to be written to the serial log
while we wait patiently for that AML to be executed.
This implementation limits the number of retries, to avoid a situation
where we keep trying over and over again, in the case of some other failure
on the ioport.
Signed-Off-By: Ben Guthro
Acked-by: Keir Fraser
--- a/xen/drivers/char/ns16550.c
+++ b/xen/drivers/char/ns16550.c
@@ -42,6 +42,7 @@ static struct ns16550 {
struct irqaction irqaction;
/* UART with no IRQ line: periodically-polled I/O. */
struct timer timer;
+ struct timer resume_timer;
unsigned int timeout_ms;
bool_t intr_works;
/* PCI card parameters. */
@@ -120,6 +121,10 @@ static struct ns16550 {
/* Frequency of external clock source. This definition assumes PC platform. */
#define UART_CLOCK_HZ 1843200
+/* Resume retry settings */
+#define RESUME_DELAY MILLISECS(10)
+#define RESUME_RETRIES 100
+
static char ns_read_reg(struct ns16550 *uart, int reg)
{
if ( uart->remapped_io_base == NULL )
@@ -330,7 +335,7 @@ static void ns16550_suspend(struct seria
uart->ps_bdf[2], PCI_COMMAND);
}
-static void ns16550_resume(struct serial_port *port)
+static void _ns16550_resume(struct serial_port *port)
{
struct ns16550 *uart = port->uart;
@@ -346,6 +351,54 @@ static void ns16550_resume(struct serial
ns16550_setup_postirq(port->uart);
}
+static int ns16550_ioport_invalid(struct ns16550 *uart)
+{
+ return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) &&
+ (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) &&
+ (((unsigned char)ns_read_reg(uart, IER)) == 0xff) &&
+ (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) &&
+ (((unsigned char)ns_read_reg(uart, LCR)) == 0xff));
+}
+
+static int delayed_resume_tries;
+static void ns16550_delayed_resume(void *data)
+{
+ struct serial_port *port = data;
+ struct ns16550 *uart = port->uart;
+
+ if ( ns16550_ioport_invalid(port->uart) && delayed_resume_tries-- )
+ set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
+ else
+ _ns16550_resume(port);
+}
+
+static void ns16550_resume(struct serial_port *port)
+{
+ struct ns16550 *uart = port->uart;
+
+ /*
+ * Check for ioport access, before fully resuming operation.
+ * On some systems, there is a SuperIO card that provides
+ * this legacy ioport on the LPC bus.
+ *
+ * We need to wait for dom0's ACPI processing to run the proper
+ * AML to re-initialize the chip, before we can use the card again.
+ *
+ * This may cause a small amount of garbage to be written
+ * to the serial log while we wait patiently for that AML to
+ * be executed. However, this is preferable to spinning in an
+ * infinite loop, as seen on a Lenovo T430, when serial was enabled.
+ */
+ if ( ns16550_ioport_invalid(uart) )
+ {
+ delayed_resume_tries = RESUME_RETRIES;
+ init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0);
+ set_timer(&uart->resume_timer, NOW() + RESUME_DELAY);
+ }
+ else
+ _ns16550_resume(port);
+}
+
#ifdef CONFIG_X86
static void __init ns16550_endboot(struct serial_port *port)
{
++++++ 26930-x86-EFI-fix-runtime-call-status-for-compat-mode-Dom0.patch ++++++
# Commit a7ac9597a7fc6ca934957eb78b41e26638281953
# Date 2013-04-29 11:27:54 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86/EFI: fix runtime call status for compat mode Dom0
The top two bits (indicating error/warning classification) need to
remain the top two bits.
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
--- a/xen/arch/x86/efi/runtime.c
+++ b/xen/arch/x86/efi/runtime.c
@@ -513,7 +513,7 @@ int efi_runtime_call(struct xenpf_efi_ru
#ifndef COMPAT
op->status = status;
#else
- op->status = (status & 0x3fffffff) | (status >> 62);
+ op->status = (status & 0x3fffffff) | ((status >> 32) & 0xc0000000);
#endif
return rc;
++++++ 26946-x86-make-vcpu_destroy_pagetables-preemptible.patch ++++++
References: bnc#816159 CVE-2013-1918 XSA-45
# Commit 6cdc9be2a5f2a87b4504404fbf648d16d9503c19
# Date 2013-05-02 16:34:21 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: make vcpu_destroy_pagetables() preemptible
... as it may take significant amounts of time.
The function, being moved to mm.c as the better home for it anyway, and
to avoid having to make a new helper function there non-static, is
given a "preemptible" parameter temporarily (until, in a subsequent
patch, its other caller is also being made capable of dealing with
preemption).
This is part of CVE-2013-1918 / XSA-45.
Signed-off-by: Jan Beulich
Acked-by: Tim Deegan
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -73,8 +73,6 @@ void (*dead_idle) (void) __read_mostly =
static void paravirt_ctxt_switch_from(struct vcpu *v);
static void paravirt_ctxt_switch_to(struct vcpu *v);
-static void vcpu_destroy_pagetables(struct vcpu *v);
-
static void default_idle(void)
{
local_irq_disable();
@@ -1059,7 +1057,7 @@ void arch_vcpu_reset(struct vcpu *v)
if ( !is_hvm_vcpu(v) )
{
destroy_gdt(v);
- vcpu_destroy_pagetables(v);
+ vcpu_destroy_pagetables(v, 0);
}
else
{
@@ -2070,63 +2068,6 @@ static int relinquish_memory(
return ret;
}
-static void vcpu_destroy_pagetables(struct vcpu *v)
-{
- struct domain *d = v->domain;
- unsigned long pfn;
-
-#ifdef __x86_64__
- if ( is_pv_32on64_vcpu(v) )
- {
- pfn = l4e_get_pfn(*(l4_pgentry_t *)
- __va(pagetable_get_paddr(v->arch.guest_table)));
-
- if ( pfn != 0 )
- {
- if ( paging_mode_refcounts(d) )
- put_page(mfn_to_page(pfn));
- else
- put_page_and_type(mfn_to_page(pfn));
- }
-
- l4e_write(
- (l4_pgentry_t *)__va(pagetable_get_paddr(v->arch.guest_table)),
- l4e_empty());
-
- v->arch.cr3 = 0;
- return;
- }
-#endif
-
- pfn = pagetable_get_pfn(v->arch.guest_table);
- if ( pfn != 0 )
- {
- if ( paging_mode_refcounts(d) )
- put_page(mfn_to_page(pfn));
- else
- put_page_and_type(mfn_to_page(pfn));
- v->arch.guest_table = pagetable_null();
- }
-
-#ifdef __x86_64__
- /* Drop ref to guest_table_user (from MMUEXT_NEW_USER_BASEPTR) */
- pfn = pagetable_get_pfn(v->arch.guest_table_user);
- if ( pfn != 0 )
- {
- if ( !is_pv_32bit_vcpu(v) )
- {
- if ( paging_mode_refcounts(d) )
- put_page(mfn_to_page(pfn));
- else
- put_page_and_type(mfn_to_page(pfn));
- }
- v->arch.guest_table_user = pagetable_null();
- }
-#endif
-
- v->arch.cr3 = 0;
-}
-
int domain_relinquish_resources(struct domain *d)
{
int ret;
@@ -2145,7 +2086,9 @@ int domain_relinquish_resources(struct d
/* Drop the in-use references to page-table bases. */
for_each_vcpu ( d, v )
{
- vcpu_destroy_pagetables(v);
+ ret = vcpu_destroy_pagetables(v, 1);
+ if ( ret )
+ return ret;
unmap_vcpu_info(v);
}
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2825,6 +2825,82 @@ static void put_superpage(unsigned long
#endif
+static int put_old_guest_table(struct vcpu *v)
+{
+ int rc;
+
+ if ( !v->arch.old_guest_table )
+ return 0;
+
+ switch ( rc = put_page_and_type_preemptible(v->arch.old_guest_table, 1) )
+ {
+ case -EINTR:
+ case -EAGAIN:
+ return -EAGAIN;
+ }
+
+ v->arch.old_guest_table = NULL;
+
+ return rc;
+}
+
+int vcpu_destroy_pagetables(struct vcpu *v, bool_t preemptible)
+{
+ unsigned long mfn = pagetable_get_pfn(v->arch.guest_table);
+ struct page_info *page;
+ int rc = put_old_guest_table(v);
+
+ if ( rc )
+ return rc;
+
+#ifdef __x86_64__
+ if ( is_pv_32on64_vcpu(v) )
+ mfn = l4e_get_pfn(*(l4_pgentry_t *)mfn_to_virt(mfn));
+#endif
+
+ if ( mfn )
+ {
+ page = mfn_to_page(mfn);
+ if ( paging_mode_refcounts(v->domain) )
+ put_page(page);
+ else
+ rc = put_page_and_type_preemptible(page, preemptible);
+ }
+
+#ifdef __x86_64__
+ if ( is_pv_32on64_vcpu(v) )
+ {
+ if ( !rc )
+ l4e_write(
+ (l4_pgentry_t *)__va(pagetable_get_paddr(v->arch.guest_table)),
+ l4e_empty());
+ }
+ else
+#endif
+ if ( !rc )
+ {
+ v->arch.guest_table = pagetable_null();
+
+#ifdef __x86_64__
+ /* Drop ref to guest_table_user (from MMUEXT_NEW_USER_BASEPTR) */
+ mfn = pagetable_get_pfn(v->arch.guest_table_user);
+ if ( mfn )
+ {
+ page = mfn_to_page(mfn);
+ if ( paging_mode_refcounts(v->domain) )
+ put_page(page);
+ else
+ rc = put_page_and_type_preemptible(page, preemptible);
+ }
+ if ( !rc )
+ v->arch.guest_table_user = pagetable_null();
+#endif
+ }
+
+ v->arch.cr3 = 0;
+
+ return rc;
+}
int new_guest_cr3(unsigned long mfn)
{
@@ -3011,12 +3087,21 @@ long do_mmuext_op(
unsigned int foreigndom)
{
struct mmuext_op op;
- int rc = 0, i = 0, okay;
unsigned long type;
- unsigned int done = 0;
+ unsigned int i = 0, done = 0;
struct vcpu *curr = current;
struct domain *d = curr->domain;
struct domain *pg_owner;
+ int okay, rc = put_old_guest_table(curr);
+
+ if ( unlikely(rc) )
+ {
+ if ( likely(rc == -EAGAIN) )
+ rc = hypercall_create_continuation(
+ __HYPERVISOR_mmuext_op, "hihi", uops, count, pdone,
+ foreigndom);
+ return rc;
+ }
if ( unlikely(count & MMU_UPDATE_PREEMPTED) )
{
--- a/xen/arch/x86/x86_64/compat/mm.c
+++ b/xen/arch/x86/x86_64/compat/mm.c
@@ -365,7 +365,7 @@ int compat_mmuext_op(XEN_GUEST_HANDLE(mm
: mcs->call.args[1];
unsigned int left = arg1 & ~MMU_UPDATE_PREEMPTED;
- BUG_ON(left == arg1);
+ BUG_ON(left == arg1 && left != i);
BUG_ON(left > count);
guest_handle_add_offset(nat_ops, i - left);
guest_handle_subtract_offset(cmp_uops, left);
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -461,6 +461,7 @@ struct arch_vcpu
pagetable_t guest_table_user; /* (MFN) x86/64 user-space pagetable */
#endif
pagetable_t guest_table; /* (MFN) guest notion of cr3 */
+ struct page_info *old_guest_table; /* partially destructed pagetable */
/* guest_table holds a ref to the page, and also a type-count unless
* shadow refcounts are in use */
pagetable_t shadow_table[4]; /* (MFN) shadow(s) of guest */
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -605,6 +605,7 @@ void audit_domains(void);
int new_guest_cr3(unsigned long pfn);
void make_cr3(struct vcpu *v, unsigned long mfn);
void update_cr3(struct vcpu *v);
+int vcpu_destroy_pagetables(struct vcpu *, bool_t preemptible);
void propagate_page_fault(unsigned long addr, u16 error_code);
void *do_page_walk(struct vcpu *v, unsigned long addr);
++++++ 26947-x86-make-new_guest_cr3-preemptible.patch ++++++
References: bnc#816159 CVE-2013-1918 XSA-45
# Commit e2e6b7b627fec0d7a769ab46441f2985ebccbf04
# Date 2013-05-02 16:35:50 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: make new_guest_cr3() preemptible
... as it may take significant amounts of time.
This is part of CVE-2013-1918 / XSA-45.
Signed-off-by: Jan Beulich
Acked-by: Tim Deegan
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2906,44 +2906,69 @@ int new_guest_cr3(unsigned long mfn)
{
struct vcpu *curr = current;
struct domain *d = curr->domain;
- int okay;
+ int rc;
unsigned long old_base_mfn;
#ifdef __x86_64__
if ( is_pv_32on64_domain(d) )
{
- okay = paging_mode_refcounts(d)
- ? 0 /* Old code was broken, but what should it be? */
- : mod_l4_entry(
+ rc = paging_mode_refcounts(d)
+ ? -EINVAL /* Old code was broken, but what should it be? */
+ : mod_l4_entry(
__va(pagetable_get_paddr(curr->arch.guest_table)),
l4e_from_pfn(
mfn,
(_PAGE_PRESENT|_PAGE_RW|_PAGE_USER|_PAGE_ACCESSED)),
- pagetable_get_pfn(curr->arch.guest_table), 0, 0, curr) == 0;
- if ( unlikely(!okay) )
+ pagetable_get_pfn(curr->arch.guest_table), 0, 1, curr);
+ switch ( rc )
{
+ case 0:
+ break;
+ case -EINTR:
+ case -EAGAIN:
+ return -EAGAIN;
+ default:
MEM_LOG("Error while installing new compat baseptr %lx", mfn);
- return 0;
+ return rc;
}
invalidate_shadow_ldt(curr, 0);
write_ptbase(curr);
- return 1;
+ return 0;
}
#endif
- okay = paging_mode_refcounts(d)
- ? get_page_from_pagenr(mfn, d)
- : !get_page_and_type_from_pagenr(mfn, PGT_root_page_table, d, 0, 0);
- if ( unlikely(!okay) )
+ rc = put_old_guest_table(curr);
+ if ( unlikely(rc) )
+ return rc;
+
+ old_base_mfn = pagetable_get_pfn(curr->arch.guest_table);
+ /*
+ * This is particularly important when getting restarted after the
+ * previous attempt got preempted in the put-old-MFN phase.
+ */
+ if ( old_base_mfn == mfn )
{
- MEM_LOG("Error while installing new baseptr %lx", mfn);
+ write_ptbase(curr);
return 0;
}
- invalidate_shadow_ldt(curr, 0);
+ rc = paging_mode_refcounts(d)
+ ? (get_page_from_pagenr(mfn, d) ? 0 : -EINVAL)
+ : get_page_and_type_from_pagenr(mfn, PGT_root_page_table, d, 0, 1);
+ switch ( rc )
+ {
+ case 0:
+ break;
+ case -EINTR:
+ case -EAGAIN:
+ return -EAGAIN;
+ default:
+ MEM_LOG("Error while installing new baseptr %lx", mfn);
+ return rc;
+ }
- old_base_mfn = pagetable_get_pfn(curr->arch.guest_table);
+ invalidate_shadow_ldt(curr, 0);
curr->arch.guest_table = pagetable_from_pfn(mfn);
update_cr3(curr);
@@ -2952,13 +2977,25 @@ int new_guest_cr3(unsigned long mfn)
if ( likely(old_base_mfn != 0) )
{
+ struct page_info *page = mfn_to_page(old_base_mfn);
+
if ( paging_mode_refcounts(d) )
- put_page(mfn_to_page(old_base_mfn));
+ put_page(page);
else
- put_page_and_type(mfn_to_page(old_base_mfn));
+ switch ( rc = put_page_and_type_preemptible(page, 1) )
+ {
+ case -EINTR:
+ rc = -EAGAIN;
+ case -EAGAIN:
+ curr->arch.old_guest_table = page;
+ break;
+ default:
+ BUG_ON(rc);
+ break;
+ }
}
- return 1;
+ return rc;
}
static struct domain *get_pg_owner(domid_t domid)
@@ -3256,8 +3293,13 @@ long do_mmuext_op(
}
case MMUEXT_NEW_BASEPTR:
- okay = (!paging_mode_translate(d)
- && new_guest_cr3(op.arg1.mfn));
+ if ( paging_mode_translate(d) )
+ okay = 0;
+ else
+ {
+ rc = new_guest_cr3(op.arg1.mfn);
+ okay = !rc;
+ }
break;
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -2407,12 +2407,23 @@ static int emulate_privileged_op(struct
#endif
}
page = get_page_from_gfn(v->domain, gfn, NULL, P2M_ALLOC);
- rc = page ? new_guest_cr3(page_to_mfn(page)) : 0;
if ( page )
+ {
+ rc = new_guest_cr3(page_to_mfn(page));
put_page(page);
+ }
+ else
+ rc = -EINVAL;
domain_unlock(v->domain);
- if ( rc == 0 ) /* not okay */
+ switch ( rc )
+ {
+ case 0:
+ break;
+ case -EAGAIN: /* retry after preemption */
+ goto skip;
+ default: /* not okay */
goto fail;
+ }
break;
}
++++++ 26948-x86-make-MMUEXT_NEW_USER_BASEPTR-preemptible.patch ++++++
References: bnc#816159 CVE-2013-1918 XSA-45
# Commit 918a5f17b447072b40780f4d03a3adc99ff0073b
# Date 2013-05-02 16:36:44 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: make MMUEXT_NEW_USER_BASEPTR preemptible
... as it may take significant amounts of time.
This is part of CVE-2013-1918 / XSA-45.
Signed-off-by: Jan Beulich
Acked-by: Tim Deegan
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3313,29 +3313,56 @@ long do_mmuext_op(
break;
}
+ old_mfn = pagetable_get_pfn(curr->arch.guest_table_user);
+ /*
+ * This is particularly important when getting restarted after the
+ * previous attempt got preempted in the put-old-MFN phase.
+ */
+ if ( old_mfn == op.arg1.mfn )
+ break;
+
if ( op.arg1.mfn != 0 )
{
if ( paging_mode_refcounts(d) )
okay = get_page_from_pagenr(op.arg1.mfn, d);
else
- okay = !get_page_and_type_from_pagenr(
- op.arg1.mfn, PGT_root_page_table, d, 0, 0);
+ {
+ rc = get_page_and_type_from_pagenr(
+ op.arg1.mfn, PGT_root_page_table, d, 0, 1);
+ okay = !rc;
+ }
if ( unlikely(!okay) )
{
- MEM_LOG("Error while installing new mfn %lx", op.arg1.mfn);
+ if ( rc == -EINTR )
+ rc = -EAGAIN;
+ else if ( rc != -EAGAIN )
+ MEM_LOG("Error while installing new mfn %lx",
+ op.arg1.mfn);
break;
}
}
- old_mfn = pagetable_get_pfn(curr->arch.guest_table_user);
curr->arch.guest_table_user = pagetable_from_pfn(op.arg1.mfn);
if ( old_mfn != 0 )
{
+ struct page_info *page = mfn_to_page(old_mfn);
+
if ( paging_mode_refcounts(d) )
- put_page(mfn_to_page(old_mfn));
+ put_page(page);
else
- put_page_and_type(mfn_to_page(old_mfn));
+ switch ( rc = put_page_and_type_preemptible(page, 1) )
+ {
+ case -EINTR:
+ rc = -EAGAIN;
+ case -EAGAIN:
+ curr->arch.old_guest_table = page;
+ okay = 0;
+ break;
+ default:
+ BUG_ON(rc);
+ break;
+ }
}
break;
++++++ 26949-x86-make-vcpu_reset-preemptible.patch ++++++
References: bnc#816159 CVE-2013-1918 XSA-45
# Commit 4939f9a6dee4280f38730fd3066e5dce353112f6
# Date 2013-05-02 16:37:24 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: make vcpu_reset() preemptible
... as dropping the old page tables may take significant amounts of
time.
This is part of CVE-2013-1918 / XSA-45.
Signed-off-by: Jan Beulich
Acked-by: Tim Deegan
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1052,17 +1052,16 @@ int arch_set_info_guest(
#undef c
}
-void arch_vcpu_reset(struct vcpu *v)
+int arch_vcpu_reset(struct vcpu *v)
{
if ( !is_hvm_vcpu(v) )
{
destroy_gdt(v);
- vcpu_destroy_pagetables(v, 0);
- }
- else
- {
- vcpu_end_shutdown_deferral(v);
+ return vcpu_destroy_pagetables(v);
}
+
+ vcpu_end_shutdown_deferral(v);
+ return 0;
}
/*
@@ -2086,7 +2085,7 @@ int domain_relinquish_resources(struct d
/* Drop the in-use references to page-table bases. */
for_each_vcpu ( d, v )
{
- ret = vcpu_destroy_pagetables(v, 1);
+ ret = vcpu_destroy_pagetables(v);
if ( ret )
return ret;
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3575,8 +3575,11 @@ static void hvm_s3_suspend(struct domain
for_each_vcpu ( d, v )
{
+ int rc;
+
vlapic_reset(vcpu_vlapic(v));
- vcpu_reset(v);
+ rc = vcpu_reset(v);
+ ASSERT(!rc);
}
vpic_reset(d);
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -240,6 +240,8 @@ static void vlapic_init_sipi_one(struct
{
case APIC_DM_INIT: {
bool_t fpu_initialised;
+ int rc;
+
/* No work on INIT de-assert for P4-type APIC. */
if ( (icr & (APIC_INT_LEVELTRIG | APIC_INT_ASSERT)) ==
APIC_INT_LEVELTRIG )
@@ -251,7 +253,8 @@ static void vlapic_init_sipi_one(struct
domain_lock(target->domain);
/* Reset necessary VCPU state. This does not include FPU state. */
fpu_initialised = target->fpu_initialised;
- vcpu_reset(target);
+ rc = vcpu_reset(target);
+ ASSERT(!rc);
target->fpu_initialised = fpu_initialised;
vlapic_reset(vcpu_vlapic(target));
domain_unlock(target->domain);
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2844,7 +2844,7 @@ static int put_old_guest_table(struct vc
return rc;
}
-int vcpu_destroy_pagetables(struct vcpu *v, bool_t preemptible)
+int vcpu_destroy_pagetables(struct vcpu *v)
{
unsigned long mfn = pagetable_get_pfn(v->arch.guest_table);
struct page_info *page;
@@ -2864,7 +2864,7 @@ int vcpu_destroy_pagetables(struct vcpu
if ( paging_mode_refcounts(v->domain) )
put_page(page);
else
- rc = put_page_and_type_preemptible(page, preemptible);
+ rc = put_page_and_type_preemptible(page, 1);
}
#ifdef __x86_64__
@@ -2890,7 +2890,7 @@ int vcpu_destroy_pagetables(struct vcpu
if ( paging_mode_refcounts(v->domain) )
put_page(page);
else
- rc = put_page_and_type_preemptible(page, preemptible);
+ rc = put_page_and_type_preemptible(page, 1);
}
if ( !rc )
v->arch.guest_table_user = pagetable_null();
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -779,14 +779,18 @@ void domain_unpause_by_systemcontroller(
domain_unpause(d);
}
-void vcpu_reset(struct vcpu *v)
+int vcpu_reset(struct vcpu *v)
{
struct domain *d = v->domain;
+ int rc;
vcpu_pause(v);
domain_lock(d);
- arch_vcpu_reset(v);
+ set_bit(_VPF_in_reset, &v->pause_flags);
+ rc = arch_vcpu_reset(v);
+ if ( rc )
+ goto out_unlock;
set_bit(_VPF_down, &v->pause_flags);
@@ -802,9 +806,13 @@ void vcpu_reset(struct vcpu *v)
#endif
cpumask_clear(v->cpu_affinity_tmp);
clear_bit(_VPF_blocked, &v->pause_flags);
+ clear_bit(_VPF_in_reset, &v->pause_flags);
+ out_unlock:
domain_unlock(v->domain);
vcpu_unpause(v);
+
+ return rc;
}
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -307,8 +307,10 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domc
if ( guest_handle_is_null(op->u.vcpucontext.ctxt) )
{
- vcpu_reset(v);
- ret = 0;
+ ret = vcpu_reset(v);
+ if ( ret == -EAGAIN )
+ ret = hypercall_create_continuation(
+ __HYPERVISOR_domctl, "h", u_domctl);
goto svc_out;
}
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -605,7 +605,7 @@ void audit_domains(void);
int new_guest_cr3(unsigned long pfn);
void make_cr3(struct vcpu *v, unsigned long mfn);
void update_cr3(struct vcpu *v);
-int vcpu_destroy_pagetables(struct vcpu *, bool_t preemptible);
+int vcpu_destroy_pagetables(struct vcpu *);
void propagate_page_fault(unsigned long addr, u16 error_code);
void *do_page_walk(struct vcpu *v, unsigned long addr);
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -13,7 +13,7 @@ typedef union {
struct vcpu *alloc_vcpu(
struct domain *d, unsigned int vcpu_id, unsigned int cpu_id);
struct vcpu *alloc_dom0_vcpu0(void);
-void vcpu_reset(struct vcpu *v);
+int vcpu_reset(struct vcpu *);
struct xen_domctl_getdomaininfo;
void getdomaininfo(struct domain *d, struct xen_domctl_getdomaininfo *info);
@@ -67,7 +67,7 @@ void arch_dump_vcpu_info(struct vcpu *v)
void arch_dump_domain_info(struct domain *d);
-void arch_vcpu_reset(struct vcpu *v);
+int arch_vcpu_reset(struct vcpu *);
extern spinlock_t vcpu_alloc_lock;
bool_t domctl_lock_acquire(void);
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -644,6 +644,9 @@ static inline struct domain *next_domain
/* VCPU is blocked due to missing mem_sharing ring. */
#define _VPF_mem_sharing 6
#define VPF_mem_sharing (1UL<<_VPF_mem_sharing)
+ /* VCPU is being reset. */
+#define _VPF_in_reset 7
+#define VPF_in_reset (1UL<<_VPF_in_reset)
static inline int vcpu_runnable(struct vcpu *v)
{
++++++ 26950-x86-make-arch_set_info_guest-preemptible.patch ++++++
References: bnc#816159 CVE-2013-1918 XSA-45
# Commit 99d2b149915010e986f4d8778708c5891e7b4635
# Date 2013-05-02 16:38:30 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: make arch_set_info_guest() preemptible
.. as the root page table validation (and the dropping of an eventual
old one) can require meaningful amounts of time.
This is part of CVE-2013-1918 / XSA-45.
Signed-off-by: Jan Beulich
Acked-by: Tim Deegan
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -859,6 +859,9 @@ int arch_set_info_guest(
if ( !v->is_initialised )
{
+ if ( !compat && !(flags & VGCF_in_kernel) && !c.nat->ctrlreg[1] )
+ return -EINVAL;
+
v->arch.pv_vcpu.ldt_base = c(ldt_base);
v->arch.pv_vcpu.ldt_ents = c(ldt_ents);
}
@@ -956,24 +959,44 @@ int arch_set_info_guest(
if ( rc != 0 )
return rc;
+ set_bit(_VPF_in_reset, &v->pause_flags);
+
if ( !compat )
- {
cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[3]);
- cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
-
- if ( !cr3_page )
- {
- destroy_gdt(v);
- return -EINVAL;
- }
- if ( !paging_mode_refcounts(d)
- && !get_page_type(cr3_page, PGT_base_page_table) )
- {
- put_page(cr3_page);
- destroy_gdt(v);
- return -EINVAL;
- }
+#ifdef CONFIG_COMPAT
+ else
+ cr3_gfn = compat_cr3_to_pfn(c.cmp->ctrlreg[3]);
+#endif
+ cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
+ if ( !cr3_page )
+ rc = -EINVAL;
+ else if ( paging_mode_refcounts(d) )
+ /* nothing */;
+ else if ( cr3_page == v->arch.old_guest_table )
+ {
+ v->arch.old_guest_table = NULL;
+ put_page(cr3_page);
+ }
+ else
+ {
+ /*
+ * Since v->arch.guest_table{,_user} are both NULL, this effectively
+ * is just a call to put_old_guest_table().
+ */
+ if ( !compat )
+ rc = vcpu_destroy_pagetables(v);
+ if ( !rc )
+ rc = get_page_type_preemptible(cr3_page,
+ !compat ? PGT_root_page_table
+ : PGT_l3_page_table);
+ if ( rc == -EINTR )
+ rc = -EAGAIN;
+ }
+ if ( rc )
+ /* handled below */;
+ else if ( !compat )
+ {
v->arch.guest_table = pagetable_from_page(cr3_page);
#ifdef __x86_64__
if ( c.nat->ctrlreg[1] )
@@ -981,56 +1004,44 @@ int arch_set_info_guest(
cr3_gfn = xen_cr3_to_pfn(c.nat->ctrlreg[1]);
cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
- if ( !cr3_page ||
- (!paging_mode_refcounts(d)
- && !get_page_type(cr3_page, PGT_base_page_table)) )
+ if ( !cr3_page )
+ rc = -EINVAL;
+ else if ( !paging_mode_refcounts(d) )
{
- if (cr3_page)
- put_page(cr3_page);
- cr3_page = pagetable_get_page(v->arch.guest_table);
- v->arch.guest_table = pagetable_null();
- if ( paging_mode_refcounts(d) )
- put_page(cr3_page);
- else
- put_page_and_type(cr3_page);
- destroy_gdt(v);
- return -EINVAL;
+ rc = get_page_type_preemptible(cr3_page, PGT_root_page_table);
+ switch ( rc )
+ {
+ case -EINTR:
+ rc = -EAGAIN;
+ case -EAGAIN:
+ v->arch.old_guest_table =
+ pagetable_get_page(v->arch.guest_table);
+ v->arch.guest_table = pagetable_null();
+ break;
+ }
}
-
- v->arch.guest_table_user = pagetable_from_page(cr3_page);
- }
- else if ( !(flags & VGCF_in_kernel) )
- {
- destroy_gdt(v);
- return -EINVAL;
+ if ( !rc )
+ v->arch.guest_table_user = pagetable_from_page(cr3_page);
}
}
else
{
l4_pgentry_t *l4tab;
- cr3_gfn = compat_cr3_to_pfn(c.cmp->ctrlreg[3]);
- cr3_page = get_page_from_gfn(d, cr3_gfn, NULL, P2M_ALLOC);
-
- if ( !cr3_page)
- {
- destroy_gdt(v);
- return -EINVAL;
- }
-
- if (!paging_mode_refcounts(d)
- && !get_page_type(cr3_page, PGT_l3_page_table) )
- {
- put_page(cr3_page);
- destroy_gdt(v);
- return -EINVAL;
- }
-
l4tab = __va(pagetable_get_paddr(v->arch.guest_table));
*l4tab = l4e_from_pfn(page_to_mfn(cr3_page),
_PAGE_PRESENT|_PAGE_RW|_PAGE_USER|_PAGE_ACCESSED);
#endif
}
+ if ( rc )
+ {
+ if ( cr3_page )
+ put_page(cr3_page);
+ destroy_gdt(v);
+ return rc;
+ }
+
+ clear_bit(_VPF_in_reset, &v->pause_flags);
if ( v->vcpu_id == 0 )
update_domain_wallclock_time(d);
--- a/xen/common/compat/domain.c
+++ b/xen/common/compat/domain.c
@@ -50,6 +50,10 @@ int compat_vcpu_op(int cmd, int vcpuid,
rc = v->is_initialised ? -EEXIST : arch_set_info_guest(v, cmp_ctxt);
domain_unlock(d);
+ if ( rc == -EAGAIN )
+ rc = hypercall_create_continuation(__HYPERVISOR_vcpu_op, "iih",
+ cmd, vcpuid, arg);
+
xfree(cmp_ctxt);
break;
}
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -849,6 +849,11 @@ long do_vcpu_op(int cmd, int vcpuid, XEN
domain_unlock(d);
free_vcpu_guest_context(ctxt);
+
+ if ( rc == -EAGAIN )
+ rc = hypercall_create_continuation(__HYPERVISOR_vcpu_op, "iih",
+ cmd, vcpuid, arg);
+
break;
case VCPUOP_up: {
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -339,6 +339,10 @@ long do_domctl(XEN_GUEST_HANDLE(xen_domc
domain_pause(d);
ret = arch_set_info_guest(v, c);
domain_unpause(d);
+
+ if ( ret == -EAGAIN )
+ ret = hypercall_create_continuation(
+ __HYPERVISOR_domctl, "h", u_domctl);
}
svc_out:
++++++ 26951-x86-make-page-table-unpinning-preemptible.patch ++++++
References: bnc#816159 CVE-2013-1918 XSA-45
# Commit a3e049f8e86fe18e3b87f18dc0c7be665026fd9f
# Date 2013-05-02 16:39:06 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: make page table unpinning preemptible
... as it may take significant amounts of time.
Since we can't re-invoke the operation in a second attempt, the
continuation logic must be slightly tweaked so that we make sure
do_mmuext_op() gets run one more time even when the preempted unpin
operation was the last one in a batch.
This is part of CVE-2013-1918 / XSA-45.
Signed-off-by: Jan Beulich
Acked-by: Tim Deegan
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3140,6 +3140,14 @@ long do_mmuext_op(
return rc;
}
+ if ( unlikely(count == MMU_UPDATE_PREEMPTED) &&
+ likely(guest_handle_is_null(uops)) )
+ {
+ /* See the curr->arch.old_guest_table related
+ * hypercall_create_continuation() below. */
+ return (int)foreigndom;
+ }
+
if ( unlikely(count & MMU_UPDATE_PREEMPTED) )
{
count &= ~MMU_UPDATE_PREEMPTED;
@@ -3163,7 +3171,7 @@ long do_mmuext_op(
for ( i = 0; i < count; i++ )
{
- if ( hypercall_preempt_check() )
+ if ( curr->arch.old_guest_table || hypercall_preempt_check() )
{
rc = -EAGAIN;
break;
@@ -3283,7 +3291,17 @@ long do_mmuext_op(
break;
}
- put_page_and_type(page);
+ switch ( rc = put_page_and_type_preemptible(page, 1) )
+ {
+ case -EINTR:
+ case -EAGAIN:
+ curr->arch.old_guest_table = page;
+ rc = 0;
+ break;
+ default:
+ BUG_ON(rc);
+ break;
+ }
put_page(page);
/* A page is dirtied when its pin status is cleared. */
@@ -3604,9 +3622,27 @@ long do_mmuext_op(
}
if ( rc == -EAGAIN )
+ {
+ ASSERT(i < count);
rc = hypercall_create_continuation(
__HYPERVISOR_mmuext_op, "hihi",
uops, (count - i) | MMU_UPDATE_PREEMPTED, pdone, foreigndom);
+ }
+ else if ( curr->arch.old_guest_table )
+ {
+ XEN_GUEST_HANDLE(void) null;
+
+ ASSERT(rc || i == count);
+ set_xen_guest_handle(null, NULL);
+ /*
+ * In order to have a way to communicate the final return value to
+ * our continuation, we pass this in place of "foreigndom", building
+ * on the fact that this argument isn't needed anymore.
+ */
+ rc = hypercall_create_continuation(
+ __HYPERVISOR_mmuext_op, "hihi", null,
+ MMU_UPDATE_PREEMPTED, null, rc);
+ }
put_pg_owner(pg_owner);
--- a/xen/arch/x86/x86_64/compat/mm.c
+++ b/xen/arch/x86/x86_64/compat/mm.c
@@ -268,6 +268,13 @@ int compat_mmuext_op(XEN_GUEST_HANDLE(mm
int rc = 0;
XEN_GUEST_HANDLE(mmuext_op_t) nat_ops;
+ if ( unlikely(count == MMU_UPDATE_PREEMPTED) &&
+ likely(guest_handle_is_null(cmp_uops)) )
+ {
+ set_xen_guest_handle(nat_ops, NULL);
+ return do_mmuext_op(nat_ops, count, pdone, foreigndom);
+ }
+
preempt_mask = count & MMU_UPDATE_PREEMPTED;
count ^= preempt_mask;
@@ -370,12 +377,18 @@ int compat_mmuext_op(XEN_GUEST_HANDLE(mm
guest_handle_add_offset(nat_ops, i - left);
guest_handle_subtract_offset(cmp_uops, left);
left = 1;
- BUG_ON(!hypercall_xlat_continuation(&left, 0x01, nat_ops, cmp_uops));
- BUG_ON(left != arg1);
- if (!test_bit(_MCSF_in_multicall, &mcs->flags))
- regs->_ecx += count - i;
+ if ( arg1 != MMU_UPDATE_PREEMPTED )
+ {
+ BUG_ON(!hypercall_xlat_continuation(&left, 0x01, nat_ops,
+ cmp_uops));
+ if ( !test_bit(_MCSF_in_multicall, &mcs->flags) )
+ regs->_ecx += count - i;
+ else
+ mcs->compat_call.args[1] += count - i;
+ }
else
- mcs->compat_call.args[1] += count - i;
+ BUG_ON(hypercall_xlat_continuation(&left, 0));
+ BUG_ON(left != arg1);
}
else
BUG_ON(err > 0);
++++++ 26952-x86-make-page-table-handling-error-paths-preemptible.patch ++++++
References: bnc#816159 CVE-2013-1918 XSA-45
# Commit b8efae696c9a2d46e91fa0eda739427efc16c250
# Date 2013-05-02 16:39:37 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: make page table handling error paths preemptible
... as they may take significant amounts of time.
This requires cloning the tweaked continuation logic from
do_mmuext_op() to do_mmu_update().
Note that in mod_l[34]_entry() a negative "preemptible" value gets
passed to put_page_from_l[34]e() now, telling the callee to store the
respective page in current->arch.old_guest_table (for a hypercall
continuation to pick up), rather than carrying out the put right away.
This is going to be made a little more explicit by a subsequent cleanup
patch.
This is part of CVE-2013-1918 / XSA-45.
Signed-off-by: Jan Beulich
Acked-by: Tim Deegan
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1258,7 +1258,16 @@ static int put_page_from_l3e(l3_pgentry_
#endif
if ( unlikely(partial > 0) )
+ {
+ ASSERT(preemptible >= 0);
return __put_page_type(l3e_get_page(l3e), preemptible);
+ }
+
+ if ( preemptible < 0 )
+ {
+ current->arch.old_guest_table = l3e_get_page(l3e);
+ return 0;
+ }
return put_page_and_type_preemptible(l3e_get_page(l3e), preemptible);
}
@@ -1271,7 +1280,17 @@ static int put_page_from_l4e(l4_pgentry_
(l4e_get_pfn(l4e) != pfn) )
{
if ( unlikely(partial > 0) )
+ {
+ ASSERT(preemptible >= 0);
return __put_page_type(l4e_get_page(l4e), preemptible);
+ }
+
+ if ( preemptible < 0 )
+ {
+ current->arch.old_guest_table = l4e_get_page(l4e);
+ return 0;
+ }
+
return put_page_and_type_preemptible(l4e_get_page(l4e), preemptible);
}
return 1;
@@ -1566,12 +1585,17 @@ static int alloc_l3_table(struct page_in
if ( rc < 0 && rc != -EAGAIN && rc != -EINTR )
{
MEM_LOG("Failure in alloc_l3_table: entry %d", i);
+ if ( i )
+ {
+ page->nr_validated_ptes = i;
+ page->partial_pte = 0;
+ current->arch.old_guest_table = page;
+ }
while ( i-- > 0 )
{
if ( !is_guest_l3_slot(i) )
continue;
unadjust_guest_l3e(pl3e[i], d);
- put_page_from_l3e(pl3e[i], pfn, 0, 0);
}
}
@@ -1601,22 +1625,24 @@ static int alloc_l4_table(struct page_in
page->nr_validated_ptes = i;
page->partial_pte = partial ?: 1;
}
- else if ( rc == -EINTR )
+ else if ( rc < 0 )
{
+ if ( rc != -EINTR )
+ MEM_LOG("Failure in alloc_l4_table: entry %d", i);
if ( i )
{
page->nr_validated_ptes = i;
page->partial_pte = 0;
- rc = -EAGAIN;
+ if ( rc == -EINTR )
+ rc = -EAGAIN;
+ else
+ {
+ if ( current->arch.old_guest_table )
+ page->nr_validated_ptes++;
+ current->arch.old_guest_table = page;
+ }
}
}
- else if ( rc < 0 )
- {
- MEM_LOG("Failure in alloc_l4_table: entry %d", i);
- while ( i-- > 0 )
- if ( is_guest_l4_slot(d, i) )
- put_page_from_l4e(pl4e[i], pfn, 0, 0);
- }
if ( rc < 0 )
return rc;
@@ -2064,7 +2090,7 @@ static int mod_l3_entry(l3_pgentry_t *pl
pae_flush_pgd(pfn, pgentry_ptr_to_slot(pl3e), nl3e);
}
- put_page_from_l3e(ol3e, pfn, 0, 0);
+ put_page_from_l3e(ol3e, pfn, 0, -preemptible);
return rc;
}
@@ -2127,7 +2153,7 @@ static int mod_l4_entry(l4_pgentry_t *pl
return -EFAULT;
}
- put_page_from_l4e(ol4e, pfn, 0, 0);
+ put_page_from_l4e(ol4e, pfn, 0, -preemptible);
return rc;
}
@@ -2285,7 +2311,15 @@ static int alloc_page_type(struct page_i
PRtype_info ": caf=%08lx taf=%" PRtype_info,
page_to_mfn(page), get_gpfn_from_mfn(page_to_mfn(page)),
type, page->count_info, page->u.inuse.type_info);
- page->u.inuse.type_info = 0;
+ if ( page != current->arch.old_guest_table )
+ page->u.inuse.type_info = 0;
+ else
+ {
+ ASSERT((page->u.inuse.type_info &
+ (PGT_count_mask | PGT_validated)) == 1);
+ get_page_light(page);
+ page->u.inuse.type_info |= PGT_partial;
+ }
}
else
{
@@ -3235,21 +3269,17 @@ long do_mmuext_op(
}
if ( (rc = xsm_memory_pin_page(d, pg_owner, page)) != 0 )
- {
- put_page_and_type(page);
okay = 0;
- break;
- }
-
- if ( unlikely(test_and_set_bit(_PGT_pinned,
- &page->u.inuse.type_info)) )
+ else if ( unlikely(test_and_set_bit(_PGT_pinned,
+ &page->u.inuse.type_info)) )
{
MEM_LOG("Mfn %lx already pinned", page_to_mfn(page));
- put_page_and_type(page);
okay = 0;
- break;
}
+ if ( unlikely(!okay) )
+ goto pin_drop;
+
/* A page is dirtied when its pin status is set. */
paging_mark_dirty(pg_owner, page_to_mfn(page));
@@ -3263,7 +3293,13 @@ long do_mmuext_op(
&page->u.inuse.type_info));
spin_unlock(&pg_owner->page_alloc_lock);
if ( drop_ref )
- put_page_and_type(page);
+ {
+ pin_drop:
+ if ( type == PGT_l1_page_table )
+ put_page_and_type(page);
+ else
+ curr->arch.old_guest_table = page;
+ }
}
break;
@@ -3669,11 +3705,28 @@ long do_mmu_update(
void *va;
unsigned long gpfn, gmfn, mfn;
struct page_info *page;
- int rc = 0, i = 0;
- unsigned int cmd, done = 0, pt_dom;
- struct vcpu *v = current;
+ unsigned int cmd, i = 0, done = 0, pt_dom;
+ struct vcpu *curr = current, *v = curr;
struct domain *d = v->domain, *pt_owner = d, *pg_owner;
struct domain_mmap_cache mapcache;
+ int rc = put_old_guest_table(curr);
+
+ if ( unlikely(rc) )
+ {
+ if ( likely(rc == -EAGAIN) )
+ rc = hypercall_create_continuation(
+ __HYPERVISOR_mmu_update, "hihi", ureqs, count, pdone,
+ foreigndom);
+ return rc;
+ }
+
+ if ( unlikely(count == MMU_UPDATE_PREEMPTED) &&
+ likely(guest_handle_is_null(ureqs)) )
+ {
+ /* See the curr->arch.old_guest_table related
+ * hypercall_create_continuation() below. */
+ return (int)foreigndom;
+ }
if ( unlikely(count & MMU_UPDATE_PREEMPTED) )
{
@@ -3722,7 +3775,7 @@ long do_mmu_update(
for ( i = 0; i < count; i++ )
{
- if ( hypercall_preempt_check() )
+ if ( curr->arch.old_guest_table || hypercall_preempt_check() )
{
rc = -EAGAIN;
break;
@@ -3903,9 +3956,27 @@ long do_mmu_update(
}
if ( rc == -EAGAIN )
+ {
+ ASSERT(i < count);
rc = hypercall_create_continuation(
__HYPERVISOR_mmu_update, "hihi",
ureqs, (count - i) | MMU_UPDATE_PREEMPTED, pdone, foreigndom);
+ }
+ else if ( curr->arch.old_guest_table )
+ {
+ XEN_GUEST_HANDLE(void) null;
+
+ ASSERT(rc || i == count);
+ set_xen_guest_handle(null, NULL);
+ /*
+ * In order to have a way to communicate the final return value to
+ * our continuation, we pass this in place of "foreigndom", building
+ * on the fact that this argument isn't needed anymore.
+ */
+ rc = hypercall_create_continuation(
+ __HYPERVISOR_mmu_update, "hihi", null,
+ MMU_UPDATE_PREEMPTED, null, rc);
+ }
put_pg_owner(pg_owner);
++++++ 26953-x86-allow-Dom0-read-only-access-to-IO-APICs.patch ++++++
References: bnc#808085
# Commit d1222afda4d27916580c28533762e362d64ace22
# Date 2013-05-02 16:46:02 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: allow Dom0 read-only access to IO-APICs
There are BIOSes that want to map the IO-APIC MMIO region from some
ACPI method(s), and there is at least one BIOS flavor that wants to
use this mapping to clear an RTE's mask bit. While we can't allow the
latter, we can permit reads and simply drop write attempts, leveraging
the already existing infrastructure introduced for dealing with AMD
IOMMUs' representation as PCI devices.
This fixes an interrupt setup problem on a system where _CRS evaluation
involved the above described BIOS/ACPI behavior, and is expected to
also deal with a boot time crash of pv-ops Linux upon encountering the
same kind of system.
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -1253,7 +1253,7 @@ int __init construct_dom0(
for ( i = 0; i < nr_ioapics; i++ )
{
mfn = paddr_to_pfn(mp_ioapics[i].mpc_apicaddr);
- if ( smp_found_config )
+ if ( !rangeset_contains_singleton(mmio_ro_ranges, mfn) )
rc |= iomem_deny_access(dom0, mfn, mfn);
}
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -2524,6 +2524,11 @@ void __init init_ioapic_mappings(void)
reg_01.raw = io_apic_read(i, 1);
nr_ioapic_entries[i] = reg_01.bits.entries + 1;
nr_irqs_gsi += nr_ioapic_entries[i];
+
+ if ( rangeset_add_singleton(mmio_ro_ranges,
+ ioapic_phys >> PAGE_SHIFT) )
+ printk(KERN_ERR "Failed to mark IO-APIC page %lx read-only\n",
+ ioapic_phys);
}
}
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -161,6 +161,8 @@ unsigned long __read_mostly pdx_group_va
bool_t __read_mostly machine_to_phys_mapping_valid = 0;
+struct rangeset *__read_mostly mmio_ro_ranges;
+
#define PAGE_CACHE_ATTRS (_PAGE_PAT|_PAGE_PCD|_PAGE_PWT)
bool_t __read_mostly opt_allow_superpage;
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1220,6 +1220,9 @@ void __init __start_xen(unsigned long mb
zap_low_mappings();
#endif
+ mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
+ RANGESETF_prettyprint_hex);
+
init_apic_mappings();
normalise_cpu_order();
--- a/xen/drivers/passthrough/io.c
+++ b/xen/drivers/passthrough/io.c
@@ -27,8 +27,6 @@
#include
#include
-struct rangeset *__read_mostly mmio_ro_ranges;
-
static void hvm_dirq_assist(unsigned long _d);
bool_t pt_irq_need_timer(uint32_t flags)
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -107,8 +107,6 @@ void __init pt_pci_init(void)
radix_tree_init(&pci_segments);
if ( !alloc_pseg(0) )
panic("Could not initialize PCI segment 0\n");
- mmio_ro_ranges = rangeset_new(NULL, "r/o mmio ranges",
- RANGESETF_prettyprint_hex);
}
int __init pci_add_segment(u16 seg)
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -535,6 +535,8 @@ extern bool_t machine_to_phys_mapping_va
_set_gpfn_from_mfn(mfn, pfn); \
} while (0)
+extern struct rangeset *mmio_ro_ranges;
+
#define get_gpfn_from_mfn(mfn) (machine_to_phys_mapping[(mfn)])
#define mfn_to_gmfn(_d, mfn) \
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -37,8 +37,6 @@ extern bool_t amd_iommu_perdev_intremap;
/* Does this domain have a P2M table we can use as its IOMMU pagetable? */
#define iommu_use_hap_pt(d) (hap_enabled(d) && iommu_hap_pt_share)
-extern struct rangeset *mmio_ro_ranges;
-
#define domain_hvm_iommu(d) (&d->arch.hvm_domain.hvm_iommu)
#define MAX_IOMMUS 32
++++++ 26956-x86-mm-preemptible-cleanup.patch ++++++
References: bnc#816159
# Commit b965b31a6bce8c37e67e525fae6da0e2f26d6b2e
# Date 2013-05-02 17:04:14 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: cleanup after making various page table manipulation operations preemptible
This drops the "preemptible" parameters from various functions where
now they can't (or shouldn't, validated by assertions) be run in non-
preemptible mode anymore, to prove that manipulations of at least L3
and L4 page tables and page table entries are now always preemptible,
i.e. the earlier patches actually fulfill their purpose of fixing the
resulting security issue.
Signed-off-by: Jan Beulich
Acked-by: Tim Deegan
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1987,7 +1987,7 @@ static int relinquish_memory(
}
if ( test_and_clear_bit(_PGT_pinned, &page->u.inuse.type_info) )
- ret = put_page_and_type_preemptible(page, 1);
+ ret = put_page_and_type_preemptible(page);
switch ( ret )
{
case 0:
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1046,7 +1046,7 @@ get_page_from_l2e(
define_get_linear_pagetable(l3);
static int
get_page_from_l3e(
- l3_pgentry_t l3e, unsigned long pfn, struct domain *d, int partial, int preemptible)
+ l3_pgentry_t l3e, unsigned long pfn, struct domain *d, int partial)
{
int rc;
@@ -1060,7 +1060,7 @@ get_page_from_l3e(
}
rc = get_page_and_type_from_pagenr(
- l3e_get_pfn(l3e), PGT_l2_page_table, d, partial, preemptible);
+ l3e_get_pfn(l3e), PGT_l2_page_table, d, partial, 1);
if ( unlikely(rc == -EINVAL) && get_l3_linear_pagetable(l3e, pfn, d) )
rc = 0;
@@ -1071,7 +1071,7 @@ get_page_from_l3e(
define_get_linear_pagetable(l4);
static int
get_page_from_l4e(
- l4_pgentry_t l4e, unsigned long pfn, struct domain *d, int partial, int preemptible)
+ l4_pgentry_t l4e, unsigned long pfn, struct domain *d, int partial)
{
int rc;
@@ -1085,7 +1085,7 @@ get_page_from_l4e(
}
rc = get_page_and_type_from_pagenr(
- l4e_get_pfn(l4e), PGT_l3_page_table, d, partial, preemptible);
+ l4e_get_pfn(l4e), PGT_l3_page_table, d, partial, 1);
if ( unlikely(rc == -EINVAL) && get_l4_linear_pagetable(l4e, pfn, d) )
rc = 0;
@@ -1239,8 +1239,10 @@ static int put_page_from_l2e(l2_pgentry_
static int __put_page_type(struct page_info *, int preemptible);
static int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn,
- int partial, int preemptible)
+ int partial, bool_t defer)
{
+ struct page_info *pg;
+
if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) || (l3e_get_pfn(l3e) == pfn) )
return 1;
@@ -1259,41 +1261,45 @@ static int put_page_from_l3e(l3_pgentry_
}
#endif
+ pg = l3e_get_page(l3e);
+
if ( unlikely(partial > 0) )
{
- ASSERT(preemptible >= 0);
- return __put_page_type(l3e_get_page(l3e), preemptible);
+ ASSERT(!defer);
+ return __put_page_type(pg, 1);
}
- if ( preemptible < 0 )
+ if ( defer )
{
- current->arch.old_guest_table = l3e_get_page(l3e);
+ current->arch.old_guest_table = pg;
return 0;
}
- return put_page_and_type_preemptible(l3e_get_page(l3e), preemptible);
+ return put_page_and_type_preemptible(pg);
}
#if CONFIG_PAGING_LEVELS >= 4
static int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn,
- int partial, int preemptible)
+ int partial, bool_t defer)
{
if ( (l4e_get_flags(l4e) & _PAGE_PRESENT) &&
(l4e_get_pfn(l4e) != pfn) )
{
+ struct page_info *pg = l4e_get_page(l4e);
+
if ( unlikely(partial > 0) )
{
- ASSERT(preemptible >= 0);
- return __put_page_type(l4e_get_page(l4e), preemptible);
+ ASSERT(!defer);
+ return __put_page_type(pg, 1);
}
- if ( preemptible < 0 )
+ if ( defer )
{
- current->arch.old_guest_table = l4e_get_page(l4e);
+ current->arch.old_guest_table = pg;
return 0;
}
- return put_page_and_type_preemptible(l4e_get_page(l4e), preemptible);
+ return put_page_and_type_preemptible(pg);
}
return 1;
}
@@ -1511,7 +1517,7 @@ static int alloc_l2_table(struct page_in
return rc > 0 ? 0 : rc;
}
-static int alloc_l3_table(struct page_info *page, int preemptible)
+static int alloc_l3_table(struct page_info *page)
{
struct domain *d = page_get_owner(page);
unsigned long pfn = page_to_mfn(page);
@@ -1558,11 +1564,10 @@ static int alloc_l3_table(struct page_in
rc = get_page_and_type_from_pagenr(l3e_get_pfn(pl3e[i]),
PGT_l2_page_table |
PGT_pae_xen_l2,
- d, partial, preemptible);
+ d, partial, 1);
}
else if ( !is_guest_l3_slot(i) ||
- (rc = get_page_from_l3e(pl3e[i], pfn, d,
- partial, preemptible)) > 0 )
+ (rc = get_page_from_l3e(pl3e[i], pfn, d, partial)) > 0 )
continue;
if ( rc == -EAGAIN )
@@ -1606,7 +1611,7 @@ static int alloc_l3_table(struct page_in
}
#if CONFIG_PAGING_LEVELS >= 4
-static int alloc_l4_table(struct page_info *page, int preemptible)
+static int alloc_l4_table(struct page_info *page)
{
struct domain *d = page_get_owner(page);
unsigned long pfn = page_to_mfn(page);
@@ -1618,8 +1623,7 @@ static int alloc_l4_table(struct page_in
i++, partial = 0 )
{
if ( !is_guest_l4_slot(d, i) ||
- (rc = get_page_from_l4e(pl4e[i], pfn, d,
- partial, preemptible)) > 0 )
+ (rc = get_page_from_l4e(pl4e[i], pfn, d, partial)) > 0 )
continue;
if ( rc == -EAGAIN )
@@ -1664,7 +1668,7 @@ static int alloc_l4_table(struct page_in
return rc > 0 ? 0 : rc;
}
#else
-#define alloc_l4_table(page, preemptible) (-EINVAL)
+#define alloc_l4_table(page) (-EINVAL)
#endif
@@ -1716,7 +1720,7 @@ static int free_l2_table(struct page_inf
return err;
}
-static int free_l3_table(struct page_info *page, int preemptible)
+static int free_l3_table(struct page_info *page)
{
struct domain *d = page_get_owner(page);
unsigned long pfn = page_to_mfn(page);
@@ -1729,7 +1733,7 @@ static int free_l3_table(struct page_inf
do {
if ( is_guest_l3_slot(i) )
{
- rc = put_page_from_l3e(pl3e[i], pfn, partial, preemptible);
+ rc = put_page_from_l3e(pl3e[i], pfn, partial, 0);
if ( rc < 0 )
break;
partial = 0;
@@ -1756,7 +1760,7 @@ static int free_l3_table(struct page_inf
}
#if CONFIG_PAGING_LEVELS >= 4
-static int free_l4_table(struct page_info *page, int preemptible)
+static int free_l4_table(struct page_info *page)
{
struct domain *d = page_get_owner(page);
unsigned long pfn = page_to_mfn(page);
@@ -1766,7 +1770,7 @@ static int free_l4_table(struct page_inf
do {
if ( is_guest_l4_slot(d, i) )
- rc = put_page_from_l4e(pl4e[i], pfn, partial, preemptible);
+ rc = put_page_from_l4e(pl4e[i], pfn, partial, 0);
if ( rc < 0 )
break;
partial = 0;
@@ -1786,7 +1790,7 @@ static int free_l4_table(struct page_inf
return rc > 0 ? 0 : rc;
}
#else
-#define free_l4_table(page, preemptible) (-EINVAL)
+#define free_l4_table(page) (-EINVAL)
#endif
int page_lock(struct page_info *page)
@@ -2025,7 +2029,6 @@ static int mod_l3_entry(l3_pgentry_t *pl
l3_pgentry_t nl3e,
unsigned long pfn,
int preserve_ad,
- int preemptible,
struct vcpu *vcpu)
{
l3_pgentry_t ol3e;
@@ -2065,7 +2068,7 @@ static int mod_l3_entry(l3_pgentry_t *pl
return rc ? 0 : -EFAULT;
}
- rc = get_page_from_l3e(nl3e, pfn, d, 0, preemptible);
+ rc = get_page_from_l3e(nl3e, pfn, d, 0);
if ( unlikely(rc < 0) )
return rc;
rc = 0;
@@ -2092,7 +2095,7 @@ static int mod_l3_entry(l3_pgentry_t *pl
pae_flush_pgd(pfn, pgentry_ptr_to_slot(pl3e), nl3e);
}
- put_page_from_l3e(ol3e, pfn, 0, -preemptible);
+ put_page_from_l3e(ol3e, pfn, 0, 1);
return rc;
}
@@ -2103,7 +2106,6 @@ static int mod_l4_entry(l4_pgentry_t *pl
l4_pgentry_t nl4e,
unsigned long pfn,
int preserve_ad,
- int preemptible,
struct vcpu *vcpu)
{
struct domain *d = vcpu->domain;
@@ -2136,7 +2138,7 @@ static int mod_l4_entry(l4_pgentry_t *pl
return rc ? 0 : -EFAULT;
}
- rc = get_page_from_l4e(nl4e, pfn, d, 0, preemptible);
+ rc = get_page_from_l4e(nl4e, pfn, d, 0);
if ( unlikely(rc < 0) )
return rc;
rc = 0;
@@ -2155,7 +2157,7 @@ static int mod_l4_entry(l4_pgentry_t *pl
return -EFAULT;
}
- put_page_from_l4e(ol4e, pfn, 0, -preemptible);
+ put_page_from_l4e(ol4e, pfn, 0, 1);
return rc;
}
@@ -2277,10 +2279,12 @@ static int alloc_page_type(struct page_i
rc = alloc_l2_table(page, type, preemptible);
break;
case PGT_l3_page_table:
- rc = alloc_l3_table(page, preemptible);
+ ASSERT(preemptible);
+ rc = alloc_l3_table(page);
break;
case PGT_l4_page_table:
- rc = alloc_l4_table(page, preemptible);
+ ASSERT(preemptible);
+ rc = alloc_l4_table(page);
break;
case PGT_seg_desc_page:
rc = alloc_segdesc_page(page);
@@ -2374,10 +2378,12 @@ int free_page_type(struct page_info *pag
if ( !(type & PGT_partial) )
page->nr_validated_ptes = L3_PAGETABLE_ENTRIES;
#endif
- rc = free_l3_table(page, preemptible);
+ ASSERT(preemptible);
+ rc = free_l3_table(page);
break;
case PGT_l4_page_table:
- rc = free_l4_table(page, preemptible);
+ ASSERT(preemptible);
+ rc = free_l4_table(page);
break;
default:
MEM_LOG("type %lx pfn %lx\n", type, page_to_mfn(page));
@@ -2868,7 +2874,7 @@ static int put_old_guest_table(struct vc
if ( !v->arch.old_guest_table )
return 0;
- switch ( rc = put_page_and_type_preemptible(v->arch.old_guest_table, 1) )
+ switch ( rc = put_page_and_type_preemptible(v->arch.old_guest_table) )
{
case -EINTR:
case -EAGAIN:
@@ -2900,7 +2906,7 @@ int vcpu_destroy_pagetables(struct vcpu
if ( paging_mode_refcounts(v->domain) )
put_page(page);
else
- rc = put_page_and_type_preemptible(page, 1);
+ rc = put_page_and_type_preemptible(page);
}
#ifdef __x86_64__
@@ -2926,7 +2932,7 @@ int vcpu_destroy_pagetables(struct vcpu
if ( paging_mode_refcounts(v->domain) )
put_page(page);
else
- rc = put_page_and_type_preemptible(page, 1);
+ rc = put_page_and_type_preemptible(page);
}
if ( !rc )
v->arch.guest_table_user = pagetable_null();
@@ -2955,7 +2961,7 @@ int new_guest_cr3(unsigned long mfn)
l4e_from_pfn(
mfn,
(_PAGE_PRESENT|_PAGE_RW|_PAGE_USER|_PAGE_ACCESSED)),
- pagetable_get_pfn(curr->arch.guest_table), 0, 1, curr);
+ pagetable_get_pfn(curr->arch.guest_table), 0, curr);
switch ( rc )
{
case 0:
@@ -3018,7 +3024,7 @@ int new_guest_cr3(unsigned long mfn)
if ( paging_mode_refcounts(d) )
put_page(page);
else
- switch ( rc = put_page_and_type_preemptible(page, 1) )
+ switch ( rc = put_page_and_type_preemptible(page) )
{
case -EINTR:
rc = -EAGAIN;
@@ -3329,7 +3335,7 @@ long do_mmuext_op(
break;
}
- switch ( rc = put_page_and_type_preemptible(page, 1) )
+ switch ( rc = put_page_and_type_preemptible(page) )
{
case -EINTR:
case -EAGAIN:
@@ -3407,7 +3413,7 @@ long do_mmuext_op(
if ( paging_mode_refcounts(d) )
put_page(page);
else
- switch ( rc = put_page_and_type_preemptible(page, 1) )
+ switch ( rc = put_page_and_type_preemptible(page) )
{
case -EINTR:
rc = -EAGAIN;
@@ -3884,12 +3890,12 @@ long do_mmu_update(
break;
case PGT_l3_page_table:
rc = mod_l3_entry(va, l3e_from_intpte(req.val), mfn,
- cmd == MMU_PT_UPDATE_PRESERVE_AD, 1, v);
+ cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
break;
#if CONFIG_PAGING_LEVELS >= 4
case PGT_l4_page_table:
rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
- cmd == MMU_PT_UPDATE_PRESERVE_AD, 1, v);
+ cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
break;
#endif
case PGT_writable_page:
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -384,15 +384,10 @@ static inline void put_page_and_type(str
put_page(page);
}
-static inline int put_page_and_type_preemptible(struct page_info *page,
- int preemptible)
+static inline int put_page_and_type_preemptible(struct page_info *page)
{
- int rc = 0;
+ int rc = put_page_type_preemptible(page);
- if ( preemptible )
- rc = put_page_type_preemptible(page);
- else
- put_page_type(page);
if ( likely(rc == 0) )
put_page(page);
return rc;
++++++ 26958-VT-d-don-t-permit-SVT_NO_VERIFY-entries-for-known-device-types.patch ++++++
References: bnc#8161663 CVE-2013-1952 XSA-49
# Commit 63cec00679cc65ab5d5a9447a62d5202f155b78c
# Date 2013-05-02 17:08:58 +0200
# Author Jan Beulich
# Committer Jan Beulich
VT-d: don't permit SVT_NO_VERIFY entries for known device types
Only in cases where we don't know what to do we should leave the IRTE
blank (suppressing all validation), but we should always log a warning
in those cases (as being insecure).
This is CVE-2013-1952 / XSA-49.
Signed-off-by: Jan Beulich
Acked-by: "Zhang, Xiantao"
--- a/xen/drivers/passthrough/vtd/intremap.c
+++ b/xen/drivers/passthrough/vtd/intremap.c
@@ -440,12 +440,9 @@ static void set_msi_source_id(struct pci
{
unsigned int sq;
+ case DEV_TYPE_PCIe_ENDPOINT:
case DEV_TYPE_PCIe_BRIDGE:
case DEV_TYPE_PCIe2PCI_BRIDGE:
- case DEV_TYPE_LEGACY_PCI_BRIDGE:
- break;
-
- case DEV_TYPE_PCIe_ENDPOINT:
switch ( pdev->phantom_stride )
{
case 1: sq = SQ_13_IGNORE_3; break;
@@ -457,6 +454,8 @@ static void set_msi_source_id(struct pci
break;
case DEV_TYPE_PCI:
+ case DEV_TYPE_LEGACY_PCI_BRIDGE:
+ case DEV_TYPE_PCI2PCIe_BRIDGE:
ret = find_upstream_bridge(seg, &bus, &devfn, &secbus);
if ( ret == 0 ) /* integrated PCI device */
{
@@ -468,10 +467,15 @@ static void set_msi_source_id(struct pci
if ( pdev_type(seg, bus, devfn) == DEV_TYPE_PCIe2PCI_BRIDGE )
set_ire_sid(ire, SVT_VERIFY_BUS, SQ_ALL_16,
(bus << 8) | pdev->bus);
- else if ( pdev_type(seg, bus, devfn) == DEV_TYPE_LEGACY_PCI_BRIDGE )
+ else
set_ire_sid(ire, SVT_VERIFY_SID_SQ, SQ_ALL_16,
PCI_BDF2(bus, devfn));
}
+ else
+ dprintk(XENLOG_WARNING VTDPREFIX,
+ "d%d: no upstream bridge for %04x:%02x:%02x.%u\n",
+ pdev->domain->domain_id,
+ seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
break;
default:
++++++ 26966-x86-handle-paged-gfn-in-wrmsr_hypervisor_regs.patch ++++++
# Commit 013e34f5a61725012467f17650597d351fc0ca99
# Date 2013-05-07 16:41:24 +0200
# Author Olaf Hering
# Committer Jan Beulich
x86: handle paged gfn in wrmsr_hypervisor_regs
If xenpaging is started very early for a guest the gfn for the hypercall
page may be paged-out already. This leads to a guest crash:
...
(XEN) HVM10: Allocated Xen hypercall page at 169ff000
(XEN) traps.c:654:d10 Bad GMFN 169ff (MFN 3e900000000) to MSR 40000000
(XEN) HVM10: Detected Xen v4.3
(XEN) io.c:201:d10 MMIO emulation failed @ 0008:c2c2c2c2: 18 7c 55 6d 03 83 ff ff 10 7c
(XEN) hvm.c:1253:d10 Triple fault on VCPU0 - invoking HVM shutdown action 1.
(XEN) HVM11: HVM Loader
...
Update return codes of wrmsr_hypervisor_regs, update callers to deal
with the new return codes:
0: not handled
1: handled
-EAGAIN: retry
Currently wrmsr_hypervisor_regs will not return the following error, it
will be added in a separate patch:
-EINVAL: error during handling
Also update the gdprintk to handle a page value of NULL to avoid
printing a bogus MFN value. Update also computing of MSR value in
gdprintk, the idx was always zero.
Signed-off-by: Olaf Hering
Acked-by: Keir Fraser
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1590,7 +1590,7 @@ static int svm_msr_read_intercept(unsign
static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content)
{
- int ret;
+ int ret, result = X86EMUL_OKAY;
struct vcpu *v = current;
struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
int sync = 0;
@@ -1703,14 +1703,24 @@ static int svm_msr_write_intercept(unsig
if ( wrmsr_viridian_regs(msr, msr_content) )
break;
- wrmsr_hypervisor_regs(msr, msr_content);
+ switch ( wrmsr_hypervisor_regs(msr, msr_content) )
+ {
+ case -EAGAIN:
+ result = X86EMUL_RETRY;
+ break;
+ case 0:
+ case 1:
+ break;
+ default:
+ goto gpf;
+ }
break;
}
if ( sync )
svm_vmload(vmcb);
- return X86EMUL_OKAY;
+ return result;
gpf:
hvm_inject_hw_exception(TRAP_gp_fault, 0);
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2214,7 +2214,16 @@ static int vmx_msr_write_intercept(unsig
case HNDL_unhandled:
if ( (vmx_write_guest_msr(msr, msr_content) != 0) &&
!is_last_branch_msr(msr) )
- wrmsr_hypervisor_regs(msr, msr_content);
+ switch ( wrmsr_hypervisor_regs(msr, msr_content) )
+ {
+ case -EAGAIN:
+ return X86EMUL_RETRY;
+ case 0:
+ case 1:
+ break;
+ default:
+ goto gp_fault;
+ }
break;
case HNDL_exception_raised:
return X86EMUL_EXCEPTION;
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -665,6 +665,7 @@ int wrmsr_hypervisor_regs(uint32_t idx,
unsigned long gmfn = val >> 12;
unsigned int idx = val & 0xfff;
struct page_info *page;
+ p2m_type_t t;
if ( idx > 0 )
{
@@ -674,15 +675,22 @@ int wrmsr_hypervisor_regs(uint32_t idx,
return 0;
}
- page = get_page_from_gfn(d, gmfn, NULL, P2M_ALLOC);
+ page = get_page_from_gfn(d, gmfn, &t, P2M_ALLOC);
if ( !page || !get_page_type(page, PGT_writable_page) )
{
if ( page )
put_page(page);
+
+ if ( p2m_is_paging(t) )
+ {
+ p2m_mem_paging_populate(d, gmfn);
+ return -EAGAIN;
+ }
+
gdprintk(XENLOG_WARNING,
"Bad GMFN %lx (MFN %lx) to MSR %08x\n",
- gmfn, page_to_mfn(page), base + idx);
+ gmfn, page ? page_to_mfn(page) : -1UL, base);
return 0;
}
@@ -2579,7 +2587,7 @@ static int emulate_privileged_op(struct
goto fail;
break;
default:
- if ( wrmsr_hypervisor_regs(regs->ecx, msr_content) )
+ if ( wrmsr_hypervisor_regs(regs->ecx, msr_content) == 1 )
break;
rc = vmce_wrmsr(regs->ecx, msr_content);
++++++ 27071-x86-IO-APIC-fix-guest-RTE-write-corner-cases.patch ++++++
References: bnc#817210
# Commit 30256a0ff17f6f3b1278b85103187341d5b0ac42
# Date 2013-05-15 10:52:02 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86/IO-APIC: fix guest RTE write corner cases
This fixes two regressions from c/s 20143:a7de5bd776ca ("x86: Make the
hypercall PHYSDEVOP_alloc_irq_vector hypercall dummy"):
For one, IRQs that had their vector set up by Xen internally without a
handler ever having got set (e.g. via "com<n>=..." without a matching
consumer option like "console=com<n>") would wrongly call
add_pin_to_irq() here, triggering the BUG_ON() in that function.
Second, when assign_irq_vector() fails this addition to irq_2_pin[]
needs to be undone.
In the context of this I'm also surprised that the irq_2_pin[]
manipulations here occur without any lock, i.e. rely on Dom0 to do
some sort of serialization.
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -141,6 +141,37 @@ static void add_pin_to_irq(unsigned int
share_vector_maps(irq_2_pin[irq].apic, apic);
}
+static void remove_pin_from_irq(unsigned int irq, int apic, int pin)
+{
+ struct irq_pin_list *entry, *prev;
+
+ for (entry = &irq_2_pin[irq]; ; entry = &irq_2_pin[entry->next]) {
+ if ((entry->apic == apic) && (entry->pin == pin))
+ break;
+ BUG_ON(!entry->next);
+ }
+
+ entry->pin = entry->apic = -1;
+
+ if (entry != &irq_2_pin[irq]) {
+ /* Removed entry is not at head of list. */
+ prev = &irq_2_pin[irq];
+ while (&irq_2_pin[prev->next] != entry)
+ prev = &irq_2_pin[prev->next];
+ prev->next = entry->next;
+ } else if (entry->next) {
+ /* Removed entry is at head of multi-item list. */
+ prev = entry;
+ entry = &irq_2_pin[entry->next];
+ *prev = *entry;
+ entry->pin = entry->apic = -1;
+ } else
+ return;
+
+ entry->next = irq_2_pin_free_entry;
+ irq_2_pin_free_entry = entry - irq_2_pin;
+}
+
/*
* Reroute an IRQ to a different pin.
*/
@@ -2292,7 +2323,7 @@ int ioapic_guest_read(unsigned long phys
int ioapic_guest_write(unsigned long physbase, unsigned int reg, u32 val)
{
- int apic, pin, irq, ret, vector, pirq;
+ int apic, pin, irq, ret, pirq;
struct IO_APIC_route_entry rte = { 0 };
unsigned long flags;
struct irq_desc *desc;
@@ -2360,13 +2391,25 @@ int ioapic_guest_write(unsigned long phy
return 0;
}
- if ( desc->arch.vector <= 0 || desc->arch.vector > LAST_DYNAMIC_VECTOR ) {
- add_pin_to_irq(irq, apic, pin);
- vector = assign_irq_vector(irq, NULL);
- if ( vector < 0 )
- return vector;
+ if ( desc->arch.vector <= 0 || desc->arch.vector > LAST_DYNAMIC_VECTOR )
+ {
+ int vector = desc->arch.vector;
+
+ if ( vector < FIRST_HIPRIORITY_VECTOR )
+ add_pin_to_irq(irq, apic, pin);
+ else
+ desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
+ ret = assign_irq_vector(irq, NULL);
+ if ( ret < 0 )
+ {
+ if ( vector < FIRST_HIPRIORITY_VECTOR )
+ remove_pin_from_irq(irq, apic, pin);
+ else
+ desc->arch.vector = vector;
+ return ret;
+ }
- printk(XENLOG_INFO "allocated vector %02x for irq %d\n", vector, irq);
+ printk(XENLOG_INFO "allocated vector %02x for irq %d\n", ret, irq);
}
spin_lock(&dom0->event_lock);
ret = map_domain_pirq(dom0, pirq, irq,
++++++ 27072-x86-shadow-fix-off-by-one-in-MMIO-permission-check.patch ++++++
# Commit afa65ddfd88184a894d9364bec587554c28c20e0
# Date 2013-05-15 14:34:05 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86/shadow: fix off-by-one in MMIO permission check
iomem_access_permitted() wants an inclusive range as input.
Also use pfn_to_paddr() in nearby code instead of open coding it.
Signed-off-by: Jan Beulich
Acked-by: Tim Deegan
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -603,13 +603,13 @@ _sh_propagate(struct vcpu *v,
else if ( d->arch.hvm_domain.is_in_uc_mode )
sflags |= pat_type_2_pte_flags(PAT_TYPE_UNCACHABLE);
else
- if ( iomem_access_permitted(d, mfn_x(target_mfn), mfn_x(target_mfn) + 1) )
+ if ( iomem_access_permitted(d, mfn_x(target_mfn), mfn_x(target_mfn)) )
{
if ( p2mt == p2m_mmio_direct )
sflags |= get_pat_flags(v,
gflags,
gfn_to_paddr(target_gfn),
- ((paddr_t)mfn_x(target_mfn)) << PAGE_SHIFT,
+ pfn_to_paddr(mfn_x(target_mfn)),
MTRR_TYPE_UNCACHABLE);
else if ( iommu_snoop )
sflags |= pat_type_2_pte_flags(PAT_TYPE_WRBACK);
@@ -617,7 +617,7 @@ _sh_propagate(struct vcpu *v,
sflags |= get_pat_flags(v,
gflags,
gfn_to_paddr(target_gfn),
- ((paddr_t)mfn_x(target_mfn)) << PAGE_SHIFT,
+ pfn_to_paddr(mfn_x(target_mfn)),
NO_HARDCODE_MEM_TYPE);
}
}
++++++ 27074-x86-hvm-avoid-p2m-lookups-for-vlapic-accesses.patch ++++++
# Commit 5d43891bf4002b754cd90d83e91d9190e8c8b9d0
# Date 2013-05-16 12:05:25 +0100
# Author Tim Deegan
# Committer Tim Deegan
x86/hvm: avoid p2m lookups for vlapic accesses.
The LAPIC base address is a known GFN, so we can skip looking up the
p2m: we know it should be handled as emulated MMIO. That helps
performance in older Windows OSes, which make a _lot_ of TPR accesses.
This will change the behaviour of any OS that maps other
memory/devices at its LAPIC address; the new behaviour (the LAPIC
mapping always wins) is closer to actual hardware behaviour.
Signed-off-by: Tim Deegan
Acked-by: Jan Beulich
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1361,6 +1361,17 @@ int hvm_hap_nested_page_fault(paddr_t gp
}
}
+ /* For the benefit of 32-bit WinXP (& older Windows) on AMD CPUs,
+ * a fast path for LAPIC accesses, skipping the p2m lookup. */
+ if ( !nestedhvm_vcpu_in_guestmode(v)
+ && gfn == PFN_DOWN(vlapic_base_address(vcpu_vlapic(v))) )
+ {
+ if ( !handle_mmio() )
+ hvm_inject_hw_exception(TRAP_gp_fault, 0);
+ rc = 1;
+ goto out;
+ }
+
p2m = p2m_get_hostp2m(v->domain);
mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma,
P2M_ALLOC | (access_w ? P2M_UNSHARE : 0), NULL);
@@ -2461,6 +2472,12 @@ static enum hvm_copy_result __hvm_copy(
gfn = addr >> PAGE_SHIFT;
}
+ /* For the benefit of 32-bit WinXP (& older Windows) on AMD CPUs,
+ * a fast path for LAPIC accesses, skipping the p2m lookup. */
+ if ( !nestedhvm_vcpu_in_guestmode(curr)
+ && gfn == PFN_DOWN(vlapic_base_address(vcpu_vlapic(curr))) )
+ return HVMCOPY_bad_gfn_to_mfn;
+
page = get_page_from_gfn(curr->domain, gfn, &p2mt, P2M_UNSHARE);
if ( p2m_is_paging(p2mt) )
++++++ 27079-fix-XSA-46-regression-with-xend-xm.patch ++++++
References: bnc#813675, bnc#821111
# Commit 934a5253d932b6f67fe40fc48975a2b0117e4cce
# Date 2013-05-21 11:32:34 +0200
# Author Jan Beulich
# Committer Jan Beulich
fix XSA-46 regression with xend/xm
The hypervisor side changes for XSA-46 require the tool stack to now
always map the guest pIRQ before granting access permission to the
underlying host IRQ (GSI). This in particular requires that pciif.py
no longer can skip this step (assuming qemu would do it) for HVM
guests.
This in turn exposes, however, an inconsistency between xend and qemu:
The former wants to always establish 1:1 mappings between pIRQ and host
IRQ (for non-MSI only of course), while the latter always wants to
allocate an arbitrary mapping. Since the whole tool stack obviously
should always agree on the mapping model, make libxc enforce the 1:1
mapping as the more natural one (as well as being the one that allows
for easier debugging, since there no need to find out the extra
mapping). Users of libxc that want to establish a particular (rather
than an allocated) mapping are still free to do so, as well as tool
stacks not based on libxc wanting to implement an allocation based
model (which is why it's not the hypervisor that's being changed to
enforce either model).
Since libxl, like xend, already uses a 1:1 model, it's unaffected by
the libxc change (and it being unaffected by the original hypervisor
side changes is - afaict - simply due to qemu getting spawned at a
later point in time compared to the xend event flow).
Signed-off-by: Jan Beulich
Tested-by: Andreas Falck (on 4.1)
Tested-by: Gordan Bobic (on 4.2)
Acked-by: Ian Campbell
Reviewed-by: Andrew Cooper
--- a/tools/libxc/xc_physdev.c
+++ b/tools/libxc/xc_physdev.c
@@ -49,7 +49,7 @@ int xc_physdev_map_pirq(xc_interface *xc
map.domid = domid;
map.type = MAP_PIRQ_TYPE_GSI;
map.index = index;
- map.pirq = *pirq;
+ map.pirq = *pirq < 0 ? index : *pirq;
rc = do_physdev_op(xch, PHYSDEVOP_map_pirq, &map, sizeof(map));
--- a/tools/python/xen/xend/server/pciif.py
+++ b/tools/python/xen/xend/server/pciif.py
@@ -340,7 +340,7 @@ class PciController(DevController):
raise VmError(('pci: failed to configure I/O memory on device '+
'%s - errno=%d')%(dev.name,rc))
- if not self.vm.info.is_hvm() and dev.irq:
+ if dev.irq > 0:
rc = xc.physdev_map_pirq(domid = fe_domid,
index = dev.irq,
pirq = dev.irq)
++++++ 27085-x86-fix-boot-time-APIC-mode-detection.patch ++++++
References: bnc#817904
# Commit 234c4dde2fd4f1182fe1a6bea6bced83fe363007
# Date 2013-05-23 13:08:32 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: fix boot time APIC mode detection
current_cpu_data becomes valid only relatively late in the boot
process, so looking there for a particular feature early in the game
would generally give the appearance of the feature being unavailable.
Getting this wrong means that at kexec time the system would get
returned to xAPIC mode, causing disconnect_bsp_APIC() to try to access
the APIC page, which on systems with x2APIC pre-enabled will never get
set up.
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
Reviewed-by: Andrew Cooper
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1473,7 +1473,7 @@ enum apic_mode current_local_apic_mode(v
/* Reading EXTD bit from the MSR is only valid if CPUID
* says so, else reserved */
- if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC)
+ if ( boot_cpu_has(X86_FEATURE_X2APIC)
&& (msr_contents & MSR_IA32_APICBASE_EXTD) )
return APIC_MODE_X2APIC;
++++++ 27093-libxc-limit-cpu-values-when-setting-vcpu-affinity.patch ++++++
References: bnc#819416 CVE-2013-2072 XSA-56
# Commit 41abbadef60e5fccdfd688579dd458f7f7887cf5
# Date 2013-05-29 15:49:22 +0100
# Author Ian Jackson
# Committer Ian Jackson
libxc: limit cpu values when setting vcpu affinity
When support for pinning more than 64 cpus was added, check for cpu
out-of-range values was removed. This can lead to subsequent
out-of-bounds cpumap array accesses in case the cpu number is higher
than the actual count.
This patch returns the check.
This is CVE-2013-2072 / XSA-56
Signed-off-by: Petr Matousek
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -228,6 +228,7 @@ static PyObject *pyxc_vcpu_setaffinity(X
int vcpu = 0, i;
xc_cpumap_t cpumap;
PyObject *cpulist = NULL;
+ int nr_cpus;
static char *kwd_list[] = { "domid", "vcpu", "cpumap", NULL };
@@ -235,6 +236,10 @@ static PyObject *pyxc_vcpu_setaffinity(X
&dom, &vcpu, &cpulist) )
return NULL;
+ nr_cpus = xc_get_max_cpus(self->xc_handle);
+ if ( nr_cpus == 0 )
+ return pyxc_error_to_exception(self->xc_handle);
+
cpumap = xc_cpumap_alloc(self->xc_handle);
if(cpumap == NULL)
return pyxc_error_to_exception(self->xc_handle);
@@ -244,6 +249,13 @@ static PyObject *pyxc_vcpu_setaffinity(X
for ( i = 0; i < PyList_Size(cpulist); i++ )
{
long cpu = PyInt_AsLong(PyList_GetItem(cpulist, i));
+ if ( cpu < 0 || cpu >= nr_cpus )
+ {
+ free(cpumap);
+ errno = EINVAL;
+ PyErr_SetFromErrno(xc_error_obj);
+ return NULL;
+ }
cpumap[cpu / 8] |= 1 << (cpu % 8);
}
}
++++++ 27112-x86-fix-ordering-of-operations-in-destroy_irq.patch ++++++
# Commit 43427e65ccfea0c6dc0232f358287e6cc616b507
# Date 2013-05-31 08:35:22 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86: fix ordering of operations in destroy_irq()
The fix for XSA-36, switching the default of vector map management to
be per-device, exposed more readily a problem with the cleanup of these
vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
keeps the subsequently invoked clear_irq_vector() from clearing the
bits for both the in-use and a possibly still outstanding old vector.
Fix this by folding dynamic_irq_cleanup() into destroy_irq(), which was
its only caller, deferring the clearing of the vector map pointer until
after clear_irq_vector().
Once at it, also defer resetting of desc->handler until after the loop
around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
(mostly theoretical) issue with the intercation with do_IRQ(): If we
don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
->ack() and ->end() with different ->handler pointers, potentially
leading to an IRQ remaining un-acked. The issue is mostly theoretical
because non-guest IRQs are subject to destroy_irq() only on (boot time)
error paths.
As to the changed locking: Invoking clear_irq_vector() with desc->lock
held is okay because vector_lock already nests inside desc->lock (proven
by set_desc_affinity(), which takes vector_lock and gets called from
various desc->handler->ack implementations, getting invoked with
desc->lock held).
Reported-by: Andrew Cooper
Signed-off-by: Jan Beulich
Acked-by: Keir Fraser
Reviewed-by: Andrew Cooper
Acked-by: George Dunlap
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -196,12 +196,24 @@ int create_irq(int node)
return irq;
}
-static void dynamic_irq_cleanup(unsigned int irq)
+void destroy_irq(unsigned int irq)
{
struct irq_desc *desc = irq_to_desc(irq);
unsigned long flags;
struct irqaction *action;
+ BUG_ON(!MSI_IRQ(irq));
+
+ if ( dom0 )
+ {
+ int err = irq_deny_access(dom0, irq);
+
+ if ( err )
+ printk(XENLOG_G_ERR
+ "Could not revoke Dom0 access to IRQ%u (error %d)\n",
+ irq, err);
+ }
+
spin_lock_irqsave(&desc->lock, flags);
desc->status |= IRQ_DISABLED;
desc->status &= ~IRQ_GUEST;
@@ -209,16 +221,19 @@ static void dynamic_irq_cleanup(unsigned
action = desc->action;
desc->action = NULL;
desc->msi_desc = NULL;
- desc->handler = &no_irq_type;
- desc->arch.used_vectors = NULL;
cpumask_setall(desc->affinity);
spin_unlock_irqrestore(&desc->lock, flags);
/* Wait to make sure it's not being used on another CPU */
do { smp_mb(); } while ( desc->status & IRQ_INPROGRESS );
- if (action)
- xfree(action);
+ spin_lock_irqsave(&desc->lock, flags);
+ desc->handler = &no_irq_type;
+ clear_irq_vector(irq);
+ desc->arch.used_vectors = NULL;
+ spin_unlock_irqrestore(&desc->lock, flags);
+
+ xfree(action);
}
static void __clear_irq_vector(int irq)
@@ -285,24 +300,6 @@ void clear_irq_vector(int irq)
spin_unlock_irqrestore(&vector_lock, flags);
}
-void destroy_irq(unsigned int irq)
-{
- BUG_ON(!MSI_IRQ(irq));
-
- if ( dom0 )
- {
- int err = irq_deny_access(dom0, irq);
-
- if ( err )
- printk(XENLOG_G_ERR
- "Could not revoke Dom0 access to IRQ%u (error %d)\n",
- irq, err);
- }
-
- dynamic_irq_cleanup(irq);
- clear_irq_vector(irq);
-}
-
int irq_to_vector(int irq)
{
int vector = -1;
++++++ 27117-x86-MCE-disable-if-MCE-banks-are-not-present.patch ++++++
# Commit 5cffb77c4072fa5b46700a2dbb3e46c5a54eba6d
# Date 2013-06-03 15:42:46 +0200
# Author Aravindh Puthiyaparambil
# Committer Jan Beulich
x86/MCE: disable if MCE banks are not present
When booting Xen on VMware ESX 5.1 and Workstation 9, you hit a GPF
during MCE initialization. The culprit is line 631 in
set_poll_bankmask():
bitmap_copy(mb->bank_map, mca_allbanks->bank_map, nr_mce_banks);
What is happening is that in mca_cap_init(), nr_mce_banks is being set
to 0. This causes the allocation of bank_map to be set to
ZERO_BLOCK_PTR which is the return value for zero-size allocation by
xzalloc_array()/_xmalloc(). This results in the bitmap_copy() to fail
disastrously. The following patch fixes this issue.
Signed-off-by: Aravindh Puthiyaparambil
Reviewed-by: Andrew Cooper
Acked-by: Christoph Egger
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -739,7 +739,14 @@ int mca_cap_init(void)
}
nr_mce_banks = msr_content & MCG_CAP_COUNT;
- /* mcabanks_alloc depends on nr_mcebanks */
+ if (!nr_mce_banks)
+ {
+ printk(XENLOG_INFO "CPU%u: No MCE banks present. "
+ "Machine check support disabled\n", smp_processor_id());
+ return -ENODEV;
+ }
+
+ /* mcabanks_alloc depends on nr_mce_banks */
if (!mca_allbanks)
{
int i;
++++++ 27118-x86-xsave-fix-information-leak-on-AMD-CPUs.patch ++++++
References: bnc#820917 CVE-2013-2076 XSA-52.
# Commit 8dcf9f0113454f233089e8e5bb3970d891928410
# Date 2013-06-04 09:26:54 +0200
# Author Jan Beulich
# Committer Jan Beulich
x86/xsave: fix information leak on AMD CPUs
Just like for FXSAVE/FXRSTOR, XSAVE/XRSTOR also don't save/restore the
last instruction and operand pointers as well as the last opcode if
there's no pending unmasked exception (see CVE-2006-1056 and commit
9747:4d667a139318).
While the FXSR solution sits in the save path, I prefer to have this in
the restore path because there the handling is simpler (namely in the
context of the pending changes to properly save the selector values for
32-bit guest code).
Also this is using FFREE instead of EMMS, as it doesn't seem unlikely
that in the future we may see CPUs with x87 and SSE/AVX but no MMX
support. The goal here anyway is just to avoid an FPU stack overflow.
I would have preferred to use FFREEP instead of FFREE (freeing two
stack slots at once), but AMD doesn't document that instruction.
This is CVE-2013-2076 / XSA-52.
Signed-off-by: Jan Beulich
Tested-by: Suravee Suthikulpanit