Hello community, here is the log from the commit of package xen for openSUSE:Factory checked in at 2019-08-07 13:54:53 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/xen (Old) and /work/SRC/openSUSE:Factory/.xen.new.9556 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Package is "xen" Wed Aug 7 13:54:53 2019 rev:268 rq:720120 version:4.12.0_16 Changes: -------- --- /work/SRC/openSUSE:Factory/xen/xen.changes 2019-06-22 11:04:36.251856662 +0200 +++ /work/SRC/openSUSE:Factory/.xen.new.9556/xen.changes 2019-08-07 13:54:57.680857127 +0200 @@ -1,0 +2,95 @@ +Wed Jul 17 13:56:46 UTC 2019 - ohering@suse.de + +- Update xen-dom0-modules.service (bsc#1137251) + Map backend module names from pvops and xenlinux kernels to a + module alias. This avoids errors from modprobe about unknown + modules. Ignore a few xenlinux modules that lack aliases. + +------------------------------------------------------------------- +Mon Jul 15 07:56:56 MDT 2019 - carnold@suse.com + +- Gcc9 warnings seem to be cleared up with upstream fixes. + Drop gcc9-ignore-warnings.patch + +------------------------------------------------------------------- +Tue Jun 25 09:29:05 MDT 2019 - carnold@suse.com + +- bsc#1138563 - L3: xenpvnetboot improperly ported to Python 3 + fix-xenpvnetboot.patch + +------------------------------------------------------------------- +Mon Jun 24 08:02:57 UTC 2019 - ohering@suse.de + +- Move /etc/modprobe.d/xen_loop.conf to /lib/modprobe.d/xen_loop.conf + +------------------------------------------------------------------- +Mon Jun 24 08:00:10 UTC 2019 - ohering@suse.de + +- Remove /etc/xen/xenapiusers and /etc/pam.d/xen-api + +------------------------------------------------------------------- +Fri Jun 21 12:25:55 UTC 2019 - ohering@suse.de + +- Remove all upstream provided files in /etc/xen + They are not required at runtime. The host admin is now + responsible if he really needs anything in this subdirectory. + +------------------------------------------------------------------- +Fri Jun 21 12:07:45 UTC 2019 - ohering@suse.de + +- In our effort to make /etc fully admin controlled, move /etc/xen/scripts + to libexec/xen/scripts with xen-tools.etc_pollution.patch + +------------------------------------------------------------------- +Wed Jun 19 13:20:39 UTC 2019 - ohering@suse.de + +- Move /etc/bash_completion.d/xl.sh to %{_datadir}/bash-completion/completions + +------------------------------------------------------------------- +Mon Jun 17 09:08:33 MDT 2019 - carnold@suse.com + +- bsc#1138294 - VUL-0: XSA-295: Unlimited Arm Atomics Operations + 5d03a0c4-1-Arm-add-an-isb-before-reading-CNTPCT_EL0.patch + 5d03a0c4-2-gnttab-rework-prototype-of-set_status.patch + 5d03a0c4-3-Arm64-rewrite-bitops-in-C.patch + 5d03a0c4-4-Arm32-rewrite-bitops-in-C.patch + 5d03a0c4-5-Arm-bitops-consolidate-prototypes.patch + 5d03a0c4-6-Arm64-cmpxchg-simplify.patch + 5d03a0c4-7-Arm32-cmpxchg-simplify.patch + 5d03a0c4-8-Arm-bitops-helpers-with-timeout.patch + 5d03a0c4-9-Arm-cmpxchg-helper-with-timeout.patch + 5d03a0c4-A-Arm-turn-on-SILO-mode-by-default.patch + 5d03a0c4-B-bitops-guest-helpers.patch + 5d03a0c4-C-cmpxchg-guest-helpers.patch + 5d03a0c4-D-use-guest-atomics-helpers.patch + 5d03a0c4-E-Arm-add-perf-counters-in-guest-atomic-helpers.patch + 5d03a0c4-F-Arm-protect-gnttab_clear_flag.patch +- Upstream bug fixes (bsc#1027519) + 5c87b6c8-drop-arch_evtchn_inject.patch + 5c87b6e8-avoid-atomic-rmw-accesses-in-map_vcpu_info.patch + 5cd921fb-trace-fix-build-with-gcc9.patch + 5cd9224b-AMD-IOMMU-disable-upon-init-fail.patch + 5cd922c5-x86-MTRR-recalc-p2mt-when-iocaps.patch + 5cd9230f-VMX-correctly-get-GS_SHADOW-for-current.patch + 5cd926d0-bitmap_fill-zero-sized.patch + 5cd92724-drivers-video-drop-constraints.patch + 5cd93a69-x86-spec-ctrl-reposition-XPTI-parsing.patch (Replaces xsa297-0a.patch) + 5cd93a69-x86-MSR_INTEL_CORE_THREAD_COUNT.patch (Replaces xsa297-0b.patch) + 5cd93a69-x86-boot-detect-Intel-SMT-correctly.patch (Replaces xsa297-0c.patch) + 5cdad090-x86-spec-ctrl-misc-non-functional-cleanup.patch (Replaces xsa297-0d.patch) + 5cdad090-x86-spec-ctrl-CPUID-MSR-definitions-for-MDS.patch (Replaces xsa297-1.patch) + 5cdad090-x86-spec-ctrl-infrastructure-for-VERW-flush.patch (Replaces xsa297-2.patch) + 5cdad090-x86-spec-ctrl-opts-to-control-VERW-flush.patch (Replaces xsa297-3.patch) + 5cd981ff-x86-IRQ-tracing-avoid-UB-or-worse.patch + 5cdeb9fd-sched-fix-csched2_deinit_pdata.patch + 5ce7a92f-x86-IO-APIC-fix-build-with-gcc9.patch + 5cf0f6a4-x86-vhpet-resume-avoid-small-diff.patch + 5cf16e51-x86-spec-ctrl-Knights-retpoline-safe.patch + +------------------------------------------------------------------- +Fri Jun 14 15:35:28 MDT 2019 - carnold@suse.com + +- Fix some outdated information in the readme + README.SUSE + +------------------------------------------------------------------- Old: ---- etc_pam.d_xen-api gcc9-ignore-warnings.patch xenapiusers xsa297-0a.patch xsa297-0b.patch xsa297-0c.patch xsa297-0d.patch xsa297-1.patch xsa297-2.patch xsa297-3.patch New: ---- 5c87b6c8-drop-arch_evtchn_inject.patch 5c87b6e8-avoid-atomic-rmw-accesses-in-map_vcpu_info.patch 5cd921fb-trace-fix-build-with-gcc9.patch 5cd9224b-AMD-IOMMU-disable-upon-init-fail.patch 5cd922c5-x86-MTRR-recalc-p2mt-when-iocaps.patch 5cd9230f-VMX-correctly-get-GS_SHADOW-for-current.patch 5cd926d0-bitmap_fill-zero-sized.patch 5cd92724-drivers-video-drop-constraints.patch 5cd93a69-x86-MSR_INTEL_CORE_THREAD_COUNT.patch 5cd93a69-x86-boot-detect-Intel-SMT-correctly.patch 5cd93a69-x86-spec-ctrl-reposition-XPTI-parsing.patch 5cd981ff-x86-IRQ-tracing-avoid-UB-or-worse.patch 5cdad090-x86-spec-ctrl-CPUID-MSR-definitions-for-MDS.patch 5cdad090-x86-spec-ctrl-infrastructure-for-VERW-flush.patch 5cdad090-x86-spec-ctrl-misc-non-functional-cleanup.patch 5cdad090-x86-spec-ctrl-opts-to-control-VERW-flush.patch 5cdeb9fd-sched-fix-csched2_deinit_pdata.patch 5ce7a92f-x86-IO-APIC-fix-build-with-gcc9.patch 5cf0f6a4-x86-vhpet-resume-avoid-small-diff.patch 5cf16e51-x86-spec-ctrl-Knights-retpoline-safe.patch 5d03a0c4-1-Arm-add-an-isb-before-reading-CNTPCT_EL0.patch 5d03a0c4-2-gnttab-rework-prototype-of-set_status.patch 5d03a0c4-3-Arm64-rewrite-bitops-in-C.patch 5d03a0c4-4-Arm32-rewrite-bitops-in-C.patch 5d03a0c4-5-Arm-bitops-consolidate-prototypes.patch 5d03a0c4-6-Arm64-cmpxchg-simplify.patch 5d03a0c4-7-Arm32-cmpxchg-simplify.patch 5d03a0c4-8-Arm-bitops-helpers-with-timeout.patch 5d03a0c4-9-Arm-cmpxchg-helper-with-timeout.patch 5d03a0c4-A-Arm-turn-on-SILO-mode-by-default.patch 5d03a0c4-B-bitops-guest-helpers.patch 5d03a0c4-C-cmpxchg-guest-helpers.patch 5d03a0c4-D-use-guest-atomics-helpers.patch 5d03a0c4-E-Arm-add-perf-counters-in-guest-atomic-helpers.patch 5d03a0c4-F-Arm-protect-gnttab_clear_flag.patch fix-xenpvnetboot.patch xen-tools.etc_pollution.patch ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ xen.spec ++++++ --- /var/tmp/diff_new_pack.GwU86x/_old 2019-08-07 13:55:00.516857096 +0200 +++ /var/tmp/diff_new_pack.GwU86x/_new 2019-08-07 13:55:00.520857096 +0200 @@ -127,7 +127,7 @@ BuildRequires: pesign-obs-integration %endif -Version: 4.12.0_12 +Version: 4.12.0_16 Release: 0 Summary: Xen Virtualization: Hypervisor (aka VMM aka Microkernel) License: GPL-2.0-only @@ -147,9 +147,6 @@ Source23: block-npiv-vport Source26: init.xen_loop Source29: block-dmmd -# Xen API remote authentication sources -Source30: etc_pam.d_xen-api -Source31: xenapiusers # Init script and sysconf file for pciback Source34: init.pciback Source35: sysconfig.pciback @@ -164,24 +161,52 @@ # Upstream patches Patch1: 5c87b644-IOMMU-leave-enabled-for-kexec-crash.patch Patch2: 5c87b6a2-x86-HVM-dont-crash-guest-in-find_mmio_cache.patch -Patch3: 5c87e6d1-x86-TSX-controls-for-RTM-force-abort-mode.patch -Patch4: 5c8f752c-x86-e820-build-with-gcc9.patch -Patch5: 5c8fb92d-x86-HVM-split-linear-reads-and-writes.patch -Patch6: 5c8fb951-x86-HVM-finish-IOREQs-correctly-on-completion.patch -Patch7: 5c8fc6c0-x86-MSR-shorten-ARCH_CAPABILITIES.patch -Patch8: 5c8fc6c0-x86-SC-retpoline-safety-calculations-for-eIBRS.patch -Patch9: 5c9e63c5-credit2-SMT-idle-handling.patch -Patch10: 5ca46b68-x86emul-no-GPR-update-upon-AVX-gather-failures.patch -Patch11: 5ca773d1-x86emul-dont-read-mask-reg-without-AVX512F.patch -Patch12: 5cab1f66-timers-fix-memory-leak-with-cpu-plug.patch -Patch13: 5cac6cba-vmx-Fixup-removals-of-MSR-load-save-list-entries.patch -Patch29701: xsa297-0a.patch -Patch29702: xsa297-0b.patch -Patch29703: xsa297-0c.patch -Patch29704: xsa297-0d.patch -Patch29711: xsa297-1.patch -Patch29712: xsa297-2.patch -Patch29713: xsa297-3.patch +Patch3: 5c87b6c8-drop-arch_evtchn_inject.patch +Patch4: 5c87b6e8-avoid-atomic-rmw-accesses-in-map_vcpu_info.patch +Patch5: 5c87e6d1-x86-TSX-controls-for-RTM-force-abort-mode.patch +Patch6: 5c8f752c-x86-e820-build-with-gcc9.patch +Patch7: 5c8fb92d-x86-HVM-split-linear-reads-and-writes.patch +Patch8: 5c8fb951-x86-HVM-finish-IOREQs-correctly-on-completion.patch +Patch9: 5c8fc6c0-x86-MSR-shorten-ARCH_CAPABILITIES.patch +Patch10: 5c8fc6c0-x86-SC-retpoline-safety-calculations-for-eIBRS.patch +Patch11: 5c9e63c5-credit2-SMT-idle-handling.patch +Patch12: 5ca46b68-x86emul-no-GPR-update-upon-AVX-gather-failures.patch +Patch13: 5ca773d1-x86emul-dont-read-mask-reg-without-AVX512F.patch +Patch14: 5cab1f66-timers-fix-memory-leak-with-cpu-plug.patch +Patch15: 5cac6cba-vmx-Fixup-removals-of-MSR-load-save-list-entries.patch +Patch16: 5cd921fb-trace-fix-build-with-gcc9.patch +Patch17: 5cd9224b-AMD-IOMMU-disable-upon-init-fail.patch +Patch18: 5cd922c5-x86-MTRR-recalc-p2mt-when-iocaps.patch +Patch19: 5cd9230f-VMX-correctly-get-GS_SHADOW-for-current.patch +Patch20: 5cd926d0-bitmap_fill-zero-sized.patch +Patch21: 5cd92724-drivers-video-drop-constraints.patch +Patch22: 5cd93a69-x86-spec-ctrl-reposition-XPTI-parsing.patch +Patch23: 5cd93a69-x86-MSR_INTEL_CORE_THREAD_COUNT.patch +Patch24: 5cd93a69-x86-boot-detect-Intel-SMT-correctly.patch +Patch25: 5cd981ff-x86-IRQ-tracing-avoid-UB-or-worse.patch +Patch26: 5cdad090-x86-spec-ctrl-misc-non-functional-cleanup.patch +Patch27: 5cdad090-x86-spec-ctrl-CPUID-MSR-definitions-for-MDS.patch +Patch28: 5cdad090-x86-spec-ctrl-infrastructure-for-VERW-flush.patch +Patch29: 5cdad090-x86-spec-ctrl-opts-to-control-VERW-flush.patch +Patch30: 5cdeb9fd-sched-fix-csched2_deinit_pdata.patch +Patch31: 5ce7a92f-x86-IO-APIC-fix-build-with-gcc9.patch +Patch32: 5cf0f6a4-x86-vhpet-resume-avoid-small-diff.patch +Patch33: 5cf16e51-x86-spec-ctrl-Knights-retpoline-safe.patch +Patch34: 5d03a0c4-1-Arm-add-an-isb-before-reading-CNTPCT_EL0.patch +Patch35: 5d03a0c4-2-gnttab-rework-prototype-of-set_status.patch +Patch36: 5d03a0c4-3-Arm64-rewrite-bitops-in-C.patch +Patch37: 5d03a0c4-4-Arm32-rewrite-bitops-in-C.patch +Patch38: 5d03a0c4-5-Arm-bitops-consolidate-prototypes.patch +Patch39: 5d03a0c4-6-Arm64-cmpxchg-simplify.patch +Patch40: 5d03a0c4-7-Arm32-cmpxchg-simplify.patch +Patch41: 5d03a0c4-8-Arm-bitops-helpers-with-timeout.patch +Patch42: 5d03a0c4-9-Arm-cmpxchg-helper-with-timeout.patch +Patch43: 5d03a0c4-A-Arm-turn-on-SILO-mode-by-default.patch +Patch44: 5d03a0c4-B-bitops-guest-helpers.patch +Patch45: 5d03a0c4-C-cmpxchg-guest-helpers.patch +Patch46: 5d03a0c4-D-use-guest-atomics-helpers.patch +Patch47: 5d03a0c4-E-Arm-add-perf-counters-in-guest-atomic-helpers.patch +Patch48: 5d03a0c4-F-Arm-protect-gnttab_clear_flag.patch # Our platform specific patches Patch400: xen-destdir.patch Patch401: vif-bridge-no-iptables.patch @@ -195,9 +220,9 @@ Patch409: xenstore-launch.patch # Needs to go upstream Patch420: suspend_evtchn_lock.patch +Patch421: xen-tools.etc_pollution.patch Patch422: stubdom-have-iovec.patch Patch423: vif-route.patch -Patch424: gcc9-ignore-warnings.patch # Other bug fixes or features Patch451: xenconsole-no-multiple-connections.patch Patch452: hibernate.patch @@ -222,6 +247,7 @@ Patch501: pygrub-python3-conversion.patch Patch502: migration-python3-conversion.patch Patch503: bin-python3-conversion.patch +Patch504: fix-xenpvnetboot.patch # Hypervisor and PV driver Patches Patch600: xen.bug1026236.suse_vtsc_tolerance.patch Patch601: x86-ioapic-ack-default.patch @@ -392,13 +418,41 @@ %patch11 -p1 %patch12 -p1 %patch13 -p1 -%patch29701 -p1 -%patch29702 -p1 -%patch29703 -p1 -%patch29704 -p1 -%patch29711 -p1 -%patch29712 -p1 -%patch29713 -p1 +%patch14 -p1 +%patch15 -p1 +%patch16 -p1 +%patch17 -p1 +%patch18 -p1 +%patch19 -p1 +%patch20 -p1 +%patch21 -p1 +%patch22 -p1 +%patch23 -p1 +%patch24 -p1 +%patch25 -p1 +%patch26 -p1 +%patch27 -p1 +%patch28 -p1 +%patch29 -p1 +%patch30 -p1 +%patch31 -p1 +%patch32 -p1 +%patch33 -p1 +%patch34 -p1 +%patch35 -p1 +%patch36 -p1 +%patch37 -p1 +%patch38 -p1 +%patch39 -p1 +%patch40 -p1 +%patch41 -p1 +%patch42 -p1 +%patch43 -p1 +%patch44 -p1 +%patch45 -p1 +%patch46 -p1 +%patch47 -p1 +%patch48 -p1 # Our platform specific patches %patch400 -p1 %patch401 -p1 @@ -412,9 +466,9 @@ %patch409 -p1 # Needs to go upstream %patch420 -p1 +%patch421 -p1 %patch422 -p1 %patch423 -p1 -%patch424 -p1 # Other bug fixes or features %patch451 -p1 %patch452 -p1 @@ -439,6 +493,7 @@ %patch501 -p1 %patch502 -p1 %patch503 -p1 +%patch504 -p1 # Hypervisor and PV driver Patches %patch600 -p1 %patch601 -p1 @@ -561,8 +616,10 @@ DESTDIR=%{buildroot} \ SYSCONFIG_DIR=%{_fillupdir} \ PKG_INSTALLDIR=%{_libdir}/pkgconfig \ + BASH_COMPLETION_DIR=%{_datadir}/bash-completion/completions \ %{?_smp_mflags} \ install +rm -rfv %{buildroot}/etc/xen find %{buildroot} -ls for i in %{buildroot}/%{_fillupdir}/* do @@ -824,31 +881,21 @@ install -m 644 docs/misc/$name %{buildroot}/%{_defaultdocdir}/xen/misc/ done -mkdir -p %{buildroot}/etc/modprobe.d -install -m644 %SOURCE26 %{buildroot}/etc/modprobe.d/xen_loop.conf +mkdir -p %{buildroot}/lib/modprobe.d +install -m644 %SOURCE26 %{buildroot}/lib/modprobe.d/xen_loop.conf # xen-utils make -C tools/xen-utils-0.1 install DESTDIR=%{buildroot} XEN_INTREE_BUILD=yes XEN_ROOT=$PWD install -m755 %SOURCE37 %{buildroot}/usr/sbin/xen2libvirt -rm -f %{buildroot}/etc/xen/README* -# Example config -mkdir -p %{buildroot}/etc/xen/{vm,examples,scripts} -mv %{buildroot}/etc/xen/xlexample* %{buildroot}/etc/xen/examples -rm -f %{buildroot}/etc/xen/examples/*nbd -install -m644 tools/xentrace/formats %{buildroot}/etc/xen/examples/xentrace_formats.txt +install -D -m644 tools/xentrace/formats %{buildroot}%{_datadir}/xen/xentrace_formats.txt # Scripts -rm -f %{buildroot}/etc/xen/scripts/block-*nbd -install -m755 %SOURCE21 %SOURCE22 %SOURCE23 %SOURCE29 %{buildroot}/etc/xen/scripts/ +rm -f %{buildroot}%{_libexecdir}/xen/scripts/block-*nbd +install -m755 %SOURCE21 %SOURCE22 %SOURCE23 %SOURCE29 %{buildroot}%{_libexecdir}/xen/scripts/ mkdir -p %{buildroot}/usr/lib/supportconfig/plugins install -m 755 %SOURCE13 %{buildroot}/usr/lib/supportconfig/plugins/xen -# Xen API remote authentication files -install -d %{buildroot}/etc/pam.d -install -m644 %SOURCE30 %{buildroot}/etc/pam.d/xen-api -install -m644 %SOURCE31 %{buildroot}/etc/xen/ - # Logrotate install -m644 -D %SOURCE15 %{buildroot}/etc/logrotate.d/xen @@ -875,10 +922,35 @@ echo -n > $conf done `" +> mods for mod in $mods do - echo "ExecStart=-/bin/sh -c 'modprobe $mod || :'" >> %{buildroot}/%{_unitdir}/${bn} + # load by alias, if possible, to handle pvops and xenlinux + alias="$mod" + case "$mod" in + xen-evtchn) ;; + xen-gntdev) ;; + xen-gntalloc) ;; + xen-blkback) alias='xen-backend:vbd' ;; + xen-netback) alias='xen-backend:vif' ;; + xen-pciback) alias='xen-backend:pci' ;; + evtchn) unset alias ;; + gntdev) unset alias ;; + netbk) alias='xen-backend:vif' ;; + blkbk) alias='xen-backend:vbd' ;; + xen-scsibk) unset alias ;; + usbbk) unset alias ;; + pciback) alias='xen-backend:pci' ;; + xen-acpi-processor) ;; + blktap2) unset alias ;; + *) ;; + esac + if test -n "${alias}" + then + echo "ExecStart=-/bin/sh -c 'modprobe $alias || :'" >> mods + fi done +sort -u mods | tee -a %{buildroot}/%{_unitdir}/${bn} rm -rfv %{buildroot}/%{_initddir} install -m644 %SOURCE35 %{buildroot}/%{_fillupdir}/sysconfig.pciback @@ -923,8 +995,7 @@ # !with_dom0_support # 32 bit hypervisor no longer supported. Remove dom0 tools. -rm -rf %{buildroot}/%{_datadir}/doc -rm -rf %{buildroot}/%{_datadir}/man +rm -rf %{buildroot}/%{_datadir} rm -rf %{buildroot}/%{_libdir}/xen rm -rf %{buildroot}/%{_libdir}/python* rm -rf %{buildroot}/%{_libdir}/ocaml* @@ -1012,22 +1083,21 @@ /usr/sbin/xen-destroy /usr/sbin/xen-livepatch /usr/sbin/xen-diag -%dir %attr(700,root,root) /etc/xen -%dir /etc/xen/scripts -/etc/xen/scripts/block* -/etc/xen/scripts/external-device-migrate -/etc/xen/scripts/hotplugpath.sh -/etc/xen/scripts/launch-xenstore -/etc/xen/scripts/locking.sh -/etc/xen/scripts/logging.sh -/etc/xen/scripts/vif2 -/etc/xen/scripts/vif-* -/etc/xen/scripts/vscsi -/etc/xen/scripts/xen-hotplug-* -/etc/xen/scripts/xen-network-common.sh -/etc/xen/scripts/xen-script-common.sh -/etc/xen/scripts/colo-proxy-setup -/etc/xen/scripts/remus-netbuf-setup +%dir %{_libexecdir}/xen/scripts +%{_libexecdir}/xen/scripts/block* +%{_libexecdir}/xen/scripts/external-device-migrate +%{_libexecdir}/xen/scripts/hotplugpath.sh +%{_libexecdir}/xen/scripts/launch-xenstore +%{_libexecdir}/xen/scripts/locking.sh +%{_libexecdir}/xen/scripts/logging.sh +%{_libexecdir}/xen/scripts/vif2 +%{_libexecdir}/xen/scripts/vif-* +%{_libexecdir}/xen/scripts/vscsi +%{_libexecdir}/xen/scripts/xen-hotplug-* +%{_libexecdir}/xen/scripts/xen-network-common.sh +%{_libexecdir}/xen/scripts/xen-script-common.sh +%{_libexecdir}/xen/scripts/colo-proxy-setup +%{_libexecdir}/xen/scripts/remus-netbuf-setup %dir /usr/lib/supportconfig %dir /usr/lib/supportconfig/plugins /usr/lib/supportconfig/plugins/xen @@ -1047,19 +1117,13 @@ %dir /var/log/xen %dir /var/log/xen/console %config /etc/logrotate.d/xen -/etc/xen/auto -%config /etc/xen/examples -%config /etc/xen/cpupool -%config /etc/xen/vm -%config(noreplace) /etc/xen/xenapiusers -%config(noreplace) /etc/xen/xl.conf -%config /etc/pam.d/xen-api -%config /etc/modprobe.d/xen_loop.conf +%dir /lib/modprobe.d +/lib/modprobe.d/xen_loop.conf %config %{_unitdir} %exclude %{_unitdir}/%{name}-vcpu-watch.service %config %{with_systemd_modules_load} -%dir /etc/modprobe.d -/etc/bash_completion.d/xl.sh +%{_datadir}/bash-completion +%{_datadir}/xen %dir %{_libdir}/python%{pyver}/site-packages/grub %dir %{_libdir}/python%{pyver}/site-packages/xen %dir %{_libdir}/python%{pyver}/site-packages/xen/lowlevel @@ -1079,7 +1143,6 @@ %if %{with xen_oxenstored} /usr/sbin/oxenstored -/etc/xen/oxenstored.conf %dir %{_libdir}/ocaml %dir %{_libdir}/ocaml/xenbus %dir %{_libdir}/ocaml/xenctrl ++++++ 5c87b6c8-drop-arch_evtchn_inject.patch ++++++ # Commit d9195962a62241f5f1b89d926eff8c063678f0c5 # Date 2019-03-12 14:40:24 +0100 # Author Jan Beulich <jbeulich@suse.com> # Committer Jan Beulich <jbeulich@suse.com> events: drop arch_evtchn_inject() Have the only user call vcpu_mark_events_pending() instead, at the same time arranging for correct ordering of the writes (evtchn_pending_sel should be written before evtchn_upcall_pending). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -597,11 +597,6 @@ out: return; } -void arch_evtchn_inject(struct vcpu *v) -{ - vgic_inject_irq(v->domain, v, v->domain->arch.evtchn_irq, true); -} - bool vgic_evtchn_irq_pending(struct vcpu *v) { struct pending_irq *p; --- a/xen/arch/arm/vgic/vgic.c +++ b/xen/arch/arm/vgic/vgic.c @@ -692,11 +692,6 @@ void vgic_kick_vcpus(struct domain *d) } } -void arch_evtchn_inject(struct vcpu *v) -{ - vgic_inject_irq(v->domain, v, v->domain->arch.evtchn_irq, true); -} - bool vgic_evtchn_irq_pending(struct vcpu *v) { struct vgic_irq *irq; --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -2724,9 +2724,3 @@ int allocate_and_map_msi_pirq(struct domain *d, int index, int *pirq_p, return ret; } - -void arch_evtchn_inject(struct vcpu *v) -{ - if ( is_hvm_vcpu(v) ) - hvm_assert_evtchn_irq(v); -} --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -1306,10 +1306,9 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset) * Mark everything as being pending just to make sure nothing gets * lost. The domain will get a spurious event, but it can cope. */ - vcpu_info(v, evtchn_upcall_pending) = 1; for ( i = 0; i < BITS_PER_EVTCHN_WORD(d); i++ ) set_bit(i, &vcpu_info(v, evtchn_pending_sel)); - arch_evtchn_inject(v); + vcpu_mark_events_pending(v); return 0; } --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -91,9 +91,6 @@ int guest_enabled_event(struct vcpu *v, uint32_t virq); /* Notify remote end of a Xen-attached event channel.*/ void notify_via_xen_event_channel(struct domain *ld, int lport); -/* Inject an event channel notification into the guest */ -void arch_evtchn_inject(struct vcpu *v); - /* * Internal event channel object storage. * ++++++ 5c87b6e8-avoid-atomic-rmw-accesses-in-map_vcpu_info.patch ++++++ # Commit c8cfbba625e3e74fd0152bd42449821e764cabae # Date 2019-03-12 14:40:56 +0100 # Author Jan Beulich <jbeulich@suse.com> # Committer Jan Beulich <jbeulich@suse.com> common: avoid atomic read-modify-write accesses in map_vcpu_info() There's no need to set the evtchn_pending_sel bits one by one. Simply write full words with all ones. For Arm this requires extending write_atomic() to also handle 64-bit values; for symmetry read_atomic() gets adjusted as well. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -1253,7 +1253,6 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset) void *mapping; vcpu_info_t *new_info; struct page_info *page; - int i; if ( offset > (PAGE_SIZE - sizeof(vcpu_info_t)) ) return -EINVAL; @@ -1306,8 +1305,12 @@ int map_vcpu_info(struct vcpu *v, unsigned long gfn, unsigned offset) * Mark everything as being pending just to make sure nothing gets * lost. The domain will get a spurious event, but it can cope. */ - for ( i = 0; i < BITS_PER_EVTCHN_WORD(d); i++ ) - set_bit(i, &vcpu_info(v, evtchn_pending_sel)); +#ifdef CONFIG_COMPAT + if ( !has_32bit_shinfo(d) ) + write_atomic(&new_info->native.evtchn_pending_sel, ~0); + else +#endif + write_atomic(&vcpu_info(v, evtchn_pending_sel), ~0); vcpu_mark_events_pending(v); return 0; --- a/xen/include/asm-arm/atomic.h +++ b/xen/include/asm-arm/atomic.h @@ -55,6 +55,19 @@ build_atomic_write(write_int_atomic, "", WORD, int, "r") #if defined (CONFIG_ARM_64) build_atomic_read(read_u64_atomic, "", "", uint64_t, "=r") build_atomic_write(write_u64_atomic, "", "", uint64_t, "r") +#elif defined (CONFIG_ARM_32) +static inline uint64_t read_u64_atomic(const volatile uint64_t *addr) +{ + uint64_t val; + + asm volatile ( "ldrd %0,%H0,%1" : "=r" (val) : "m" (*addr) ); + + return val; +} +static inline void write_u64_atomic(volatile uint64_t *addr, uint64_t val) +{ + asm volatile ( "strd %1,%H1,%0" : "=m" (*addr) : "r" (val) ); +} #endif build_add_sized(add_u8_sized, "b", BYTE, uint8_t, "ri") @@ -69,6 +82,7 @@ void __bad_atomic_size(void); case 1: __x = (typeof(*p))read_u8_atomic((uint8_t *)p); break; \ case 2: __x = (typeof(*p))read_u16_atomic((uint16_t *)p); break; \ case 4: __x = (typeof(*p))read_u32_atomic((uint32_t *)p); break; \ + case 8: __x = (typeof(*p))read_u64_atomic((uint64_t *)p); break; \ default: __x = 0; __bad_atomic_size(); break; \ } \ __x; \ @@ -80,6 +94,7 @@ void __bad_atomic_size(void); case 1: write_u8_atomic((uint8_t *)p, (uint8_t)__x); break; \ case 2: write_u16_atomic((uint16_t *)p, (uint16_t)__x); break; \ case 4: write_u32_atomic((uint32_t *)p, (uint32_t)__x); break; \ + case 8: write_u64_atomic((uint64_t *)p, (uint64_t)__x); break; \ default: __bad_atomic_size(); break; \ } \ __x; \ ++++++ 5cd921fb-trace-fix-build-with-gcc9.patch ++++++ # Commit 3fd3b266d4198c06e8e421ca515d9ba09ccd5155 # Date 2019-05-13 09:51:23 +0200 # Author Jan Beulich <jbeulich@suse.com> # Committer Jan Beulich <jbeulich@suse.com> trace: fix build with gcc9 While I've not observed this myself, gcc 9 (imo validly) reportedly may complain trace.c: In function '__trace_hypercall': trace.c:826:19: error: taking address of packed member of 'struct <anonymous>' may result in an unaligned pointer value [-Werror=address-of-packed-member] 826 | uint32_t *a = d.args; and the fix is rather simple - remove the __packed attribute. Introduce a BUILD_BUG_ON() as replacement, for the unlikely case that Xen might get ported to an architecture where array alignment higher that that of its elements. Reported-by: Martin Liška <martin.liska@suse.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: George Dunlap <george.dunlap@citrix.com> --- a/xen/common/trace.c +++ b/xen/common/trace.c @@ -819,12 +819,18 @@ unlock: void __trace_hypercall(uint32_t event, unsigned long op, const xen_ulong_t *args) { - struct __packed { + struct { uint32_t op; uint32_t args[6]; } d; uint32_t *a = d.args; + /* + * In lieu of using __packed above, which gcc9 legitimately doesn't + * like in combination with the address of d.args[] taken. + */ + BUILD_BUG_ON(offsetof(typeof(d), args) != sizeof(d.op)); + #define APPEND_ARG32(i) \ do { \ unsigned i_ = (i); \ ++++++ 5cd9224b-AMD-IOMMU-disable-upon-init-fail.patch ++++++ # Commit 87a3347d476443c66c79953d77d6aef1d2bb3bbd # Date 2019-05-13 09:52:43 +0200 # Author Jan Beulich <jbeulich@suse.com> # Committer Jan Beulich <jbeulich@suse.com> AMD/IOMMU: disable previously enabled IOMMUs upon init failure If any IOMMUs were successfully initialized before encountering failure, the successfully enabled ones should be disabled again before cleaning up their resources. Move disable_iommu() next to enable_iommu() to avoid a forward declaration, and take the opportunity to remove stray blank lines ahead of both functions' final closing braces. Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Brian Woods <brian.woods@amd.com> --- a/xen/drivers/passthrough/amd/iommu_init.c +++ b/xen/drivers/passthrough/amd/iommu_init.c @@ -909,7 +909,35 @@ static void enable_iommu(struct amd_iomm iommu->enabled = 1; spin_unlock_irqrestore(&iommu->lock, flags); +} + +static void disable_iommu(struct amd_iommu *iommu) +{ + unsigned long flags; + + spin_lock_irqsave(&iommu->lock, flags); + + if ( !iommu->enabled ) + { + spin_unlock_irqrestore(&iommu->lock, flags); + return; + } + + amd_iommu_msi_enable(iommu, IOMMU_CONTROL_DISABLED); + set_iommu_command_buffer_control(iommu, IOMMU_CONTROL_DISABLED); + set_iommu_event_log_control(iommu, IOMMU_CONTROL_DISABLED); + + if ( amd_iommu_has_feature(iommu, IOMMU_EXT_FEATURE_PPRSUP_SHIFT) ) + set_iommu_ppr_log_control(iommu, IOMMU_CONTROL_DISABLED); + + if ( amd_iommu_has_feature(iommu, IOMMU_EXT_FEATURE_GTSUP_SHIFT) ) + set_iommu_guest_translation_control(iommu, IOMMU_CONTROL_DISABLED); + + set_iommu_translation_control(iommu, IOMMU_CONTROL_DISABLED); + iommu->enabled = 0; + + spin_unlock_irqrestore(&iommu->lock, flags); } static void __init deallocate_buffer(void *buf, uint32_t sz) @@ -1046,6 +1074,7 @@ static void __init amd_iommu_init_cleanu list_del(&iommu->list); if ( iommu->enabled ) { + disable_iommu(iommu); deallocate_ring_buffer(&iommu->cmd_buffer); deallocate_ring_buffer(&iommu->event_log); deallocate_ring_buffer(&iommu->ppr_log); @@ -1297,36 +1326,6 @@ error_out: return rc; } -static void disable_iommu(struct amd_iommu *iommu) -{ - unsigned long flags; - - spin_lock_irqsave(&iommu->lock, flags); - - if ( !iommu->enabled ) - { - spin_unlock_irqrestore(&iommu->lock, flags); - return; - } - - amd_iommu_msi_enable(iommu, IOMMU_CONTROL_DISABLED); - set_iommu_command_buffer_control(iommu, IOMMU_CONTROL_DISABLED); - set_iommu_event_log_control(iommu, IOMMU_CONTROL_DISABLED); - - if ( amd_iommu_has_feature(iommu, IOMMU_EXT_FEATURE_PPRSUP_SHIFT) ) - set_iommu_ppr_log_control(iommu, IOMMU_CONTROL_DISABLED); - - if ( amd_iommu_has_feature(iommu, IOMMU_EXT_FEATURE_GTSUP_SHIFT) ) - set_iommu_guest_translation_control(iommu, IOMMU_CONTROL_DISABLED); - - set_iommu_translation_control(iommu, IOMMU_CONTROL_DISABLED); - - iommu->enabled = 0; - - spin_unlock_irqrestore(&iommu->lock, flags); - -} - static void invalidate_all_domain_pages(void) { struct domain *d; ++++++ 5cd922c5-x86-MTRR-recalc-p2mt-when-iocaps.patch ++++++ # Commit f3d880bf2be92534c5bacf11de2f561cbad550fb # Date 2019-05-13 09:54:45 +0200 # Author Igor Druzhinin <igor.druzhinin@citrix.com> # Committer Jan Beulich <jbeulich@suse.com> x86/mtrr: recalculate P2M type for domains with iocaps This change reflects the logic in epte_get_entry_emt() and allows changes in guest MTTRs to be reflected in EPT for domains having direct access to certain hardware memory regions but without IOMMU context assigned (e.g. XenGT). Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/hvm/mtrr.c +++ b/xen/arch/x86/hvm/mtrr.c @@ -783,7 +783,7 @@ HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save void memory_type_changed(struct domain *d) { - if ( has_iommu_pt(d) && d->vcpu && d->vcpu[0] ) + if ( (has_iommu_pt(d) || cache_flush_permitted(d)) && d->vcpu && d->vcpu[0] ) { p2m_memory_type_changed(d); flush_all(FLUSH_CACHE); ++++++ 5cd9230f-VMX-correctly-get-GS_SHADOW-for-current.patch ++++++ # Commit f69fc1c2f36e8a74ba54c9c8fa5c904ea1ad319e # Date 2019-05-13 09:55:59 +0200 # Author Tamas K Lengyel <tamas@tklengyel.com> # Committer Jan Beulich <jbeulich@suse.com> x86/vmx: correctly gather gs_shadow value for current vCPU Currently the gs_shadow value is only cached when the vCPU is being scheduled out by Xen. Reporting this (usually) stale value through vm_event is incorrect, since it doesn't represent the actual state of the vCPU at the time the event was recorded. This prevents vm_event subscribers from correctly finding kernel structures in the guest when it is trapped while in ring3. Refresh shadow_gs value when the context being saved is for the current vCPU. Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com> --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -779,12 +779,18 @@ static void vmx_load_cpu_state(struct vc static void vmx_save_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) { + if ( v == current ) + vmx_save_guest_msrs(v); + vmx_save_cpu_state(v, ctxt); vmx_vmcs_save(v, ctxt); } static int vmx_load_vmcs_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt) { + /* Not currently safe to use in current context. */ + ASSERT(v != current); + vmx_load_cpu_state(v, ctxt); if ( vmx_vmcs_restore(v, ctxt) ) ++++++ 5cd926d0-bitmap_fill-zero-sized.patch ++++++ # Commit 93df28be2d4f620caf18109222d046355ac56327 # Date 2019-05-13 10:12:00 +0200 # Author Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> # Committer Jan Beulich <jbeulich@suse.com> bitmap: fix bitmap_fill with zero-sized bitmap When bitmap_fill(..., 0) is called, do not try to write anything. Before this patch, it tried to write almost LONG_MAX, surely overwriting something. Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> --- a/xen/include/xen/bitmap.h +++ b/xen/include/xen/bitmap.h @@ -126,6 +126,8 @@ static inline void bitmap_fill(unsigned size_t nlongs = BITS_TO_LONGS(nbits); switch (nlongs) { + case 0: + break; default: memset(dst, -1, (nlongs - 1) * sizeof(unsigned long)); /* fall through */ ++++++ 5cd92724-drivers-video-drop-constraints.patch ++++++ # Commit 19600eb75aa9b1df3e4b0a4e55a5d08b957e1fd9 # Date 2019-05-13 10:13:24 +0200 # Author Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> # Committer Jan Beulich <jbeulich@suse.com> drivers/video: drop framebuffer size constraints The limit 1900x1200 do not match real world devices (1900 looks like a typo, should be 1920). But in practice the limits are arbitrary and do not serve any real purpose. As discussed in "Increase framebuffer size to todays standards" thread, drop them completely. This fixes graphic console on device with 3840x2160 native resolution. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> # Commit 343459e34a6d32ba44a21f8b8fe4c1f69b1714c2 # Date 2019-05-13 10:12:56 +0200 # Author Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> # Committer Jan Beulich <jbeulich@suse.com> drivers/video: drop unused limits MAX_BPP, MAX_FONT_W, MAX_FONT_H are not used in the code at all. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> --- a/xen/drivers/video/lfb.c +++ b/xen/drivers/video/lfb.c @@ -10,12 +10,6 @@ #include "lfb.h" #include "font.h" -#define MAX_XRES 1900 -#define MAX_YRES 1200 -#define MAX_BPP 4 -#define MAX_FONT_W 8 -#define MAX_FONT_H 16 - struct lfb_status { struct lfb_prop lfbp; @@ -149,13 +143,6 @@ void lfb_carriage_return(void) int __init lfb_init(struct lfb_prop *lfbp) { - if ( lfbp->width > MAX_XRES || lfbp->height > MAX_YRES ) - { - printk(XENLOG_WARNING "Couldn't initialize a %ux%u framebuffer early.\n", - lfbp->width, lfbp->height); - return -EINVAL; - } - lfb.lfbp = *lfbp; lfb.lbuf = xmalloc_bytes(lfb.lfbp.bytes_per_line); ++++++ 5cd93a69-x86-MSR_INTEL_CORE_THREAD_COUNT.patch ++++++ # Commit d4120936bcd1695faf5b575f1259c58e31d2b18b # Date 2019-05-13 10:35:37 +0100 # Author Andrew Cooper <andrew.cooper3@citrix.com> # Committer Andrew Cooper <andrew.cooper3@citrix.com> x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT This is a model specific register which details the current configuration cores and threads in the package. Because of how Hyperthread and Core configuration works works in firmware, the MSR it is de-facto constant and will remain unchanged until the next system reset. It is a read only MSR (so unilaterally reject writes), but for now retain its leaky-on-read properties. Further CPUID/MSR work is required before we can start virtualising a consistent topology to the guest, and retaining the old behaviour is the safest course of action. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/msr.c +++ b/xen/arch/x86/msr.c @@ -200,6 +200,10 @@ int guest_rdmsr(const struct vcpu *v, ui ARRAY_SIZE(msrs->dr_mask))]; break; + /* + * TODO: Implement when we have better topology representation. + case MSR_INTEL_CORE_THREAD_COUNT: + */ default: return X86EMUL_UNHANDLEABLE; } @@ -229,6 +233,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t { uint64_t rsvd; + case MSR_INTEL_CORE_THREAD_COUNT: case MSR_INTEL_PLATFORM_INFO: case MSR_ARCH_CAPABILITIES: /* Read-only */ --- a/xen/include/asm-x86/msr-index.h +++ b/xen/include/asm-x86/msr-index.h @@ -32,6 +32,10 @@ #define EFER_KNOWN_MASK (EFER_SCE | EFER_LME | EFER_LMA | EFER_NX | \ EFER_SVME | EFER_FFXSE) +#define MSR_INTEL_CORE_THREAD_COUNT 0x00000035 +#define MSR_CTC_THREAD_MASK 0x0000ffff +#define MSR_CTC_CORE_MASK 0xffff0000 + /* Speculation Controls. */ #define MSR_SPEC_CTRL 0x00000048 #define SPEC_CTRL_IBRS (_AC(1, ULL) << 0) ++++++ 5cd93a69-x86-boot-detect-Intel-SMT-correctly.patch ++++++ # Commit b12fec4a125950240573ea32f65c61fb9afa74c3 # Date 2019-05-13 10:35:37 +0100 # Author Andrew Cooper <andrew.cooper3@citrix.com> # Committer Andrew Cooper <andrew.cooper3@citrix.com> x86/boot: Detect the firmware SMT setting correctly on Intel hardware While boot_cpu_data.x86_num_siblings is an accurate value to use on AMD hardware, it isn't on Intel when the user has disabled Hyperthreading in the firmware. As a result, a user which has chosen to disable HT still gets nagged on L1TF-vulnerable hardware when they haven't chosen an explicit smt=<bool> setting. Make use of the largely-undocumented MSR_INTEL_CORE_THREAD_COUNT which in practice exists since Nehalem, when booting on real hardware. Fall back to using the ACPI table APIC IDs. While adjusting this logic, fix a latent bug in amd_get_topology(). The thread count field in CPUID.0x8000001e.ebx is documented as 8 bits wide, rather than 2 bits wide. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/cpu/amd.c +++ b/xen/arch/x86/cpu/amd.c @@ -507,7 +507,7 @@ static void amd_get_topology(struct cpui u32 eax, ebx, ecx, edx; cpuid(0x8000001e, &eax, &ebx, &ecx, &edx); - c->x86_num_siblings = ((ebx >> 8) & 0x3) + 1; + c->x86_num_siblings = ((ebx >> 8) & 0xff) + 1; if (c->x86 < 0x17) c->compute_unit_id = ebx & 0xFF; --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -368,6 +368,45 @@ static void __init print_details(enum in #endif } +static bool __init check_smt_enabled(void) +{ + uint64_t val; + unsigned int cpu; + + /* + * x86_num_siblings defaults to 1 in the absence of other information, and + * is adjusted based on other topology information found in CPUID leaves. + * + * On AMD hardware, it will be the current SMT configuration. On Intel + * hardware, it will represent the maximum capability, rather than the + * current configuration. + */ + if ( boot_cpu_data.x86_num_siblings < 2 ) + return false; + + /* + * Intel Nehalem and later hardware does have an MSR which reports the + * current count of cores/threads in the package. + * + * At the time of writing, it is almost completely undocumented, so isn't + * virtualised reliably. + */ + if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && !cpu_has_hypervisor && + !rdmsr_safe(MSR_INTEL_CORE_THREAD_COUNT, val) ) + return (MASK_EXTR(val, MSR_CTC_CORE_MASK) != + MASK_EXTR(val, MSR_CTC_THREAD_MASK)); + + /* + * Search over the CPUs reported in the ACPI tables. Any whose APIC ID + * has a non-zero thread id component indicates that SMT is active. + */ + for_each_present_cpu ( cpu ) + if ( x86_cpu_to_apicid[cpu] & (boot_cpu_data.x86_num_siblings - 1) ) + return true; + + return false; +} + /* Calculate whether Retpoline is known-safe on this CPU. */ static bool __init retpoline_safe(uint64_t caps) { @@ -697,12 +736,14 @@ static __init void l1tf_calculations(uin void __init init_speculation_mitigations(void) { enum ind_thunk thunk = THUNK_DEFAULT; - bool use_spec_ctrl = false, ibrs = false; + bool use_spec_ctrl = false, ibrs = false, hw_smt_enabled; uint64_t caps = 0; if ( boot_cpu_has(X86_FEATURE_ARCH_CAPS) ) rdmsrl(MSR_ARCH_CAPABILITIES, caps); + hw_smt_enabled = check_smt_enabled(); + /* * Has the user specified any custom BTI mitigations? If so, follow their * instructions exactly and disable all heuristics. @@ -873,8 +914,7 @@ void __init init_speculation_mitigations * However, if we are on affected hardware, with HT enabled, and the user * hasn't explicitly chosen whether to use HT or not, nag them to do so. */ - if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim && - boot_cpu_data.x86_num_siblings > 1 ) + if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim && hw_smt_enabled ) warning_add( "Booted on L1TF-vulnerable hardware with SMT/Hyperthreading\n" "enabled. Please assess your configuration and choose an\n" ++++++ 5cd93a69-x86-spec-ctrl-reposition-XPTI-parsing.patch ++++++ # Commit c2c2bb0d60c642e64a5243a79c8b1548ffb7bc5b # Date 2019-05-13 10:35:37 +0100 # Author Andrew Cooper <andrew.cooper3@citrix.com> # Committer Andrew Cooper <andrew.cooper3@citrix.com> x86/spec-ctrl: Reposition the XPTI command line parsing logic It has ended up in the middle of the mitigation calculation logic. Move it to be beside the other command line parsing. No functional change. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -167,6 +167,73 @@ static int __init parse_spec_ctrl(const } custom_param("spec-ctrl", parse_spec_ctrl); +int8_t __read_mostly opt_xpti_hwdom = -1; +int8_t __read_mostly opt_xpti_domu = -1; + +static __init void xpti_init_default(uint64_t caps) +{ + if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD ) + caps = ARCH_CAPS_RDCL_NO; + + if ( caps & ARCH_CAPS_RDCL_NO ) + { + if ( opt_xpti_hwdom < 0 ) + opt_xpti_hwdom = 0; + if ( opt_xpti_domu < 0 ) + opt_xpti_domu = 0; + } + else + { + if ( opt_xpti_hwdom < 0 ) + opt_xpti_hwdom = 1; + if ( opt_xpti_domu < 0 ) + opt_xpti_domu = 1; + } +} + +static __init int parse_xpti(const char *s) +{ + const char *ss; + int val, rc = 0; + + /* Interpret 'xpti' alone in its positive boolean form. */ + if ( *s == '\0' ) + opt_xpti_hwdom = opt_xpti_domu = 1; + + do { + ss = strchr(s, ','); + if ( !ss ) + ss = strchr(s, '\0'); + + switch ( parse_bool(s, ss) ) + { + case 0: + opt_xpti_hwdom = opt_xpti_domu = 0; + break; + + case 1: + opt_xpti_hwdom = opt_xpti_domu = 1; + break; + + default: + if ( !strcmp(s, "default") ) + opt_xpti_hwdom = opt_xpti_domu = -1; + else if ( (val = parse_boolean("dom0", s, ss)) >= 0 ) + opt_xpti_hwdom = val; + else if ( (val = parse_boolean("domu", s, ss)) >= 0 ) + opt_xpti_domu = val; + else if ( *s ) + rc = -EINVAL; + break; + } + + s = ss + 1; + } while ( *ss ); + + return rc; +} +custom_param("xpti", parse_xpti); + int8_t __read_mostly opt_pv_l1tf_hwdom = -1; int8_t __read_mostly opt_pv_l1tf_domu = -1; @@ -627,73 +694,6 @@ static __init void l1tf_calculations(uin : (3ul << (paddr_bits - 2)))); } -int8_t __read_mostly opt_xpti_hwdom = -1; -int8_t __read_mostly opt_xpti_domu = -1; - -static __init void xpti_init_default(uint64_t caps) -{ - if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD ) - caps = ARCH_CAPS_RDCL_NO; - - if ( caps & ARCH_CAPS_RDCL_NO ) - { - if ( opt_xpti_hwdom < 0 ) - opt_xpti_hwdom = 0; - if ( opt_xpti_domu < 0 ) - opt_xpti_domu = 0; - } - else - { - if ( opt_xpti_hwdom < 0 ) - opt_xpti_hwdom = 1; - if ( opt_xpti_domu < 0 ) - opt_xpti_domu = 1; - } -} - -static __init int parse_xpti(const char *s) -{ - const char *ss; - int val, rc = 0; - - /* Interpret 'xpti' alone in its positive boolean form. */ - if ( *s == '\0' ) - opt_xpti_hwdom = opt_xpti_domu = 1; - - do { - ss = strchr(s, ','); - if ( !ss ) - ss = strchr(s, '\0'); - - switch ( parse_bool(s, ss) ) - { - case 0: - opt_xpti_hwdom = opt_xpti_domu = 0; - break; - - case 1: - opt_xpti_hwdom = opt_xpti_domu = 1; - break; - - default: - if ( !strcmp(s, "default") ) - opt_xpti_hwdom = opt_xpti_domu = -1; - else if ( (val = parse_boolean("dom0", s, ss)) >= 0 ) - opt_xpti_hwdom = val; - else if ( (val = parse_boolean("domu", s, ss)) >= 0 ) - opt_xpti_domu = val; - else if ( *s ) - rc = -EINVAL; - break; - } - - s = ss + 1; - } while ( *ss ); - - return rc; -} -custom_param("xpti", parse_xpti); - void __init init_speculation_mitigations(void) { enum ind_thunk thunk = THUNK_DEFAULT; ++++++ 5cd981ff-x86-IRQ-tracing-avoid-UB-or-worse.patch ++++++ # Commit 6fafb8befa99620a2d7323b9eca5c387bad1f59f # Date 2019-05-13 16:41:03 +0200 # Author Jan Beulich <jbeulich@suse.com> # Committer Jan Beulich <jbeulich@suse.com> x86/IRQ: avoid UB (or worse) in trace_irq_mask() Dynamically allocated CPU mask objects may be smaller than cpumask_t, so copying has to be restricted to the actual allocation size. This is particulary important since the function doesn't bail early when tracing is not active, so even production builds would be affected by potential misbehavior here. Take the opportunity and also - use initializers instead of assignment + memset(), - constify the cpumask_t input pointer, - u32 -> uint32_t. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com> --- a/xen/arch/x86/irq.c +++ b/xen/arch/x86/irq.c @@ -99,16 +99,19 @@ void unlock_vector_lock(void) spin_unlock(&vector_lock); } -static void trace_irq_mask(u32 event, int irq, int vector, cpumask_t *mask) +static void trace_irq_mask(uint32_t event, int irq, int vector, + const cpumask_t *mask) { struct { unsigned int irq:16, vec:16; unsigned int mask[6]; - } d; - d.irq = irq; - d.vec = vector; - memset(d.mask, 0, sizeof(d.mask)); - memcpy(d.mask, mask, min(sizeof(d.mask), sizeof(cpumask_t))); + } d = { + .irq = irq, + .vec = vector, + }; + + memcpy(d.mask, mask, + min(sizeof(d.mask), BITS_TO_LONGS(nr_cpu_ids) * sizeof(long))); trace_var(event, 1, sizeof(d), &d); } ++++++ 5cdad090-x86-spec-ctrl-CPUID-MSR-definitions-for-MDS.patch ++++++ # Commit d4f6116c080dc013cd1204c4d8ceb95e5f278689 # Date 2019-05-14 15:28:32 +0100 # Author Andrew Cooper <andrew.cooper3@citrix.com> # Committer Andrew Cooper <andrew.cooper3@citrix.com> x86/spec-ctrl: CPUID/MSR definitions for Microarchitectural Data Sampling The MD_CLEAR feature can be automatically offered to guests. No infrastructure is needed in Xen to support the guest making use of it. This is part of XSA-297, CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -483,7 +483,7 @@ accounting for hardware capabilities as Currently accepted: -The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`, +The Speculation Control hardware features `md-clear`, `ibrsb`, `stibp`, `ibpb`, `l1d-flush` and `ssbd` are used by default if available and applicable. They can be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and won't offer them to guests. --- a/tools/libxl/libxl_cpuid.c +++ b/tools/libxl/libxl_cpuid.c @@ -202,6 +202,7 @@ int libxl_cpuid_parse_config(libxl_cpuid {"avx512-4vnniw",0x00000007, 0, CPUID_REG_EDX, 2, 1}, {"avx512-4fmaps",0x00000007, 0, CPUID_REG_EDX, 3, 1}, + {"md-clear", 0x00000007, 0, CPUID_REG_EDX, 10, 1}, {"ibrsb", 0x00000007, 0, CPUID_REG_EDX, 26, 1}, {"stibp", 0x00000007, 0, CPUID_REG_EDX, 27, 1}, {"l1d-flush", 0x00000007, 0, CPUID_REG_EDX, 28, 1}, --- a/tools/misc/xen-cpuid.c +++ b/tools/misc/xen-cpuid.c @@ -146,6 +146,7 @@ static const char *str_7d0[32] = { [ 2] = "avx512_4vnniw", [ 3] = "avx512_4fmaps", + [10] = "md-clear", /* 12 */ [13] = "tsx-force-abort", [26] = "ibrsb", [27] = "stibp", --- a/xen/arch/x86/cpuid.c +++ b/xen/arch/x86/cpuid.c @@ -29,7 +29,12 @@ static int __init parse_xen_cpuid(const if ( !ss ) ss = strchr(s, '\0'); - if ( (val = parse_boolean("ibpb", s, ss)) >= 0 ) + if ( (val = parse_boolean("md-clear", s, ss)) >= 0 ) + { + if ( !val ) + setup_clear_cpu_cap(X86_FEATURE_MD_CLEAR); + } + else if ( (val = parse_boolean("ibpb", s, ss)) >= 0 ) { if ( !val ) setup_clear_cpu_cap(X86_FEATURE_IBPB); --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -291,17 +291,19 @@ static void __init print_details(enum in printk("Speculative mitigation facilities:\n"); /* Hardware features which pertain to speculative mitigations. */ - printk(" Hardware features:%s%s%s%s%s%s%s%s%s%s\n", + printk(" Hardware features:%s%s%s%s%s%s%s%s%s%s%s%s\n", (_7d0 & cpufeat_mask(X86_FEATURE_IBRSB)) ? " IBRS/IBPB" : "", (_7d0 & cpufeat_mask(X86_FEATURE_STIBP)) ? " STIBP" : "", (_7d0 & cpufeat_mask(X86_FEATURE_L1D_FLUSH)) ? " L1D_FLUSH" : "", (_7d0 & cpufeat_mask(X86_FEATURE_SSBD)) ? " SSBD" : "", + (_7d0 & cpufeat_mask(X86_FEATURE_MD_CLEAR)) ? " MD_CLEAR" : "", (e8b & cpufeat_mask(X86_FEATURE_IBPB)) ? " IBPB" : "", (caps & ARCH_CAPS_IBRS_ALL) ? " IBRS_ALL" : "", (caps & ARCH_CAPS_RDCL_NO) ? " RDCL_NO" : "", (caps & ARCH_CAPS_RSBA) ? " RSBA" : "", (caps & ARCH_CAPS_SKIP_L1DFL) ? " SKIP_L1DFL": "", - (caps & ARCH_CAPS_SSB_NO) ? " SSB_NO" : ""); + (caps & ARCH_CAPS_SSB_NO) ? " SSB_NO" : "", + (caps & ARCH_CAPS_MDS_NO) ? " MDS_NO" : ""); /* Compiled-in support which pertains to mitigations. */ if ( IS_ENABLED(CONFIG_INDIRECT_THUNK) || IS_ENABLED(CONFIG_SHADOW_PAGING) ) @@ -339,23 +341,25 @@ static void __init print_details(enum in * mitigation support for guests. */ #ifdef CONFIG_HVM - printk(" Support for HVM VMs:%s%s%s%s\n", + printk(" Support for HVM VMs:%s%s%s%s%s\n", (boot_cpu_has(X86_FEATURE_SC_MSR_HVM) || boot_cpu_has(X86_FEATURE_SC_RSB_HVM) || opt_eager_fpu) ? "" : " None", boot_cpu_has(X86_FEATURE_SC_MSR_HVM) ? " MSR_SPEC_CTRL" : "", boot_cpu_has(X86_FEATURE_SC_RSB_HVM) ? " RSB" : "", - opt_eager_fpu ? " EAGER_FPU" : ""); + opt_eager_fpu ? " EAGER_FPU" : "", + boot_cpu_has(X86_FEATURE_MD_CLEAR) ? " MD_CLEAR" : ""); #endif #ifdef CONFIG_PV - printk(" Support for PV VMs:%s%s%s%s\n", + printk(" Support for PV VMs:%s%s%s%s%s\n", (boot_cpu_has(X86_FEATURE_SC_MSR_PV) || boot_cpu_has(X86_FEATURE_SC_RSB_PV) || opt_eager_fpu) ? "" : " None", boot_cpu_has(X86_FEATURE_SC_MSR_PV) ? " MSR_SPEC_CTRL" : "", boot_cpu_has(X86_FEATURE_SC_RSB_PV) ? " RSB" : "", - opt_eager_fpu ? " EAGER_FPU" : ""); + opt_eager_fpu ? " EAGER_FPU" : "", + boot_cpu_has(X86_FEATURE_MD_CLEAR) ? " MD_CLEAR" : ""); printk(" XPTI (64-bit PV only): Dom0 %s, DomU %s (with%s PCID)\n", opt_xpti_hwdom ? "enabled" : "disabled", --- a/xen/include/asm-x86/msr-index.h +++ b/xen/include/asm-x86/msr-index.h @@ -51,6 +51,7 @@ #define ARCH_CAPS_RSBA (_AC(1, ULL) << 2) #define ARCH_CAPS_SKIP_L1DFL (_AC(1, ULL) << 3) #define ARCH_CAPS_SSB_NO (_AC(1, ULL) << 4) +#define ARCH_CAPS_MDS_NO (_AC(1, ULL) << 5) #define MSR_FLUSH_CMD 0x0000010b #define FLUSH_CMD_L1D (_AC(1, ULL) << 0) --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -242,6 +242,7 @@ XEN_CPUFEATURE(IBPB, 8*32+12) / /* Intel-defined CPU features, CPUID level 0x00000007:0.edx, word 9 */ XEN_CPUFEATURE(AVX512_4VNNIW, 9*32+ 2) /*A AVX512 Neural Network Instructions */ XEN_CPUFEATURE(AVX512_4FMAPS, 9*32+ 3) /*A AVX512 Multiply Accumulation Single Precision */ +XEN_CPUFEATURE(MD_CLEAR, 9*32+10) /*A VERW clears microarchitectural buffers */ XEN_CPUFEATURE(TSX_FORCE_ABORT, 9*32+13) /* MSR_TSX_FORCE_ABORT.RTM_ABORT */ XEN_CPUFEATURE(IBRSB, 9*32+26) /*A IBRS and IBPB support (used by Intel) */ XEN_CPUFEATURE(STIBP, 9*32+27) /*A STIBP */ ++++++ 5cdad090-x86-spec-ctrl-infrastructure-for-VERW-flush.patch ++++++ # Commit 548a932ac786d6bf3584e4b54f2ab993e1117710 # Date 2019-05-14 15:28:32 +0100 # Author Andrew Cooper <andrew.cooper3@citrix.com> # Committer Andrew Cooper <andrew.cooper3@citrix.com> x86/spec-ctrl: Infrastructure to use VERW to flush pipeline buffers Three synthetic features are introduced, as we need individual control of each, depending on circumstances. A later change will enable them at appropriate points. The verw_sel field doesn't strictly need to live in struct cpu_info. It lives there because there is a convenient hole it can fill, and it reduces the complexity of the SPEC_CTRL_EXIT_TO_{PV,HVM} assembly by avoiding the need for any temporary stack maintenance. This is part of XSA-297, CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/x86_64/asm-offsets.c +++ b/xen/arch/x86/x86_64/asm-offsets.c @@ -110,6 +110,7 @@ void __dummy__(void) BLANK(); OFFSET(CPUINFO_guest_cpu_user_regs, struct cpu_info, guest_cpu_user_regs); + OFFSET(CPUINFO_verw_sel, struct cpu_info, verw_sel); OFFSET(CPUINFO_current_vcpu, struct cpu_info, current_vcpu); OFFSET(CPUINFO_cr4, struct cpu_info, cr4); OFFSET(CPUINFO_xen_cr3, struct cpu_info, xen_cr3); --- a/xen/include/asm-x86/cpufeatures.h +++ b/xen/include/asm-x86/cpufeatures.h @@ -31,3 +31,6 @@ XEN_CPUFEATURE(SC_RSB_PV, (FSCAPIN XEN_CPUFEATURE(SC_RSB_HVM, (FSCAPINTS+0)*32+19) /* RSB overwrite needed for HVM */ XEN_CPUFEATURE(SC_MSR_IDLE, (FSCAPINTS+0)*32+21) /* (SC_MSR_PV || SC_MSR_HVM) && default_xen_spec_ctrl */ XEN_CPUFEATURE(XEN_LBR, (FSCAPINTS+0)*32+22) /* Xen uses MSR_DEBUGCTL.LBR */ +XEN_CPUFEATURE(SC_VERW_PV, (FSCAPINTS+0)*32+23) /* VERW used by Xen for PV */ +XEN_CPUFEATURE(SC_VERW_HVM, (FSCAPINTS+0)*32+24) /* VERW used by Xen for HVM */ +XEN_CPUFEATURE(SC_VERW_IDLE, (FSCAPINTS+0)*32+25) /* VERW used by Xen for idle */ --- a/xen/include/asm-x86/current.h +++ b/xen/include/asm-x86/current.h @@ -38,6 +38,7 @@ struct vcpu; struct cpu_info { struct cpu_user_regs guest_cpu_user_regs; unsigned int processor_id; + unsigned int verw_sel; struct vcpu *current_vcpu; unsigned long per_cpu_offset; unsigned long cr4; --- a/xen/include/asm-x86/spec_ctrl.h +++ b/xen/include/asm-x86/spec_ctrl.h @@ -60,6 +60,13 @@ static inline void init_shadow_spec_ctrl info->shadow_spec_ctrl = 0; info->xen_spec_ctrl = default_xen_spec_ctrl; info->spec_ctrl_flags = default_spec_ctrl_flags; + + /* + * For least latency, the VERW selector should be a writeable data + * descriptor resident in the cache. __HYPERVISOR_DS32 shares a cache + * line with __HYPERVISOR_CS, so is expected to be very cache-hot. + */ + info->verw_sel = __HYPERVISOR_DS32; } /* WARNING! `ret`, `call *`, `jmp *` not safe after this call. */ @@ -80,6 +87,22 @@ static always_inline void spec_ctrl_ente alternative_input("", "wrmsr", X86_FEATURE_SC_MSR_IDLE, "a" (val), "c" (MSR_SPEC_CTRL), "d" (0)); barrier(); + + /* + * Microarchitectural Store Buffer Data Sampling: + * + * On vulnerable systems, store buffer entries are statically partitioned + * between active threads. When entering idle, our store buffer entries + * are re-partitioned to allow the other threads to use them. + * + * Flush the buffers to ensure that no sensitive data of ours can be + * leaked by a sibling after it gets our store buffer entries. + * + * Note: VERW must be encoded with a memory operand, as it is only that + * form which causes a flush. + */ + alternative_input("", "verw %[sel]", X86_FEATURE_SC_VERW_IDLE, + [sel] "m" (info->verw_sel)); } /* WARNING! `ret`, `call *`, `jmp *` not safe before this call. */ @@ -98,6 +121,17 @@ static always_inline void spec_ctrl_exit alternative_input("", "wrmsr", X86_FEATURE_SC_MSR_IDLE, "a" (val), "c" (MSR_SPEC_CTRL), "d" (0)); barrier(); + + /* + * Microarchitectural Store Buffer Data Sampling: + * + * On vulnerable systems, store buffer entries are statically partitioned + * between active threads. When exiting idle, the other threads store + * buffer entries are re-partitioned to give us some. + * + * We now have store buffer entries with stale data from sibling threads. + * A flush if necessary will be performed on the return to guest path. + */ } #endif /* __ASSEMBLY__ */ --- a/xen/include/asm-x86/spec_ctrl_asm.h +++ b/xen/include/asm-x86/spec_ctrl_asm.h @@ -241,12 +241,16 @@ /* Use when exiting to PV guest context. */ #define SPEC_CTRL_EXIT_TO_PV \ ALTERNATIVE "", \ - DO_SPEC_CTRL_EXIT_TO_GUEST, X86_FEATURE_SC_MSR_PV + DO_SPEC_CTRL_EXIT_TO_GUEST, X86_FEATURE_SC_MSR_PV; \ + ALTERNATIVE "", __stringify(verw CPUINFO_verw_sel(%rsp)), \ + X86_FEATURE_SC_VERW_PV /* Use when exiting to HVM guest context. */ #define SPEC_CTRL_EXIT_TO_HVM \ ALTERNATIVE "", \ - DO_SPEC_CTRL_EXIT_TO_GUEST, X86_FEATURE_SC_MSR_HVM + DO_SPEC_CTRL_EXIT_TO_GUEST, X86_FEATURE_SC_MSR_HVM; \ + ALTERNATIVE "", __stringify(verw CPUINFO_verw_sel(%rsp)), \ + X86_FEATURE_SC_VERW_HVM /* * Use in IST interrupt/exception context. May interrupt Xen or PV context. ++++++ 5cdad090-x86-spec-ctrl-misc-non-functional-cleanup.patch ++++++ # Commit 9b62eba6c429c327e1507816bef403ccc87357ae # Date 2019-05-14 15:28:32 +0100 # Author Andrew Cooper <andrew.cooper3@citrix.com> # Committer Andrew Cooper <andrew.cooper3@citrix.com> x86/spec-ctrl: Misc non-functional cleanup * Identify BTI in the spec_ctrl_{enter,exit}_idle() comments, as other mitigations will shortly appear. * Use alternative_input() and cover the lack of memory cobber with a further barrier. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> --- a/xen/include/asm-x86/spec_ctrl.h +++ b/xen/include/asm-x86/spec_ctrl.h @@ -68,6 +68,8 @@ static always_inline void spec_ctrl_ente uint32_t val = 0; /* + * Branch Target Injection: + * * Latch the new shadow value, then enable shadowing, then update the MSR. * There are no SMP issues here; only local processor ordering concerns. */ @@ -75,8 +77,9 @@ static always_inline void spec_ctrl_ente barrier(); info->spec_ctrl_flags |= SCF_use_shadow; barrier(); - asm volatile ( ALTERNATIVE("", "wrmsr", X86_FEATURE_SC_MSR_IDLE) - :: "a" (val), "c" (MSR_SPEC_CTRL), "d" (0) : "memory" ); + alternative_input("", "wrmsr", X86_FEATURE_SC_MSR_IDLE, + "a" (val), "c" (MSR_SPEC_CTRL), "d" (0)); + barrier(); } /* WARNING! `ret`, `call *`, `jmp *` not safe before this call. */ @@ -85,13 +88,16 @@ static always_inline void spec_ctrl_exit uint32_t val = info->xen_spec_ctrl; /* + * Branch Target Injection: + * * Disable shadowing before updating the MSR. There are no SMP issues * here; only local processor ordering concerns. */ info->spec_ctrl_flags &= ~SCF_use_shadow; barrier(); - asm volatile ( ALTERNATIVE("", "wrmsr", X86_FEATURE_SC_MSR_IDLE) - :: "a" (val), "c" (MSR_SPEC_CTRL), "d" (0) : "memory" ); + alternative_input("", "wrmsr", X86_FEATURE_SC_MSR_IDLE, + "a" (val), "c" (MSR_SPEC_CTRL), "d" (0)); + barrier(); } #endif /* __ASSEMBLY__ */ ++++++ 5cdad090-x86-spec-ctrl-opts-to-control-VERW-flush.patch ++++++ # Commit 3c04c258ab40405a74e194d9889a4cbc7abe94b4 # Date 2019-05-14 15:28:32 +0100 # Author Andrew Cooper <andrew.cooper3@citrix.com> # Committer Andrew Cooper <andrew.cooper3@citrix.com> x86/spec-ctrl: Introduce options to control VERW flushing The Microarchitectural Data Sampling vulnerability is split into categories with subtly different properties: MLPDS - Microarchitectural Load Port Data Sampling MSBDS - Microarchitectural Store Buffer Data Sampling MFBDS - Microarchitectural Fill Buffer Data Sampling MDSUM - Microarchitectural Data Sampling Uncacheable Memory MDSUM is a special case of the other three, and isn't distinguished further. These issues pertain to three microarchitectural buffers. The Load Ports, the Store Buffers and the Fill Buffers. Each of these structures are flushed by the new enhanced VERW functionality, but the conditions under which flushing is necessary vary. For this concise overview of the issues and default logic, the abbreviations SP (Store Port), FB (Fill Buffer), LP (Load Port) and HT (Hyperthreading) are used for brevity: * Vulnerable hardware is divided into two categories - parts which suffer from SP only, and parts with any other combination of vulnerabilities. * SP only has an HT interaction when the thread goes idle, due to the static partitioning of resources. LP and FB have HT interactions at all points, due to the competitive sharing of resources. All issues potentially leak data across the return-to-guest transition. * The microcode which implements VERW flushing also extends MSR_FLUSH_CMD, so we don't need to do both on the HVM return-to-guest path. However, some parts are not vulnerable to L1TF (therefore have no MSR_FLUSH_CMD), but are vulnerable to MDS, so do require VERW on the HVM path. Note that we deliberately support mds=1 even without MD_CLEAR in case the microcode has been updated but the feature bit not exposed. This is part of XSA-297, CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -1895,7 +1895,7 @@ not be able to control the state of the By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`). ### spec-ctrl (x86) -> `= List of [ <bool>, xen=<bool>, {pv,hvm,msr-sc,rsb}=<bool>, +> `= List of [ <bool>, xen=<bool>, {pv,hvm,msr-sc,rsb,md-clear}=<bool>,
bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu, l1d-flush}=<bool> ]`
@@ -1919,9 +1919,10 @@ in place for guests to use. Use of a positive boolean value for either of these options is invalid. -The booleans `pv=`, `hvm=`, `msr-sc=` and `rsb=` offer fine grained control -over the alternative blocks used by Xen. These impact Xen's ability to -protect itself, and Xen's ability to virtualise support for guests to use. +The booleans `pv=`, `hvm=`, `msr-sc=`, `rsb=` and `md-clear=` offer fine +grained control over the alternative blocks used by Xen. These impact Xen's +ability to protect itself, and Xen's ability to virtualise support for guests +to use. * `pv=` and `hvm=` offer control over all suboptions for PV and HVM guests respectively. @@ -1930,6 +1931,11 @@ protect itself, and Xen's ability to vir guests and if disabled, guests will be unable to use IBRS/STIBP/SSBD/etc. * `rsb=` offers control over whether to overwrite the Return Stack Buffer / Return Address Stack on entry to Xen. +* `md-clear=` offers control over whether to use VERW to flush + microarchitectural buffers on idle and exit from Xen. *Note: For + compatibility with development versions of this fix, `mds=` is also accepted + on Xen 4.12 and earlier as an alias. Consult vendor documentation in + preference to here.* If Xen was compiled with INDIRECT_THUNK support, `bti-thunk=` can be used to select which of the thunks gets patched into the `__x86_indirect_thunk_%reg` --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -35,6 +35,8 @@ static bool __initdata opt_msr_sc_pv = t static bool __initdata opt_msr_sc_hvm = true; static bool __initdata opt_rsb_pv = true; static bool __initdata opt_rsb_hvm = true; +static int8_t __initdata opt_md_clear_pv = -1; +static int8_t __initdata opt_md_clear_hvm = -1; /* Cmdline controls for Xen's speculative settings. */ static enum ind_thunk { @@ -59,6 +61,9 @@ paddr_t __read_mostly l1tf_addr_mask, __ static bool __initdata cpu_has_bug_l1tf; static unsigned int __initdata l1d_maxphysaddr; +static bool __initdata cpu_has_bug_msbds_only; /* => minimal HT impact. */ +static bool __initdata cpu_has_bug_mds; /* Any other M{LP,SB,FB}DS combination. */ + static int __init parse_spec_ctrl(const char *s) { const char *ss; @@ -94,6 +99,8 @@ static int __init parse_spec_ctrl(const disable_common: opt_rsb_pv = false; opt_rsb_hvm = false; + opt_md_clear_pv = 0; + opt_md_clear_hvm = 0; opt_thunk = THUNK_JMP; opt_ibrs = 0; @@ -116,11 +123,13 @@ static int __init parse_spec_ctrl(const { opt_msr_sc_pv = val; opt_rsb_pv = val; + opt_md_clear_pv = val; } else if ( (val = parse_boolean("hvm", s, ss)) >= 0 ) { opt_msr_sc_hvm = val; opt_rsb_hvm = val; + opt_md_clear_hvm = val; } else if ( (val = parse_boolean("msr-sc", s, ss)) >= 0 ) { @@ -132,6 +141,12 @@ static int __init parse_spec_ctrl(const opt_rsb_pv = val; opt_rsb_hvm = val; } + else if ( (val = parse_boolean("md-clear", s, ss)) >= 0 || + (val = parse_boolean("mds", s, ss)) >= 0 ) + { + opt_md_clear_pv = val; + opt_md_clear_hvm = val; + } /* Xen's speculative sidechannel mitigation settings. */ else if ( !strncmp(s, "bti-thunk=", 10) ) @@ -317,7 +332,7 @@ static void __init print_details(enum in "\n"); /* Settings for Xen's protection, irrespective of guests. */ - printk(" Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s\n", + printk(" Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s%s\n", thunk == THUNK_NONE ? "N/A" : thunk == THUNK_RETPOLINE ? "RETPOLINE" : thunk == THUNK_LFENCE ? "LFENCE" : @@ -327,7 +342,8 @@ static void __init print_details(enum in !boot_cpu_has(X86_FEATURE_SSBD) ? "" : (default_xen_spec_ctrl & SPEC_CTRL_SSBD) ? " SSBD+" : " SSBD-", opt_ibpb ? " IBPB" : "", - opt_l1d_flush ? " L1D_FLUSH" : ""); + opt_l1d_flush ? " L1D_FLUSH" : "", + opt_md_clear_pv || opt_md_clear_hvm ? " VERW" : ""); /* L1TF diagnostics, printed if vulnerable or PV shadowing is in use. */ if ( cpu_has_bug_l1tf || opt_pv_l1tf_hwdom || opt_pv_l1tf_domu ) @@ -737,6 +753,107 @@ static __init void l1tf_calculations(uin : (3ul << (paddr_bits - 2)))); } +/* Calculate whether this CPU is vulnerable to MDS. */ +static __init void mds_calculations(uint64_t caps) +{ + /* MDS is only known to affect Intel Family 6 processors at this time. */ + if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || + boot_cpu_data.x86 != 6 ) + return; + + /* Any processor advertising MDS_NO should be not vulnerable to MDS. */ + if ( caps & ARCH_CAPS_MDS_NO ) + return; + + switch ( boot_cpu_data.x86_model ) + { + /* + * Core processors since at least Nehalem are vulnerable. + */ + case 0x1f: /* Auburndale / Havendale */ + case 0x1e: /* Nehalem */ + case 0x1a: /* Nehalem EP */ + case 0x2e: /* Nehalem EX */ + case 0x25: /* Westmere */ + case 0x2c: /* Westmere EP */ + case 0x2f: /* Westmere EX */ + case 0x2a: /* SandyBridge */ + case 0x2d: /* SandyBridge EP/EX */ + case 0x3a: /* IvyBridge */ + case 0x3e: /* IvyBridge EP/EX */ + case 0x3c: /* Haswell */ + case 0x3f: /* Haswell EX/EP */ + case 0x45: /* Haswell D */ + case 0x46: /* Haswell H */ + case 0x3d: /* Broadwell */ + case 0x47: /* Broadwell H */ + case 0x4f: /* Broadwell EP/EX */ + case 0x56: /* Broadwell D */ + case 0x4e: /* Skylake M */ + case 0x5e: /* Skylake D */ + cpu_has_bug_mds = true; + break; + + /* + * Some Core processors have per-stepping vulnerability. + */ + case 0x55: /* Skylake-X / Cascade Lake */ + if ( boot_cpu_data.x86_mask <= 5 ) + cpu_has_bug_mds = true; + break; + + case 0x8e: /* Kaby / Coffee / Whiskey Lake M */ + if ( boot_cpu_data.x86_mask <= 0xb ) + cpu_has_bug_mds = true; + break; + + case 0x9e: /* Kaby / Coffee / Whiskey Lake D */ + if ( boot_cpu_data.x86_mask <= 0xc ) + cpu_has_bug_mds = true; + break; + + /* + * Very old and very new Atom processors are not vulnerable. + */ + case 0x1c: /* Pineview */ + case 0x26: /* Lincroft */ + case 0x27: /* Penwell */ + case 0x35: /* Cloverview */ + case 0x36: /* Cedarview */ + case 0x7a: /* Goldmont */ + break; + + /* + * Middling Atom processors are vulnerable to just the Store Buffer + * aspect. + */ + case 0x37: /* Baytrail / Valleyview (Silvermont) */ + case 0x4a: /* Merrifield */ + case 0x4c: /* Cherrytrail / Brasswell */ + case 0x4d: /* Avaton / Rangely (Silvermont) */ + case 0x5a: /* Moorefield */ + case 0x5d: + case 0x65: + case 0x6e: + case 0x75: + /* + * Knights processors (which are based on the Silvermont/Airmont + * microarchitecture) are similarly only affected by the Store Buffer + * aspect. + */ + case 0x57: /* Knights Landing */ + case 0x85: /* Knights Mill */ + cpu_has_bug_msbds_only = true; + break; + + default: + printk("Unrecognised CPU model %#x - assuming vulnerable to MDS\n", + boot_cpu_data.x86_model); + cpu_has_bug_mds = true; + break; + } +} + void __init init_speculation_mitigations(void) { enum ind_thunk thunk = THUNK_DEFAULT; @@ -924,6 +1041,47 @@ void __init init_speculation_mitigations "enabled. Please assess your configuration and choose an\n" "explicit 'smt=<bool>' setting. See XSA-273.\n"); + mds_calculations(caps); + + /* + * By default, enable PV and HVM mitigations on MDS-vulnerable hardware. + * This will only be a token effort for MLPDS/MFBDS when HT is enabled, + * but it is somewhat better than nothing. + */ + if ( opt_md_clear_pv == -1 ) + opt_md_clear_pv = ((cpu_has_bug_mds || cpu_has_bug_msbds_only) && + boot_cpu_has(X86_FEATURE_MD_CLEAR)); + if ( opt_md_clear_hvm == -1 ) + opt_md_clear_hvm = ((cpu_has_bug_mds || cpu_has_bug_msbds_only) && + boot_cpu_has(X86_FEATURE_MD_CLEAR)); + + /* + * Enable MDS defences as applicable. The PV blocks need using all the + * time, and the Idle blocks need using if either PV or HVM defences are + * used. + * + * HVM is more complicated. The MD_CLEAR microcode extends L1D_FLUSH with + * equivelent semantics to avoid needing to perform both flushes on the + * HVM path. The HVM blocks don't need activating if our hypervisor told + * us it was handling L1D_FLUSH, or we are using L1D_FLUSH ourselves. + */ + if ( opt_md_clear_pv ) + setup_force_cpu_cap(X86_FEATURE_SC_VERW_PV); + if ( opt_md_clear_pv || opt_md_clear_hvm ) + setup_force_cpu_cap(X86_FEATURE_SC_VERW_IDLE); + if ( opt_md_clear_hvm && !(caps & ARCH_CAPS_SKIP_L1DFL) && !opt_l1d_flush ) + setup_force_cpu_cap(X86_FEATURE_SC_VERW_HVM); + + /* + * Warn the user if they are on MLPDS/MFBDS-vulnerable hardware with HT + * active and no explicit SMT choice. + */ + if ( opt_smt == -1 && cpu_has_bug_mds && hw_smt_enabled ) + warning_add( + "Booted on MLPDS/MFBDS-vulnerable hardware with SMT/Hyperthreading\n" + "enabled. Mitigations will not be fully effective. Please\n" + "choose an explicit smt=<bool> setting. See XSA-297.\n"); + print_details(thunk, caps); /* ++++++ 5cdeb9fd-sched-fix-csched2_deinit_pdata.patch ++++++ # Commit ffd3367ed682b6ac6f57fcb151921054dd4cce7e # Date 2019-05-17 15:41:17 +0200 # Author Juergen Gross <jgross@suse.com> # Committer Jan Beulich <jbeulich@suse.com> xen/sched: fix csched2_deinit_pdata() Commit 753ba43d6d16e688 ("xen/sched: fix credit2 smt idle handling") introduced a regression when switching cpus between cpupools. When assigning a cpu to a cpupool with credit2 being the default scheduler csched2_deinit_pdata() is called for the credit2 private data after the new scheduler's private data has been hooked to the per-cpu scheduler data. Unfortunately csched2_deinit_pdata() will cycle through all per-cpu scheduler areas it knows of for removing the cpu from the respective sibling masks including the area of the just moved cpu. This will (depending on the new scheduler) either clobber the data of the new scheduler or in case of sched_rt lead to a crash. Avoid that by removing the cpu from the list of active cpus in credit2 data first. The opposite problem is occurring when removing a cpu from a cpupool: init_pdata() of credit2 will access the per-cpu data of the old scheduler. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Dario Faggioli <dfaggioli@suse.com> --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -3813,22 +3813,21 @@ init_pdata(struct csched2_private *prv, activate_runqueue(prv, spc->runq_id); } - __cpumask_set_cpu(cpu, &rqd->idle); - __cpumask_set_cpu(cpu, &rqd->active); - __cpumask_set_cpu(cpu, &prv->initialized); - __cpumask_set_cpu(cpu, &rqd->smt_idle); + __cpumask_set_cpu(cpu, &spc->sibling_mask); - /* On the boot cpu we are called before cpu_sibling_mask has been set up. */ - if ( cpu == 0 && system_state < SYS_STATE_active ) - __cpumask_set_cpu(cpu, &csched2_pcpu(cpu)->sibling_mask); - else + if ( cpumask_weight(&rqd->active) > 0 ) for_each_cpu ( rcpu, per_cpu(cpu_sibling_mask, cpu) ) if ( cpumask_test_cpu(rcpu, &rqd->active) ) { __cpumask_set_cpu(cpu, &csched2_pcpu(rcpu)->sibling_mask); - __cpumask_set_cpu(rcpu, &csched2_pcpu(cpu)->sibling_mask); + __cpumask_set_cpu(rcpu, &spc->sibling_mask); } + __cpumask_set_cpu(cpu, &rqd->idle); + __cpumask_set_cpu(cpu, &rqd->active); + __cpumask_set_cpu(cpu, &prv->initialized); + __cpumask_set_cpu(cpu, &rqd->smt_idle); + if ( cpumask_weight(&rqd->active) == 1 ) rqd->pick_bias = cpu; @@ -3937,13 +3936,13 @@ csched2_deinit_pdata(const struct schedu printk(XENLOG_INFO "Removing cpu %d from runqueue %d\n", cpu, spc->runq_id); - for_each_cpu ( rcpu, &rqd->active ) - __cpumask_clear_cpu(cpu, &csched2_pcpu(rcpu)->sibling_mask); - __cpumask_clear_cpu(cpu, &rqd->idle); __cpumask_clear_cpu(cpu, &rqd->smt_idle); __cpumask_clear_cpu(cpu, &rqd->active); + for_each_cpu ( rcpu, &rqd->active ) + __cpumask_clear_cpu(cpu, &csched2_pcpu(rcpu)->sibling_mask); + if ( cpumask_empty(&rqd->active) ) { printk(XENLOG_INFO " No cpus left on runqueue, disabling\n"); ++++++ 5ce7a92f-x86-IO-APIC-fix-build-with-gcc9.patch ++++++ # Commit ca9310b24e6205de5387e5982ccd42c35caf89d4 # Date 2019-05-24 10:19:59 +0200 # Author Jan Beulich <jbeulich@suse.com> # Committer Jan Beulich <jbeulich@suse.com> x86/IO-APIC: fix build with gcc9 There are a number of pointless __packed attributes which cause gcc 9 to legitimately warn: utils.c: In function 'vtd_dump_iommu_info': utils.c:287:33: error: converting a packed 'struct IO_APIC_route_entry' pointer (alignment 1) to a 'struct IO_APIC_route_remap_entry' pointer (alignment 8) may result in an unaligned pointer value [-Werror=address-of-packed-member] 287 | remap = (struct IO_APIC_route_remap_entry *) &rte; | ^~~~~~~~~~~~~~~~~~~~~~~~~ intremap.c: In function 'ioapic_rte_to_remap_entry': intremap.c:343:25: error: converting a packed 'struct IO_APIC_route_entry' pointer (alignment 1) to a 'struct IO_APIC_route_remap_entry' pointer (alignment 8) may result in an unaligned pointer value [-Werror=address-of-packed-member] 343 | remap_rte = (struct IO_APIC_route_remap_entry *) old_rte; | ^~~~~~~~~~~~~~~~~~~~~~~~~ Simply drop these attributes. Take the liberty and also re-format the structure definitions at the same time. Reported-by: Charles Arnold <carnold@suse.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> --- a/xen/include/asm-x86/io_apic.h +++ b/xen/include/asm-x86/io_apic.h @@ -32,42 +32,42 @@ * The structure of the IO-APIC: */ union IO_APIC_reg_00 { - u32 raw; - struct __packed { - u32 __reserved_2 : 14, - LTS : 1, - delivery_type : 1, - __reserved_1 : 8, - ID : 8; - } bits; + uint32_t raw; + struct { + unsigned int __reserved_2:14; + unsigned int LTS:1; + unsigned int delivery_type:1; + unsigned int __reserved_1:8; + unsigned int ID:8; + } bits; }; union IO_APIC_reg_01 { - u32 raw; - struct __packed { - u32 version : 8, - __reserved_2 : 7, - PRQ : 1, - entries : 8, - __reserved_1 : 8; - } bits; + uint32_t raw; + struct { + unsigned int version:8; + unsigned int __reserved_2:7; + unsigned int PRQ:1; + unsigned int entries:8; + unsigned int __reserved_1:8; + } bits; }; union IO_APIC_reg_02 { - u32 raw; - struct __packed { - u32 __reserved_2 : 24, - arbitration : 4, - __reserved_1 : 4; - } bits; + uint32_t raw; + struct { + unsigned int __reserved_2:24; + unsigned int arbitration:4; + unsigned int __reserved_1:4; + } bits; }; union IO_APIC_reg_03 { - u32 raw; - struct __packed { - u32 boot_DT : 1, - __reserved_1 : 31; - } bits; + uint32_t raw; + struct { + unsigned int boot_DT:1; + unsigned int __reserved_1:31; + } bits; }; /* @@ -87,35 +87,36 @@ enum ioapic_irq_destination_types { dest_ExtINT = 7 }; -struct __packed IO_APIC_route_entry { - __u32 vector : 8, - delivery_mode : 3, /* 000: FIXED - * 001: lowest prio - * 111: ExtINT - */ - dest_mode : 1, /* 0: physical, 1: logical */ - delivery_status : 1, - polarity : 1, - irr : 1, - trigger : 1, /* 0: edge, 1: level */ - mask : 1, /* 0: enabled, 1: disabled */ - __reserved_2 : 15; - - union { struct { __u32 - __reserved_1 : 24, - physical_dest : 4, - __reserved_2 : 4; - } physical; - - struct { __u32 - __reserved_1 : 24, - logical_dest : 8; - } logical; - - /* used when Interrupt Remapping with EIM is enabled */ - __u32 dest32; - } dest; - +struct IO_APIC_route_entry { + unsigned int vector:8; + unsigned int delivery_mode:3; /* + * 000: FIXED + * 001: lowest prio + * 111: ExtINT + */ + unsigned int dest_mode:1; /* 0: physical, 1: logical */ + unsigned int delivery_status:1; + unsigned int polarity:1; /* 0: low, 1: high */ + unsigned int irr:1; + unsigned int trigger:1; /* 0: edge, 1: level */ + unsigned int mask:1; /* 0: enabled, 1: disabled */ + unsigned int __reserved_2:15; + + union { + struct { + unsigned int __reserved_1:24; + unsigned int physical_dest:4; + unsigned int __reserved_2:4; + } physical; + + struct { + unsigned int __reserved_1:24; + unsigned int logical_dest:8; + } logical; + + /* used when Interrupt Remapping with EIM is enabled */ + unsigned int dest32; + } dest; }; /* ++++++ 5cf0f6a4-x86-vhpet-resume-avoid-small-diff.patch ++++++ # Commit b144cf45d50b603c2909fc32c6abf7359f86f1aa # Date 2019-05-31 11:40:52 +0200 # Author Paul Durrant <paul.durrant@citrix.com> # Committer Jan Beulich <jbeulich@suse.com> x86/vhpet: avoid 'small' time diff test on resume It appears that even 64-bit versions of Windows 10, when not using syth- etic timers, will use 32-bit HPET non-periodic timers. There is a test in hpet_set_timer(), specific to 32-bit timers, that tries to disambiguate between a comparator value that is in the past and one that is sufficiently far in the future that it wraps. This is done by assuming that the delta between the main counter and comparator will be 'small' [1], if the comparator value is in the past. Unfortunately, more often than not, this is not the case if the timer is being re-started after a migrate and so the timer is set to fire far in the future (in excess of a minute in several observed cases) rather then set to fire immediately. This has a rather odd symptom where the guest console is alive enough to be able to deal with mouse pointer re-rendering, but any keyboard activity or mouse clicks yield no response. This patch simply adds an extra check of 'creation_finished' into hpet_set_timer() so that the 'small' time test is omitted when the function is called to restart timers after migration, and thus any negative delta causes a timer to fire immediately. [1] The number of ticks that equate to 0.9765625 milliseconds Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/hvm/hpet.c +++ b/xen/arch/x86/hvm/hpet.c @@ -273,10 +273,14 @@ static void hpet_set_timer(HPETState *h, * Detect time values set in the past. This is hard to do for 32-bit * comparators as the timer does not have to be set that far in the future * for the counter difference to wrap a 32-bit signed integer. We fudge - * by looking for a 'small' time value in the past. + * by looking for a 'small' time value in the past. However, if we + * are restoring after migrate, treat any wrap as past since the value + * is unlikely to be 'small'. */ if ( (int64_t)diff < 0 ) - diff = (timer_is_32bit(h, tn) && (-diff > HPET_TINY_TIME_SPAN)) + diff = (timer_is_32bit(h, tn) && + vhpet_domain(h)->creation_finished && + (-diff > HPET_TINY_TIME_SPAN)) ? (uint32_t)diff : 0; destroy_periodic_time(&h->pt[tn]); ++++++ 5cf16e51-x86-spec-ctrl-Knights-retpoline-safe.patch ++++++ # Commit e2105180f99d22aad47ee57113015e11d7397e54 # Date 2019-05-31 19:11:29 +0100 # Author Andrew Cooper <andrew.cooper3@citrix.com> # Committer Andrew Cooper <andrew.cooper3@citrix.com> x86/spec-ctrl: Knights Landing/Mill are retpoline-safe They are both Airmont-based and should have been included in c/s 17f74242ccf "x86/spec-ctrl: Extend repoline safey calcuations for eIBRS and Atom parts". Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -518,9 +518,11 @@ static bool __init retpoline_safe(uint64 case 0x4d: /* Avaton / Rangely (Silvermont) */ case 0x4c: /* Cherrytrail / Brasswell */ case 0x4a: /* Merrifield */ + case 0x57: /* Knights Landing */ case 0x5a: /* Moorefield */ case 0x5c: /* Goldmont */ case 0x5f: /* Denverton */ + case 0x85: /* Knights Mill */ return true; default: ++++++ 5d03a0c4-1-Arm-add-an-isb-before-reading-CNTPCT_EL0.patch ++++++ # Commit 5e1b9cb0f29d6b52bd603d22bca4ae4cfeef9e74 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm: Add an isb() before reading CNTPCT_EL0 to prevent re-ordering Per D8.2.1 in ARM DDI 0487C.a, "a read to CNTPCT_EL0 can occur speculatively and out of order relative to other instructions executed on the same PE." Add an instruction barrier to get accurate number of cycles when requested in get_cycles(). For the other users of CNPCT_EL0, replace by a call to get_cycles(). This is part of XSA-295. Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> --- a/xen/arch/arm/time.c +++ b/xen/arch/arm/time.c @@ -151,7 +151,7 @@ void __init preinit_xen_time(void) if ( res ) panic("Timer: Cannot initialize platform timer\n"); - boot_count = READ_SYSREG64(CNTPCT_EL0); + boot_count = get_cycles(); } static void __init init_dt_xen_time(void) @@ -192,7 +192,7 @@ int __init init_xen_time(void) /* Return number of nanoseconds since boot */ s_time_t get_s_time(void) { - uint64_t ticks = READ_SYSREG64(CNTPCT_EL0) - boot_count; + uint64_t ticks = get_cycles() - boot_count; return ticks_to_ns(ticks); } --- a/xen/include/asm-arm/time.h +++ b/xen/include/asm-arm/time.h @@ -2,6 +2,7 @@ #define __ARM_TIME_H__ #include <asm/sysregs.h> +#include <asm/system.h> #define DT_MATCH_TIMER \ DT_MATCH_COMPATIBLE("arm,armv7-timer"), \ @@ -11,6 +12,7 @@ typedef uint64_t cycles_t; static inline cycles_t get_cycles (void) { + isb(); return READ_SYSREG64(CNTPCT_EL0); } ++++++ 5d03a0c4-2-gnttab-rework-prototype-of-set_status.patch ++++++ # Commit 863e74eb2cffb5c1a454441b3e842ac56802d2f0 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/grant_table: Rework the prototype of _set_status* for lisibility It is not clear from the parameters name whether domid and gt_version correspond to the local or remote domain. A follow-up patch will make them more confusing. So rename domid (resp. gt_version) to ldomid (resp. rgt_version). At the same time re-order the parameters to hopefully make it more readable. This is part of XSA-295. Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -645,11 +645,11 @@ static unsigned int nr_grant_entries(str return 0; } -static int _set_status_v1(domid_t domid, +static int _set_status_v1(const grant_entry_header_t *shah, + struct active_grant_entry *act, int readonly, int mapflag, - grant_entry_header_t *shah, - struct active_grant_entry *act) + domid_t ldomid) { int rc = GNTST_okay; union grant_combo scombo, prev_scombo, new_scombo; @@ -684,11 +684,11 @@ static int _set_status_v1(domid_t domid if ( !act->pin && (((scombo.shorts.flags & mask) != GTF_permit_access) || - (scombo.shorts.domid != domid)) ) + (scombo.shorts.domid != ldomid)) ) PIN_FAIL(done, GNTST_general_error, "Bad flags (%x) or dom (%d); expected d%d\n", scombo.shorts.flags, scombo.shorts.domid, - domid); + ldomid); new_scombo = scombo; new_scombo.shorts.flags |= GTF_reading; @@ -717,12 +717,12 @@ done: return rc; } -static int _set_status_v2(domid_t domid, +static int _set_status_v2(const grant_entry_header_t *shah, + grant_status_t *status, + struct active_grant_entry *act, int readonly, int mapflag, - grant_entry_header_t *shah, - struct active_grant_entry *act, - grant_status_t *status) + domid_t ldomid) { int rc = GNTST_okay; union grant_combo scombo; @@ -748,10 +748,10 @@ static int _set_status_v2(domid_t domid if ( !act->pin && ( (((flags & mask) != GTF_permit_access) && ((flags & mask) != GTF_transitive)) || - (id != domid)) ) + (id != ldomid)) ) PIN_FAIL(done, GNTST_general_error, "Bad flags (%x) or dom (%d); expected d%d, flags %x\n", - flags, id, domid, mask); + flags, id, ldomid, mask); if ( readonly ) { @@ -778,14 +778,14 @@ static int _set_status_v2(domid_t domid { if ( (((flags & mask) != GTF_permit_access) && ((flags & mask) != GTF_transitive)) || - (id != domid) || + (id != ldomid) || (!readonly && (flags & GTF_readonly)) ) { gnttab_clear_flag(_GTF_writing, status); gnttab_clear_flag(_GTF_reading, status); PIN_FAIL(done, GNTST_general_error, "Unstable flags (%x) or dom (%d); expected d%d (r/w: %d)\n", - flags, id, domid, !readonly); + flags, id, ldomid, !readonly); } } else @@ -803,19 +803,19 @@ done: } -static int _set_status(unsigned gt_version, - domid_t domid, +static int _set_status(const grant_entry_header_t *shah, + grant_status_t *status, + unsigned rgt_version, + struct active_grant_entry *act, int readonly, int mapflag, - grant_entry_header_t *shah, - struct active_grant_entry *act, - grant_status_t *status) + domid_t ldomid) { - if ( gt_version == 1 ) - return _set_status_v1(domid, readonly, mapflag, shah, act); + if ( rgt_version == 1 ) + return _set_status_v1(shah, act, readonly, mapflag, ldomid); else - return _set_status_v2(domid, readonly, mapflag, shah, act, status); + return _set_status_v2(shah, status, act, readonly, mapflag, ldomid); } static struct active_grant_entry *grant_map_exists(const struct domain *ld, @@ -980,9 +980,9 @@ map_grant_ref( (!(op->flags & GNTMAP_readonly) && !(act->pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask))) ) { - if ( (rc = _set_status(rgt->gt_version, ld->domain_id, - op->flags & GNTMAP_readonly, - 1, shah, act, status) ) != GNTST_okay ) + if ( (rc = _set_status(shah, status, rgt->gt_version, act, + op->flags & GNTMAP_readonly, 1, + ld->domain_id) != GNTST_okay) ) goto act_release_out; if ( !act->pin ) @@ -2434,8 +2434,8 @@ acquire_grant_for_copy( { if ( (!old_pin || (!readonly && !(old_pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)))) && - (rc = _set_status_v2(ldom, readonly, 0, shah, act, - status)) != GNTST_okay ) + (rc = _set_status_v2(shah, status, act, readonly, 0, + ldom)) != GNTST_okay ) goto unlock_out; if ( !allow_transitive ) @@ -2535,9 +2535,8 @@ acquire_grant_for_copy( else if ( !old_pin || (!readonly && !(old_pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask))) ) { - if ( (rc = _set_status(rgt->gt_version, ldom, - readonly, 0, shah, act, - status) ) != GNTST_okay ) + if ( (rc = _set_status(shah, status, rgt->gt_version, act, + readonly, 0, ldom)) != GNTST_okay ) goto unlock_out; td = rd; ++++++ 5d03a0c4-3-Arm64-rewrite-bitops-in-C.patch ++++++ # Commit bc7c2c9af89469706f8778d40eba5d4fc0094974 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm64: bitops: Rewrite bitop helpers in C This is part of XSA-295. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> --- a/xen/arch/arm/README.LinuxPrimitives +++ b/xen/arch/arm/README.LinuxPrimitives @@ -8,7 +8,6 @@ arm64: bitops: last sync @ v3.16-rc6 (last commit: 8715466b6027) -linux/arch/arm64/lib/bitops.S xen/arch/arm/arm64/lib/bitops.S linux/arch/arm64/include/asm/bitops.h xen/include/asm-arm/arm64/bitops.h --------------------------------------------------------------------- --- a/xen/arch/arm/arm64/lib/bitops.S +++ /dev/null @@ -1,67 +0,0 @@ -/* - * Based on linux/arch/arm64/lib/bitops.h which in turn is - * Based on arch/arm/lib/bitops.h - * - * Copyright (C) 2013 ARM Ltd. - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program. If not, see http://www.gnu.org/licenses/. - */ - -/* - * x0: bits 4:0 bit offset - * bits 31:5 word offset - * x1: address - */ - .macro bitop, name, instr -ENTRY( \name ) - and w3, w0, #31 // Get bit offset - eor w0, w0, w3 // Clear low bits - mov x2, #1 - add x1, x1, x0, lsr #3 // Get word offset - lsl x3, x2, x3 // Create mask -1: ldxr w2, [x1] - \instr w2, w2, w3 - stxr w0, w2, [x1] - cbnz w0, 1b - ret -ENDPROC(\name ) - .endm - - .macro testop, name, instr -ENTRY( \name ) - and w3, w0, #31 // Get bit offset - eor w0, w0, w3 // Clear low bits - mov x2, #1 - add x1, x1, x0, lsr #3 // Get word offset - lsl x4, x2, x3 // Create mask -1: ldxr w2, [x1] - lsr w0, w2, w3 // Save old value of bit - \instr w2, w2, w4 // toggle bit - stlxr w5, w2, [x1] - cbnz w5, 1b - dmb ish - and w0, w0, #1 -3: ret -ENDPROC(\name ) - .endm - -/* - * Atomic bit operations. - */ - bitop change_bit, eor - bitop clear_bit, bic - bitop set_bit, orr - - testop test_and_change_bit, eor - testop test_and_clear_bit, bic - testop test_and_set_bit, orr --- /dev/null +++ b/xen/arch/arm/arm64/lib/bitops.c @@ -0,0 +1,90 @@ +/* + * Copyright (C) 2018 ARM Ltd. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see http://www.gnu.org/licenses/. + */ + +#include <xen/bitops.h> +#include <asm/system.h> + +/* + * The atomic bit operations pass the number of bit in a signed number + * (not sure why). This has the drawback to increase the complexity of + * the resulting assembly. + * + * To generate simpler code, the number of bit (nr) will be cast to + * unsigned int. + * + * XXX: Rework the interface to use unsigned int. + */ + +#define bitop(name, instr) \ +void name(int nr, volatile void *p) \ +{ \ + volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr); \ + const uint32_t mask = BIT_MASK((unsigned int)nr); \ + unsigned long res, tmp; \ + \ + do \ + { \ + asm volatile ("// " __stringify(name) "\n" \ + " ldxr %w2, %1\n" \ + " " __stringify(instr) " %w2, %w2, %w3\n" \ + " stxr %w0, %w2, %1\n" \ + : "=&r" (res), "+Q" (*ptr), "=&r" (tmp) \ + : "r" (mask)); \ + } while ( res ); \ +} \ + +#define testop(name, instr) \ +int name(int nr, volatile void *p) \ +{ \ + volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr); \ + unsigned int bit = (unsigned int)nr % BITS_PER_WORD; \ + const uint32_t mask = BIT_MASK(bit); \ + unsigned long res, tmp; \ + unsigned long oldbit; \ + \ + do \ + { \ + asm volatile ("// " __stringify(name) "\n" \ + " ldxr %w3, %2\n" \ + " lsr %w1, %w3, %w5 // Save old value of bit\n" \ + " " __stringify(instr) " %w3, %w3, %w4 // Toggle bit\n" \ + " stlxr %w0, %w3, %2\n" \ + : "=&r" (res), "=&r" (oldbit), "+Q" (*ptr), "=&r" (tmp) \ + : "r" (mask), "r" (bit) \ + : "memory"); \ + } while ( res ); \ + \ + dmb(ish); \ + \ + return oldbit & 1; \ +} + +bitop(change_bit, eor) +bitop(clear_bit, bic) +bitop(set_bit, orr) + +testop(test_and_change_bit, eor) +testop(test_and_clear_bit, bic) +testop(test_and_set_bit, orr) + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ ++++++ 5d03a0c4-4-Arm32-rewrite-bitops-in-C.patch ++++++ # Commit 429fb328442b1671d1679b8c95088b6cd5427fc6 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm32: bitops: Rewrite bitop helpers in C This is part of XSA-295. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> --- a/xen/arch/arm/README.LinuxPrimitives +++ b/xen/arch/arm/README.LinuxPrimitives @@ -68,19 +68,9 @@ arm32 bitops: last sync @ v3.16-rc6 (last commit: c32ffce0f66e) -linux/arch/arm/lib/bitops.h xen/arch/arm/arm32/lib/bitops.h -linux/arch/arm/lib/changebit.S xen/arch/arm/arm32/lib/changebit.S -linux/arch/arm/lib/clearbit.S xen/arch/arm/arm32/lib/clearbit.S linux/arch/arm/lib/findbit.S xen/arch/arm/arm32/lib/findbit.S -linux/arch/arm/lib/setbit.S xen/arch/arm/arm32/lib/setbit.S -linux/arch/arm/lib/testchangebit.S xen/arch/arm/arm32/lib/testchangebit.S -linux/arch/arm/lib/testclearbit.S xen/arch/arm/arm32/lib/testclearbit.S -linux/arch/arm/lib/testsetbit.S xen/arch/arm/arm32/lib/testsetbit.S -for i in bitops.h changebit.S clearbit.S findbit.S setbit.S testchangebit.S \ - testclearbit.S testsetbit.S; do - diff -u ../linux/arch/arm/lib/$i xen/arch/arm/arm32/lib/$i; -done +diff -u ../linux/arch/arm/lib/findbit.S xen/arch/arm/arm32/lib/findbit.S --------------------------------------------------------------------- --- a/xen/arch/arm/arm32/lib/Makefile +++ b/xen/arch/arm/arm32/lib/Makefile @@ -1,6 +1,5 @@ obj-y += memcpy.o memmove.o memset.o memchr.o memzero.o -obj-y += findbit.o setbit.o -obj-y += setbit.o clearbit.o changebit.o -obj-y += testsetbit.o testclearbit.o testchangebit.o +obj-y += findbit.o +obj-y += bitops.o obj-y += strchr.o strrchr.o obj-y += lib1funcs.o lshrdi3.o div64.o --- /dev/null +++ b/xen/arch/arm/arm32/lib/bitops.c @@ -0,0 +1,98 @@ +/* + * Copyright (C) 2018 ARM Ltd. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see http://www.gnu.org/licenses/. + */ + +#include <xen/bitops.h> +#include <xen/prefetch.h> +#include <asm/system.h> + +/* + * The atomic bit operations pass the number of bit in a signed number + * (not sure why). This has the drawback to increase the complexity of + * the resulting assembly. + * + * To generate simpler code, the number of bit (nr) will be cast to + * unsigned int. + * + * XXX: Rework the interface to use unsigned int. + */ + +#define bitop(name, instr) \ +void name(int nr, volatile void *p) \ +{ \ + volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr); \ + const uint32_t mask = BIT_MASK((unsigned int)nr); \ + unsigned long res, tmp; \ + \ + ASSERT(((vaddr_t)p & 0x3) == 0); \ + prefetchw((const void *)ptr); \ + \ + do \ + { \ + asm volatile ("// " __stringify(name) "\n" \ + " ldrex %2, %1\n" \ + " " __stringify(instr) " %2, %2, %3\n" \ + " strex %0, %2, %1\n" \ + : "=&r" (res), "+Qo" (*ptr), "=&r" (tmp) \ + : "r" (mask)); \ + } while ( res ); \ +} + +#define testop(name, instr) \ +int name(int nr, volatile void *p) \ +{ \ + volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr); \ + unsigned int bit = (unsigned int)nr % BITS_PER_WORD; \ + const uint32_t mask = BIT_MASK(bit); \ + unsigned long res, tmp; \ + int oldbit; \ + \ + ASSERT(((vaddr_t)p & 0x3) == 0); \ + smp_mb(); \ + \ + prefetchw((const void *)ptr); \ + \ + do \ + { \ + asm volatile ("// " __stringify(name) "\n" \ + " ldrex %3, %2\n" \ + " lsr %1, %3, %5 // Save old value of bit\n" \ + " " __stringify(instr) " %3, %3, %4 // Toggle bit\n" \ + " strex %0, %3, %2\n" \ + : "=&r" (res), "=&r" (oldbit), "+Qo" (*ptr), "=&r" (tmp) \ + : "r" (mask), "r" (bit)); \ + } while ( res ); \ + \ + smp_mb(); \ + \ + return oldbit & 1; \ +} \ + +bitop(change_bit, eor) +bitop(clear_bit, bic) +bitop(set_bit, orr) + +testop(test_and_change_bit, eor) +testop(test_and_clear_bit, bic) +testop(test_and_set_bit, orr) + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ --- a/xen/arch/arm/arm32/lib/bitops.h +++ /dev/null @@ -1,104 +0,0 @@ - -#if __LINUX_ARM_ARCH__ >= 6 - .macro bitop, name, instr -ENTRY( \name ) -UNWIND( .fnstart ) - ands ip, r1, #3 - strneb r1, [ip] @ assert word-aligned - mov r2, #1 - and r3, r0, #31 @ Get bit offset - mov r0, r0, lsr #5 - add r1, r1, r0, lsl #2 @ Get word offset -#if __LINUX_ARM_ARCH__ >= 7 && defined(CONFIG_SMP) - .arch_extension mp - ALT_SMP(W(pldw) [r1]) - ALT_UP(W(nop)) -#endif - mov r3, r2, lsl r3 -1: ldrex r2, [r1] - \instr r2, r2, r3 - strex r0, r2, [r1] - cmp r0, #0 - bne 1b - bx lr -UNWIND( .fnend ) -ENDPROC(\name ) - .endm - - .macro testop, name, instr, store -ENTRY( \name ) -UNWIND( .fnstart ) - ands ip, r1, #3 - strneb r1, [ip] @ assert word-aligned - mov r2, #1 - and r3, r0, #31 @ Get bit offset - mov r0, r0, lsr #5 - add r1, r1, r0, lsl #2 @ Get word offset - mov r3, r2, lsl r3 @ create mask - smp_dmb -#if __LINUX_ARM_ARCH__ >= 7 && defined(CONFIG_SMP) - .arch_extension mp - ALT_SMP(W(pldw) [r1]) - ALT_UP(W(nop)) -#endif -1: ldrex r2, [r1] - ands r0, r2, r3 @ save old value of bit - \instr r2, r2, r3 @ toggle bit - strex ip, r2, [r1] - cmp ip, #0 - bne 1b - smp_dmb - cmp r0, #0 - movne r0, #1 -2: bx lr -UNWIND( .fnend ) -ENDPROC(\name ) - .endm -#else - .macro bitop, name, instr -ENTRY( \name ) -UNWIND( .fnstart ) - ands ip, r1, #3 - strneb r1, [ip] @ assert word-aligned - and r2, r0, #31 - mov r0, r0, lsr #5 - mov r3, #1 - mov r3, r3, lsl r2 - save_and_disable_irqs ip - ldr r2, [r1, r0, lsl #2] - \instr r2, r2, r3 - str r2, [r1, r0, lsl #2] - restore_irqs ip - mov pc, lr -UNWIND( .fnend ) -ENDPROC(\name ) - .endm - -/** - * testop - implement a test_and_xxx_bit operation. - * @instr: operational instruction - * @store: store instruction - * - * Note: we can trivially conditionalise the store instruction - * to avoid dirtying the data cache. - */ - .macro testop, name, instr, store -ENTRY( \name ) -UNWIND( .fnstart ) - ands ip, r1, #3 - strneb r1, [ip] @ assert word-aligned - and r3, r0, #31 - mov r0, r0, lsr #5 - save_and_disable_irqs ip - ldr r2, [r1, r0, lsl #2]! - mov r0, #1 - tst r2, r0, lsl r3 - \instr r2, r2, r0, lsl r3 - \store r2, [r1] - moveq r0, #0 - restore_irqs ip - mov pc, lr -UNWIND( .fnend ) -ENDPROC(\name ) - .endm -#endif --- a/xen/arch/arm/arm32/lib/changebit.S +++ /dev/null @@ -1,14 +0,0 @@ -/* - * linux/arch/arm/lib/changebit.S - * - * Copyright (C) 1995-1996 Russell King - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ -#include "assembler.h" -#include "bitops.h" - .text - -bitop _change_bit, eor --- a/xen/arch/arm/arm32/lib/clearbit.S +++ /dev/null @@ -1,14 +0,0 @@ -/* - * linux/arch/arm/lib/clearbit.S - * - * Copyright (C) 1995-1996 Russell King - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ -#include "assembler.h" -#include "bitops.h" - .text - -bitop _clear_bit, bic --- a/xen/arch/arm/arm32/lib/setbit.S +++ /dev/null @@ -1,15 +0,0 @@ -/* - * linux/arch/arm/lib/setbit.S - * - * Copyright (C) 1995-1996 Russell King - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ - -#include "assembler.h" -#include "bitops.h" - .text - -bitop _set_bit, orr --- a/xen/arch/arm/arm32/lib/testchangebit.S +++ /dev/null @@ -1,15 +0,0 @@ -/* - * linux/arch/arm/lib/testchangebit.S - * - * Copyright (C) 1995-1996 Russell King - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ - -#include "assembler.h" -#include "bitops.h" - .text - -testop _test_and_change_bit, eor, str --- a/xen/arch/arm/arm32/lib/testclearbit.S +++ /dev/null @@ -1,15 +0,0 @@ -/* - * linux/arch/arm/lib/testclearbit.S - * - * Copyright (C) 1995-1996 Russell King - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ - -#include "assembler.h" -#include "bitops.h" - .text - -testop _test_and_clear_bit, bicne, strne --- a/xen/arch/arm/arm32/lib/testsetbit.S +++ /dev/null @@ -1,15 +0,0 @@ -/* - * linux/arch/arm/lib/testsetbit.S - * - * Copyright (C) 1995-1996 Russell King - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ - -#include "assembler.h" -#include "bitops.h" - .text - -testop _test_and_set_bit, orreq, streq --- a/xen/include/asm-arm/arm32/bitops.h +++ b/xen/include/asm-arm/arm32/bitops.h @@ -1,19 +1,12 @@ #ifndef _ARM_ARM32_BITOPS_H #define _ARM_ARM32_BITOPS_H -extern void _set_bit(int nr, volatile void * p); -extern void _clear_bit(int nr, volatile void * p); -extern void _change_bit(int nr, volatile void * p); -extern int _test_and_set_bit(int nr, volatile void * p); -extern int _test_and_clear_bit(int nr, volatile void * p); -extern int _test_and_change_bit(int nr, volatile void * p); - -#define set_bit(n,p) _set_bit(n,p) -#define clear_bit(n,p) _clear_bit(n,p) -#define change_bit(n,p) _change_bit(n,p) -#define test_and_set_bit(n,p) _test_and_set_bit(n,p) -#define test_and_clear_bit(n,p) _test_and_clear_bit(n,p) -#define test_and_change_bit(n,p) _test_and_change_bit(n,p) +extern void set_bit(int nr, volatile void * p); +extern void clear_bit(int nr, volatile void * p); +extern void change_bit(int nr, volatile void * p); +extern int test_and_set_bit(int nr, volatile void * p); +extern int test_and_clear_bit(int nr, volatile void * p); +extern int test_and_change_bit(int nr, volatile void * p); #define flsl fls ++++++ 5d03a0c4-5-Arm-bitops-consolidate-prototypes.patch ++++++ # Commit 3a4e55e6051bb10f9debc4fb874c31081b24930b # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm: bitops: Consolidate prototypes in one place The prototype are the same between arm32 and arm64. Consolidate them in asm-arm/bitops.h. This change will help the introductions of new helpers in a follow-up patch. This is part of XSA-295. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> --- a/xen/include/asm-arm/arm32/bitops.h +++ b/xen/include/asm-arm/arm32/bitops.h @@ -1,13 +1,6 @@ #ifndef _ARM_ARM32_BITOPS_H #define _ARM_ARM32_BITOPS_H -extern void set_bit(int nr, volatile void * p); -extern void clear_bit(int nr, volatile void * p); -extern void change_bit(int nr, volatile void * p); -extern int test_and_set_bit(int nr, volatile void * p); -extern int test_and_clear_bit(int nr, volatile void * p); -extern int test_and_change_bit(int nr, volatile void * p); - #define flsl fls /* --- a/xen/include/asm-arm/arm64/bitops.h +++ b/xen/include/asm-arm/arm64/bitops.h @@ -1,16 +1,6 @@ #ifndef _ARM_ARM64_BITOPS_H #define _ARM_ARM64_BITOPS_H -/* - * Little endian assembly atomic bitops. - */ -extern void set_bit(int nr, volatile void *p); -extern void clear_bit(int nr, volatile void *p); -extern void change_bit(int nr, volatile void *p); -extern int test_and_set_bit(int nr, volatile void *p); -extern int test_and_clear_bit(int nr, volatile void *p); -extern int test_and_change_bit(int nr, volatile void *p); - /* Based on linux/include/asm-generic/bitops/builtin-__ffs.h */ /** * __ffs - find first bit in word. --- a/xen/include/asm-arm/bitops.h +++ b/xen/include/asm-arm/bitops.h @@ -38,6 +38,14 @@ # error "unknown ARM variant" #endif +/* Atomics bitops */ +void set_bit(int nr, volatile void *p); +void clear_bit(int nr, volatile void *p); +void change_bit(int nr, volatile void *p); +int test_and_set_bit(int nr, volatile void *p); +int test_and_clear_bit(int nr, volatile void *p); +int test_and_change_bit(int nr, volatile void *p); + /** * __test_and_set_bit - Set a bit and return its old value * @nr: Bit to set ++++++ 5d03a0c4-6-Arm64-cmpxchg-simplify.patch ++++++ # Commit 00926c2370f3c6a45a20e068f53cbe7989d180f0 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm64: cmpxchg: Simplify the cmpxchg implementation The only difference between each case of the cmpxchg is the size of used. Rather than duplicating the code, provide a macro to generate each cases. This makes the code easier to read and modify. This is part of XSA-295. Signed-off-by; Julien Grall <julien.grall@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> --- a/xen/include/asm-arm/arm64/cmpxchg.h +++ b/xen/include/asm-arm/arm64/cmpxchg.h @@ -61,80 +61,54 @@ static inline unsigned long __xchg(unsig __ret; \ }) -extern void __bad_cmpxchg(volatile void *ptr, int size); +extern unsigned long __bad_cmpxchg(volatile void *ptr, int size); + +#define __CMPXCHG_CASE(w, sz, name) \ +static inline unsigned long __cmpxchg_case_##name(volatile void *ptr, \ + unsigned long old, \ + unsigned long new) \ +{ \ + unsigned long res, oldval; \ + \ + do { \ + asm volatile("// __cmpxchg_case_" #name "\n" \ + " ldxr" #sz " %" #w "1, %2\n" \ + " mov %w0, #0\n" \ + " cmp %" #w "1, %" #w "3\n" \ + " b.ne 1f\n" \ + " stxr" #sz " %w0, %" #w "4, %2\n" \ + "1:\n" \ + : "=&r" (res), "=&r" (oldval), \ + "+Q" (*(unsigned long *)ptr) \ + : "Ir" (old), "r" (new) \ + : "cc"); \ + } while (res); \ + \ + return oldval; \ +} + +__CMPXCHG_CASE(w, b, 1) +__CMPXCHG_CASE(w, h, 2) +__CMPXCHG_CASE(w, , 4) +__CMPXCHG_CASE( , , 8) static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old, unsigned long new, int size) { - unsigned long oldval = 0, res; - switch (size) { case 1: - do { - asm volatile("// __cmpxchg1\n" - " ldxrb %w1, %2\n" - " mov %w0, #0\n" - " cmp %w1, %w3\n" - " b.ne 1f\n" - " stxrb %w0, %w4, %2\n" - "1:\n" - : "=&r" (res), "=&r" (oldval), "+Q" (*(u8 *)ptr) - : "Ir" (old), "r" (new) - : "cc"); - } while (res); - break; - + return __cmpxchg_case_1(ptr, old, new); case 2: - do { - asm volatile("// __cmpxchg2\n" - " ldxrh %w1, %2\n" - " mov %w0, #0\n" - " cmp %w1, %w3\n" - " b.ne 1f\n" - " stxrh %w0, %w4, %2\n" - "1:\n" - : "=&r" (res), "=&r" (oldval), "+Q" (*(u16 *)ptr) - : "Ir" (old), "r" (new) - : "cc"); - } while (res); - break; - + return __cmpxchg_case_2(ptr, old, new); case 4: - do { - asm volatile("// __cmpxchg4\n" - " ldxr %w1, %2\n" - " mov %w0, #0\n" - " cmp %w1, %w3\n" - " b.ne 1f\n" - " stxr %w0, %w4, %2\n" - "1:\n" - : "=&r" (res), "=&r" (oldval), "+Q" (*(u32 *)ptr) - : "Ir" (old), "r" (new) - : "cc"); - } while (res); - break; - + return __cmpxchg_case_4(ptr, old, new); case 8: - do { - asm volatile("// __cmpxchg8\n" - " ldxr %1, %2\n" - " mov %w0, #0\n" - " cmp %1, %3\n" - " b.ne 1f\n" - " stxr %w0, %4, %2\n" - "1:\n" - : "=&r" (res), "=&r" (oldval), "+Q" (*(u64 *)ptr) - : "Ir" (old), "r" (new) - : "cc"); - } while (res); - break; - + return __cmpxchg_case_8(ptr, old, new); default: - __bad_cmpxchg(ptr, size); - oldval = 0; + return __bad_cmpxchg(ptr, size); } - return oldval; + ASSERT_UNREACHABLE(); } static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old, ++++++ 5d03a0c4-7-Arm32-cmpxchg-simplify.patch ++++++ # Commit 2d2ccf4355a182232b2c60a3bca4c15210e8b4b6 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm32: cmpxchg: Simplify the cmpxchg implementation The only difference between each case of the cmpxchg is the size of used. Rather than duplicating the code, provide a macro to generate each cases. This makes the code easier to read and modify. While doing the rework, the case for 64-bit cmpxchg is removed. This is unused today (already commented) and it would not be possible to use it directly. This is part of XSA-295. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> --- a/xen/include/asm-arm/arm32/cmpxchg.h +++ b/xen/include/asm-arm/arm32/cmpxchg.h @@ -52,72 +52,50 @@ static inline unsigned long __xchg(unsig * indicated by comparing RETURN with OLD. */ -extern void __bad_cmpxchg(volatile void *ptr, int size); +extern unsigned long __bad_cmpxchg(volatile void *ptr, int size); + +#define __CMPXCHG_CASE(sz, name) \ +static inline unsigned long __cmpxchg_case_##name(volatile void *ptr, \ + unsigned long old, \ + unsigned long new) \ +{ \ + unsigned long oldval, res; \ + \ + do { \ + asm volatile("@ __cmpxchg_case_" #name "\n" \ + " ldrex" #sz " %1, [%2]\n" \ + " mov %0, #0\n" \ + " teq %1, %3\n" \ + " strex" #sz "eq %0, %4, [%2]\n" \ + : "=&r" (res), "=&r" (oldval) \ + : "r" (ptr), "Ir" (old), "r" (new) \ + : "memory", "cc"); \ + } while (res); \ + \ + return oldval; \ +} + +__CMPXCHG_CASE(b, 1) +__CMPXCHG_CASE(h, 2) +__CMPXCHG_CASE( , 4) static always_inline unsigned long __cmpxchg( volatile void *ptr, unsigned long old, unsigned long new, int size) { - unsigned long oldval, res; - prefetchw((const void *)ptr); switch (size) { case 1: - do { - asm volatile("@ __cmpxchg1\n" - " ldrexb %1, [%2]\n" - " mov %0, #0\n" - " teq %1, %3\n" - " strexbeq %0, %4, [%2]\n" - : "=&r" (res), "=&r" (oldval) - : "r" (ptr), "Ir" (old), "r" (new) - : "memory", "cc"); - } while (res); - break; + return __cmpxchg_case_1(ptr, old, new); case 2: - do { - asm volatile("@ __cmpxchg2\n" - " ldrexh %1, [%2]\n" - " mov %0, #0\n" - " teq %1, %3\n" - " strexheq %0, %4, [%2]\n" - : "=&r" (res), "=&r" (oldval) - : "r" (ptr), "Ir" (old), "r" (new) - : "memory", "cc"); - } while (res); - break; + return __cmpxchg_case_2(ptr, old, new); case 4: - do { - asm volatile("@ __cmpxchg4\n" - " ldrex %1, [%2]\n" - " mov %0, #0\n" - " teq %1, %3\n" - " strexeq %0, %4, [%2]\n" - : "=&r" (res), "=&r" (oldval) - : "r" (ptr), "Ir" (old), "r" (new) - : "memory", "cc"); - } while (res); - break; -#if 0 - case 8: - do { - asm volatile("@ __cmpxchg8\n" - " ldrexd %1, [%2]\n" - " mov %0, #0\n" - " teq %1, %3\n" - " strexdeq %0, %4, [%2]\n" - : "=&r" (res), "=&r" (oldval) - : "r" (ptr), "Ir" (old), "r" (new) - : "memory", "cc"); - } while (res); - break; -#endif + return __cmpxchg_case_4(ptr, old, new); default: - __bad_cmpxchg(ptr, size); - oldval = 0; + return __bad_cmpxchg(ptr, size); } - return oldval; + ASSERT_UNREACHABLE(); } static always_inline unsigned long __cmpxchg_mb(volatile void *ptr, ++++++ 5d03a0c4-8-Arm-bitops-helpers-with-timeout.patch ++++++ # Commit 62d10134fb0611bf9d09c6a09877db013e500ea9 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm: bitops: Implement a new set of helpers that can timeout Exclusive load-store atomics should only be used between trusted threads. As not all the guests are trusted, it may be possible to DoS Xen when updating shared memory with guest atomically. To prevent the infinite loop, we introduce a new set of helpers that can timeout. The timeout is based on the maximum number of iterations. They will be used in follow-up patch to make atomic operations on shared memory safe. This is part of XSA-295. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> --- a/xen/arch/arm/arm32/lib/bitops.c +++ b/xen/arch/arm/arm32/lib/bitops.c @@ -30,7 +30,8 @@ */ #define bitop(name, instr) \ -void name(int nr, volatile void *p) \ +static always_inline bool int_##name(int nr, volatile void *p, bool timeout,\ + unsigned int max_try) \ { \ volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr); \ const uint32_t mask = BIT_MASK((unsigned int)nr); \ @@ -47,17 +48,33 @@ void name(int nr, volatile void *p) " strex %0, %2, %1\n" \ : "=&r" (res), "+Qo" (*ptr), "=&r" (tmp) \ : "r" (mask)); \ - } while ( res ); \ + \ + if ( !res ) \ + break; \ + } while ( !timeout || ((--max_try) > 0) ); \ + \ + return !res; \ +} \ + \ +void name(int nr, volatile void *p) \ +{ \ + if ( !int_##name(nr, p, false, 0) ) \ + ASSERT_UNREACHABLE(); \ +} \ + \ +bool name##_timeout(int nr, volatile void *p, unsigned int max_try) \ +{ \ + return int_##name(nr, p, true, max_try); \ } #define testop(name, instr) \ -int name(int nr, volatile void *p) \ +static always_inline bool int_##name(int nr, volatile void *p, int *oldbit, \ + bool timeout, unsigned int max_try) \ { \ volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr); \ unsigned int bit = (unsigned int)nr % BITS_PER_WORD; \ const uint32_t mask = BIT_MASK(bit); \ unsigned long res, tmp; \ - int oldbit; \ \ ASSERT(((vaddr_t)p & 0x3) == 0); \ smp_mb(); \ @@ -71,14 +88,35 @@ int name(int nr, volatile void *p) " lsr %1, %3, %5 // Save old value of bit\n" \ " " __stringify(instr) " %3, %3, %4 // Toggle bit\n" \ " strex %0, %3, %2\n" \ - : "=&r" (res), "=&r" (oldbit), "+Qo" (*ptr), "=&r" (tmp) \ + : "=&r" (res), "=&r" (*oldbit), "+Qo" (*ptr), "=&r" (tmp) \ : "r" (mask), "r" (bit)); \ - } while ( res ); \ + \ + if ( !res ) \ + break; \ + } while ( !timeout || ((--max_try) > 0) ); \ \ smp_mb(); \ \ - return oldbit & 1; \ + *oldbit &= 1; \ + \ + return !res; \ +} \ + \ +int name(int nr, volatile void *p) \ +{ \ + int oldbit; \ + \ + if ( !int_##name(nr, p, &oldbit, false, 0) ) \ + ASSERT_UNREACHABLE(); \ + \ + return oldbit; \ } \ + \ +bool name##_timeout(int nr, volatile void *p, \ + int *oldbit, unsigned int max_try) \ +{ \ + return int_##name(nr, p, oldbit, true, max_try); \ +} bitop(change_bit, eor) bitop(clear_bit, bic) --- a/xen/arch/arm/arm64/lib/bitops.c +++ b/xen/arch/arm/arm64/lib/bitops.c @@ -29,7 +29,8 @@ */ #define bitop(name, instr) \ -void name(int nr, volatile void *p) \ +static always_inline bool int_##name(int nr, volatile void *p, bool timeout,\ + unsigned int max_try) \ { \ volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr); \ const uint32_t mask = BIT_MASK((unsigned int)nr); \ @@ -43,17 +44,33 @@ void name(int nr, volatile void *p) " stxr %w0, %w2, %1\n" \ : "=&r" (res), "+Q" (*ptr), "=&r" (tmp) \ : "r" (mask)); \ - } while ( res ); \ + \ + if ( !res ) \ + break; \ + } while ( !timeout || ((--max_try) > 0) ); \ + \ + return !res; \ } \ + \ +void name(int nr, volatile void *p) \ +{ \ + if ( !int_##name(nr, p, false, 0) ) \ + ASSERT_UNREACHABLE(); \ +} \ + \ +bool name##_timeout(int nr, volatile void *p, unsigned int max_try) \ +{ \ + return int_##name(nr, p, true, max_try); \ +} #define testop(name, instr) \ -int name(int nr, volatile void *p) \ +static always_inline bool int_##name(int nr, volatile void *p, int *oldbit, \ + bool timeout, unsigned int max_try) \ { \ volatile uint32_t *ptr = (uint32_t *)p + BIT_WORD((unsigned int)nr); \ unsigned int bit = (unsigned int)nr % BITS_PER_WORD; \ const uint32_t mask = BIT_MASK(bit); \ unsigned long res, tmp; \ - unsigned long oldbit; \ \ do \ { \ @@ -62,14 +79,35 @@ int name(int nr, volatile void *p) " lsr %w1, %w3, %w5 // Save old value of bit\n" \ " " __stringify(instr) " %w3, %w3, %w4 // Toggle bit\n" \ " stlxr %w0, %w3, %2\n" \ - : "=&r" (res), "=&r" (oldbit), "+Q" (*ptr), "=&r" (tmp) \ + : "=&r" (res), "=&r" (*oldbit), "+Q" (*ptr), "=&r" (tmp) \ : "r" (mask), "r" (bit) \ : "memory"); \ - } while ( res ); \ + \ + if ( !res ) \ + break; \ + } while ( !timeout || ((--max_try) > 0) ); \ \ dmb(ish); \ \ - return oldbit & 1; \ + *oldbit &= 1; \ + \ + return !res; \ +} \ + \ +int name(int nr, volatile void *p) \ +{ \ + int oldbit; \ + \ + if ( !int_##name(nr, p, &oldbit, false, 0) ) \ + ASSERT_UNREACHABLE(); \ + \ + return oldbit; \ +} \ + \ +bool name##_timeout(int nr, volatile void *p, \ + int *oldbit, unsigned int max_try) \ +{ \ + return int_##name(nr, p, oldbit, true, max_try); \ } bitop(change_bit, eor) --- a/xen/include/asm-arm/bitops.h +++ b/xen/include/asm-arm/bitops.h @@ -38,7 +38,14 @@ # error "unknown ARM variant" #endif -/* Atomics bitops */ +/* + * Atomic bitops + * + * The helpers below *should* only be used on memory shared between + * trusted threads or we know the memory cannot be accessed by another + * thread. + */ + void set_bit(int nr, volatile void *p); void clear_bit(int nr, volatile void *p); void change_bit(int nr, volatile void *p); @@ -46,6 +53,25 @@ int test_and_set_bit(int nr, volatile vo int test_and_clear_bit(int nr, volatile void *p); int test_and_change_bit(int nr, volatile void *p); +/* + * The helpers below may fail to update the memory if the action takes + * too long. + * + * @max_try: Maximum number of iterations + * + * The helpers will return true when the update has succeeded (i.e no + * timeout) and false if the update has failed. + */ +bool set_bit_timeout(int nr, volatile void *p, unsigned int max_try); +bool clear_bit_timeout(int nr, volatile void *p, unsigned int max_try); +bool change_bit_timeout(int nr, volatile void *p, unsigned int max_try); +bool test_and_set_bit_timeout(int nr, volatile void *p, + int *oldbit, unsigned int max_try); +bool test_and_clear_bit_timeout(int nr, volatile void *p, + int *oldbit, unsigned int max_try); +bool test_and_change_bit_timeout(int nr, volatile void *p, + int *oldbit, unsigned int max_try); + /** * __test_and_set_bit - Set a bit and return its old value * @nr: Bit to set ++++++ 5d03a0c4-9-Arm-cmpxchg-helper-with-timeout.patch ++++++ # Commit 86b0bc958373217b986ca3fc8c46597577e83049 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm: cmpxchg: Provide a new helper that can timeout Exclusive load-store atomics should only be used between trusted threads. As not all the guests are trusted, it may be possible to DoS Xen when updating shared memory with guest atomically. To prevent the infinite loop, we introduce a new helper that can timeout. The timeout is based on the maximum number of iterations. It will be used in follow-up patch to make atomic operations on shared memory safe. This is part of XSA-295. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> --- a/xen/include/asm-arm/arm32/cmpxchg.h +++ b/xen/include/asm-arm/arm32/cmpxchg.h @@ -55,11 +55,14 @@ static inline unsigned long __xchg(unsig extern unsigned long __bad_cmpxchg(volatile void *ptr, int size); #define __CMPXCHG_CASE(sz, name) \ -static inline unsigned long __cmpxchg_case_##name(volatile void *ptr, \ - unsigned long old, \ - unsigned long new) \ +static inline bool __cmpxchg_case_##name(volatile void *ptr, \ + unsigned long *old, \ + unsigned long new, \ + bool timeout, \ + unsigned int max_try) \ { \ - unsigned long oldval, res; \ + unsigned long oldval; \ + unsigned long res; \ \ do { \ asm volatile("@ __cmpxchg_case_" #name "\n" \ @@ -68,29 +71,35 @@ static inline unsigned long __cmpxchg_ca " teq %1, %3\n" \ " strex" #sz "eq %0, %4, [%2]\n" \ : "=&r" (res), "=&r" (oldval) \ - : "r" (ptr), "Ir" (old), "r" (new) \ + : "r" (ptr), "Ir" (*old), "r" (new) \ : "memory", "cc"); \ - } while (res); \ \ - return oldval; \ + if (!res) \ + break; \ + } while (!timeout || ((--max_try) > 0)); \ + \ + *old = oldval; \ + \ + return !res; \ } __CMPXCHG_CASE(b, 1) __CMPXCHG_CASE(h, 2) __CMPXCHG_CASE( , 4) -static always_inline unsigned long __cmpxchg( - volatile void *ptr, unsigned long old, unsigned long new, int size) +static always_inline bool __int_cmpxchg(volatile void *ptr, unsigned long *old, + unsigned long new, int size, + bool timeout, unsigned int max_try) { prefetchw((const void *)ptr); switch (size) { case 1: - return __cmpxchg_case_1(ptr, old, new); + return __cmpxchg_case_1(ptr, old, new, timeout, max_try); case 2: - return __cmpxchg_case_2(ptr, old, new); + return __cmpxchg_case_2(ptr, old, new, timeout, max_try); case 4: - return __cmpxchg_case_4(ptr, old, new); + return __cmpxchg_case_4(ptr, old, new, timeout, max_try); default: return __bad_cmpxchg(ptr, size); } @@ -98,6 +107,17 @@ static always_inline unsigned long __cmp ASSERT_UNREACHABLE(); } +static always_inline unsigned long __cmpxchg(volatile void *ptr, + unsigned long old, + unsigned long new, + int size) +{ + if (!__int_cmpxchg(ptr, &old, new, size, false, 0)) + ASSERT_UNREACHABLE(); + + return old; +} + static always_inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old, unsigned long new, int size) @@ -111,6 +131,25 @@ static always_inline unsigned long __cmp return ret; } +/* + * The helper may fail to update the memory if the action takes too long. + * + * @old: On call the value pointed contains the expected old value. It will be + * updated to the actual old value. + * @max_try: Maximum number of iterations + * + * The helper will return true when the update has succeeded (i.e no + * timeout) and false if the update has failed. + */ +static always_inline bool __cmpxchg_mb_timeout(volatile void *ptr, + unsigned long *old, + unsigned long new, + int size, + unsigned int max_try) +{ + return __int_cmpxchg(ptr, old, new, size, true, max_try); +} + #define cmpxchg(ptr,o,n) \ ((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \ (unsigned long)(o), \ --- a/xen/include/asm-arm/arm64/cmpxchg.h +++ b/xen/include/asm-arm/arm64/cmpxchg.h @@ -64,11 +64,14 @@ static inline unsigned long __xchg(unsig extern unsigned long __bad_cmpxchg(volatile void *ptr, int size); #define __CMPXCHG_CASE(w, sz, name) \ -static inline unsigned long __cmpxchg_case_##name(volatile void *ptr, \ - unsigned long old, \ - unsigned long new) \ +static inline bool __cmpxchg_case_##name(volatile void *ptr, \ + unsigned long *old, \ + unsigned long new, \ + bool timeout, \ + unsigned int max_try) \ { \ - unsigned long res, oldval; \ + unsigned long oldval; \ + unsigned long res; \ \ do { \ asm volatile("// __cmpxchg_case_" #name "\n" \ @@ -80,11 +83,16 @@ static inline unsigned long __cmpxchg_ca "1:\n" \ : "=&r" (res), "=&r" (oldval), \ "+Q" (*(unsigned long *)ptr) \ - : "Ir" (old), "r" (new) \ + : "Ir" (*old), "r" (new) \ : "cc"); \ - } while (res); \ \ - return oldval; \ + if (!res) \ + break; \ + } while (!timeout || ((--max_try) > 0)); \ + \ + *old = oldval; \ + \ + return !res; \ } __CMPXCHG_CASE(w, b, 1) @@ -92,18 +100,19 @@ __CMPXCHG_CASE(w, h, 2) __CMPXCHG_CASE(w, , 4) __CMPXCHG_CASE( , , 8) -static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old, - unsigned long new, int size) +static always_inline bool __int_cmpxchg(volatile void *ptr, unsigned long *old, + unsigned long new, int size, + bool timeout, unsigned int max_try) { switch (size) { case 1: - return __cmpxchg_case_1(ptr, old, new); + return __cmpxchg_case_1(ptr, old, new, timeout, max_try); case 2: - return __cmpxchg_case_2(ptr, old, new); + return __cmpxchg_case_2(ptr, old, new, timeout, max_try); case 4: - return __cmpxchg_case_4(ptr, old, new); + return __cmpxchg_case_4(ptr, old, new, timeout, max_try); case 8: - return __cmpxchg_case_8(ptr, old, new); + return __cmpxchg_case_8(ptr, old, new, timeout, max_try); default: return __bad_cmpxchg(ptr, size); } @@ -111,8 +120,20 @@ static inline unsigned long __cmpxchg(vo ASSERT_UNREACHABLE(); } -static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old, - unsigned long new, int size) +static always_inline unsigned long __cmpxchg(volatile void *ptr, + unsigned long old, + unsigned long new, + int size) +{ + if (!__int_cmpxchg(ptr, &old, new, size, false, 0)) + ASSERT_UNREACHABLE(); + + return old; +} + +static always_inline unsigned long __cmpxchg_mb(volatile void *ptr, + unsigned long old, + unsigned long new, int size) { unsigned long ret; @@ -123,6 +144,25 @@ static inline unsigned long __cmpxchg_mb return ret; } +/* + * The helper may fail to update the memory if the action takes too long. + * + * @old: On call the value pointed contains the expected old value. It will be + * updated to the actual old value. + * @max_try: Maximum number of iterations + * + * The helper will return true when the update has succeeded (i.e no + * timeout) and false if the update has failed. + */ +static always_inline bool __cmpxchg_mb_timeout(volatile void *ptr, + unsigned long *old, + unsigned long new, + int size, + unsigned int max_try) +{ + return __int_cmpxchg(ptr, old, new, size, true, max_try); +} + #define cmpxchg(ptr, o, n) \ ({ \ __typeof__(*(ptr)) __ret; \ ++++++ 5d03a0c4-A-Arm-turn-on-SILO-mode-by-default.patch ++++++ # Commit cb8e0d28e8df97adc4ba2721f75824acdd3702b7 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm: Turn on SILO mode by default on Arm On Arm, exclusive load-store atomics should only be used between trusted thread. As not all the guests are trusted, it may be possible to DoS Xen when updating shared memory with guest atomically. Recent patches introduced new helpers to update shared memory with guest atomically. Those helpers relies on a memory region to be be shared with Xen and a single guest. At the moment, nothing prevent a guest sharing a page with Xen and as well with another guest (e.g via grant table). For the scope of the XSA, the quickest way is to deny communications between unprivileged guest. So this patch is enabling and using SILO mode by default on Arm. Users wanted finer graine policy could wrote their own Flask policy. This is part of XSA-295. Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -39,6 +39,7 @@ #include <xen/trace.h> #include <xen/libfdt/libfdt.h> #include <xen/acpi.h> +#include <xen/warning.h> #include <asm/alternative.h> #include <asm/page.h> #include <asm/current.h> @@ -834,8 +835,11 @@ void __init start_xen(unsigned long boot tasklet_subsys_init(); - - xsm_dt_init(); + if ( xsm_dt_init() != 1 ) + warning_add("WARNING: SILO mode is not enabled.\n" + "It has implications on the security of the system,\n" + "unless the communications have been forbidden between\n" + "untrusted domains.\n"); init_maintenance_interrupt(); init_timer_interrupt(); --- a/xen/common/Kconfig +++ b/xen/common/Kconfig @@ -106,7 +106,7 @@ config XENOPROF config XSM bool "Xen Security Modules support" - default n + default ARM ---help--- Enables the security framework known as Xen Security Modules which allows administrators fine-grained control over a Xen domain and @@ -171,6 +171,7 @@ config XSM_SILO choice prompt "Default XSM implementation" depends on XSM + default XSM_SILO_DEFAULT if XSM_SILO && ARM default XSM_FLASK_DEFAULT if XSM_FLASK default XSM_SILO_DEFAULT if XSM_SILO default XSM_DUMMY_DEFAULT --- a/xen/include/xsm/xsm.h +++ b/xen/include/xsm/xsm.h @@ -741,6 +741,11 @@ extern int xsm_multiboot_policy_init(uns #endif #ifdef CONFIG_HAS_DEVICE_TREE +/* + * Initialize XSM + * + * On success, return 1 if using SILO mode else 0. + */ extern int xsm_dt_init(void); extern int xsm_dt_policy_init(void **policy_buffer, size_t *policy_size); extern bool has_xsm_magic(paddr_t); --- a/xen/xsm/xsm_core.c +++ b/xen/xsm/xsm_core.c @@ -167,7 +167,7 @@ int __init xsm_dt_init(void) xfree(policy_buffer); - return ret; + return ret ?: (xsm_bootparam == XSM_BOOTPARAM_SILO); } /** ++++++ 5d03a0c4-B-bitops-guest-helpers.patch ++++++ # Commit 6a5f01a57a662565e6aa63fc9f3081fa69e54465 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/bitops: Provide helpers to safely modify guest memory atomically On Arm, exclusive load-store atomics should only be used between trusted thread. As not all the guests are trusted, it may be possible to DoS Xen when updating shared memory with guest atomically. This patch adds a new set of helper that will update the guest memory safely. For x86, it is already possible to use the current helpers safely. So just wrap them. For Arm, we will first attempt to update the guest memory with the loop bounded by a maximum number of iterations. If it fails, we will pause the domain and try again. Note that this heuristics assumes that a page can only be shared between Xen and one domain. Not Xen and multiple domain. The maximum number of iterations is based on how many times a simple load-store atomic operation can be executed in 1uS. The maximum value is per-CPU to cater big.LITTLE and calculated when the CPU is booting. The heuristic was randomly chosen and can be modified if impact too much good-behaving guest. Note, while test_bit does not requires to use atomic operation, a wrapper for test_bit was added for completeness. In this case, the domain stays constified to avoid major rework in the caller for the time-being. This is part of XSA-295. Signed-of-by: Julien Grall <julien.grall@arm.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> --- a/xen/arch/arm/Makefile +++ b/xen/arch/arm/Makefile @@ -22,6 +22,7 @@ obj-$(CONFIG_GICV3) += gic-v3.o obj-$(CONFIG_HAS_ITS) += gic-v3-its.o obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o obj-y += guestcopy.o +obj-y += guest_atomics.o obj-y += guest_walk.o obj-y += hvm.o obj-y += io.o --- /dev/null +++ b/xen/arch/arm/guest_atomics.c @@ -0,0 +1,91 @@ +/* + * arch/arm/guest_atomics.c + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; If not, see http://www.gnu.org/licenses/. + */ +#include <xen/cpu.h> + +#include <asm/guest_atomics.h> + +DEFINE_PER_CPU_READ_MOSTLY(unsigned int, guest_safe_atomic_max); + +/* + * Heuristic to find a safe upper-limit for load-store exclusive + * operations on memory shared with guest. + * + * At the moment, we calculate the number of iterations of a simple + * load-store atomic loop in 1uS. + */ +static void calibrate_safe_atomic(void) +{ + s_time_t deadline = NOW() + MICROSECS(1); + unsigned int counter = 0; + unsigned long mem = 0; + + do + { + unsigned long res, tmp; + +#ifdef CONFIG_ARM_32 + asm volatile (" ldrex %2, %1\n" + " add %2, %2, #1\n" + " strex %0, %2, %1\n" + : "=&r" (res), "+Q" (mem), "=&r" (tmp)); +#else + asm volatile (" ldxr %w2, %1\n" + " add %w2, %w2, #1\n" + " stxr %w0, %w2, %1\n" + : "=&r" (res), "+Q" (mem), "=&r" (tmp)); +#endif + counter++; + } while (NOW() < deadline); + + this_cpu(guest_safe_atomic_max) = counter; + + printk(XENLOG_DEBUG + "CPU%u: Guest atomics will try %u times before pausing the domain\n", + smp_processor_id(), counter); +} + +static int cpu_guest_safe_atomic_callback(struct notifier_block *nfb, + unsigned long action, + void *hcpu) +{ + if ( action == CPU_STARTING ) + calibrate_safe_atomic(); + + return NOTIFY_DONE; +} + +static struct notifier_block cpu_guest_safe_atomic_nfb = { + .notifier_call = cpu_guest_safe_atomic_callback, +}; + +static int __init guest_safe_atomic_init(void) +{ + register_cpu_notifier(&cpu_guest_safe_atomic_nfb); + + calibrate_safe_atomic(); + + return 0; +} +presmp_initcall(guest_safe_atomic_init); + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ --- /dev/null +++ b/xen/include/asm-arm/guest_atomics.h @@ -0,0 +1,76 @@ +#ifndef _ARM_GUEST_ATOMICS_H +#define _ARM_GUEST_ATOMICS_H + +#include <xen/bitops.h> +#include <xen/sched.h> + +/* + * The guest atomics helpers shares the same logic. We first try to use + * the *_timeout version of the operation. If it didn't timeout, then we + * successfully updated the memory. Nothing else to do. + * + * If it did timeout, then it means we didn't manage to update the + * memory. This is possibly because the guest is misbehaving (i.e tight + * store loop) but can also happen for other reasons (i.e nested Xen). + * In that case pause the domain and retry the operation, this time + * without a timeout. + * + * Note, those helpers rely on other part of the code to prevent sharing + * a page between Xen and multiple domain. + */ + +DECLARE_PER_CPU(unsigned int, guest_safe_atomic_max); + +#define guest_bitop(name) \ +static inline void guest_##name(struct domain *d, int nr, volatile void *p) \ +{ \ + if ( name##_timeout(nr, p, this_cpu(guest_safe_atomic_max)) ) \ + return; \ + \ + domain_pause_nosync(d); \ + name(nr, p); \ + domain_unpause(d); \ +} + +#define guest_testop(name) \ +static inline int guest_##name(struct domain *d, int nr, volatile void *p) \ +{ \ + bool succeed; \ + int oldbit; \ + \ + succeed = name##_timeout(nr, p, &oldbit, \ + this_cpu(guest_safe_atomic_max)); \ + if ( succeed ) \ + return oldbit; \ + \ + domain_pause_nosync(d); \ + oldbit = name(nr, p); \ + domain_unpause(d); \ + \ + return oldbit; \ +} + +guest_bitop(set_bit) +guest_bitop(clear_bit) +guest_bitop(change_bit) + +#undef guest_bitop + +/* test_bit does not use load-store atomic operations */ +#define guest_test_bit(d, nr, p) ((void)(d), test_bit(nr, p)) + +guest_testop(test_and_set_bit) +guest_testop(test_and_clear_bit) +guest_testop(test_and_change_bit) + +#undef guest_testop + +#endif /* _ARM_GUEST_ATOMICS_H */ +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ --- /dev/null +++ b/xen/include/asm-x86/guest_atomics.h @@ -0,0 +1,30 @@ +#ifndef _X86_GUEST_ATOMICS_H +#define _X86_GUEST_ATOMICS_H + +#include <xen/bitops.h> + +/* + * It is safe to use the atomics helpers on x86 on memory shared with + * the guests. + */ +#define guest_set_bit(d, nr, p) ((void)(d), set_bit(nr, p)) +#define guest_clear_bit(d, nr, p) ((void)(d), clear_bit(nr, p)) +#define guest_change_bit(d, nr, p) ((void)(d), change_bit(nr, p)) +#define guest_test_bit(d, nr, p) ((void)(d), test_bit(nr, p)) + +#define guest_test_and_set_bit(d, nr, p) \ + ((void)(d), test_and_set_bit(nr, p)) +#define guest_test_and_clear_bit(d, nr, p) \ + ((void)(d), test_and_clear_bit(nr, p)) +#define guest_test_and_change_bit(d, nr, p) \ + ((void)(d), test_and_change_bit(nr, p)) + +#endif /* _X86_GUEST_ATOMICS_H */ +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ ++++++ 5d03a0c4-C-cmpxchg-guest-helpers.patch ++++++ # Commit c7fd09cb491793ba0cf4c91f94ae9674e841f28a # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/cmpxchg: Provide helper to safely modify guest memory atomically On Arm, exclusive load-store atomics should only be used between trusted thread. As not all the guests are trusted, it may be possible to DoS Xen when updating shared memory with guest atomically. This patch adds a new helper that will update the guest memory safely. For x86, it is already possible to use the current helper safely. So just wrap it. For Arm, we will first attempt to update the guest memory with the loop bounded by a maximum number of iterations. If it fails, we will pause the domain and try again. Note that this heuristics assumes that a page can only be shared between Xen and one domain. Not Xen and multiple domain. The maximum number of iterations is based on how many times atomic_inc() can be executed in 1uS. The maximum value is per-CPU to cater big.LITTLE and calculated when the CPU is booting. The maximum number of iterations is based on how many times a simple load-store atomic operation can be executed in 1uS. The maximum value is per-CPU to cater big.LITTLE and calculated when the CPU is booting. The heuristic was randomly chosen and can be modified if impact too much good-behaving guest. This is part of XSA-295. Signed-of-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Jan Beulich <jbeulich@suse.com> --- a/xen/include/asm-arm/guest_atomics.h +++ b/xen/include/asm-arm/guest_atomics.h @@ -65,6 +65,31 @@ guest_testop(test_and_change_bit) #undef guest_testop +static inline unsigned long __guest_cmpxchg(struct domain *d, + volatile void *ptr, + unsigned long old, + unsigned long new, + unsigned int size) +{ + unsigned long oldval = old; + + if ( __cmpxchg_mb_timeout(ptr, &oldval, new, size, + this_cpu(guest_safe_atomic_max)) ) + return oldval; + + domain_pause_nosync(d); + oldval = __cmpxchg_mb(ptr, old, new, size); + domain_unpause(d); + + return oldval; +} + +#define guest_cmpxchg(d, ptr, o, n) \ + ((__typeof__(*(ptr)))__guest_cmpxchg(d, ptr, \ + (unsigned long)(o),\ + (unsigned long)(n),\ + sizeof (*(ptr)))) + #endif /* _ARM_GUEST_ATOMICS_H */ /* * Local variables: --- a/xen/include/asm-x86/guest_atomics.h +++ b/xen/include/asm-x86/guest_atomics.h @@ -19,6 +19,8 @@ #define guest_test_and_change_bit(d, nr, p) \ ((void)(d), test_and_change_bit(nr, p)) +#define guest_cmpxchg(d, ptr, o, n) ((void)(d), cmpxchg(ptr, o, n)) + #endif /* _X86_GUEST_ATOMICS_H */ /* * Local variables: ++++++ 5d03a0c4-D-use-guest-atomics-helpers.patch ++++++ # Commit 4b0c004beb22777572c1b8bfd1404caddfb268f0 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen: Use guest atomics helpers when modifying atomically guest memory On Arm, exclusive load-store atomics should only be used between trusted thread. As not all the guests are trusted, it may be possible to DoS Xen when updating shared memory with guest atomically. This patch replaces all the atomics operations on shared memory with a guest by the new guest atomics helpers. The x86 code was not audited to know where guest atomics helpers could be used. I will leave that to the x86 folks. Note that some rework was required in order to plumb use the new guest atomics in event channel and grant-table. Because guest_test_bit is ignoring the parameter "d" for now, it means there a lot of places do not need to drop the const. We may want to revisit this in the future if the parameter "d" becomes necessary. This is part of XSA-295. Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> --- a/xen/arch/arm/domain.c +++ b/xen/arch/arm/domain.c @@ -27,6 +27,7 @@ #include <asm/event.h> #include <asm/gic.h> #include <asm/guest_access.h> +#include <asm/guest_atomics.h> #include <asm/irq.h> #include <asm/p2m.h> #include <asm/platform.h> @@ -1017,7 +1018,7 @@ void arch_dump_vcpu_info(struct vcpu *v) void vcpu_mark_events_pending(struct vcpu *v) { - int already_pending = test_and_set_bit( + bool already_pending = guest_test_and_set_bit(v->domain, 0, (unsigned long *)&vcpu_info(v, evtchn_upcall_pending)); if ( already_pending ) --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -40,6 +40,8 @@ #include <xen/pfn.h> #include <xen/sizes.h> #include <xen/libfdt/libfdt.h> + +#include <asm/guest_atomics.h> #include <asm/setup.h> struct domain *dom_xen, *dom_io, *dom_cow; @@ -1380,7 +1382,7 @@ void put_page_type(struct page_info *pag return; } -void gnttab_clear_flag(unsigned long nr, uint16_t *addr) +void gnttab_clear_flag(struct domain *d, unsigned long nr, uint16_t *addr) { /* * Note that this cannot be clear_bit(), as the access must be @@ -1390,7 +1392,7 @@ void gnttab_clear_flag(unsigned long nr, do { old = *addr; - } while (cmpxchg(addr, old, old & mask) != old); + } while (guest_cmpxchg(d, addr, old, old & mask) != old); } void gnttab_mark_dirty(struct domain *d, mfn_t mfn) --- a/xen/common/event_2l.c +++ b/xen/common/event_2l.c @@ -13,6 +13,8 @@ #include <xen/sched.h> #include <xen/event.h> +#include <asm/guest_atomics.h> + static void evtchn_2l_set_pending(struct vcpu *v, struct evtchn *evtchn) { struct domain *d = v->domain; @@ -25,12 +27,12 @@ static void evtchn_2l_set_pending(struct * others may require explicit memory barriers. */ - if ( test_and_set_bit(port, &shared_info(d, evtchn_pending)) ) + if ( guest_test_and_set_bit(d, port, &shared_info(d, evtchn_pending)) ) return; - if ( !test_bit (port, &shared_info(d, evtchn_mask)) && - !test_and_set_bit(port / BITS_PER_EVTCHN_WORD(d), - &vcpu_info(v, evtchn_pending_sel)) ) + if ( !guest_test_bit(d, port, &shared_info(d, evtchn_mask)) && + !guest_test_and_set_bit(d, port / BITS_PER_EVTCHN_WORD(d), + &vcpu_info(v, evtchn_pending_sel)) ) { vcpu_mark_events_pending(v); } @@ -40,7 +42,7 @@ static void evtchn_2l_set_pending(struct static void evtchn_2l_clear_pending(struct domain *d, struct evtchn *evtchn) { - clear_bit(evtchn->port, &shared_info(d, evtchn_pending)); + guest_clear_bit(d, evtchn->port, &shared_info(d, evtchn_pending)); } static void evtchn_2l_unmask(struct domain *d, struct evtchn *evtchn) @@ -52,10 +54,10 @@ static void evtchn_2l_unmask(struct doma * These operations must happen in strict order. Based on * evtchn_2l_set_pending() above. */ - if ( test_and_clear_bit(port, &shared_info(d, evtchn_mask)) && - test_bit (port, &shared_info(d, evtchn_pending)) && - !test_and_set_bit (port / BITS_PER_EVTCHN_WORD(d), - &vcpu_info(v, evtchn_pending_sel)) ) + if ( guest_test_and_clear_bit(d, port, &shared_info(d, evtchn_mask)) && + guest_test_bit(d, port, &shared_info(d, evtchn_pending)) && + !guest_test_and_set_bit(d, port / BITS_PER_EVTCHN_WORD(d), + &vcpu_info(v, evtchn_pending_sel)) ) { vcpu_mark_events_pending(v); } @@ -66,7 +68,8 @@ static bool evtchn_2l_is_pending(const s unsigned int max_ports = BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); ASSERT(port < max_ports); - return port < max_ports && test_bit(port, &shared_info(d, evtchn_pending)); + return (port < max_ports && + guest_test_bit(d, port, &shared_info(d, evtchn_pending))); } static bool evtchn_2l_is_masked(const struct domain *d, evtchn_port_t port) @@ -74,7 +77,8 @@ static bool evtchn_2l_is_masked(const st unsigned int max_ports = BITS_PER_EVTCHN_WORD(d) * BITS_PER_EVTCHN_WORD(d); ASSERT(port < max_ports); - return port >= max_ports || test_bit(port, &shared_info(d, evtchn_mask)); + return (port >= max_ports || + guest_test_bit(d, port, &shared_info(d, evtchn_mask))); } static void evtchn_2l_print_state(struct domain *d, --- a/xen/common/event_fifo.c +++ b/xen/common/event_fifo.c @@ -17,6 +17,8 @@ #include <xen/mm.h> #include <xen/domain_page.h> +#include <asm/guest_atomics.h> + #include <public/event_channel.h> static inline event_word_t *evtchn_fifo_word_from_port(const struct domain *d, @@ -51,7 +53,7 @@ static void evtchn_fifo_init(struct doma * on the wrong VCPU or with an unexpected priority. */ word = evtchn_fifo_word_from_port(d, evtchn->port); - if ( word && test_bit(EVTCHN_FIFO_LINKED, word) ) + if ( word && guest_test_bit(d, EVTCHN_FIFO_LINKED, word) ) gdprintk(XENLOG_WARNING, "domain %d, port %d already on a queue\n", d->domain_id, evtchn->port); } @@ -116,7 +118,7 @@ static int try_set_link(event_word_t *wo * We block unmasking by the guest by marking the tail word as BUSY, * therefore, the cmpxchg() may fail at most 4 times. */ -static bool_t evtchn_fifo_set_link(const struct domain *d, event_word_t *word, +static bool_t evtchn_fifo_set_link(struct domain *d, event_word_t *word, uint32_t link) { event_word_t w; @@ -130,7 +132,7 @@ static bool_t evtchn_fifo_set_link(const return ret; /* Lock the word to prevent guest unmasking. */ - set_bit(EVTCHN_FIFO_BUSY, word); + guest_set_bit(d, EVTCHN_FIFO_BUSY, word); w = read_atomic(word); @@ -140,13 +142,13 @@ static bool_t evtchn_fifo_set_link(const if ( ret >= 0 ) { if ( ret == 0 ) - clear_bit(EVTCHN_FIFO_BUSY, word); + guest_clear_bit(d, EVTCHN_FIFO_BUSY, word); return ret; } } gdprintk(XENLOG_WARNING, "domain %d, port %d not linked\n", d->domain_id, link); - clear_bit(EVTCHN_FIFO_BUSY, word); + guest_clear_bit(d, EVTCHN_FIFO_BUSY, word); return 1; } @@ -171,13 +173,13 @@ static void evtchn_fifo_set_pending(stru return; } - was_pending = test_and_set_bit(EVTCHN_FIFO_PENDING, word); + was_pending = guest_test_and_set_bit(d, EVTCHN_FIFO_PENDING, word); /* * Link the event if it unmasked and not already linked. */ - if ( !test_bit(EVTCHN_FIFO_MASKED, word) - && !test_bit(EVTCHN_FIFO_LINKED, word) ) + if ( !guest_test_bit(d, EVTCHN_FIFO_MASKED, word) && + !guest_test_bit(d, EVTCHN_FIFO_LINKED, word) ) { struct evtchn_fifo_queue *q, *old_q; event_word_t *tail_word; @@ -206,7 +208,7 @@ static void evtchn_fifo_set_pending(stru if ( !old_q ) goto done; - if ( test_and_set_bit(EVTCHN_FIFO_LINKED, word) ) + if ( guest_test_and_set_bit(d, EVTCHN_FIFO_LINKED, word) ) { spin_unlock_irqrestore(&old_q->lock, flags); goto done; @@ -252,8 +254,8 @@ static void evtchn_fifo_set_pending(stru spin_unlock_irqrestore(&q->lock, flags); if ( !linked - && !test_and_set_bit(q->priority, - &v->evtchn_fifo->control_block->ready) ) + && !guest_test_and_set_bit(d, q->priority, + &v->evtchn_fifo->control_block->ready) ) vcpu_mark_events_pending(v); } done: @@ -275,7 +277,7 @@ static void evtchn_fifo_clear_pending(st * No need to unlink as the guest will unlink and ignore * non-pending events. */ - clear_bit(EVTCHN_FIFO_PENDING, word); + guest_clear_bit(d, EVTCHN_FIFO_PENDING, word); } static void evtchn_fifo_unmask(struct domain *d, struct evtchn *evtchn) @@ -287,10 +289,10 @@ static void evtchn_fifo_unmask(struct do if ( unlikely(!word) ) return; - clear_bit(EVTCHN_FIFO_MASKED, word); + guest_clear_bit(d, EVTCHN_FIFO_MASKED, word); /* Relink if pending. */ - if ( test_bit(EVTCHN_FIFO_PENDING, word) ) + if ( guest_test_bit(d, EVTCHN_FIFO_PENDING, word) ) evtchn_fifo_set_pending(v, evtchn); } @@ -298,21 +300,21 @@ static bool evtchn_fifo_is_pending(const { const event_word_t *word = evtchn_fifo_word_from_port(d, port); - return word && test_bit(EVTCHN_FIFO_PENDING, word); + return word && guest_test_bit(d, EVTCHN_FIFO_PENDING, word); } static bool_t evtchn_fifo_is_masked(const struct domain *d, evtchn_port_t port) { const event_word_t *word = evtchn_fifo_word_from_port(d, port); - return !word || test_bit(EVTCHN_FIFO_MASKED, word); + return !word || guest_test_bit(d, EVTCHN_FIFO_MASKED, word); } static bool_t evtchn_fifo_is_busy(const struct domain *d, evtchn_port_t port) { const event_word_t *word = evtchn_fifo_word_from_port(d, port); - return word && test_bit(EVTCHN_FIFO_LINKED, word); + return word && guest_test_bit(d, EVTCHN_FIFO_LINKED, word); } static int evtchn_fifo_set_priority(struct domain *d, struct evtchn *evtchn, @@ -339,11 +341,11 @@ static void evtchn_fifo_print_state(stru word = evtchn_fifo_word_from_port(d, evtchn->port); if ( !word ) printk("? "); - else if ( test_bit(EVTCHN_FIFO_LINKED, word) ) - printk("%c %-4u", test_bit(EVTCHN_FIFO_BUSY, word) ? 'B' : ' ', + else if ( guest_test_bit(d, EVTCHN_FIFO_LINKED, word) ) + printk("%c %-4u", guest_test_bit(d, EVTCHN_FIFO_BUSY, word) ? 'B' : ' ', *word & EVTCHN_FIFO_LINK_MASK); else - printk("%c - ", test_bit(EVTCHN_FIFO_BUSY, word) ? 'B' : ' '); + printk("%c - ", guest_test_bit(d, EVTCHN_FIFO_BUSY, word) ? 'B' : ' '); } static const struct evtchn_port_ops evtchn_port_ops_fifo = @@ -495,7 +497,7 @@ static void setup_ports(struct domain *d evtchn = evtchn_from_port(d, port); - if ( test_bit(port, &shared_info(d, evtchn_pending)) ) + if ( guest_test_bit(d, port, &shared_info(d, evtchn_pending)) ) evtchn->pending = 1; evtchn_fifo_set_priority(d, evtchn, EVTCHN_FIFO_PRIORITY_DEFAULT); --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -39,6 +39,7 @@ #include <xen/vmap.h> #include <xsm/xsm.h> #include <asm/flushtlb.h> +#include <asm/guest_atomics.h> /* Per-domain grant information. */ struct grant_table { @@ -646,6 +647,7 @@ static unsigned int nr_grant_entries(str } static int _set_status_v1(const grant_entry_header_t *shah, + struct domain *rd, struct active_grant_entry *act, int readonly, int mapflag, @@ -701,8 +703,8 @@ static int _set_status_v1(const grant_en "Attempt to write-pin a r/o grant entry\n"); } - prev_scombo.word = cmpxchg((u32 *)shah, - scombo.word, new_scombo.word); + prev_scombo.word = guest_cmpxchg(rd, (u32 *)shah, + scombo.word, new_scombo.word); if ( likely(prev_scombo.word == scombo.word) ) break; @@ -719,6 +721,7 @@ done: static int _set_status_v2(const grant_entry_header_t *shah, grant_status_t *status, + struct domain *rd, struct active_grant_entry *act, int readonly, int mapflag, @@ -781,8 +784,8 @@ static int _set_status_v2(const grant_en (id != ldomid) || (!readonly && (flags & GTF_readonly)) ) { - gnttab_clear_flag(_GTF_writing, status); - gnttab_clear_flag(_GTF_reading, status); + gnttab_clear_flag(rd, _GTF_writing, status); + gnttab_clear_flag(rd, _GTF_reading, status); PIN_FAIL(done, GNTST_general_error, "Unstable flags (%x) or dom (%d); expected d%d (r/w: %d)\n", flags, id, ldomid, !readonly); @@ -792,7 +795,7 @@ static int _set_status_v2(const grant_en { if ( unlikely(flags & GTF_readonly) ) { - gnttab_clear_flag(_GTF_writing, status); + gnttab_clear_flag(rd, _GTF_writing, status); PIN_FAIL(done, GNTST_general_error, "Unstable grant readonly flag\n"); } @@ -805,6 +808,7 @@ done: static int _set_status(const grant_entry_header_t *shah, grant_status_t *status, + struct domain *rd, unsigned rgt_version, struct active_grant_entry *act, int readonly, @@ -813,9 +817,9 @@ static int _set_status(const grant_entry { if ( rgt_version == 1 ) - return _set_status_v1(shah, act, readonly, mapflag, ldomid); + return _set_status_v1(shah, rd, act, readonly, mapflag, ldomid); else - return _set_status_v2(shah, status, act, readonly, mapflag, ldomid); + return _set_status_v2(shah, status, rd, act, readonly, mapflag, ldomid); } static struct active_grant_entry *grant_map_exists(const struct domain *ld, @@ -980,7 +984,7 @@ map_grant_ref( (!(op->flags & GNTMAP_readonly) && !(act->pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask))) ) { - if ( (rc = _set_status(shah, status, rgt->gt_version, act, + if ( (rc = _set_status(shah, status, rd, rgt->gt_version, act, op->flags & GNTMAP_readonly, 1, ld->domain_id) != GNTST_okay) ) goto act_release_out; @@ -1204,10 +1208,10 @@ map_grant_ref( unlock_out_clear: if ( !(op->flags & GNTMAP_readonly) && !(act->pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) ) - gnttab_clear_flag(_GTF_writing, status); + gnttab_clear_flag(rd, _GTF_writing, status); if ( !act->pin ) - gnttab_clear_flag(_GTF_reading, status); + gnttab_clear_flag(rd, _GTF_reading, status); act_release_out: active_entry_release(act); @@ -1477,10 +1481,10 @@ unmap_common_complete(struct gnttab_unma if ( ((act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)) == 0) && !(op->done & GNTMAP_readonly) ) - gnttab_clear_flag(_GTF_writing, status); + gnttab_clear_flag(rd, _GTF_writing, status); if ( act->pin == 0 ) - gnttab_clear_flag(_GTF_reading, status); + gnttab_clear_flag(rd, _GTF_reading, status); active_entry_release(act); grant_read_unlock(rgt); @@ -2045,8 +2049,8 @@ gnttab_prepare_for_transfer( new_scombo = scombo; new_scombo.shorts.flags |= GTF_transfer_committed; - prev_scombo.word = cmpxchg((u32 *)&sha->flags, - scombo.word, new_scombo.word); + prev_scombo.word = guest_cmpxchg(rd, (u32 *)&sha->flags, + scombo.word, new_scombo.word); if ( likely(prev_scombo.word == scombo.word) ) break; @@ -2339,11 +2343,11 @@ release_grant_for_copy( act->pin -= GNTPIN_hstw_inc; if ( !(act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)) ) - gnttab_clear_flag(_GTF_writing, status); + gnttab_clear_flag(rd, _GTF_writing, status); } if ( !act->pin ) - gnttab_clear_flag(_GTF_reading, status); + gnttab_clear_flag(rd, _GTF_reading, status); active_entry_release(act); grant_read_unlock(rgt); @@ -2365,14 +2369,15 @@ release_grant_for_copy( under the domain's grant table lock. */ /* Only safe on transitive grants. Even then, note that we don't attempt to drop any pin on the referent grant. */ -static void fixup_status_for_copy_pin(const struct active_grant_entry *act, +static void fixup_status_for_copy_pin(struct domain *rd, + const struct active_grant_entry *act, uint16_t *status) { if ( !(act->pin & (GNTPIN_hstw_mask | GNTPIN_devw_mask)) ) - gnttab_clear_flag(_GTF_writing, status); + gnttab_clear_flag(rd, _GTF_writing, status); if ( !act->pin ) - gnttab_clear_flag(_GTF_reading, status); + gnttab_clear_flag(rd, _GTF_reading, status); } /* @@ -2434,7 +2439,7 @@ acquire_grant_for_copy( { if ( (!old_pin || (!readonly && !(old_pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)))) && - (rc = _set_status_v2(shah, status, act, readonly, 0, + (rc = _set_status_v2(shah, status, rd, act, readonly, 0, ldom)) != GNTST_okay ) goto unlock_out; @@ -2483,7 +2488,7 @@ acquire_grant_for_copy( if ( rc != GNTST_okay ) { - fixup_status_for_copy_pin(act, status); + fixup_status_for_copy_pin(rd, act, status); rcu_unlock_domain(td); active_entry_release(act); grant_read_unlock(rgt); @@ -2506,7 +2511,7 @@ acquire_grant_for_copy( !act->is_sub_page)) ) { release_grant_for_copy(td, trans_gref, readonly); - fixup_status_for_copy_pin(act, status); + fixup_status_for_copy_pin(rd, act, status); rcu_unlock_domain(td); active_entry_release(act); grant_read_unlock(rgt); @@ -2535,7 +2540,7 @@ acquire_grant_for_copy( else if ( !old_pin || (!readonly && !(old_pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask))) ) { - if ( (rc = _set_status(shah, status, rgt->gt_version, act, + if ( (rc = _set_status(shah, status, rd, rgt->gt_version, act, readonly, 0, ldom)) != GNTST_okay ) goto unlock_out; @@ -2623,10 +2628,10 @@ acquire_grant_for_copy( unlock_out_clear: if ( !(readonly) && !(act->pin & (GNTPIN_hstw_mask | GNTPIN_devw_mask)) ) - gnttab_clear_flag(_GTF_writing, status); + gnttab_clear_flag(rd, _GTF_writing, status); if ( !act->pin ) - gnttab_clear_flag(_GTF_reading, status); + gnttab_clear_flag(rd, _GTF_reading, status); unlock_out: active_entry_release(act); @@ -3661,11 +3666,11 @@ gnttab_release_mappings( } if ( (act->pin & (GNTPIN_devw_mask|GNTPIN_hstw_mask)) == 0 ) - gnttab_clear_flag(_GTF_writing, status); + gnttab_clear_flag(rd, _GTF_writing, status); } if ( act->pin == 0 ) - gnttab_clear_flag(_GTF_reading, status); + gnttab_clear_flag(rd, _GTF_reading, status); active_entry_release(act); grant_read_unlock(rgt); --- a/xen/include/asm-arm/grant_table.h +++ b/xen/include/asm-arm/grant_table.h @@ -14,7 +14,7 @@ struct grant_table_arch { gfn_t *status_gfn; }; -void gnttab_clear_flag(unsigned long nr, uint16_t *addr); +void gnttab_clear_flag(struct domain *d, unsigned long nr, uint16_t *addr); int create_grant_host_mapping(unsigned long gpaddr, mfn_t mfn, unsigned int flags, unsigned int cache_flags); #define gnttab_host_mapping_get_page_type(ro, ld, rd) (0) --- a/xen/include/asm-x86/grant_table.h +++ b/xen/include/asm-x86/grant_table.h @@ -64,7 +64,8 @@ static inline int replace_grant_host_map #define gnttab_mark_dirty(d, f) paging_mark_dirty((d), f) -static inline void gnttab_clear_flag(unsigned int nr, uint16_t *st) +static inline void gnttab_clear_flag(struct domain *d, unsigned int nr, + uint16_t *st) { /* * Note that this cannot be clear_bit(), as the access must be ++++++ 5d03a0c4-E-Arm-add-perf-counters-in-guest-atomic-helpers.patch ++++++ # Commit 48584b4b90a9d4ff3fd2545822d487544b7d0718 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm: Add performance counters in guest atomic helpers Add performance counters in guest atomic helpers to be able to detect whether a guest is often paused during the operations. This is part of XSA-295. Signed-off-by: Julien Grall <julien.grall@arm.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> --- a/xen/include/asm-arm/guest_atomics.h +++ b/xen/include/asm-arm/guest_atomics.h @@ -24,9 +24,13 @@ DECLARE_PER_CPU(unsigned int, guest_safe #define guest_bitop(name) \ static inline void guest_##name(struct domain *d, int nr, volatile void *p) \ { \ + perfc_incr(atomics_guest); \ + \ if ( name##_timeout(nr, p, this_cpu(guest_safe_atomic_max)) ) \ return; \ \ + perfc_incr(atomics_guest_paused); \ + \ domain_pause_nosync(d); \ name(nr, p); \ domain_unpause(d); \ @@ -38,11 +42,15 @@ static inline int guest_##name(struct do bool succeed; \ int oldbit; \ \ + perfc_incr(atomics_guest); \ + \ succeed = name##_timeout(nr, p, &oldbit, \ this_cpu(guest_safe_atomic_max)); \ if ( succeed ) \ return oldbit; \ \ + perfc_incr(atomics_guest_paused); \ + \ domain_pause_nosync(d); \ oldbit = name(nr, p); \ domain_unpause(d); \ @@ -73,10 +81,14 @@ static inline unsigned long __guest_cmpx { unsigned long oldval = old; + perfc_incr(atomics_guest); + if ( __cmpxchg_mb_timeout(ptr, &oldval, new, size, this_cpu(guest_safe_atomic_max)) ) return oldval; + perfc_incr(atomics_guest_paused); + domain_pause_nosync(d); oldval = __cmpxchg_mb(ptr, old, new, size); domain_unpause(d); --- a/xen/include/asm-arm/perfc_defn.h +++ b/xen/include/asm-arm/perfc_defn.h @@ -73,6 +73,9 @@ PERFCOUNTER(phys_timer_irqs, "Physical PERFCOUNTER(virt_timer_irqs, "Virtual timer interrupts") PERFCOUNTER(maintenance_irqs, "Maintenance interrupts") +PERFCOUNTER(atomics_guest, "atomics: guest access") +PERFCOUNTER(atomics_guest_paused, "atomics: guest paused") + /*#endif*/ /* __XEN_PERFC_DEFN_H__ */ /* ++++++ 5d03a0c4-F-Arm-protect-gnttab_clear_flag.patch ++++++ # Commit 70d2f27b592bfcf76750b9fed5906e53423eebd7 # Date 2019-06-14 14:27:32 +0100 # Author Julien Grall <julien.grall@arm.com> # Committer Julien Grall <julien.grall@arm.com> xen/arm: grant-table: Protect gnttab_clear_flag against guest misbehavior The function gnttab_clear_flag is used to clear the access flags. On Arm, it is implemented using a loop and guest_cmpxchg. It is possible that guest_cmpxchg will always return a different value than old. This can happen if the guest updated the memory before Xen has time to do the exchange. Because of that, there are no way for to promise the loop will end. It is possible to make the current code safe by re-using the same principle as applied on the guest atomic helper. However this patch takes a different approach that should lead to more efficient code in the default case. A new helper is introduced to clear a set of bits on a 16-bits word. This should avoid a an extra loop to check cmpxchg succeeded. Note that a mask is used instead of a bit, so the helper can be re-used later on for clearing multiple flags at the same time. This is part of XSA-295. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> --- a/xen/arch/arm/arm32/lib/bitops.c +++ b/xen/arch/arm/arm32/lib/bitops.c @@ -126,6 +126,41 @@ testop(test_and_change_bit, eor) testop(test_and_clear_bit, bic) testop(test_and_set_bit, orr) +static always_inline bool int_clear_mask16(uint16_t mask, volatile uint16_t *p, + bool timeout, unsigned int max_try) +{ + unsigned long res, tmp; + + prefetchw((const uint16_t *)p); + + do + { + asm volatile ("// int_clear_mask16\n" + " ldrexh %2, %1\n" + " bic %2, %2, %3\n" + " strexh %0, %2, %1\n" + : "=&r" (res), "+Qo" (*p), "=&r" (tmp) + : "r" (mask)); + + if ( !res ) + break; + } while ( !timeout || ((--max_try) > 0) ); + + return !res; +} + +void clear_mask16(uint16_t mask, volatile void *p) +{ + if ( !int_clear_mask16(mask, p, false, 0) ) + ASSERT_UNREACHABLE(); +} + +bool clear_mask16_timeout(uint16_t mask, volatile void *p, + unsigned int max_try) +{ + return int_clear_mask16(mask, p, true, max_try); +} + /* * Local variables: * mode: C --- a/xen/arch/arm/arm64/lib/bitops.c +++ b/xen/arch/arm/arm64/lib/bitops.c @@ -118,6 +118,39 @@ testop(test_and_change_bit, eor) testop(test_and_clear_bit, bic) testop(test_and_set_bit, orr) +static always_inline bool int_clear_mask16(uint16_t mask, volatile uint16_t *p, + bool timeout, unsigned int max_try) +{ + unsigned long res, tmp; + + do + { + asm volatile ("// int_clear_mask16\n" + " ldxrh %w2, %1\n" + " bic %w2, %w2, %w3\n" + " stxrh %w0, %w2, %1\n" + : "=&r" (res), "+Q" (*p), "=&r" (tmp) + : "r" (mask)); + + if ( !res ) + break; + } while ( !timeout || ((--max_try) > 0) ); + + return !res; +} + +void clear_mask16(uint16_t mask, volatile void *p) +{ + if ( !int_clear_mask16(mask, p, false, 0) ) + ASSERT_UNREACHABLE(); +} + +bool clear_mask16_timeout(uint16_t mask, volatile void *p, + unsigned int max_try) +{ + return int_clear_mask16(mask, p, true, max_try); +} + /* * Local variables: * mode: C --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1384,15 +1384,7 @@ void put_page_type(struct page_info *pag void gnttab_clear_flag(struct domain *d, unsigned long nr, uint16_t *addr) { - /* - * Note that this cannot be clear_bit(), as the access must be - * confined to the specified 2 bytes. - */ - uint16_t mask = ~(1 << nr), old; - - do { - old = *addr; - } while (guest_cmpxchg(d, addr, old, old & mask) != old); + guest_clear_mask16(d, BIT(nr), addr); } void gnttab_mark_dirty(struct domain *d, mfn_t mfn) --- a/xen/include/asm-arm/bitops.h +++ b/xen/include/asm-arm/bitops.h @@ -53,6 +53,8 @@ int test_and_set_bit(int nr, volatile vo int test_and_clear_bit(int nr, volatile void *p); int test_and_change_bit(int nr, volatile void *p); +void clear_mask16(uint16_t mask, volatile void *p); + /* * The helpers below may fail to update the memory if the action takes * too long. @@ -71,6 +73,8 @@ bool test_and_clear_bit_timeout(int nr, int *oldbit, unsigned int max_try); bool test_and_change_bit_timeout(int nr, volatile void *p, int *oldbit, unsigned int max_try); +bool clear_mask16_timeout(uint16_t mask, volatile void *p, + unsigned int max_try); /** * __test_and_set_bit - Set a bit and return its old value --- a/xen/include/asm-arm/guest_atomics.h +++ b/xen/include/asm-arm/guest_atomics.h @@ -73,6 +73,19 @@ guest_testop(test_and_change_bit) #undef guest_testop +static inline void guest_clear_mask16(struct domain *d, uint16_t mask, + volatile uint16_t *p) +{ + perfc_incr(atomics_guest); + + if ( clear_mask16_timeout(mask, p, this_cpu(guest_safe_atomic_max)) ) + return; + + domain_pause_nosync(d); + clear_mask16(mask, p); + domain_unpause(d); +} + static inline unsigned long __guest_cmpxchg(struct domain *d, volatile void *ptr, unsigned long old, ++++++ README.SUSE ++++++ --- /var/tmp/diff_new_pack.GwU86x/_old 2019-08-07 13:55:00.812857093 +0200 +++ /var/tmp/diff_new_pack.GwU86x/_new 2019-08-07 13:55:00.812857093 +0200 @@ -639,7 +639,7 @@ xen-devel@lists.xen.org If you find issues with the packaging or setup done by SUSE, please report it through bugzilla: - https://bugzilla.novell.com + https://bugzilla.suse.com ENJOY! ++++++ fix-xenpvnetboot.patch ++++++ References: bsc#1138563 --- xen-4.10.3-testing/tools/misc/xenpvnetboot.orig 2019-06-19 13:46:55.249857405 -0600 +++ xen-4.10.3-testing/tools/misc/xenpvnetboot 2019-06-19 13:57:43.148948352 -0600 @@ -89,7 +89,7 @@ class Fetcher: suffix = ''.join(random.sample(string.ascii_letters, 6)) local_name = os.path.join(self.tmpdir, 'xenpvboot.%s.%s' % (os.path.basename(filename), suffix)) try: - return request.urlretrieve(url, local_name) + return request.urlretrieve(url, local_name)[0] except Exception as err: raise RuntimeError('Cannot get file %s: %s' % (url, err)) @@ -284,7 +284,7 @@ Supported locations: sys.exit(1) sys.stdout.flush() - os.write(fd, output) + os.write(fd, output.encode('utf-8')) if __name__ == '__main__': ++++++ xen-tools.etc_pollution.patch ++++++ --- a/m4/paths.m4 +++ b/m4/paths.m4 @@ -137,7 +137,7 @@ AC_SUBST(INITD_DIR) XEN_CONFIG_DIR=$CONFIG_DIR/xen AC_SUBST(XEN_CONFIG_DIR) -XEN_SCRIPT_DIR=$XEN_CONFIG_DIR/scripts +XEN_SCRIPT_DIR=${LIBEXEC}/scripts AC_SUBST(XEN_SCRIPT_DIR) case "$host_os" in --- a/docs/man/xl-disk-configuration.5.pod +++ b/docs/man/xl-disk-configuration.5.pod @@ -257,7 +257,7 @@ automatically determine the most suitabl Specifies that B<target> is not a normal host path, but rather information to be interpreted by the executable program I<SCRIPT>, -(looked for in F</etc/xen/scripts>, if it doesn't contain a slash). +(looked for in F</usr/lib/xen/scripts>, if it doesn't contain a slash). These scripts are normally called "block-I<SCRIPT>". --- a/docs/man/xl.1.pod.in +++ b/docs/man/xl.1.pod.in @@ -560,7 +560,7 @@ See the corresponding option of the I<cr =item B<-N> I<netbufscript> Use <netbufscript> to setup network buffering instead of the -default script (/etc/xen/scripts/remus-netbuf-setup). +default script (/usr/lib/xen/scripts/remus-netbuf-setup). =item B<-F> --- a/docs/man/xl.conf.5.pod +++ b/docs/man/xl.conf.5.pod @@ -95,7 +95,7 @@ Configures the default hotplug script us The old B<vifscript> option is deprecated and should not be used. -Default: C</etc/xen/scripts/vif-bridge> +Default: C</usr/lib/xen/scripts/vif-bridge> =item B<vif.default.bridge="NAME"> @@ -121,13 +121,13 @@ Default: C<None> Configures the default script used by Remus to setup network buffering. -Default: C</etc/xen/scripts/remus-netbuf-setup> +Default: C</usr/lib/xen/scripts/remus-netbuf-setup> =item B<colo.default.proxyscript="PATH"> Configures the default script used by COLO to setup colo-proxy. -Default: C</etc/xen/scripts/colo-proxy-setup> +Default: C</usr/lib/xen/scripts/colo-proxy-setup> =item B<output_format="json|sxp"> --- a/docs/misc/block-scripts.txt +++ b/docs/misc/block-scripts.txt @@ -18,7 +18,7 @@ Setup It is highly recommended that custom hotplug scripts as much as possible include and use the common Xen functionality. If the script -is run from the normal block script location (/etc/xen/scripts by +is run from the normal block script location (/usr/lib/xen/scripts by default), then this can be done by adding the following to the top of the script: --- a/tools/xl/xl_cmdtable.c +++ b/tools/xl/xl_cmdtable.c @@ -580,7 +580,7 @@ struct cmd_spec cmd_table[] = { "-e Do not wait in the background (on <host>) for the death\n" " of the domain.\n" "-N <netbufscript> Use netbufscript to setup network buffering instead of the\n" - " default script (/etc/xen/scripts/remus-netbuf-setup).\n" + " default script (/usr/lib/xen/scripts/remus-netbuf-setup).\n" "-F Enable unsafe configurations [-b|-n|-d flags]. Use this option\n" " with caution as failover may not work as intended.\n" "-b Replicate memory checkpoints to /dev/null (blackhole).\n"