Jiri Wiesner changed bug 1220119
What Removed Added
Flags   needinfo?(claudio.fontana@suse.com)
CC   claudio.fontana@suse.com

Comment # 18 on bug 1220119 from Jiri Wiesner
Both openqaworker21 and openqaworker26 are 1-socket machines. This means that
there should not be any discrepancies between TSC reads from different CPUs
(possibly belonging to different sockets), which can be seen on 8-socket and
larger machines. So, there is no need for TSC warp corrections. The CPUs are
AMD EPYC 7773X with these flags: tsc, rdtscp, constant_tsc, nonstop_tsc,
tsc_scale. The criteria for disabling the clocksource watchdog on the TSC are
constant_tsc, nonstop_tsc and tsc_adjust, which means the TSC gets checked by
the watchdog by default on these machines.

The guest boot after which the systemd-networkd test failed happened on the
17th:
> $ grep 'PM: RTC' serial0-worker-openqaworker26.txt
> [    0.237369][    T1] PM: RTC time: 03:17:39, date: 2024-02-17
The KVM host was up since the 16th and there have been no clocksource watchdog
errors since the machine has booted up:
> $ grep 'PM: RTC\|clocksource' messages.txt
> 2024-02-16T03:32:56+0000 openqaworker26 kernel: clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
> 2024-02-16T03:32:56+0000 openqaworker26 kernel: clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484873504 ns
> 2024-02-16T03:32:56+0000 openqaworker26 kernel: clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x1fb6ad8590d, max_idle_ns: 440795233723 ns
> 2024-02-16T03:32:56+0000 openqaworker26 kernel: clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
> 2024-02-16T03:32:56+0000 openqaworker26 kernel: PM: RTC time: 03:32:35, date: 2024-02-16
> 2024-02-16T03:32:56+0000 openqaworker26 kernel: clocksource: Switched to clocksource tsc-early
> 2024-02-16T03:32:56+0000 openqaworker26 kernel: clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
> 2024-02-16T03:32:56+0000 openqaworker26 kernel: tsc: Refined TSC clocksource calibration: 2199.998 MHz
> 2024-02-16T03:32:56+0000 openqaworker26 kernel: clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1fb63109b96, max_idle_ns: 440795265316 ns
> 2024-02-16T03:32:56+0000 openqaworker26 kernel: clocksource: Switched to clocksource tsc
This means that the TSC counters work fine on the pCPUs. It is almost
impossible for the HPET (the watchdog) and the TSC (the clocksource) to be both
consistently wrong with the same time skew.

The log from the VM is mind-boggling. It seems as if the timer handler for the
clocksource watchdog that was supposed to expire around 315.260000 ran 4 times:
> $ grep 'PM: RTC\|clocksource' serial0-worker-openqaworker26.txt
> [    0.000001][    T0] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [    0.004859][    T0] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 6370452778343963 ns
> [    0.145004][    T0] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
> [    0.152187][    T0] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x1fb63109b96, max_idle_ns: 440795265316 ns
> [    0.234335][    T1] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 6370867519511994 ns
> [    0.237369][    T1] PM: RTC time: 03:17:39, date: 2024-02-17
> [    0.526994][    T1] clocksource: Switched to clocksource kvm-clock
> [    0.537249][    T1] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
> [    0.647119][    T1] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1fb63109b96, max_idle_ns: 440795265316 ns
The first execution of the handler:
> [  315.260520][    C0] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 44383091202 wd_nsec: 506649480
The second execution of the handler:
> [  315.260551][    C0] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
> [  315.262450][    C0] clocksource:                       'kvm-clock' wd_nsec: 506673719 wd_now: 4ae9f21fd6 wd_last: 4acbbee59f mask: ffffffffffffffff
> [  315.264225][    C0] clocksource:                       'tsc' cs_nsec: 656770889 cs_now: a5e0ab6d74 cs_last: a58a8c159a mask: ffffffffffffffff
> [  315.265736][    C0] clocksource:                       Clocksource 'tsc' skewed 150097170 ns (150 ms) over watchdog 'kvm-clock' interval of 506673719 ns (506 ms)
> [  315.267289][    C0] clocksource:                       'kvm-clock' (not 'tsc') is current clocksource.
> [  315.268301][    C0] tsc: Marking TSC unstable due to clocksource watchdog
The third execution of the handler:
> [  315.260531][    C0] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
> [  315.262425][    C0] clocksource:                       'kvm-clock' wd_nsec: 506655009 wd_now: 4ae9f1d6c0 wd_last: 4acbbee59f mask: ffffffffffffffff
> [  315.264455][    C0] clocksource:                       'tsc' cs_nsec: 748234030 cs_now: a5eca9c820 cs_last: a58a8c159a mask: ffffffffffffffff
> [  315.266134][    C0] clocksource:                       Clocksource 'tsc' skewed 241579021 ns (241 ms) over watchdog 'kvm-clock' interval of 506655009 ns (506 ms)
> [  315.268044][    C0] clocksource:                       'kvm-clock' (not 'tsc') is current clocksource.
> [  315.269278][    C0] tsc: Marking TSC unstable due to clocksource watchdog
The fourth execution of the handler:
> [  315.260529][    C0] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1438452798 wd_nsec: 506651809
I would like to investigate this later. I will need to take in account that the
guest has voluntary preemption enabled.

Cladio, could someone from your team please elaborate on whether it makes sense
at all to use the tsc clocksource in KVM guests? The tsc clocksource is not the
current clocksource (kvm-clock is) but watchdog checks are enabled for it. I
think the watchdog check errors are just noise because the tsc clocksource is
useless and the tsc clocksource or, at least, its watchdog checks should be
disabled.


You are receiving this mail because: