On Sun, 2016-09-04 at 18:26 +0200, Erwin Van de Velde wrote:
Hi all,
I appreciate the work done for testing TW and have to say that it rarely breaks, but if I look at the snapshot announcements, I can only conclude that the OpenQA report is not that useful.
E.g. last report: Passed: 54 Incomplete: 3 Soft Failure: 40 Failed: 6 -> Slightly more than half of the tests passed and still this is ok? If you start looking, that might be correct, because for some failed needles it is just the font that has changed or other minor cosmetic changes. However, this makes it hard to find real failures even with the softfailed/failed differentiation.
To have a better feedback and useful tests, up-to-date needles are really required in my opinion. How can we get there?
Kind regards, Erwin
Softfailed could as well be called 'limited pass' - it's minor application tests not passing for different reasons (needles is a cat and mouse game, that are being updated). But even with all needles updated, currently there are two apps causing most of those softfails: * chrome -> new versions introduce keyring integration. The test needs to be adapted to this * vlc on KDE: there is a keyboard shortcut double-occupied (alt-p); most users probably won't realize it as they click there with the mouse; we could change the test to do the same, or keep on working with upstream to solve the underlying issue Any single app in a complete test run causing a fail like this brings the overall test to 'softfail' (or 'limited pass' as a softfailed test is not considered for blocking releases of snapshots). so in short, yes, we do try to minimize the softfail count as well, but without actually looking at the reasons of their occurence, it's not always black and white to claim what is wrong with them. As to what can be done about it: work work work Cheers, Dominique