[yast-devel] An idea for integration tests
Since I was tired of the Trello card I spent the whole workshop working on and being inspired by Flavio's talk, I switched my mind this morning to the integration tests card. I had an idea for writing tests with much less stubbing (using a virtual system) than I think is worth trying. Explained (together with some braindumping) at https://public.pad.fsfe.org/p/yast-integration-tests Cheers. -- Ancor González Sosa openSUSE Team at SUSE Linux GmbH -- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org
On Fri, 26 Sep 2014 15:24:13 +0200 Ancor Gonzalez Sosa <ancor@suse.de> wrote:
Since I was tired of the Trello card I spent the whole workshop working on and being inspired by Flavio's talk, I switched my mind this morning to the integration tests card.
I had an idea for writing tests with much less stubbing (using a virtual system) than I think is worth trying.
Explained (together with some braindumping) at https://public.pad.fsfe.org/p/yast-integration-tests
Cheers.
Hi, I answer to etherpad document here as I have more highlevel notes: 1) It does not make sense to me to separate two layers for integration testing. Integration testing is whole stack test so if click on button on top level do what is expected. Integration testing usually have almost none mocking or testing on bariers ( where barier is parts out of your control, which you cannot affect ). 2) Your proposal with scr_server is almost exactly what old testsuite did - testing on SCR level. It have several significant drawbacks: 2a) It do not test whole stack. If there is bug in agent, you do not catch it 2b) It also do not test integration with whole system, so if some command stop working, like removing option or splitting package, or if configuration file change syntax, so won't notice it. 2c) It is a lot of work and hard to maintain. Consider how often we break our old testsuite just before we add some new SCR calls. 3) docker is for me not solution, as in docker do not run systemd and almost all modules somehow manipulate with services. So I think only way is full virtualization. No let me comment your requirements:
rollbacks
yes, it make sense. When I play in past with cloud and kvm, it have snapshotting ability
"inject" yast module with dependencies
I will be more strict here. Install yast package. It ensure that installation works for package, that it have all required packages and also test way which probably user use.
different initial systems
yes, make sense. Question is if we need it prepared or if we start with same initial system and convert to target state like customer should do it.
temporary system and catch output
Yes, it make sense for me. What other requirements should be there 1) parallel run integration tests are quite slow and we need to speed up as much as possible 2) visible output Common problem is that integration tests run over night, so next day everyone should see what is broken and that we need to fix it. 3) easy debugging If something break in over-night run, we need to have way how to get what is broken 3) reduce fragility Common problem of integration testing is fragility, as tests are often broken. So we need easy way how to fix it or how to make it more robust Few ideas I have about topic: - using VNC library to work with UI - https://code.google.com/p/ruby-vnc/ - use OCR to localize buttons with text and have pointers where to click - use cloud for parallel build with its snapshotting ability - create tree if requirements and how to get into such state like installation with partitioning A -> snapshot A -> install + run module M1 in scenario A -> install + run module M2 in scenario A installation with partitioning B -> snapshot B -> install + run module M1 in scenario B -> install + run module M2 in scenario B installation with partitioning C -> snapshot C -> install + run module M1 in scenario B For quick install we can use autoyast and before we run test we should verify that it is really expected state - For released project use already known snapshot, for product before release use latest succesful snapshot Josef -- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org
On 09/29/2014 09:02 AM, Josef Reidinger wrote:
On Fri, 26 Sep 2014 15:24:13 +0200 Ancor Gonzalez Sosa <ancor@suse.de> wrote:
Since I was tired of the Trello card I spent the whole workshop working on and being inspired by Flavio's talk, I switched my mind this morning to the integration tests card.
I had an idea for writing tests with much less stubbing (using a virtual system) than I think is worth trying.
Explained (together with some braindumping) at https://public.pad.fsfe.org/p/yast-integration-tests
Cheers.
Hi, I answer to etherpad document here as I have more highlevel notes:
1) It does not make sense to me to separate two layers for integration testing. Integration testing is whole stack test so if click on button on top level do what is expected. Integration testing usually have almost none mocking or testing on bariers ( where barier is parts out of your control, which you cannot affect ).
My fault. I made an error in the title of the mail. I should have been "an idea for functional tests" (I was sure I used "functional" until I read your reply). That's why the etherpad is titled "integration and functional tests for Yast". I'm focusing first in the later because I'm afraid there is no lightweight solution for the former.
2) Your proposal with scr_server is almost exactly what old testsuite did - testing on SCR level. It have several significant drawbacks:
2a) It do not test whole stack. If there is bug in agent, you do not catch it
Not all, but will help to catch many of them, I think.
2b) It also do not test integration with whole system, so if some command stop working, like removing option or splitting package, or if configuration file change syntax, so won't notice it.
As I said. The solution is not really for integration tests (wrong mail's subject). Even though, I'm not sure if I get your point here. The goal is not to test that a given SCR command was called (that's what we already do with the current unit tests), but to check the status of the target system. Nothing stops us to easily check if a service is running, if a given file is there, etc.
2c) It is a lot of work and hard to maintain. Consider how often we break our old testsuite just before we add some new SCR calls.
I agree. Nevertheless, My impression was that agents do not change so much nowadays. In any case, there are plenty of tools to automate creation of VMs/containers out there.
3) docker is for me not solution, as in docker do not run systemd and almost all modules somehow manipulate with services. So I think only way is full virtualization.
I was not aware of that drawback. I though it was possible to run something "closer" to a full system inside a container. Well, time for a closer look to Pennyworth then.
No let me comment your requirements:
rollbacks yes, it make sense. When I play in past with cloud and kvm, it have snapshotting ability
"inject" yast module with dependencies I will be more strict here. Install yast package. It ensure that installation works for package, that it have all required packages and also test way which probably user use.
different initial systems yes, make sense. Question is if we need it prepared or if we start with same initial system and convert to target state like customer should do it.
I was wondering the same. We probably should have some hybrid system. Some "recipes" to create the systems with some mechanisms to reuse the results as long as they are reusable, to reduce the overhead.
temporary system and catch output Yes, it make sense for me.
What other requirements should be there
1) parallel run integration tests are quite slow and we need to speed up as much as possible
2) visible output Common problem is that integration tests run over night, so next day everyone should see what is broken and that we need to fix it.
3) easy debugging If something break in over-night run, we need to have way how to get what is broken
3) reduce fragility Common problem of integration testing is fragility, as tests are often broken. So we need easy way how to fix it or how to make it more robust
Few ideas I have about topic:
- using VNC library to work with UI - https://code.google.com/p/ruby-vnc/
That's what openQA does nowadays.
- use OCR to localize buttons with text and have pointers where to click
openQA have some preliminary support for this as well. But we found out that the current image matching mechanisms does a good-enough job (sometimes even better that OCR) in 99% of situations. We have not invested much effort in OCR since then.
- use cloud for parallel build with its snapshotting ability
Sounds smart. openQA currently uses kvm for snapshots and we are trying to get the cloud team involved to achieve distribution capabilities (all workers must be in the same machine right now).
- create tree if requirements and how to get into such state like
installation with partitioning A -> snapshot A -> install + run module M1 in scenario A -> install + run module M2 in scenario A
installation with partitioning B -> snapshot B -> install + run module M1 in scenario B -> install + run module M2 in scenario B
installation with partitioning C -> snapshot C -> install + run module M1 in scenario B
For quick install we can use autoyast and before we run test we should verify that it is really expected state
Sounds like ideas that has been considered for openQA (not implemented because the lack of manpower).
- For released project use already known snapshot, for product before release use latest succesful snapshot
+1. As said before, I was focusing more in lightweight solutions for functional tests. Your plans for comprehensive full integration tests that runs overnight and test the installation process overlaps too much with openQA, IMHO. I know that openQA sucks in many things (tests written in dirty Perl not being the minor one), but a very high percentage of your idea sounds like re-inventing a better openQA to me. Cheers. -- Ancor González Sosa openSUSE Team at SUSE Linux GmbH -- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org
On Mon, 29 Sep 2014 10:17:58 +0200 Ancor Gonzalez Sosa <ancor@suse.de> wrote:
On 09/29/2014 09:02 AM, Josef Reidinger wrote:
On Fri, 26 Sep 2014 15:24:13 +0200 Ancor Gonzalez Sosa <ancor@suse.de> wrote:
Since I was tired of the Trello card I spent the whole workshop working on and being inspired by Flavio's talk, I switched my mind this morning to the integration tests card.
I had an idea for writing tests with much less stubbing (using a virtual system) than I think is worth trying.
Explained (together with some braindumping) at https://public.pad.fsfe.org/p/yast-integration-tests
Cheers.
Hi, I answer to etherpad document here as I have more highlevel notes:
1) It does not make sense to me to separate two layers for integration testing. Integration testing is whole stack test so if click on button on top level do what is expected. Integration testing usually have almost none mocking or testing on bariers ( where barier is parts out of your control, which you cannot affect ).
My fault. I made an error in the title of the mail. I should have been "an idea for functional tests" (I was sure I used "functional" until I read your reply). That's why the etherpad is titled "integration and functional tests for Yast". I'm focusing first in the later because I'm afraid there is no lightweight solution for the former.
OK, so lets consider you want functional testing. Still it applies rules to barrier testing as functional testing is blackbox testing. And having barrier on SCR level does not sound like good idea for me, as it is still part we control and we more or less consider implementation detail, as important is what commands and what files we write. So for me it will be better to have stomething like https://github.com/defunkt/fakefs but system-wide, maybe there is something for fuse?
2) Your proposal with scr_server is almost exactly what old testsuite did - testing on SCR level. It have several significant drawbacks:
2a) It do not test whole stack. If there is bug in agent, you do not catch it
Not all, but will help to catch many of them, I think.
on other hand it also need investment, so for me placing barier really outside of our code will help us achieve better ratio with discovered bugs versus investments
2b) It also do not test integration with whole system, so if some command stop working, like removing option or splitting package, or if configuration file change syntax, so won't notice it.
As I said. The solution is not really for integration tests (wrong mail's subject). Even though, I'm not sure if I get your point here. The goal is not to test that a given SCR command was called (that's what we already do with the current unit tests), but to check the status of the target system. Nothing stops us to easily check if a service is running, if a given file is there, etc.
then I probably do not get idea with SCR instance. How it should work?
2c) It is a lot of work and hard to maintain. Consider how often we break our old testsuite just before we add some new SCR calls.
I agree. Nevertheless, My impression was that agents do not change so much nowadays. In any case, there are plenty of tools to automate creation of VMs/containers out there.
yes, I agree.
3) docker is for me not solution, as in docker do not run systemd and almost all modules somehow manipulate with services. So I think only way is full virtualization.
I was not aware of that drawback. I though it was possible to run something "closer" to a full system inside a container. Well, time for a closer look to Pennyworth then.
Well, to be more precious, it is possible, but there is some problems at least according to this http://serverfault.com/questions/607769/running-systemd-inside-a-docker-cont... Another drawbacks is that hardware is machine where it run, so it can be limitation factor.
different initial systems yes, make sense. Question is if we need it prepared or if we start with same initial system and convert to target state like customer should do it.
I was wondering the same. We probably should have some hybrid system. Some "recipes" to create the systems with some mechanisms to reuse the results as long as they are reusable, to reduce the overhead.
yes, with snapshotting it should be quite easy.
Few ideas I have about topic:
- using VNC library to work with UI - https://code.google.com/p/ruby-vnc/
That's what openQA does nowadays.
really? last time when I check it it use qemu console, which is horrible, so I propose to use VNC, but see not much activity there. If it is implemented using this, it is great.
- use OCR to localize buttons with text and have pointers where to click
openQA have some preliminary support for this as well. But we found out that the current image matching mechanisms does a good-enough job (sometimes even better that OCR) in 99% of situations. We have not invested much effort in OCR since then.
Problem is that it works for installer which is quite stable. If you play with yast gui for your module, then I think you break it with almost every commit. So in some cases OCR should bring advantage.
- use cloud for parallel build with its snapshotting ability
Sounds smart. openQA currently uses kvm for snapshots and we are trying to get the cloud team involved to achieve distribution capabilities (all workers must be in the same machine right now).
yes, I also discuss that such limitation is really limitting performance wise.
- create tree if requirements and how to get into such state like
installation with partitioning A -> snapshot A -> install + run module M1 in scenario A -> install + run module M2 in scenario A
installation with partitioning B -> snapshot B -> install + run module M1 in scenario B -> install + run module M2 in scenario B
installation with partitioning C -> snapshot C -> install + run module M1 in scenario B
For quick install we can use autoyast and before we run test we should verify that it is really expected state
Sounds like ideas that has been considered for openQA (not implemented because the lack of manpower).
- For released project use already known snapshot, for product before release use latest succesful snapshot
+1.
As said before, I was focusing more in lightweight solutions for functional tests. Your plans for comprehensive full integration tests that runs overnight and test the installation process overlaps too much with openQA, IMHO. I know that openQA sucks in many things (tests written in dirty Perl not being the minor one), but a very high percentage of your idea sounds like re-inventing a better openQA to me.
Maybe solution for full integration tests are really just improving openQA. Then we can reserve some time to improve openQA to serve our needs. Maybe even have own openQA server just for integration tests of yast modules and submit modules from devel project to target project only when it pass integration tests. What I miss for functional testing is what benefits it brings if we will have integration and unit testing and if such benefits deserve time investment into it? Maybe some summary why it make sense to add another level of tests. Josef -- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org
On 09/29/2014 11:18 AM, Josef Reidinger wrote:
On Mon, 29 Sep 2014 10:17:58 +0200 Ancor Gonzalez Sosa <ancor@suse.de> wrote:
On 09/29/2014 09:02 AM, Josef Reidinger wrote:
On Fri, 26 Sep 2014 15:24:13 +0200 Ancor Gonzalez Sosa <ancor@suse.de> wrote:
Since I was tired of the Trello card I spent the whole workshop working on and being inspired by Flavio's talk, I switched my mind this morning to the integration tests card.
I had an idea for writing tests with much less stubbing (using a virtual system) than I think is worth trying.
Explained (together with some braindumping) at https://public.pad.fsfe.org/p/yast-integration-tests
Cheers.
Hi, I answer to etherpad document here as I have more highlevel notes:
1) It does not make sense to me to separate two layers for integration testing. Integration testing is whole stack test so if click on button on top level do what is expected. Integration testing usually have almost none mocking or testing on bariers ( where barier is parts out of your control, which you cannot affect ). My fault. I made an error in the title of the mail. I should have been "an idea for functional tests" (I was sure I used "functional" until I read your reply). That's why the etherpad is titled "integration and functional tests for Yast". I'm focusing first in the later because I'm afraid there is no lightweight solution for the former. OK, so lets consider you want functional testing. Still it applies rules to barrier testing as functional testing is blackbox testing. And having barrier on SCR level does not sound like good idea for me, as it is still part we control and we more or less consider implementation detail, as important is what commands and what files we write. So for me it will be better to have stomething like https://github.com/defunkt/fakefs but system-wide, maybe there is something for fuse?
2) Your proposal with scr_server is almost exactly what old testsuite did - testing on SCR level. It have several significant drawbacks:
2a) It do not test whole stack. If there is bug in agent, you do not catch it Not all, but will help to catch many of them, I think. on other hand it also need investment, so for me placing barier really outside of our code will help us achieve better ratio with discovered bugs versus investments
2b) It also do not test integration with whole system, so if some command stop working, like removing option or splitting package, or if configuration file change syntax, so won't notice it. As I said. The solution is not really for integration tests (wrong mail's subject). Even though, I'm not sure if I get your point here. The goal is not to test that a given SCR command was called (that's what we already do with the current unit tests), but to check the status of the target system. Nothing stops us to easily check if a service is running, if a given file is there, etc. then I probably do not get idea with SCR instance. How it should work?
The proposed solution is closer to the chroot we usually do in our tests by calling SCROpen("chroot=X") thanto full integration tests like openQA. Calls to SCR will still have effect on files, services and so on, but not on the host system running the testsuite but in a caged volatile system. That's one step further than our currently heavily stubbed tests without going into the load of a fully emulated system.
2c) It is a lot of work and hard to maintain. Consider how often we break our old testsuite just before we add some new SCR calls. I agree. Nevertheless, My impression was that agents do not change so much nowadays. In any case, there are plenty of tools to automate creation of VMs/containers out there.
yes, I agree.
3) docker is for me not solution, as in docker do not run systemd and almost all modules somehow manipulate with services. So I think only way is full virtualization. I was not aware of that drawback. I though it was possible to run something "closer" to a full system inside a container. Well, time for a closer look to Pennyworth then.
Well, to be more precious, it is possible, but there is some problems at least according to this http://serverfault.com/questions/607769/running-systemd-inside-a-docker-cont...
Another drawbacks is that hardware is machine where it run, so it can be limitation factor.
different initial systems yes, make sense. Question is if we need it prepared or if we start with same initial system and convert to target state like customer should do it. I was wondering the same. We probably should have some hybrid system. Some "recipes" to create the systems with some mechanisms to reuse the results as long as they are reusable, to reduce the overhead.
yes, with snapshotting it should be quite easy.
Few ideas I have about topic:
- using VNC library to work with UI - https://code.google.com/p/ruby-vnc/ That's what openQA does nowadays. really?
Really :) https://github.com/os-autoinst/os-autoinst/blob/b9f3b1a80fb1b6d967213e568bd8...
last time when I check it it use qemu console, which is horrible, so I propose to use VNC, but see not much activity there. If it is implemented using this, it is great.
Many tests still use outdated mechanisms, but the core even supports double click :)
- use OCR to localize buttons with text and have pointers where to click openQA have some preliminary support for this as well. But we found out that the current image matching mechanisms does a good-enough job (sometimes even better that OCR) in 99% of situations. We have not invested much effort in OCR since then.
Problem is that it works for installer which is quite stable. If you play with yast gui for your module, then I think you break it with almost every commit. So in some cases OCR should bring advantage.
As said, the preliminary support is there. I would try first with image matching and, once it proves insufficient, resurrect the OCR support.
- use cloud for parallel build with its snapshotting ability Sounds smart. openQA currently uses kvm for snapshots and we are trying to get the cloud team involved to achieve distribution capabilities (all workers must be in the same machine right now).
yes, I also discuss that such limitation is really limitting performance wise.
openQA is "almost useful" for many people in the company. With the involvement of more teams it would be relatively easy to overcome the limitations. It should be almost a piece of cake for the cloud team to improve openQA scalability by making it "cloud enabled". With more feedback and some collaboration of the teams involved in testing, it probably would be possible or even easy to design a sane DSL for the tests. As long as it's just a side project for 4 people, limitations will remain there.
- create tree if requirements and how to get into such state like
installation with partitioning A -> snapshot A -> install + run module M1 in scenario A -> install + run module M2 in scenario A
installation with partitioning B -> snapshot B -> install + run module M1 in scenario B -> install + run module M2 in scenario B
installation with partitioning C -> snapshot C -> install + run module M1 in scenario B
For quick install we can use autoyast and before we run test we should verify that it is really expected state Sounds like ideas that has been considered for openQA (not implemented because the lack of manpower).
- For released project use already known snapshot, for product before release use latest succesful snapshot +1.
As said before, I was focusing more in lightweight solutions for functional tests. Your plans for comprehensive full integration tests that runs overnight and test the installation process overlaps too much with openQA, IMHO. I know that openQA sucks in many things (tests written in dirty Perl not being the minor one), but a very high percentage of your idea sounds like re-inventing a better openQA to me.
Maybe solution for full integration tests are really just improving openQA. Then we can reserve some time to improve openQA to serve our needs. Maybe even have own openQA server just for integration tests of yast modules and submit modules from devel project to target project only when it pass integration tests.
What I miss for functional testing is what benefits it brings if we will have integration and unit testing and if such benefits deserve time investment into it? Maybe some summary why it make sense to add another level of tests.
In the past, I have always used just a combination of unit tests and integration tests because, as you say, I felt that combination covered the whole spectrum and functional tests didn't add much. The problem I see in Yast is that integration tests means very heavy processes that takes ages to run, usually in an infrastructure away from the developer's workstation. And I miss the good old feeling of being able to run rake test:unit test:integration in my workstation and have results in some minutes with immediate feedback when something when wrong (to hit ^C and check without even waiting for the remaining tests to finish). That's why I wanted to go one step further in testing without going into the realm of integration tests, which I think that, for a tool like Yast, cannot be lightweight by definition. Cheers. -- Ancor González Sosa openSUSE Team at SUSE Linux GmbH -- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org
On Fri, Sep 26, 2014 at 03:24:13PM +0200, Ancor Gonzalez Sosa wrote:
Since I was tired of the Trello card I spent the whole workshop working on and being inspired by Flavio's talk, I switched my mind this morning to the integration tests card.
I had an idea for writing tests with much less stubbing (using a virtual system) than I think is worth trying.
Explained (together with some braindumping) at https://public.pad.fsfe.org/p/yast-integration-tests
Unfortunately for this approach several components do not interact via SCR with the system, e.g. libzypp and libstorage. Also any stock rubygem we want to use will not care about SCR. ciao Arvin -- Arvin Schnell, <aschnell@suse.de> Senior Software Engineer, Research & Development SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) Maxfeldstraße 5 90409 Nürnberg Germany -- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org
On 09/29/2014 12:25 PM, Arvin Schnell wrote:
On Fri, Sep 26, 2014 at 03:24:13PM +0200, Ancor Gonzalez Sosa wrote:
Since I was tired of the Trello card I spent the whole workshop working on and being inspired by Flavio's talk, I switched my mind this morning to the integration tests card.
I had an idea for writing tests with much less stubbing (using a virtual system) than I think is worth trying.
Explained (together with some braindumping) at https://public.pad.fsfe.org/p/yast-integration-tests Unfortunately for this approach several components do not interact via SCR with the system, e.g. libzypp and libstorage.
Too bad. I though that agents were always the final interface to the system. :-/
Also any stock rubygem we want to use will not care about SCR.
ciao Arvin
-- Ancor González Sosa openSUSE Team at SUSE Linux GmbH -- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org
participants (3)
-
Ancor Gonzalez Sosa
-
Arvin Schnell
-
Josef Reidinger