[opensuse-buildservice] Track statistics on the openSUSE staging process to gain feedback on changes

newer
[opensuse-buildservice] Open Build...

older
[opensuse-buildservice] Request...

Jimmy Berry

1 Mar 2017 1 Mar '17

07:48

I am looking to provide a variety of statistics and metrics relating to the staging workflow on OBS in onder to see the impact of automation and tune the tools. I have spent some time setting up a local OBS, obs_factory engine, looking through the database structure, and reviewing tools designed to present metrics. My understanding from @hennevogel is that Influx Data is planned to be used to provide this type of information and it makes sense to get a feel for the existing plans and how this may fit into that. The biggest hurdle seems to be trying to create timeseries information from event data by walking the events backwards from the known state (ie current) to find states of interest. At some point whatever solution is built will need access to the relevant data to do some aggregation and store intermediate results that can then be drawn on for presentation. Alternatively, I had some success using sub-selects, but that is likely not the most performant way forward. Given the design of the submit request/staging workflow, that no events are recorded after a request is declined, it is not possible to determine when a obsolete request was unstaged. The overall state of stagings cannot be accurately determined when such a workflow has occurred. It will likely also be difficult/impossible to determine when a build state complete, re-entered building due to manual rebuilds or re-freeze, or failed/passed testing. The dashboard [1] already provides this information, but re-creating a history in an aggregate form is likely near impossible if not very difficult. As such it likely makes sense to create a new polling job that stores facets of the information collected by the dashboard in a timerseries database. For the rest of the information tied specific to individual requests it seems possible to figure everything out from a handful of tables in the OBS database. Alternatively, I can write something that polls and scrapes data using the APIs into local storage, but that seems like unnecessary extra work. Is it feasible for me to be granted read access to the few tables of interest or the Influx/similar tool setup with access so that I can begin setting up some metrics of interest? A bit more on the topic of generating timeseries data. As an example, consider presenting a graph of the request backlog against Factory over time, the time until first staging, or the number of empty stagings over time. The event information is collected in the form of reviews. Anytime the request is staged or re-staged reviews for the particular staging project are added or accepted. Accepting or declining the request unfortunately stops future changes from being recorded which means the staging tools cannot indicate when the request is removed from a staging, but for simplicity it can be assumed complete after one of those states. Assuming one has a known state from which to start it should be possible to walk the event history and annotate states of interest. Given that the current state can be queried which indicates what requests are currently in a staging that provides a starting point. Assuming the script can be stopped anytime it encounters an annotation again (or other mechanism) the job can be run on some timely basis to annotate the desired information. The polling technique could be used to avoid all this walking the event history which is simpler, but cannot backfill the data. As such it is preferable to walk the event tree where possible. I look forward to your thoughts. [1] https://build.opensuse.org/project/staging_projects/openSUSE:Factory -- Jimmy -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Show replies by date

Adrian Schröter

1 Mar 1 Mar

08:54

On Mittwoch, 1. März 2017, 01:48:19 CET wrote Jimmy Berry:

...

I am looking to provide a variety of statistics and metrics relating to the staging workflow on OBS in onder to see the impact of automation and tune the tools. I have spent some time setting up a local OBS, obs_factory engine, looking through the database structure, and reviewing tools designed to present metrics. My understanding from @hennevogel is that Influx Data is planned to be used to provide this type of information and it makes sense to get a feel for the existing plans and how this may fit into that.

The biggest hurdle seems to be trying to create timeseries information from event data by walking the events backwards from the known state (ie current) to find states of interest. At some point whatever solution is built will need access to the relevant data to do some aggregation and store intermediate results that can then be drawn on for presentation. Alternatively, I had some success using sub-selects, but that is likely not the most performant way forward.

Given the design of the submit request/staging workflow, that no events are recorded after a request is declined, it is not possible to determine when a obsolete request was unstaged. The overall state of stagings cannot be accurately determined when such a workflow has occurred. It will likely also be difficult/impossible to determine when a build state complete, re-entered building due to manual rebuilds or re-freeze, or failed/passed testing. The dashboard [1] already provides this information, but re-creating a history in an aggregate form is likely near impossible if not very difficult. As such it likely makes sense to create a new polling job that stores facets of the information collected by the dashboard in a timerseries database.

For the rest of the information tied specific to individual requests it seems possible to figure everything out from a handful of tables in the OBS database.

We did this for the maintenance statistics: osc api /statistics/maintenance_statistics/openSUSE:Maintenance:6433 it also handles assignments from a group to a user to calculate how long the group took to review.

...

Alternatively, I can write something that polls and scrapes data using the APIs into local storage, but that seems like unnecessary extra work. Is it feasible for me to be granted read access to the few tables of interest or the Influx/similar tool setup with access so that I can begin setting up some metrics of interest?

Your code is very much isolated, so I can't really judge about it. Just one hint, your code runs in an environemnt and host which is critical for our security. And also for all people using repositories from it. So another service means another potential weakness, it would be good if that does not need to run on our main server at least.

...

A bit more on the topic of generating timeseries data. As an example, consider presenting a graph of the request backlog against Factory over time, the time until first staging, or the number of empty stagings over time. The event information is collected in the form of reviews. Anytime the request is staged or re-staged reviews for the particular staging project are added or accepted. Accepting or declining the request unfortunately stops future changes from being recorded which means the staging tools cannot indicate when the request is removed from a staging, but for simplicity it can be assumed complete after one of those states.

I think most of this is generic and not specific to staging projects. So it would be good to extend the generic request system with statistics IMHO.

...

Assuming one has a known state from which to start it should be possible to walk the event history and annotate states of interest. Given that the current state can be queried which indicates what requests are currently in a staging that provides a starting point. Assuming the script can be stopped anytime it encounters an annotation again (or other mechanism) the job can be run on some timely basis to annotate the desired information.

The polling technique could be used to avoid all this walking the event history which is simpler, but cannot backfill the data. As such it is preferable to walk the event tree where possible.

I look forward to your thoughts.

[1] https://build.opensuse.org/project/staging_projects/openSUSE:Factory

-- Adrian Schroeter email: adrian@suse.de SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) Maxfeldstraße 5 90409 Nürnberg Germany -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Jimmy Berry

14:43

On Wednesday, March 1, 2017 9:54:14 AM CST Adrian Schröter wrote:

...

On Mittwoch, 1. März 2017, 01:48:19 CET wrote Jimmy Berry:

...
I am looking to provide a variety of statistics and metrics relating to the staging workflow on OBS in onder to see the impact of automation and tune the tools. I have spent some time setting up a local OBS, obs_factory engine, looking through the database structure, and reviewing tools designed to present metrics. My understanding from @hennevogel is that Influx Data is planned to be used to provide this type of information and it makes sense to get a feel for the existing plans and how this may fit into that.

The biggest hurdle seems to be trying to create timeseries information from event data by walking the events backwards from the known state (ie current) to find states of interest. At some point whatever solution is built will need access to the relevant data to do some aggregation and store intermediate results that can then be drawn on for presentation. Alternatively, I had some success using sub-selects, but that is likely not the most performant way forward.

Given the design of the submit request/staging workflow, that no events are recorded after a request is declined, it is not possible to determine when a obsolete request was unstaged. The overall state of stagings cannot be accurately determined when such a workflow has occurred. It will likely also be difficult/impossible to determine when a build state complete, re-entered building due to manual rebuilds or re-freeze, or failed/passed testing. The dashboard [1] already provides this information, but re-creating a history in an aggregate form is likely near impossible if not very difficult. As such it likely makes sense to create a new polling job that stores facets of the information collected by the dashboard in a timerseries database.

For the rest of the information tied specific to individual requests it seems possible to figure everything out from a handful of tables in the OBS database.

We did this for the maintenance statistics:

osc api /statistics/maintenance_statistics/openSUSE:Maintenance:6433

it also handles assignments from a group to a user to calculate how long the group took to review.

This looks somewhat similar although the example provided does not have any reviews, but based on your comment I can assume what that might look like. What tool consumes this API? Presumably the tool then has to scrape all the information from this API in a similar manor to what I was trying to avoid.

...

...
Alternatively, I can write something that polls and scrapes data using the APIs into local storage, but that seems like unnecessary extra work. Is it feasible for me to be granted read access to the few tables of interest or the Influx/similar tool setup with access so that I can begin setting up some metrics of interest?

Your code is very much isolated, so I can't really judge about it. Just one hint, your code runs in an environemnt and host which is critical for our security. And also for all people using repositories from it.

So another service means another potential weakness, it would be good if that does not need to run on our main server at least.

Any code I mention at this point is running on my local machine. If read access to the source tables is not possible the code can be hosted entirely separate from OBS.

...

...
A bit more on the topic of generating timeseries data. As an example, consider presenting a graph of the request backlog against Factory over time, the time until first staging, or the number of empty stagings over time. The event information is collected in the form of reviews. Anytime the request is staged or re-staged reviews for the particular staging project are added or accepted. Accepting or declining the request unfortunately stops future changes from being recorded which means the staging tools cannot indicate when the request is removed from a staging, but for simplicity it can be assumed complete after one of those states.

I think most of this is generic and not specific to staging projects. So it would be good to extend the generic request system with statistics IMHO.

That was my understanding as well, which is why I am not proceeding any further until I get an understanding of any existing plans surrounding OBS metrics.

...

...
Assuming one has a known state from which to start it should be possible to walk the event history and annotate states of interest. Given that the current state can be queried which indicates what requests are currently in a staging that provides a starting point. Assuming the script can be stopped anytime it encounters an annotation again (or other mechanism) the job can be run on some timely basis to annotate the desired information.

The polling technique could be used to avoid all this walking the event history which is simpler, but cannot backfill the data. As such it is preferable to walk the event tree where possible.

I look forward to your thoughts.

[1] https://build.opensuse.org/project/staging_projects/openSUSE:Factory

-- Jimmy -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Henne Vogelsang

16:44

Hey, On 01.03.2017 15:43, Jimmy Berry wrote:

...

That was my understanding as well, which is why I am not proceeding any further until I get an understanding of any existing plans surrounding OBS metrics.

Apart from the tool we want to use to store time series data (influxdb), the tool we want to use to send data there (influxer) and the tool we want to use to show metrics (grafana) we don't have much of a plan. I guess it's up to you to figure out how you can make sense out of this for your use case :-) If you need to record some extra time series data for your staging workflow engine you can do that, as your engine always runs in the context of the OBS instance it's mounted on top of. So it will also have access to the influxdb instance etc. Same is BTW true for access to the SQL database, your engine has the same access as the Rails app it's mounted from. I hope that helps, Henne -- Henne Vogelsang http://www.opensuse.org Everybody has a plan, until they get hit. - Mike Tyson -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Jimmy Berry

21:23

On Wednesday, March 1, 2017 5:44:58 PM CST Henne Vogelsang wrote:

...

Hey,

On 01.03.2017 15:43, Jimmy Berry wrote:

...
That was my understanding as well, which is why I am not proceeding any further until I get an understanding of any existing plans surrounding OBS metrics.

Apart from the tool we want to use to store time series data (influxdb), the tool we want to use to send data there (influxer) and the tool we want to use to show metrics (grafana) we don't have much of a plan. I guess it's up to you to figure out how you can make sense out of this for your use case :-)

If you need to record some extra time series data for your staging workflow engine you can do that, as your engine always runs in the context of the OBS instance it's mounted on top of. So it will also have access to the influxdb instance etc.

Same is BTW true for access to the SQL database, your engine has the same access as the Rails app it's mounted from.

As I would expect. I was looking for access to develop against since it is difficult to recreate an accurate facsimile of the OBS instance and near impossible to simulate the variety of workflows through which requests have gone. It would also be good to see if pulling certain metrics directly from the source tables is performant enough. When I worked on the tooling used by the development site for other open source projects it was possible to get a sanitized database dump or staging environment that had access to both a clone of production and read access to production. These resources were invaluable for validating data migrations and tools before deployment. Without such access it was impossible to predict all the ways in which data can be either inconsistent, corrupted, or odd edge- cases. Given that storing additional information will not cover all the desired metrics it is likely more effective to just record timeseries data. I'll have to look at the tool in question, but I would expect a background job to run that periodically writes a record to the timeseries database. Such a background job that will end up storing data outside of the scope of obs_factory. On that note, are the various influx software pieces setup and hosted or has nothing been done short of selecting the desired tool? Short of database read access to where I can potentially run some of these tools myself and figure out how to set things up I am not really sure how I can proceed. Either I spend my time scraping the data via the APIs or writing a scripts to generate data to develop against. Both of which seem like unnecessary extra effort given the real deal already exists. I am happy to put in effort to make this happen, but I'd rather not beat around the bush recreating data that may or may not properly represent the real data. Even if I can somehow put everything in the obs_factory engine that does not help me develop it.

...

I hope that helps,

Henne

Thanks, -- Jimmy -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Henne Vogelsang

7 Mar 7 Mar

14:27

Hey, On 01.03.2017 22:23, Jimmy Berry wrote:

...

On Wednesday, March 1, 2017 5:44:58 PM CST Henne Vogelsang wrote:

...
If you need to record some extra time series data for your staging workflow engine you can do that, as your engine always runs in the context of the OBS instance it's mounted on top of. So it will also have access to the influxdb instance etc.

Same is BTW true for access to the SQL database, your engine has the same access as the Rails app it's mounted from.

As I would expect. I was looking for access to develop against since it is difficult to recreate an accurate facsimile of the OBS instance and near impossible to simulate the variety of workflows through which requests have gone.

I very much doubt that. We have an extensive test suite that is already 'simulating' all major workflows, including requests of the various kinds. For creating data you can use the tooling that exists, like our data factories[1]. If you need help with this do not hesitate to contact me :-)

...

It would also be good to see if pulling certain metrics directly from the source tables is performant enough.

Aren't you getting ahead of yourself? Why don't you first figure out what you want to do and how and then worry about performance of the production DB :-)

...

When I worked on the tooling used by the development site for other open source projects it was possible to get a sanitized database dump or staging environment that had access to both a clone of production and read access to production. These resources were invaluable for validating data migrations and tools before deployment.

This is a good practice that we also follow. But what has this to do with your tool? You are neither migrating nor deploying...

...

Without such access it was impossible to predict all the ways in which data can be either inconsistent, corrupted, or odd edge- cases.

Again you are getting ahead of yourself I think. We have a very well documented data structure. If something is inconsistent, corrupted or an odd edge case it is by our definition broken. If you come across such a case you should tell us or better yet fix that case :-)

...

Given that storing additional information will not cover all the desired metrics it is likely more effective to just record timeseries data. I'll have to look at the tool in question, but I would expect a background job to run that periodically writes a record to the timeseries database.

No, the contrary. Every time something happens a data point get's recorded into a data set in the time series DB. So let's say a request is closed. You would record the fact, the time, add some tags describing the resolution (accepted, decline) or the user who did this etc. Once you have this data in the time series DB you can query and display it :-)

...

On that note, are the various influx software pieces setup and hosted or has nothing been done short of selecting the desired tool?

No nothing is done yet. Just planed, sorry. Henne [1] https://github.com/openSUSE/open-build-service/tree/master/src/api/spec/fact... -- Henne Vogelsang http://www.opensuse.org Everybody has a plan, until they get hit. - Mike Tyson -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Jimmy Berry

15 Mar 15 Mar

01:56

On Tuesday, March 7, 2017 3:27:11 PM CDT Henne Vogelsang wrote:

...

Hey,

On 01.03.2017 22:23, Jimmy Berry wrote:

...
On Wednesday, March 1, 2017 5:44:58 PM CST Henne Vogelsang wrote:

...
If you need to record some extra time series data for your staging workflow engine you can do that, as your engine always runs in the context of the OBS instance it's mounted on top of. So it will also have access to the influxdb instance etc.

Same is BTW true for access to the SQL database, your engine has the same access as the Rails app it's mounted from.

As I would expect. I was looking for access to develop against since it is difficult to recreate an accurate facsimile of the OBS instance and near impossible to simulate the variety of workflows through which requests have gone.

I very much doubt that. We have an extensive test suite that is already 'simulating' all major workflows, including requests of the various kinds. For creating data you can use the tooling that exists, like our data factories[1]. If you need help with this do not hesitate to contact me :-)

I skimmed through the files and I did not see anything similar to the Factory staging workflow managed by openSUSE/osc-plugin-factory. The components of that workflow would be covered by such tests and data creation, but is not terribly helpful for trying to build something to extract specific statistics. The staging workflow creates reviews when requests are staged in a particular staging and records in which staging the request was placed. The statistics of interest need to look for specific types of reviews related to staging process and spacing between them and other events. The data needs to be very specifically structure like that in the real instance. To be clear, I already wrote a few queries locally against records I created by hand that extract the desired information. An example of tricky data, a request can be staged, denied, unstaged, and then reinstated. At which point during the time it was denied no review changes will be recorded (ie the fact that it was unstaged). This is one of the cases the tooling has to handle and I can recreate locally so I have no doubt it occurs. Making sure the statistics properly handle all the intricacies of the real data cannot easily be simulated. Having done this sort of work on other live systems it is nearly impossible to predict the interesting edge-cases in real data and is not particularly productive to do so when compared to running against the real thing.

...

...
It would also be good to see if pulling certain metrics directly from the source tables is performant enough.

Aren't you getting ahead of yourself? Why don't you first figure out what you want to do and how and then worry about performance of the production DB :-)

As noted in the original post I have quite a bit of detail in what I want to do and a few approaches which are dependent on the performance of said approaches. If the simplest approach performance is sufficient why spend extra time on a more complex approach? If others have time to get more directly involved I can document more publicly the specifics of what I have already done, but otherwise I'll save that for when I have a final solution.

...

...
When I worked on the tooling used by the development site for other open source projects it was possible to get a sanitized database dump or staging environment that had access to both a clone of production and read access to production. These resources were invaluable for validating data migrations and tools before deployment.

This is a good practice that we also follow. But what has this to do with your tool? You are neither migrating nor deploying...

Looking for the edge-cases in the data, especially when requests operated on while in a denied state (as noted above).

...

...
Without such access it was impossible to predict all

the ways in which data can be either inconsistent, corrupted, or odd edge- cases.

Again you are getting ahead of yourself I think. We have a very well documented data structure. If something is inconsistent, corrupted or an odd edge case it is by our definition broken. If you come across such a case you should tell us or better yet fix that case :-)

I agree the data structure is documented. As noted I already wrote queries for some of the desired information. Without running queries and scripts against the real data I cannot find edge-cases.

...

...
Given that storing additional information will not cover all the desired metrics it is likely more effective to just record timeseries data. I'll have to look at the tool in question, but I would expect a background job to run that periodically writes a record to the timeseries database.

No, the contrary. Every time something happens a data point get's recorded into a data set in the time series DB. So let's say a request is closed. You would record the fact, the time, add some tags describing the resolution (accepted, decline) or the user who did this etc. Once you have this data in the time series DB you can query and display it :-)

I contrasted storing additional data (in the OBS structure) to storing everything of interest in timeseries database. Indeed, having the data in a timeseries database would work, but represents a lot of data duplication and an entire process that as I understand does not currently exist. As such I was hoping to avoid it and pull at least a subset directly from the existing data structure.

...

...
On that note, are the various influx software pieces setup and hosted or has nothing been done short of selecting the desired tool?

No nothing is done yet. Just planed, sorry.

Henne

[1] https://github.com/openSUSE/open-build-service/tree/master/src/api/spec/fact ories

At this point, I am not sure what is desired to move this forward. I have a goal of specific metrics that I would like to extract and present documented in original post. I have done work on a local instance to see what metrics can be extracted from the existing data and wrote queries to do so. I have determined what information is lacking and that likely best to just have a new process for writing such timeseries data, which sounds similar to what was planned. There are certain trends in the metrics I would expect to be present in the real data that I would like to confirm. In fact the metrics that can be extracted from the existing data may suffice if they demonstrate the things in which I am interested, but I cannot tell them from running queries against fake data. I had hoped to avoid creating a data scraping tool, but if it is not possible to gain some sort of access to the data I may just do that to avoid being blocked. Likely I'll write the data into the same structure used by OBS so the tool will be compatible if ever deployed properly. Some of the data is generic to all requests and other is specific to openSUSE/ obs_factory and openSUSE/osc-factory-plugin related specifics. I have considered building some additional API calls, perhaps some in obs_factory, that could expose certain aggregate query results. That may be useful, but at the moment this project is somewhat exploratory in that it will become clear what is interesting in the data when it is explored. As such a more fluid setup that allows for developing queries and metrics until a full picture is clear seems to make sense rather than trying to build code and have it deployed before even an initial result can be seen. -- Jimmy -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Adrian Schröter

07:28

Hi Jimmy, I have to admit that I don't understand which exact statistics you want to get. It would be important for me to have a concrete description for each meassurement you want to do. We can decide then individual if we can provide these numbers. Please create some seperate document for each of them. In case it is critical for the project please use Fate. Otherwise some wiki page or github issue might be sufficient. Please describe what these numbers should tell and what should be the base for for these numbers from your POV. We can discuss about the implementation details then in a later step. thanks adrian On Dienstag, 14. März 2017, 20:56:59 CET wrote Jimmy Berry:

...

On Tuesday, March 7, 2017 3:27:11 PM CDT Henne Vogelsang wrote:

...
Hey,

On 01.03.2017 22:23, Jimmy Berry wrote:

...
On Wednesday, March 1, 2017 5:44:58 PM CST Henne Vogelsang wrote:

...
If you need to record some extra time series data for your staging workflow engine you can do that, as your engine always runs in the context of the OBS instance it's mounted on top of. So it will also have access to the influxdb instance etc.

Same is BTW true for access to the SQL database, your engine has the same access as the Rails app it's mounted from.

As I would expect. I was looking for access to develop against since it is difficult to recreate an accurate facsimile of the OBS instance and near impossible to simulate the variety of workflows through which requests have gone.

I very much doubt that. We have an extensive test suite that is already 'simulating' all major workflows, including requests of the various kinds. For creating data you can use the tooling that exists, like our data factories[1]. If you need help with this do not hesitate to contact me :-)

I skimmed through the files and I did not see anything similar to the Factory staging workflow managed by openSUSE/osc-plugin-factory. The components of that workflow would be covered by such tests and data creation, but is not terribly helpful for trying to build something to extract specific statistics. The staging workflow creates reviews when requests are staged in a particular staging and records in which staging the request was placed.

The statistics of interest need to look for specific types of reviews related to staging process and spacing between them and other events. The data needs to be very specifically structure like that in the real instance.

To be clear, I already wrote a few queries locally against records I created by hand that extract the desired information. An example of tricky data, a request can be staged, denied, unstaged, and then reinstated. At which point during the time it was denied no review changes will be recorded (ie the fact that it was unstaged). This is one of the cases the tooling has to handle and I can recreate locally so I have no doubt it occurs. Making sure the statistics properly handle all the intricacies of the real data cannot easily be simulated. Having done this sort of work on other live systems it is nearly impossible to predict the interesting edge-cases in real data and is not particularly productive to do so when compared to running against the real thing.

...
...
It would also be good to see if pulling certain metrics directly from the source tables is performant enough.

Aren't you getting ahead of yourself? Why don't you first figure out what you want to do and how and then worry about performance of the production DB :-)

As noted in the original post I have quite a bit of detail in what I want to do and a few approaches which are dependent on the performance of said approaches. If the simplest approach performance is sufficient why spend extra time on a more complex approach?

If others have time to get more directly involved I can document more publicly the specifics of what I have already done, but otherwise I'll save that for when I have a final solution.

...
...
When I worked on the tooling used by the development site for other open source projects it was possible to get a sanitized database dump or staging environment that had access to both a clone of production and read access to production. These resources were invaluable for validating data migrations and tools before deployment.

This is a good practice that we also follow. But what has this to do with your tool? You are neither migrating nor deploying...

Looking for the edge-cases in the data, especially when requests operated on while in a denied state (as noted above).

...
...
Without such access it was impossible to predict all

the ways in which data can be either inconsistent, corrupted, or odd edge- cases.

Again you are getting ahead of yourself I think. We have a very well documented data structure. If something is inconsistent, corrupted or an odd edge case it is by our definition broken. If you come across such a case you should tell us or better yet fix that case :-)

I agree the data structure is documented. As noted I already wrote queries for some of the desired information. Without running queries and scripts against the real data I cannot find edge-cases.

...
...
Given that storing additional information will not cover all the desired metrics it is likely more effective to just record timeseries data. I'll have to look at the tool in question, but I would expect a background job to run that periodically writes a record to the timeseries database.

No, the contrary. Every time something happens a data point get's recorded into a data set in the time series DB. So let's say a request is closed. You would record the fact, the time, add some tags describing the resolution (accepted, decline) or the user who did this etc. Once you have this data in the time series DB you can query and display it :-)

I contrasted storing additional data (in the OBS structure) to storing everything of interest in timeseries database. Indeed, having the data in a timeseries database would work, but represents a lot of data duplication and an entire process that as I understand does not currently exist. As such I was hoping to avoid it and pull at least a subset directly from the existing data structure.

...
...
On that note, are the various influx software pieces setup and hosted or has nothing been done short of selecting the desired tool?

No nothing is done yet. Just planed, sorry.

Henne

[1] https://github.com/openSUSE/open-build-service/tree/master/src/api/spec/fact ories

At this point, I am not sure what is desired to move this forward. I have a goal of specific metrics that I would like to extract and present documented in original post. I have done work on a local instance to see what metrics can be extracted from the existing data and wrote queries to do so. I have determined what information is lacking and that likely best to just have a new process for writing such timeseries data, which sounds similar to what was planned.

There are certain trends in the metrics I would expect to be present in the real data that I would like to confirm. In fact the metrics that can be extracted from the existing data may suffice if they demonstrate the things in which I am interested, but I cannot tell them from running queries against fake data.

I had hoped to avoid creating a data scraping tool, but if it is not possible to gain some sort of access to the data I may just do that to avoid being blocked. Likely I'll write the data into the same structure used by OBS so the tool will be compatible if ever deployed properly.

Some of the data is generic to all requests and other is specific to openSUSE/ obs_factory and openSUSE/osc-factory-plugin related specifics. I have considered building some additional API calls, perhaps some in obs_factory, that could expose certain aggregate query results. That may be useful, but at the moment this project is somewhat exploratory in that it will become clear what is interesting in the data when it is explored. As such a more fluid setup that allows for developing queries and metrics until a full picture is clear seems to make sense rather than trying to build code and have it deployed before even an initial result can be seen.

-- Adrian Schroeter email: adrian@suse.de SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) Maxfeldstraße 5 90409 Nürnberg Germany

2605

Age (days ago)

2619

Last active (days ago)

List overview

Download

7 comments

3 participants

participants (3)

Adrian Schröter
Henne Vogelsang
Jimmy Berry