Mailinglist Archive: opensuse-buildservice (176 mails)

< Previous Next >
Re: [opensuse-buildservice] Track statistics on the openSUSE staging process to gain feedback on changes
Hi Jimmy,

I have to admit that I don't understand which exact statistics you want
to get. It would be important for me to have a concrete description
for each meassurement you want to do. We can decide then individual
if we can provide these numbers.

Please create some seperate document for each of them. In case it is critical
for the project please use Fate. Otherwise some wiki page or github issue
might be sufficient.

Please describe what these numbers should tell and what should be the
base for for these numbers from your POV. We can discuss about the
implementation details then in a later step.

thanks
adrian

On Dienstag, 14. März 2017, 20:56:59 CET wrote Jimmy Berry:
On Tuesday, March 7, 2017 3:27:11 PM CDT Henne Vogelsang wrote:
Hey,

On 01.03.2017 22:23, Jimmy Berry wrote:
On Wednesday, March 1, 2017 5:44:58 PM CST Henne Vogelsang wrote:
If you need to record some extra time series data for your staging
workflow engine you can do that, as your engine always runs in the
context of the OBS instance it's mounted on top of. So it will also have
access to the influxdb instance etc.

Same is BTW true for access to the SQL database, your engine has the
same access as the Rails app it's mounted from.

As I would expect. I was looking for access to develop against since it is
difficult to recreate an accurate facsimile of the OBS instance and near
impossible to simulate the variety of workflows through which requests
have
gone.

I very much doubt that. We have an extensive test suite that is already
'simulating' all major workflows, including requests of the various
kinds. For creating data you can use the tooling that exists, like our
data factories[1]. If you need help with this do not hesitate to contact
me :-)

I skimmed through the files and I did not see anything similar to the Factory
staging workflow managed by openSUSE/osc-plugin-factory. The components of
that workflow would be covered by such tests and data creation, but is not
terribly helpful for trying to build something to extract specific
statistics.
The staging workflow creates reviews when requests are staged in a particular
staging and records in which staging the request was placed.

The statistics of interest need to look for specific types of reviews related
to staging process and spacing between them and other events. The data needs
to be very specifically structure like that in the real instance.

To be clear, I already wrote a few queries locally against records I created
by hand that extract the desired information. An example of tricky data, a
request can be staged, denied, unstaged, and then reinstated. At which point
during the time it was denied no review changes will be recorded (ie the fact
that it was unstaged). This is one of the cases the tooling has to handle and
I can recreate locally so I have no doubt it occurs. Making sure the
statistics properly handle all the intricacies of the real data cannot easily
be simulated. Having done this sort of work on other live systems it is
nearly
impossible to predict the interesting edge-cases in real data and is not
particularly productive to do so when compared to running against the real
thing.


It would also be good to see if pulling certain metrics directly from
the source tables is performant enough.

Aren't you getting ahead of yourself? Why don't you first figure out
what you want to do and how and then worry about performance of the
production DB :-)

As noted in the original post I have quite a bit of detail in what I want to
do and a few approaches which are dependent on the performance of said
approaches. If the simplest approach performance is sufficient why spend
extra
time on a more complex approach?

If others have time to get more directly involved I can document more
publicly
the specifics of what I have already done, but otherwise I'll save that for
when I have a final solution.


When I worked on the tooling used by the development site for other open
source projects it was possible to get a sanitized database dump or
staging
environment that had access to both a clone of production and read access
to production. These resources were invaluable for validating data
migrations and tools before deployment.

This is a good practice that we also follow. But what has this to do
with your tool? You are neither migrating nor deploying...

Looking for the edge-cases in the data, especially when requests operated on
while in a denied state (as noted above).


> Without such access it was impossible to predict all

the ways in which data can be either inconsistent, corrupted, or odd edge-
cases.

Again you are getting ahead of yourself I think. We have a very well
documented data structure. If something is inconsistent, corrupted or an
odd edge case it is by our definition broken. If you come across such a
case you should tell us or better yet fix that case :-)

I agree the data structure is documented. As noted I already wrote queries
for
some of the desired information. Without running queries and scripts against
the real data I cannot find edge-cases.


Given that storing additional information will not cover all the desired
metrics it is likely more effective to just record timeseries data. I'll
have to look at the tool in question, but I would expect a background job
to run that periodically writes a record to the timeseries database.

No, the contrary. Every time something happens a data point get's
recorded into a data set in the time series DB. So let's say a request
is closed. You would record the fact, the time, add some tags describing
the resolution (accepted, decline) or the user who did this etc. Once
you have this data in the time series DB you can query and display it :-)

I contrasted storing additional data (in the OBS structure) to storing
everything of interest in timeseries database. Indeed, having the data in a
timeseries database would work, but represents a lot of data duplication and
an entire process that as I understand does not currently exist. As such I
was
hoping to avoid it and pull at least a subset directly from the existing data
structure.


On that note, are the various influx software pieces setup and
hosted or has nothing been done short of selecting the desired tool?

No nothing is done yet. Just planed, sorry.

Henne

[1]
https://github.com/openSUSE/open-build-service/tree/master/src/api/spec/fact
ories

At this point, I am not sure what is desired to move this forward. I have a
goal of specific metrics that I would like to extract and present documented
in original post. I have done work on a local instance to see what metrics
can
be extracted from the existing data and wrote queries to do so. I have
determined what information is lacking and that likely best to just have a
new
process for writing such timeseries data, which sounds similar to what was
planned.

There are certain trends in the metrics I would expect to be present in the
real data that I would like to confirm. In fact the metrics that can be
extracted from the existing data may suffice if they demonstrate the things
in
which I am interested, but I cannot tell them from running queries against
fake data.

I had hoped to avoid creating a data scraping tool, but if it is not possible
to gain some sort of access to the data I may just do that to avoid being
blocked. Likely I'll write the data into the same structure used by OBS so
the
tool will be compatible if ever deployed properly.

Some of the data is generic to all requests and other is specific to openSUSE/
obs_factory and openSUSE/osc-factory-plugin related specifics. I have
considered building some additional API calls, perhaps some in obs_factory,
that could expose certain aggregate query results. That may be useful, but at
the moment this project is somewhat exploratory in that it will become clear
what is interesting in the data when it is explored. As such a more fluid
setup that allows for developing queries and metrics until a full picture is
clear seems to make sense rather than trying to build code and have it
deployed before even an initial result can be seen.




--

Adrian Schroeter
email: adrian@xxxxxxx

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284
(AG Nürnberg)

Maxfeldstraße 5
90409 Nürnberg
Germany


< Previous Next >
List Navigation