Mailinglist Archive: opensuse-packaging (250 mails)

< Previous Next >
Re: [opensuse-packaging] Communication problem within packagers
  • From: Luke Imhoff <luke@xxxxxxxx>
  • Date: Tue, 18 Aug 2009 06:36:29 -0500
  • Message-id: <4A8A923D.2080802@xxxxxxxx>
Adrian Schröter wrote:
Am Montag, 17. August 2009 17:23:16 schrieb Luke Imhoff:
We hit a similar issue on our internal OBS instance, so I made some mods
to the server so that a developer could grab a dependency graph and
determine all packages that depend on their OBS package (it actually
goes down to the Subversion code project itself, but that's not
required.) So the developer can be in kernel-source and do
svndownstream deps ls and get a listing all OBS packages that depend on
kernel-source.

What happens is svndownstream gets a list of all the buildinfos (which a
mod to the server merges into one xml file at
/build/_builddependencytree) and then uses the bdep entries and some
python code in svndownstream (which is built on top of osc) to generate
a dependency graph.

The problem with it is my implementation. I couldn't follow all the
code in bs_sched, so we had to switch to generating the full depgraph
only daily (since it takes 8-9 minutes). I'm sure someone that actually
wrote sub buildinfo() or bs_sched could improve on that time since in
theory bs_sched has a depgraph already and there's potential for caching
if you know you're going to ask for everyone's buildinfo. It's just
bs_sched is a big loop and I couldn't figure out how to extract it any
of the information it had in it.

We have actually the "whatdependson" call as (the last?) missing feature for our 1.7 release.

We have already the :depend data on the server, so it should be not too hard to get this implemented.

However, I would be very interessted in your code creating the graphes. It would be nice if we could integrate this into our web interface (maybe even interactive via clicking on an item and jump to that package ;)

bye
adrian

The main class doing all the calculations based on the /build/_binarytree and /build/_builddependencytree is DependencyGraph in build.py. The binarytree and builddependencytree are just calls to the server that merge together all the _binary or bdeps entries for the _buildinfo calls for all packages. You can infer the layout from DependencyGraph.fetch(). Creating a tree is down in the DependencyGraph.tree(). It gives output like emerge --tree:

PROJECT PACKAGE REPOSITORY ARCH
DEPENDENT_PROJECT DEPENDENT_PACKAGE DEPENDENT_REPOSITORY DEPENDENT_ARCH

The tree is just based on a topological sort (http://en.wikipedia.org/wiki/Topological_sort) generated by DependencyGraph.topologicalSort() that includes the items depth in the tree so the tree can be printed with indentation for the levels.

DependencyGraph.dot() will generate a dot file that can be passed to graphviz to make pretty (or scary depending on your point of view and how many things are interdependent) pictures of the dependency graph.




"""Module for accessing /build hierarchy in API"""

# standard modules
import collections
import datetime

# cray modules
import cray.backport
import cray.obs

# plugin modules
import obs.remote.binarytree
import obs.remote.builddependencytree
import obs.remote.package
import obs.remote.project

class DependencyGraph(dict):
"""Package to Package dependency graph."""

def __init__(self):
super(DependencyGraph, self).__init__()
self.root = None

def _depthFirstSearch(self, binaryPackage, visited, preorder=True,
depth=None):
"""Recursive depth first search.

@param binaryPackage current node
@param visited set of already visited nodes
@param preorder True to yield packages before calling
_depthFirstSearch on the dependent packages.
False to yield packages after calling
_depthFirstSearch on the dependent packages.
@param depth None to not yield depth with Package
Otherwise depth of package
@yield Package if depth is None
(depth, Package) otherwise
"""
visited.add(binaryPackage)

formatted = depthedBinaryPackage(depth, binaryPackage)

if preorder:
yield formatted

for dependent in self.get(binaryPackage, ()):
if dependent not in visited:
dependentDepth = depth

if dependentDepth is not None:
dependentDepth += 1

for formattedIndirectDependent in self._depthFirstSearch(
dependent,
visited,
preorder,
dependentDepth):
yield formattedIndirectDependent

if not preorder:
yield formatted

def branch(self, name):
"""Branches all project in dependency graph.

Each remote package in dependency graph is branched to
home:USER:branches:NAME:PROJECT:PACKAGE

@param name name of branch
@return dict of RemotePackage keys and RemotePackage values.
keys are the original projects while values are the branched
project
"""
branchedPackages = {}

for remotePackage in self.remotePackages():
branchPackage = remotePackage.branch(name)
branchedPackages[remotePackage] = branchPackage

# update the build repositories so they point back at the branch
# projects
branchedProjects = {}

for remotePackage, branchPackage in branchedPackages.iteritems():
project = obs.remote.project.Project(remotePackage.project)
branchProject = obs.remote.project.BranchProject(
project,
branchPackage.project)
branchedProjects[project] = branchProject

for branchProject in branchedProjects.itervalues():
branchProject.useBranchedProjects(branchedProjects)

return branchedPackages

def dependents(self, package):
"""Returns a DependencyGraph of direct and indirect dependents of
package.

@param package a package to start dependency graph
@return a new DepenencyGraph
"""
dependencyGraph = self.__class__()
dependencyGraph.root = package

dependents = set((package,))

while len(dependents) > 0:
dependent = dependents.pop()

# check that we haven't hit the dependent already
if dependent not in dependencyGraph:
try:
indirectDependents = self[dependent]
except KeyError:
# dependent has no dependents
continue

dependencyGraph[dependent] = indirectDependents
dependents.update(indirectDependents)

return dependencyGraph

def depthFirstSearch(self, package, preorder=True, includeDepth=False):
"""Recursive depth first search.

@param package package from which to start search
@param preorder True to yield packages before calling
_depthFirstSearch on the dependent packages.
False to yield packages after calling
_depthFirstSearch on the dependent packages.
@param includeDepth False to not include depth of package
True to include depth of package
@yield Package if includeDepth is False
(depth, Package) if includeDepth is True
"""
depth = None

if includeDepth:
depth = 0

return self._depthFirstSearch(package, set(), preorder, depth)

def dot(self):
"""Returns a str of the dependency graph as a dot file."""
lines = ["digraph dependencyGraph {"]

for dependency, dependents in self.iteritems():
lines.append(" ")
dependencyStr = " ".join(dependency)

for dependent in dependents:
dependentStr = " ".join(dependent)
lines.append(' "%s" -> "%s";' %
(dependencyStr, dependentStr))

lines.append("}")

return "\n".join(lines)

@classmethod
def fetch(cls, apiurl=None):
"""
@param buildDependencyTree a BuildDependencyTree
@param binaryTree a BinaryTree
"""
dependencyGraph = cls()
buildDependencyTree = (
obs.remote.builddependencytree.BuildDependencyTree.fetch(apiurl))
binaryTree = (
obs.remote.binarytree.BinaryTree.fetch(apiurl))

for project in buildDependencyTree.itervalues():
for repository in project.itervalues():
for architecture in repository.itervalues():
for package in architecture.itervalues():
dependencies = package.dependencies(binaryTree)
binaryPackage = obs.remote.package.BinaryPackage(
project.name,
repository.name,
architecture.name,
package.name)

for dependency in dependencies:
dependents = dependencyGraph.setdefault(
dependency, set())
dependents.add(binaryPackage)

return dependencyGraph

def merge(self, other):
"""Merges dependency graphs if they share common elements

@param other another DependencyGraph
@return a new DependencyGraph
@raise MergeError if dependency graphs do not intersect
"""
selfKeys = frozenset(self.iterkeys())
otherKeys = frozenset(other.iterkeys())

# if no common elements
if len(selfKeys & otherKeys) < 1:
raise MergeError((self, other))

dependencyGraph = self.__class__()

dependencyGraph.update(self)
dependencyGraph.update(other)

return dependencyGraph

def remotePackages(self):
"""Returns set of all remote packages in dependency graph.

@return set of RemotePackages
"""
remotePackages = set()

# if there is a root then use it. This ensures that if a dependents
# dependency graph has entries (because root has not dependents) then
# RemotePackages are still returned.
if self.root is not None:
remotePackage = obs.remote.package.RemotePackage(self.root.project,
self.root.package)
remotePackages.add(remotePackage)

for dependencyBinaryPackage, dependentBinaryPackages \
in self.iteritems():
remotePackage = dependencyBinaryPackage.remotePackage()
remotePackages.add(remotePackage)

for dependentBinaryPackage in dependentBinaryPackages:
remotePackage = dependentBinaryPackage.remotePackage()
remotePackages.add(remotePackage)

return remotePackages

def sources(self):
"""Finds all packages in dependency graph that don't depend on any
other package.

If a dependency graph has a single source then the graph is a
dependency tree (the dependency graph is assumed to be cycle
free) with the single source as the root.

@return set of Packages
"""
dependencies = set(self.iterkeys())

dependents = set()
for packageDependents in self.itervalues():
dependents |= packageDependents

# if a dependency is not dependent on another dependency then it is a
# source
sources = dependencies - dependents

return sources

def topologicalSort(self, includeDepth=False):
"""Return a topological sort or dependency graph.

A topoligcal sort is a linear ordering of packages in which each
packages comes before all packages to which it is a dependency.

@param includeDepth False to only return Packages in list
True to return Package and its depth in list
@return list of Packages if includeDepth is False
list of (depth, Package) tuples if includeDepth is True
"""
order = collections.deque()

unvisited = set(self.iterkeys())
visited = set()

depth = None

if includeDepth:
depth = 0

while len(unvisited) > 0:
root = unvisited.pop()
order.extendleft(self._depthFirstSearch(root, visited,
preorder = False,
depth = depth))
unvisited -= visited

return list(order)

def tree(self):
"""Returns a str containing the dependency graph as a tree.

@return str
"""
lines = [" " * depth + binaryPackage.remoteLogPath()
for (depth, binaryPackage)
in self.topologicalSort(includeDepth = True)]

return "\n".join(lines)

def depthedBinaryPackage(depth, binaryPackage):
"""Returns package with depth when appropriate.

@return (depth, binaryPackage) if depth is not None
BinaryPackage if depth is None
"""
if depth is None:
formatted = binaryPackage
else:
formatted = (depth, binaryPackage)

return formatted

def disjoint(dependencyGraphs):
"""Return a list of disjoin dependency graphs.

If dependency graphs intersect then they need to be combined so they
produce a single topological sort and share the same
binaries. It also implies that one dependency graph is a
proper subgraph of the other.

If the dependency graphs don't intersect then they need to use
distinct binaries directories so if the packages in each
dependency graph produce binaries with the same name they do not
interfere.

@param an iterable of DependencyGraphs
@return list of DependencyGraphs
"""
disjointDependencyGraphs = []

# use list for queues so merged elements can be removed
primaryQueue = list(dependencyGraphs)
while len(primaryQueue) > 0:
primary = primaryQueue.pop()

for index, secondary in enumerate(primaryQueue):
try:
merged = primary.merge(secondary)
except obs.remote.build.MergeError:
continue

# del on iterating list is safe since we're breaking
del primaryQueue[index]
primaryQueue.append(merged)
break
# if no secondary is mergable then this dependency graph is
# independent
else:
disjointDependencyGraphs.append(primary)

return disjointDependencyGraphs

class MergeError(Exception):
"""Error raised when DependencyGraphs cannot be merged"""

def __init__(self, dependencyGraphs):
Exception.__init__(self, "Dependency Graphs could not be merged")
self.dependencyGraphs = dependencyGraphs

class ProjectResultsMonitor(object):
"""Monitors project results and reports changes"""

RECOMMENDED_ACTIONS = {
"succeeded" : "Package(s) can be downloaded from project repository.",
"broken" :
"Package(s) are missing required (spec/dsc/kiwi or source) files.",
"expansion error" :
"Package(s) are missing depencendencies. One of three solutions "
"exists:\n"
"1) A package needs to be linked into the project to build the "
"dependency (using osc linkpac).\n"
"2) A project that already builds the dependency needs to be added as "
"a path in the project build repositories (using osc meta prj) to "
"supply the dependency.\n"
"3) A new package needs to be created to supply the dependency.",
"disabled" :
"Package(s) are disabled. If this is not desired then they should be "
"reenabled (using the web interface, osc meta prj or osc meta pkg).",
"failed" :
"Package build(s) have failed. You should inspect the logs (osc "
"rbl). If the failed packages require changes then you should them "
"out from OBS (using osc co) and then using the _url file to find the "
"location to checkout from Subversion (using svn co `cat "
"OBSWORKAREA/_url`) to make the mods."}

STABLE = frozenset(("succeeded", "broken", "expansion error", "disabled",
"failed"))

def __init__(self, project, apiurl=None):
self.project = project
self.apiurl = apiurl
self.previous = {}
self.previousDateTime = None
self.current = {}
self.currentDateTime = None

def changes(self):
"""Calculates changes between previous and current status. Call
update() to generate a new set of changes.

@return dict keyed by (project, repository, arch, package) with
values of (previousStatus, currentStatus).
If there is no previousStatus it will be None.
"""
changes = {}

for package, currentCode in self.current.iteritems():
previousCode = self.previous.get(package)

if previousCode != currentCode:
changes[package] = (previousCode, currentCode)

return changes

def displayChanges(self):
"""Prints changes to stdout."""
changes = self.changes()

for binaryPackage, (previous, current) in changes.iteritems():
remoteLogPath = binaryPackage.remoteLogPath()

if previous is None:
print "%s %s: %s" % (self.currentDateTime, remoteLogPath,
current)
else:
print "%s %s: %s -> %s" % (self.currentDateTime,
remoteLogPath,
previous,
current)

def displaySummary(self):
"""Print summary to stdout."""
summary = self.summary()

for i, status in enumerate(sorted(summary.iterkeys())):
if i > 0:
print "-" * 79

packages = sorted(summary[status])
print "%s packages %s:" % (len(packages), status)

for package in packages:
print " %s" % package.remoteLogPath()

print
print self.RECOMMENDED_ACTIONS[status]

def _statusesIn(self, statusSet):
"""Returns if all statuses are in the set.

@param statusSet a set of statuses
@return True if all statuses in statusSet
False if any one status is not in statusSet
"""
return cray.backport.all(status in statusSet
for status in self.current.itervalues())

def stable(self):
"""Returns true if all statuses are stable.

Statuses in STABLE are considered stable.
return True if all statuses in STABLE
False if any one status is not in STABLE
"""
return self._statusesIn(self.STABLE)

def downloadable(self):
"""Returns true if all statuses are succeeded/disabled.

return True if all status are succeeded/disabled
False is any one status is not succeeded/disabled
"""
return self._statusesIn(frozenset(("disabled", "succeeded")))

def summary(self):
"""Returns summary of statuses.

@return dict keyed by status with list of packages with that status as
values
"""
summary = {}

for package, status in self.current.iteritems():
statusPackages = summary.setdefault(status, [])
statusPackages.append(package)

return summary

def update(self):
"""Updates current status with project results from server."""
self.previous = self.current
self.previousDateTime = self.currentDateTime

self.current = {}
self.currentDateTime = datetime.datetime.now()

root = cray.obs.urlXML(self.apiurl,
"Getting project results for %s" %
self.project,
("build", str(self.project), "_result"))

for result in root.findall("result"):
project = result.get("project")
repository = result.get("repository")
arch = result.get("arch")

for status in result.findall("status"):
package = status.get("package")
code = status.get("code")

binaryPackage = obs.remote.package.BinaryPackage(project,
repository,
arch,
package)
self.current[binaryPackage] = code

"""Module for parsing /build/_binarytree"""

# standard modules
import re

# plugin modules
import obs.remote.core

class BinaryTree(obs.remote.core.Collection, obs.remote.core.FetchMixin):
"""Binary tree from /build/_binarytree"""

def __init__(self, parent, xml):
"""
@param xml binarytree node
"""
descendentClasses = (obs.remote.core.NamedCollection,
Project,
Architecture,
obs.remote.core.NamedCollection,
Binary)
super(BinaryTree, self).__init__(parent, xml, descendentClasses)

def __str__(self):
return "_binarytree"

@classmethod
# reduced argument count is on purpose
# pylint: disable-msg=W0221
def fetch(cls, apiurl=None):
"""Fetches BinaryTree data from API URL.

@param apiurl API URL
@return a new BinaryTree
"""
return super(BinaryTree, cls).fetch(apiurl,
"Getting binary tree",
("build", "_binarytree",))

class Project(obs.remote.core.NamedCollection,
obs.remote.core.FetchMixin):
"""Binary tree from /build/PROJECT/_binarytree or under a element under
the BinaryTree from /build/_binarytree"""

def __init__(self, parent, xml, descendentClasses=None):
"""
@param xml project node
"""
if descendentClasses is None:
descendentClasses = (obs.remote.core.NamedCollection,
Architecture,
obs.remote.core.NamedCollection,
Binary)
super(Project, self).__init__(parent, xml, descendentClasses)

@classmethod
# reduced argument count is on purpose
# pylint: disable-msg=W0221
def fetch(cls, project, apiurl=None):
"""Fetches ProjectBinaryTree data from API URL.

@param project name of project to fetch
@param apiurl API URL
@return a new ProjectBinaryTree
"""
return super(Project, cls).fetch(apiurl,
"Getting project binary tree",
("build", project, "_binarytree"))

class Architecture(obs.remote.core.NamedCollection):
"""Architecture in repository"""

def __init__(self, parent, xml, descendentClasses):
super(Architecture, self).__init__(parent, xml, descendentClasses)

self.binaryProducers = {}

for package in self.itervalues():
for binary in package.itervalues():
binaryPackage = obs.remote.package.BinaryPackage(
package.parent.parent.parent.name,
package.parent.parent.name,
package.parent.name,
package.name)
self.binaryProducers[binary.name] = binaryPackage

class Binary(object):
"""Binary in package"""

FILE_NAME_RE = re.compile("(?P<name>.+)-(?P<version>[^-]+)-"
"(?P<release>[^-]+)\.(?P<extension>.+)")

def __init__(self, parent, xml):
"""
@param xml package node
"""
super(Binary, self).__init__()
self.parent = parent

# initialize attributes used by fileName
self.name = ""
self.version = ""
self.release = ""
self.extension = ""

self.fileName = xml.get("filename")
self.size = int(xml.get("size"))
self.modificationTime = int(xml.get("mtime"))

def _getFileName(self):
"""filename of binary"""
return "%s-%s-%s.%s" % (self.name, self.version, self.release,
self.extension)

def _setFileName(self, fileName):
"""Parses fileName into name, version, release, and extension.

@param fileName full fileName of format
%(name)s-%(version)s-%(release)s.%(extension)s
"""
match = self.FILE_NAME_RE.match(fileName)

if match is None:
raise FormatError(self.FILE_NAME_RE, fileName)

groupsByName = match.groupdict()

for name, value in groupsByName.iteritems():
setattr(self, name, value)

fileName = property(_getFileName, _setFileName)

class FormatError(Exception):
"""Error raised when format of string does not match regular expression"""

def __init__(self, regularExpression, string):
Exception.__init__("'%s' does not match '%s'" %
(string, regularExpression.pattern))
self.regularExpression = regularExpression
self.string = string

"""Module for parsing /build/_binarytree"""

import obs.remote.core

class BuildDependencyTree(obs.remote.core.Collection,
obs.remote.core.FetchMixin):
"""Build dependency tree from /build/_builddependencytree"""

def __init__(self, parent, xml):
"""
@param xml binarytree node
"""
descendentClasses = (obs.remote.core.NamedCollection,
obs.remote.core.NamedCollection,
obs.remote.core.NamedCollection,
Package,
BuildDependency)
super(BuildDependencyTree, self).__init__(parent,
xml,
descendentClasses)

def __str__(self):
return "_builddependencytree"

@classmethod
# reduced argument count is on purpose
# pylint: disable-msg=W0221
def fetch(cls, apiurl=None):
"""Fetches BuildDependencyTree from API URL.

@param apiurl API URL
@return a new BuildDependencyTree
"""
return super(BuildDependencyTree, cls).fetch(
apiurl,
"Getting build dependency tree",
("build", "_builddependencytree"))

class Package(obs.remote.core.NamedCollection):
"""Package in architecture"""

def dependencies(self, binaryTree):
"""Returns list of Packages that produce this package's build
dependencies.

@param binaryTree a BinaryTree
@return list of Packages
"""
dependencies = []

for buildDependency in self.itervalues():
project = binaryTree[buildDependency.project]
repository = project[buildDependency.repository]
try:
architecture = repository[buildDependency.architecture]
package = architecture.binaryProducers[buildDependency.name]
except KeyError:
# KeyError occurs if package not built, but supplied as
# prebuilt for distro Build Repository. If it's rebuilt then
# we don't care about it as a dependency
continue

dependencies.append(package)

return dependencies

class BuildDependency(object):
"""BuildDependency in package"""

def __init__(self, parent, xml):
"""
@param xml package node
"""
super(BuildDependency, self).__init__()
self.parent = parent

self.project = xml.get("project")
self.repository = xml.get("repository")
self.architecture = xml.get("arch")
self.name = xml.get("name")
self.version = xml.get("version")
self.release = xml.get("release")

< Previous Next >