Mailinglist Archive: opensuse-factory (454 mails)

< Previous Next >
[opensuse-factory] RFC Generic Packaging for Languages that have vendor/ Trees
  • From: Aleksa Sarai <asarai@xxxxxxx>
  • Date: Wed, 20 Dec 2017 09:32:02 +1100
  • Message-id: <20171219223200.lova6lxooitbzu5b@gordon>
Hello *,

This is a proposal for having a generic packaging system of RPMs for
languages that use "vendor/" trees. Please respond with any feedback you
have on the details of this proposal.

The main justification for the need for this proposal is that we have
seen the recent rise of languages that have an *enormous* number of
"micro-packages" (JavaScript is the most well-known offender here, where
the majority of widely used packages are only several lines long, but
Rust has a similar issue, and Go/Ruby do too). This has effectively made
it an impracticality (or even an impossibility for some languages) to
create a 1-to-1 RPM mapping for each package. So while a 1-to-1 RPM
mapping is arguably the most ideal (both from a idealogical perspective
and a tooling perspective), the maintenance burden is far too high.

Another problem is that many projects written in these sorts of
languages these days "vendor" their dependencies, usually using a
language-specific package manager to do so. (This is slightly ironic in
my opinion, because if they'd integrated more with distributions this
ideally wouldn't be necessary, but that ship has sailed.) This is a
problem that also needs to be resolved. Luckily such projects usually
have some sort of "lock file" that describes what is present inside the
"vendor/" tree -- this is something that will be useful later. It should
be noted that the 1-to-1 RPM mapping also doesn't help here either as it
further will balloon out the number of packages we would need to have
(as each project might have different version dependencies). Debian has
been attempting to do this with Go packages, and as far as I can see
it's quite a futile effort because of the maintenance burden that comes
from it.

At the moment the way that most packages deal with this problem is that
they just punt completely on reproducibility and audit-ability, and just
vendor all dependencies in a project and then tar up the vendor/ tree
and include it in the OBS project. For a JavaScript project this would
involve just running `yarn <blah>` (or whatever the command is) and then
taking node_modules/ and creating a node_modules.tar.xz that is
included in the specfile. The main problem with this approach currently
is that it is completely unauditable and nobody knows what's inside
that magic vendor blob. *However* the core idea is not completely
insane. The Rust folks have also started doing the same thing with
cargo-vendor.

And here we come to my proposal. The idea is to take what is already
being done in these projects, and create better tooling around it to
make the work of development, maintainence, security, and legal much
easier.

First, we need to provide more metadata about these vendor blobs in the
RPM layer, so that security could at least *track* what versions of
things are used by a project. And in the worst case, it should be
possible to patch a vendor blob. This would likely best be done through
RPM macros, by creating a virtual Provides for each of the vendored
libraries. This matches what Fedora does for bundled libraries[1]. The
Provides could be just as simple as

Provides: bundled(rust:nix) = 0.8.1

Or something more involved to be extra paranoid:

Provides:
bundled(rust:registry+https://github.com/rust-lang/crates.io-index:nix) = 0.8.1

Secondly, in order to make this vendor archive reproducible, I propose
we have an OBS service that can be used to vendor a source tree (which
can obviously be run either locally or on OBS). It will produce all of
the vendor archives created by language-specific tools, and produce a
language-agnostic manifest of what was downloaded (the name, language,
version, git commit, and so on). The idea is that this manifest could be
used by the RPM macros above rather than writing language-specific
macros.

I have already started working on the OBS service, but I would love to
hear your feedback on this proposal.

[1]:
https://fedoraproject.org/wiki/Bundled_Libraries?rd=Packaging:Bundled_Libraries

--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>
< Previous Next >