Am 04.04.24 um 14:25 schrieb Atri Bhattacharya:
Ben Greiner wrote:

I plead guilty as charged. There are many packages of mine where there 
is a Source without associated URL.
Ha ha, relax, no one is charging anyone! I think we are all guilty and the faith in trust-based-models that we have had so far has been shaken.

Instead, let me take this chance to thank you earnestly for all the help you have accorded me with many packages, python packages in particular.

Likewise. The point is, as long as somehow trusted and well-known contributors like you and me exhibit such practices, how would the maliciously crafted contribution of an attacker be suspicious in this regard?


This ranges from _source generated 
rust vendor.tar.xz (Who audits these?) to test data tarballs generated 
by manual downloads. Spot the connection to the XZ backdoor.

As far as cargo vendor.tar.xz tarballs are concerned, I may be mistaken, but it is possible to check these by regenerating them from the `cargo.toml` file in the upstream git repository. I think this is done by the cargo_vendor.service already. The vendor.tar.xz contains sources from various different upstream repositories, but all of them are traceable. So, a downstream packager may find it difficult to maliciously alter this.

But is this actually executed? Does a bot, not to say a reviewer, really look into vendor.tar.xz and try to reproduce it? I am not too deep into rust packaging, but does cargo_vendor actually create reproducible archives based on the lock files or can vendored packages jump in version?

In theory, automake generated source tarballs are also traceable, but in the case of xz nobody noticed that the shipped configure script was not generated by the m4 macros.


For data files that concern tests, I think we should simply not run tests that require data files not available from upstream. Sometimes upstream will recommend "run this script to download the data files" and as packagers we do run them on our machines and generate a tarball out of them. I think we should stop doing that.

Maybe, as an enhancement to the url download and repo checkout services, we need an obs-service where we can specify a script for custom test data downloads. Such a script can then be reviewed at submission time and tested for reproducibility by bots.


Thanks for your response.

Best wishes.
--
Atri

Cheers,
Ben