CEP 12 - Serving run_exports
metadata in conda channels
Title | Serving run_exports metadata in conda channels |
Status | Accepted |
Author(s) | Jaime Rodríguez-Guerra <jrodriguez@quansight.com> |
Created | May 4, 2023 |
Updated | Jul 27, 2023 |
Discussion | https://github.com/conda-incubator/ceps/pull/51 |
Implementation | NA |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 when, and only when, they appear in all capitals, as shown here.
Abstract
Have conda channels serve standalone run_exports
metadata, next to repodata.json
.
Motivation
Building infrastructure (such as some conda-forge bots) use run_exports
to calculate which packages need rebuilding as part of an upgrade. Right now, conda-forge needs to maintain their own JSON database, which involves downloading and extracting the new artifacts as they become available: the run_exports
metadata lives inside the .tar.bz2
and .conda
files.
Since conda-index
already processes the package metadata to generate repodata.json
, it would be trivial to also generate run_exports.json
and serve them together.
Precedence and role in the pinning resolution process
This file is not meant to replace the run_exports
metadata already present in the package archives. It merely presents that information in a more convenient way.
conda-build
-like clients are free to use either the run_exports
metadata in the archives or the one in run_exports.json
, since they MUST be equivalent.
Special keys like pin_run_as_build
MUST keep their behavior, since run_exports.json
does not add a new level of precedence in the pinning resolution process. Again, it's an equivalent source for information already present in the archives.
This means that run_exports.json
MUST NOT be patched (like it is done with repodata.json
). It MUST always reflect the metadata present in the archives.
Note this only applies to the served
run_exports.json
file. It does not try to regulate whatconda-build
-like tools can do at environment creation time. They might need to apply modifications analogous to repodata patching to therun_exports
metadata during thebuild
,host
andtest
environments setup. If patchingrun_exports.json
is shown to be necessary for correct environment creation, it will be the subject of another CEP and could involve a change in the schema version to ensure backwards compatibility.
Specification
The schema of run_exports.json
will mimic the repodata.json
structure whenever possible:
info
: metadata about the platform, architecture, and version of therun_exports.json
schema.packages
: map of.tar.bz2
filenames torun_exports
metadatadict
.packages.conda
: map of.conda
filenames torun_exports
metadatadict
.
Each run_exports
metadata dict
can contain the following fields; each field accepts a list of strings (conda-build specs).
weak
strong
weak_constrains
strong_constrains
noarch
{
"info": {
"platform": "string",
"arch": "string",
"subdir": "string",
"version": 0
},
"packages": {
"package-version-build.tar.bz2": {
"run_exports": {
"noarch": [
"string",
],
"strong": [
"string",
],
"strong_constrains": [
"string",
],
"weak": [
"string",
],
"weak_constrains": [
"string",
],
}
}
},
"packages.conda": {
"package-version-build.conda": {
"run_exports": {
"noarch": [
"string",
],
"strong": [
"string",
],
"strong_constrains": [
"string",
],
"weak": [
"string",
],
"weak_constrains": [
"string",
],
}
}
}
}
If a package does not define run_exports
, the corresponding entry in packages
or packages.conda
MUST be an empty run_exports
item:
{
"info": {
"platform": "string",
"arch": "string",
"subdir": "string",
"version": 0
},
"packages": {
"package-version-build.tar.bz2": {
"run_exports": {}
}
},
"packages.conda": {
"package-version-build.conda": {
"run_exports": {}
}
}
}
See the validation schema draft in
run_exports.schema.json
.
Backwards compatibility
This is a new feature, so there is no backwards compatibility to worry about.
Alternatives
We could maintain the status quo and ask downstream infrastructure to maintain their own database. However, this is a burden on them, and it is trivial to generate this data.
We also considered adding the run_exports
metadata to repodata.json
, but this has a few shortcomings:
- It would require extending the
repodata
schema, currently not formally standardized. - It would increase the size of the already heavy
repodata.json
files. - (Typed) repodata parsers would need to be updated to handle the new field.
Finally, we studied whether the run_exports
metadata already present in the channeldata.json
metadata would be enough. However, this metadata is only presented per version (not build), so it is insufficient and incomplete. Changing the channeldata.json
schema to include per-build run_exports
metadata would be a breaking change (e.g. conda-build
's --use-channeldata
option).
FAQ
Which packages should be included?
All packages present in the unpatched repodata.json
(repodata_from_packages.json
in some channels) documents should be included.
Will this affect the performance of repodata fetching?
Only package building infrastructure (such as conda-forge bots) will need to fetch this data. The rest of the ecosystem (such as CLI conda clients) will not be affected.
References
- Initial proposal in
conda-index
write_run_exports
implementation inconda-build
run_exports
schema inconda-build
RunExports
struct inrattler-build
Copyright
All CEPs are explicitly CC0 1.0 Universal.