CEP 12 - Serving run_exports metadata in conda channels
| Title | Serving run_exports metadata in conda channels |
| Status | Accepted |
| Author(s) | Jaime Rodríguez-Guerra <jrodriguez@quansight.com> |
| Created | May 4, 2023 |
| Updated | Jul 27, 2023 |
| Discussion | https://github.com/conda-incubator/ceps/pull/51 |
| Implementation | NA |
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 when, and only when, they appear in all capitals, as shown here.
Abstract
Have conda channels serve standalone run_exports metadata, next to repodata.json.
Motivation
Building infrastructure (such as some conda-forge bots) use run_exports to calculate which packages need rebuilding as part of an upgrade. Right now, conda-forge needs to maintain their own JSON database, which involves downloading and extracting the new artifacts as they become available: the run_exports metadata lives inside the .tar.bz2 and .conda files.
Since conda-index already processes the package metadata to generate repodata.json, it would be trivial to also generate run_exports.json and serve them together.
Precedence and role in the pinning resolution process
This file is not meant to replace the run_exports metadata already present in the package archives. It merely presents that information in a more convenient way.
conda-build-like clients are free to use either the run_exports metadata in the archives or the one in run_exports.json, since they MUST be equivalent.
Special keys like pin_run_as_build MUST keep their behavior, since run_exports.json does not add a new level of precedence in the pinning resolution process. Again, it's an equivalent source for information already present in the archives.
This means that run_exports.json MUST NOT be patched (like it is done with repodata.json). It MUST always reflect the metadata present in the archives.
Note this only applies to the served
run_exports.jsonfile. It does not try to regulate whatconda-build-like tools can do at environment creation time. They might need to apply modifications analogous to repodata patching to therun_exportsmetadata during thebuild,hostandtestenvironments setup. If patchingrun_exports.jsonis shown to be necessary for correct environment creation, it will be the subject of another CEP and could involve a change in the schema version to ensure backwards compatibility.
Specification
The schema of run_exports.json will mimic the repodata.json structure whenever possible:
info: metadata about the platform, architecture, and version of therun_exports.jsonschema.packages: map of.tar.bz2filenames torun_exportsmetadatadict.packages.conda: map of.condafilenames torun_exportsmetadatadict.
Each run_exports metadata dict can contain the following fields; each field accepts a list of strings (conda-build specs).
weakstrongweak_constrainsstrong_constrainsnoarch
{
"info": {
"platform": "string",
"arch": "string",
"subdir": "string",
"version": 0
},
"packages": {
"package-version-build.tar.bz2": {
"run_exports": {
"noarch": [
"string",
],
"strong": [
"string",
],
"strong_constrains": [
"string",
],
"weak": [
"string",
],
"weak_constrains": [
"string",
],
}
}
},
"packages.conda": {
"package-version-build.conda": {
"run_exports": {
"noarch": [
"string",
],
"strong": [
"string",
],
"strong_constrains": [
"string",
],
"weak": [
"string",
],
"weak_constrains": [
"string",
],
}
}
}
}
If a package does not define run_exports, the corresponding entry in packages or packages.conda MUST be an empty run_exports item:
{
"info": {
"platform": "string",
"arch": "string",
"subdir": "string",
"version": 0
},
"packages": {
"package-version-build.tar.bz2": {
"run_exports": {}
}
},
"packages.conda": {
"package-version-build.conda": {
"run_exports": {}
}
}
}
See the validation schema draft in
run_exports.schema.json.
Backwards compatibility
This is a new feature, so there is no backwards compatibility to worry about.
Alternatives
We could maintain the status quo and ask downstream infrastructure to maintain their own database. However, this is a burden on them, and it is trivial to generate this data.
We also considered adding the run_exports metadata to repodata.json, but this has a few shortcomings:
- It would require extending the
repodataschema, currently not formally standardized. - It would increase the size of the already heavy
repodata.jsonfiles. - (Typed) repodata parsers would need to be updated to handle the new field.
Finally, we studied whether the run_exports metadata already present in the channeldata.json metadata would be enough. However, this metadata is only presented per version (not build), so it is insufficient and incomplete. Changing the channeldata.json schema to include per-build run_exports metadata would be a breaking change (e.g. conda-build's --use-channeldata option).
FAQ
Which packages should be included?
All packages present in the unpatched repodata.json (repodata_from_packages.json in some channels) documents should be included.
Will this affect the performance of repodata fetching?
Only package building infrastructure (such as conda-forge bots) will need to fetch this data. The rest of the ecosystem (such as CLI conda clients) will not be affected.
References
- Initial proposal in
conda-index write_run_exportsimplementation inconda-buildrun_exportsschema inconda-buildRunExportsstruct inrattler-build
Copyright
All CEPs are explicitly CC0 1.0 Universal.