Skip to main content

CEP 27 - Standardizing a publish attestation for the conda ecosystem

Title Standardizing a publish attestation for the conda ecosystem
Status Accepted
Author(s) Wolf Vollprecht <wolf@prefix.dev>, William Woodruff <william.woodruff@trailofbits.com>
Created Feb 18, 2025
Updated Jul 04, 2025
Discussion https://github.com/conda/ceps/pull/112
Implementation N/A

Abstract

This CEP proposes a standard attestation layout for the conda ecosystem. This attestation layout is based on the in-toto framework and will enable further integration with signing schemes like Sigstore.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 when, and only when, they appear in all capitals, as shown here.

More specifically, violations of a MUST or MUST NOT rule MUST result in an error. Violations of the rules specified by any of the other all-capital terms MAY result in a warning, at discretion of the implementation.

Definitions and Concepts

  • An attestation is a machine-readable cryptographically signed statement. When an attestation's signature is verified against a trusted key, that verification provides integrity and authenticity guarantees about the attestation's subject. For example:

    • Alice is the maintainer of the widgets package.

    • Alice signs a machine readable statement equivalent to the following English sentence, producing her attestation:

      > Alice published the `widgets` package at version v1.2.3 with
      > hash `sha256:abcd...` to the `conda-forge` channel.
    • Bob establishes trust in Alice's public key.

    • Bob can verify the attestation's signature against Alice's public key, giving him confidence that the statement is true.

    • Correspondingly, Bob can reject any statement for widgets that is not signed by Alice's public key.

  • in-toto is a framework and standard for defining attestations.

    • Within in-toto, an attestation's statement is composed of a subject and a predicate. The subject is the resource (or resources) being attested to, and the predicate is an arbitrary collection of metadata about the subject. The predicate is identified by a predicate type, which defines the predicate's expected schema.
  • Sigstore is a project that enables misuse-resistant software signing and verification via short-lived certificates and a tamper-evident log. Sigstore uses attestation frameworks like in-toto to provide transparency and misuse-resistance properties on top of the integrity and authenticity properties of attestations.

    One of Sigstore's major misuse-resistance contributions is the use of ephemeral keys for signing. Modifying the example above:

    • Instead of maintaining a long-lived signing key, Alice generates an ephemeral key and binds it to her identity ("alice@trustme.example.com").

      This binding is done via a certificate issued by Fulcio, which verifies a proof of possession (such as from OpenID Connect) from Alice for her identity. The certificate issued by Fulcio is, in turn auditable via RFC 6962 Certificate Transparency (CT) logs.

    • Alice signs her attestation with her ephemeral key, and distributes a "bundle" containing both her attestation and her signing certificate.

    • Instead of establishing trust with a long-lived key from Alice, Bob establishes trust in Alice's identity.

    • Bob can verify the attestation's signature against Alice's emphemeral key, which in turn can be verified as authentically Alice's via the Fulcio- issued certificate.

    With this flow, neither Alice nor Bob needs to maintain long-lived signing or verifying keyrings, in turn reducing the attacker surface for key compromise.

    Another key misuse-resistance contribution within Sigstore is machine identities. A machine identity behaves similarly to a human identity (Alice or Bob), but identifies a machine instead of a human. For example, github.com/example/example/.github/workflows/release.yml@refs/tags/v1.2.3 could be the machine identity of a GitHub Actions workflow that ran from release.yml within example/example against the v1.2.3 tag.

Motivation

The conda ecosystem contains metadata that answers the following questions, in part or in full:

  • Who (or what) published this package?
  • What is the package's hash?
  • Where was this package published from, and where to?
  • When was this package published?

However, the authenticity of this metadata is not currently cryptographically verifiable: the consuming party must either trust it as presented, or verify it manually against independent sources of truth (such as a project's release history).

Attestations that present this metadata in a cryptographically verifiable manner are desirable for a number of reasons:

  • Package maintainers wish to demonstrate the integrity and authenticity of their package uploads;
  • Individual downstream users wish to verify the integrity and authenticity of packages they consume, without placing additional trust in the channel or channel's hosting server;
  • Attestations change the sophistication and risk profile for attackers in defenders' favor: the attacker must be sufficiently sophisticated to access private key material, and have a risk tolerance profile that accepts exposure via auditable transparency logs.

More broadly, attestation schemes like the one proposed in this CEP have seen adoption in similar and related ecosystems:

Specification

Attestation format

This CEP proposes the following attestation statement layout, using the in-toto Statement schema:

  • predicateType MUST be https://schemas.conda.org/attestations-publish-1.schema.json
  • subject MUST be a single ResourceDescriptor, with the following constraints:
    • subject[0].name MUST be the full filename of the conda package that will be part of the repodata.json and under which it will appear on the server.
    • subject[0].digest MUST be a DigestSet, and it MUST contain a single sha256 entry with the SHA256 hash of the conda package.
  • predicate MAY be present. If present and not null, it MUST be a JSON object with the following fields:
    • targetChannel MUST be a string, indicating where the package is being uploaded to. This field MUST be a valid URL with no trailing slashes.

An example of a compliant statement is provided below:

{
"_type": "https://in-toto.io/Statement/v1",
"subject": [{
"name": "file-name-0.0.1-h123456_5.conda",
"digest": {"sha256": "01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b"},
}],
"predicateType": "https://schemas.conda.org/attestations-publish-1.schema.json",
"predicate": {
"targetChannel": "https://prefix.dev/conda-forge",
}
}
JSON schema
{
"$defs": {
"CondaPublishAttestationPredicate": {
"$id": "https://schemas.conda.org/attestations-publish-1.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"description": "JSON Schema for the predicate field of conda publish attestations. This schema defines the structure and validation rules for the predicate portion of an in-toto statement used in conda package publish attestations, as specified in CEP 27.",
"properties": {
"targetChannel": {
"description": "The channel where the package is being uploaded to. This must be a valid URL with no trailing slashes.",
"examples": [
"https://prefix.dev/conda-forge",
"https://conda.anaconda.org/conda-forge"
],
"format": "uri",
"maxLength": 2083,
"minLength": 1,
"title": "Targetchannel",
"type": "string"
}
},
"required": [
"targetChannel"
],
"title": "Conda Publish Attestation Predicate",
"type": "object"
},
"ResourceDescriptor": {
"properties": {
"name": {
"description": "The full filename of the conda package",
"examples": [
"numpy-1.24.3-py310h5f9d8e6_0.conda"
],
"title": "Name",
"type": "string"
},
"digest": {
"additionalProperties": {
"type": "string"
},
"description": "Digest set containing SHA256 hash of the package",
"examples": [
{
"sha256": "01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b"
}
],
"title": "Digest",
"type": "object"
}
},
"required": [
"name",
"digest"
],
"title": "ResourceDescriptor",
"type": "object"
}
},
"description": "Complete in-toto statement for conda publish attestations.\n\nThis is provided for reference and validation, showing how the predicate\nfits into the complete in-toto statement structure.",
"properties": {
"_type": {
"const": "https://in-toto.io/Statement/v1",
"description": "The in-toto statement type",
"name": "_type",
"title": "Type",
"type": "string"
},
"subject": {
"description": "List containing exactly one ResourceDescriptor for the conda package",
"items": {
"$ref": "#/$defs/ResourceDescriptor"
},
"maxItems": 1,
"minItems": 1,
"title": "Subject",
"type": "array"
},
"predicateType": {
"const": "https://schemas.conda.org/attestations-publish-1.schema.json",
"description": "The predicate type for conda publish attestations",
"title": "Predicatetype",
"type": "string"
},
"predicate": {
"anyOf": [
{
"$ref": "#/$defs/CondaPublishAttestationPredicate"
},
{
"type": "null"
}
],
"default": null,
"description": "The conda publish attestation predicate"
}
},
"required": [
"_type",
"subject",
"predicateType"
],
"title": "CondaPublishAttestationStatement",
"type": "object"
}
Pydantic Model
#!/usr/bin/env python
#
# /// script
# requires-python = ">=3.12"
# dependencies = [
# "pydantic>=2.11",
# ]
# ///
#
"""
Pydantic model for the conda publish attestation predicate schema.
This generates the JSON schema for https://schemas.conda.org/attestations-publish-1.schema.json
"""

from typing import Annotated, Literal, Optional
from pydantic import AfterValidator, BaseModel, Field, HttpUrl
import json


class CondaPublishAttestationPredicate(BaseModel):
"""
Predicate for conda publish attestations.

This represents the predicate portion of an in-toto statement for conda package
publish attestations, as defined in CEP 27.
"""

target_channel: HttpUrl = Field(
...,
alias="targetChannel",
description=(
"The channel where the package is being uploaded to. "
"This must be a valid URL with no trailing slashes."
),
examples=["https://prefix.dev/conda-forge", "https://conda.anaconda.org/conda-forge"]
)

class Config:
# Allow field population by both name and alias (camelCase in JSON, snake_case in Python)
validate_by_name = True
# Generate schema with field aliases
json_schema_extra = {
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://schemas.conda.org/attestations-publish-1.schema.json",
"title": "Conda Publish Attestation Predicate",
"description": (
"JSON Schema for the predicate field of conda publish attestations. "
"This schema defines the structure and validation rules for the predicate "
"portion of an in-toto statement used in conda package publish attestations, "
"as specified in CEP 27."
),
}


def validate_sha256_present(v):
"""Ensure that the digest dictionary contains a SHA256 hash."""
if not isinstance(v, dict):
raise ValueError("digest must be a dictionary")
if 'sha256' not in v:
raise ValueError("digest must contain a 'sha256' key")
if not v['sha256']:
raise ValueError("sha256 value cannot be empty")
# Basic validation that SHA256 looks like a hex string of correct length
sha256_value = v['sha256']
if not isinstance(sha256_value, str):
raise ValueError("sha256 value must be a string")
if len(sha256_value) != 64:
raise ValueError("sha256 value must be exactly 64 characters long")
try:
int(sha256_value, 16)
except ValueError:
raise ValueError("sha256 value must be a valid hexadecimal string")
return v


def validate_conda_package_name(v):
"""Validate conda package filename format."""
if not isinstance(v, str):
raise ValueError("package name must be a string")

# Check if filename ends with valid extensions
if not (v.endswith('.conda') or v.endswith('.tar.bz2')):
raise ValueError("package name must end with '.conda' or '.tar.bz2'")

# Remove extension to check the base name
if v.endswith('.conda'):
base_name = v[:-6] # Remove '.conda'
else: # ends with '.tar.bz2'
base_name = v[:-8] # Remove '.tar.bz2'

# Check if all characters are lowercase
if base_name != base_name.lower():
raise ValueError("package name must be all lowercase")

# Check for at least 2 dashes (separating name, version, and build string)
dash_count = base_name.count('-')
if dash_count < 2:
raise ValueError("package name must contain at least 2 dashes separating name, version, and build string")

# Basic format validation: should have name-version-build pattern
parts = base_name.split('-')
if len(parts) < 3:
raise ValueError("package name must follow name-version-build format")

# Check that no part is empty
if any(not part for part in parts):
raise ValueError("package name parts (name, version, build) cannot be empty")

return v


class CondaPublishAttestationStatement(BaseModel):
"""
Complete in-toto statement for conda publish attestations.

This is provided for reference and validation, showing how the predicate
fits into the complete in-toto statement structure.
"""

class ResourceDescriptor(BaseModel):
name: Annotated[str, AfterValidator(validate_conda_package_name)] = Field(
...,
description="The full filename of the conda package",
examples=["numpy-1.24.3-py310h5f9d8e6_0.conda"]
)
digest: Annotated[dict[str, str], AfterValidator(validate_sha256_present)] = Field(
...,

description="Digest set containing SHA256 hash of the package",
examples=[{"sha256": "01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b"}]
)


type_: Literal["https://in-toto.io/Statement/v1"] = Field(
name="_type",
alias="_type",
description="The in-toto statement type"
)

subject: list[ResourceDescriptor] = Field(
...,
min_items=1,
max_items=1,
description="List containing exactly one ResourceDescriptor for the conda package"
)

predicate_type: Literal["https://schemas.conda.org/attestations-publish-1.schema.json"] = Field(
alias="predicateType",
description="The predicate type for conda publish attestations"
)

predicate: Optional[CondaPublishAttestationPredicate] = Field(
None,
description="The conda publish attestation predicate"
)

class Config:
validate_by_name = True


def generate_predicate_schema() -> dict:
"""Generate the JSON schema for the conda publish attestation predicate."""
return CondaPublishAttestationPredicate.model_json_schema()

def generate_complete_statement_schema() -> dict:
"""Generate the JSON schema for the complete in-toto statement."""
return CondaPublishAttestationStatement.model_json_schema()


if __name__ == "__main__":
# Generate and print the predicate schema
predicate_schema = generate_predicate_schema()
print("=== Conda Publish Attestation Predicate Schema ===")
print(json.dumps(predicate_schema, indent=2))

print("\n" + "="*60 + "\n")

# Generate and print the complete statement schema for reference
complete_schema = generate_complete_statement_schema()
print("=== Complete in-toto Statement Schema (for reference) ===")
print(json.dumps(complete_schema, indent=2))

# Validate the example from the CEP
print("\n" + "="*60 + "\n")
print("=== Validating CEP Example ===")

example_statement = {
"_type": "https://in-toto.io/Statement/v1",
"subject": [{
"name": "file-name-0.0.1-h123456_5.conda",
"digest": {"sha256": "01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b"},
}],
"predicateType": "https://schemas.conda.org/attestations-publish-1.schema.json",
"predicate": {
"targetChannel": "https://prefix.dev/conda-forge",
}
}

try:
validated = CondaPublishAttestationStatement.model_validate(example_statement)
print("✅ CEP example validates successfully!")
print(f"Validated statement: {validated}")
except Exception as e:
print(f"❌ Validation failed: {e}")

Signing and distributing

This CEP recommends the following signing process:

  1. The signer (i.e. Alice or Alice's trusted machine identity) uses a Sigstore-compatible client to generate an ephemeral keypair and bind it to their identity via a public certificate.
  2. The signer generates an in-toto statement as described above, and produces an attestation by signing that statement with their ephemeral private key.
  3. The signer uploads their attestation to the Sigstore transparency log (the Public Good Instance) as a DSSE envelope.
  4. The signer produces a Sigstore bundle containing their certificate, attestation, and transparency log inclusion proof.

Each of these steps is performed transparently by a Sigstore client like sigstore-python, except for the generation of the in-toto statement in step (2), which should be handled externally and then fed into the Sigstore client for signing.

The result of this process is a single Sigstore bundle, which can be distributed alongside the conda package or otherwise made discoverable.

This CEP does not define a distribution mechanism; see Future work.

Verifying

This CEP recommends the following verification process:

  1. The verifier retrieves Alice's conda package and associated Sigstore bundle.

  2. The verifier performs a standard Sigstore verification process against the bundle, using Alice's identity (or machine identity) as the signing identity. This process produces a verified in-toto statement.

    This step requires the verifier to establish trust in the identity being verified against.

    Exact mechanisms for establishing this trust are outside the scope of this CEP; see Future work.

  3. The verifier checks the in-toto statement for consistency against their ground truth:

    • The predicateType field MUST be https://schemas.conda.org/attestations-publish-1.schema.json.
    • The subject[0].name field MUST match the filename of the conda package.
    • The subject[0].digest field MUST match the SHA256 hash of the conda package.
    • The predicate.targetChannel field SHOULD match the channel that the package was retrieved from, if predicate is present. However, the verifier MAY choose to allow a channel mismatch, e.g. if the known context is a mirroring context (where the conda package was originally published to a different channel, but is now being consumed from a mirror).

At the end of this process, the verifier is confident in the following facts:

  • The package was published by the signer (Alice or Alice's machine identity).
    • If the publisher is a machine identity, this further establishes build provenance via the machine identity's claims. See Sigstore OID information for additional information on these claims.
  • The integrity and authenticity of the package are guaranteed by the signer.

This CEP suggests that attestation verification be performed by both clients (i.e. package installers retrieving packages from attestation-aware channels) and servers (i.e. attestation-aware channels).

Servers MUST perform the same verification process as clients, with the qualification that the server's trust in the signing identity is established latently via the server's publishing and upload mechanism. For example, if the server supports Trusted Publishing, then the package's attestation should be verified against the set of Trusted Publisher identities for that package.

Security Model

This section provides a high-level security model for the scheme proposed in this CEP.

Unforgeability of provenance

Traditional package distribution schemes depend on untrusted metadata for provenance: the package itself describes its source repository, its maintainers, etc.

This allows an attacker to forge a package's metadata such that a typical downstream unduly trusts the package. For example, a typical downstream may observe that the package lists a "trustworthy" GitHub repository or organization as its source, and therefore trust the package without confirming that its contents actually reflect those of the repository. This is compounded when the distribution format (such as a conda package or Python wheel) does not directly match the source layout (such as the contents of a GitHub repository), making direct comparison nontrivial.

This CEP's design introduces an unforgeability property: a package that claims a piece of untrusted metadata can be verified by cross-checking that metadata against the package's attestation.

An attacker cannot forge or spoof this check, modulo their ability to compromise the authentic signing identity itself.

Transparency and auditability

Traditional signing schemes provide integrity and authenticity modulo trust in a signing identity.

This is a strong property, but not a perfect one: an attacker who does manage to compromise a signing identity can mount a targeted attack, wherein the general public observes only legitimate artifacts and signatures while the victim receives a malicious artifact with a valid signature from the compromised signing identity. In effect, this makes it possible for the attacker to maintain their stealth during a targeted attack, since untargeted parties are not made aware of the malicious artifact or its signature.

This CEP's design introduces a transparency property: attestations are not considered valid unless they are included in a publicly auditable, append-only transparency log. This log is made up of entries, which bind over the attestation itself (including the subject), the attestation's signature, as well as the attestation's signing identity.

Because a valid transparency log inclusion proof is a requirement during verification, an attacker who compromises a signing identity cannot perform a targeted attack without making the attack itself publicly discoverable. This means the log can be monitored in real time for uses of identities, allowing users to proactively detect and respond to an identity being compromised, rather than having to wait for the compromise being detected downstream and reported.

Pre-established trust in signing identities

Like other signing schemes, this CEP's design does not eliminate the need for trust establishment: a verifier who blindly accepts attestations from any signing identity is effectively only protected against opportunistic network-side attackers.

Discussion

This predicate adds basic verifiable facts about the package. It will tie the producer of the package to the target channel, if the attestation's producer chooses to do so. A verifier can then use this information during the verification process, checking that the target channel matches the channel the package was retrieved from, or ignoring a mismatch between channels when the package is retrieved from a mirror.

This is similar to what PyPI has implemented with the PyPI publish attestation. Since there is no single authoritative index in the conda world, we add the targetChannel field to reach parity.

Future work

This CEP leaves three aspects of a complete attestation scheme open for future discussion and work:

  1. This CEP does not specify a mechanism for establishing trust in each conda package's signing identities. This is a non-trivial problem analogous to the key distribution problem in traditional PKI systems, albeit without the complications and operational challenges associated with long-lived key material.

    One potential mechanism for identity trust distribution is a TOFU (trust on first use) scheme with an attestation-aware conda channel, where package names are "locked" to attesting identities on first use, with subsequent updates being verified against that identity. This scheme requires a suitable lockfile format; PEP 751 and its [[package.attestation-identities]] table may serve as prior art for a similar approach in the conda ecosystem.

  2. This CEP does not specify a distribution mechanism for attestations (i.e., Sigstore bundles containing attestations).

    One potential distribution mechanism is to have attestation-aware conda channels distribute each package's attestations alongside the package, or via a similarly discoverable channel-side data. Prior art for this type of distribution mechanism can be found in the PyPI and RubyGems ecosystems, e.g. PyPI's Integrity API.

  3. This CEP specifies an single initial in-toto predicate (https://schemas.conda.org/attestations-publish-1.schema.json), which conveys a binding between a signing identity and its intent to publish a particular conda package to a particular channel.

    Future iterations of conda's attestation design may wish to support and use other predicate types, such as the SLSA Provenance (Supply-chain Levels for Software Artifacts) layout. Doing so would expose additional metadata about the package's source and build provenance, giving conda package consumers greater control over their consumption and admission policies.