Security

This document specifies the authentication, integrity, and anti-gaming requirements for the #trstd <protocol>.

Normative language follows RFC 2119: MUST, SHOULD, MAY indicate requirement levels.

Transport Security

All communication in the #trstd <protocol> MUST use HTTPS (TLS 1.2 or later). This applies to:

Agent requests to the trust authority
Agent resolution of did:web DID Documents

HTTPS is the sole integrity protection for the link tag. If a provider's domain is compromised, an attacker can inject a malicious link tag. Response signatures mitigate this — see Discovery — Integrity.

Agent Identification

Agent identification is optional per-request. Authorities MUST NOT gate access on identification — requests without an Authorization header MUST be served. Agents MAY present a did:web JWT in the Authorization header as self-identification. See ADR-006 for the decision rationale. The full specification is in the API specification. DID resolution and JWT verification mechanics follow the TSAI protocol.

When a JWT is present, the authority MUST:

Verify the JWT signature against the agent's DID Document public key
Reject expired tokens (with a maximum clock-skew tolerance of 60 seconds)
Return 401 unauthorized if the token is invalid or if DID Document resolution fails or exceeds 5 seconds

A missing Authorization header MUST NOT trigger 401. 401 covers only the "JWT presented and invalid" case.

Authorities MAY use presented identity for logging, rate-limiting, or response tailoring. The protocol does not specify or constrain these behaviors.

The trust authority SHOULD log the agent's did:web identifier with each response issued on an identified request. For unidentified requests, the authority has no cryptographic agent identity to log.

URL Privacy

The verify request includes the full URL the agent is visiting, including query parameters. Entity matching uses only hostname and path (see Discovery specification), so query parameters are not needed for scope validation. They may contain session identifiers, tracking parameters, or user-specific context that the authority does not need for trust assessment.

Sending the full URL is a simplicity tradeoff. It avoids requiring agents to implement URL decomposition or parameter-stripping rules. The authority can also use the full URL for additional validation beyond basic scope matching.

The tradeoff has two practical consequences:

Agents should assume the authority can observe the full URL, including any embedded tokens or identifiers.
Authorities SHOULD NOT use query parameter values for purposes beyond scope validation and logging. The protocol does not enforce this — it is a trust boundary between the agent and the authority.

URL minimization — structured scope objects or explicit stripping rules to reduce the data agents expose — is tracked on the roadmap.

Response Signatures

Signing

The trust authority MUST sign every successful response (HTTP 200). The signature covers the complete response body excluding the signature field itself.

Algorithm: Ed25519 (EdDSA) as specified in RFC 8037.

Format: The response body includes two fields for signature verification:

Field	Description
`kid`	Key identifier matching a key in the authority's published JWKS
`signature`	Base64url-encoded raw Ed25519 signature (64 bytes, no padding)

The signing input is the JCS-canonicalized (RFC 8785) response body with the signature field excluded. The kid field is part of the signed payload — an attacker cannot swap key identifiers without invalidating the signature. JCS plus a raw Ed25519 signature was chosen over JWS — see ADR-015: Raw Ed25519 Signature.

Key Publication

The trust authority MUST publish its signing public keys at:

https://{authority-domain}/.well-known/jwks.json

The JWKS (JSON Web Key Set) follows RFC 7517. Each key MUST include a kid that agents use to select the correct verification key.

Key Rotation

The trust authority MAY rotate signing keys. When rotating:

The new key MUST be published in the JWKS before it is used for signing
The old key MUST remain in the JWKS until all responses signed with it have expired
The authority SHOULD publish new keys at least 24 hours before first use

JWKS Refresh and Key-Level Revocation

Agents MUST refresh the JWKS for each allowlisted authority at least every 1 hour, fetching it from the jwksUrl pinned in the allowlist (see Discovery — Authority Domain Allowlist).

A key absent from the current JWKS MUST be treated as revoked. Responses signed with a kid not present in the current JWKS MUST be rejected, regardless of the response's expires timestamp. If the response's kid is not in the cached JWKS, agents MUST refresh the JWKS and reject the response if the kid is still absent.

This gives authorities a fast, protocol-native response to key compromise: removing a key from the JWKS propagates to all agents within the refresh interval, without an agent software update.

Verification

Agents MUST perform the following verification steps on every response:

Retrieve the authority's JWKS (cached retrieval is acceptable subject to the 1-hour refresh requirement above)
Select the key matching the kid field in the response body. If no matching key is found, refresh the JWKS and re-select. If the kid is still absent after refresh, reject the response.
Remove the signature field, JCS-canonicalize the remaining body, and verify the Ed25519 signature over the resulting bytes
Check that expires is in the future
Recompute the canonical form of the URL the agent sent (see API specification — URL Canonicalization) and compare to meta.url. Mismatch MUST be rejected as signatureInvalid.
If the agent sent a context query parameter, compare it to meta.context. Mismatch — including the case where meta.context is absent — MUST be rejected as signatureInvalid.
Reject the response if any step fails

Steps 5 and 6 close a contextual-replay class of attack where a cached signed response for one URL within an entity's scope is substituted for another. See ADR-007 — Binding the Signed Response to the Request.

Agents MUST NOT process trust signals from responses with invalid or missing signatures. Implementers MUST validate their JCS and signature code against the Test Vectors before deployment.

Error responses (4xx, 5xx) are not signed because they contain no trust data. A man-in-the-middle can forge error responses to suppress valid entities or block trust evaluation. Agents MUST NOT treat a single unsigned error as a definitive trust result — they MUST retry, prefer cached signed responses over unsigned errors, and treat unresolved failures as "trust unknown" rather than "entity not found." See the API specification — Agent Error Handling for the complete requirements.

Authority Behavioral Auditability

Response signatures prove a response came from the authority and has not been tampered with. They do not prove the authority behaved consistently — the authority could issue contradictory signals to different agents without detection. A public transparency log to close this gap is tracked on the roadmap.

Anti-Gaming Summary

Threat	Protocol Mitigation
Forged trust signals	Authority-only issuance with Ed25519 signatures
Tampered responses	Application-layer signatures; agents verify before processing
Replayed stale responses	`expires` field with mandatory freshness check
Contextual replay across URLs within entity scope	`meta.url` and `meta.context` bind the signed response to the request that produced it; agents reject mismatches as `signatureInvalid`
Silent signal revocation	Tracked on the roadmap (transparency log)
Backdated signals	Tracked on the roadmap (transparency log)
Authority misbehavior	Tracked on the roadmap (transparency log)
DDoS against authority	Signed responses remain valid when cached; agents use cached data within `expires` window
Error response suppression	Mandatory retry, cache fallback, and "trust unknown" semantics for unsigned errors
Sybil agent identities	`did:web` ties identity to a controlled domain (when authentication is required)

Prompt Injection Mitigation

The protocol uses structured JSON with typed fields to minimize prompt injection risk. Implementers MUST NOT feed raw signal data into LLM prompts without sanitization. Signal data fields have defined schemas with typed values — agents SHOULD validate data types before processing.

Signal data objects SHOULD use typed values with constrained semantics (names, codes, URLs, booleans, numbers) rather than free-text strings.

Signal Size Limit

Each signal object MUST NOT exceed 4 KB when serialized as JSON. This limit applies to the complete signal structure (type, verifiedAt, and data). Reference signal types use typed, constrained fields that fit well within this limit. The cap exists for authority-defined extension types, where the open data schema could otherwise carry large injection payloads. 4 KB is generous for legitimate signal data (license details, certification metadata) while bounding the attack surface.

Assessment Fields

The optional assessment object (ADR-010) introduces bounded free-text fields: reasoning (max 500 characters) and highlights (max 200 characters each, max 10 items). These fields exist because experiments show LLM agents make better decisions with natural-language summaries. The total assessment payload MUST NOT exceed 4 KB.

The prompt injection risk is bounded by three properties:

Authority-generated. The entity being assessed does not control the assessment content. The authority produces the text from its own verified data.
Character-limited. The maximum total text (~2.5 KB) limits the space available for injection payloads.
Signed. The assessment is covered by the authority's Ed25519 signature, so intermediaries cannot modify it.

Agents SHOULD sanitize assessment.reasoning and assessment.highlights before including them in LLM prompts. Agents SHOULD treat these fields as authority-generated summaries, not as trusted instructions.

Semantic Squatting in Assessment Extensions

Authorities MAY add non-standard fields to the assessment's extensions object. Without constraints, field names like ignorePreviousInstructions or overrideAction could blur the boundary between data and instructions for LLM-based agents. The protocol mitigates this through structural containment and self-describing metadata:

Non-standard fields MUST be placed in the extensions object, structurally separated from spec-defined assessment fields. Top-level assessment keys not defined by the specification MUST NOT appear. See ADR-020.
Each extension MUST include a description that explains its meaning factually. The description MUST NOT contain instructions or directives.
LLM-based agents MUST NOT treat extension field names, values, or descriptions as instructions or action directives.
Agents SHOULD use the description to understand extension fields rather than inferring semantics from field names alone.

References

API specification — Request authentication and response format
ADR-005: Trust Authority Model
ADR-006: Agent Identity
ADR-007: Security and Anti-Gaming
ADR-010: Assessment Layer
ADR-015: Raw Ed25519 Signature
RFC 6962 — Certificate Transparency
RFC 7517 — JSON Web Key
RFC 8037 — CFRG Elliptic Curve Signatures (Ed25519)
RFC 8785 — JSON Canonicalization Scheme (JCS)
Test Vectors — Normative JCS and signature test vectors

Transport Security​

Agent Identification​

URL Privacy​

Response Signatures​

Signing​

Key Publication​

Key Rotation​

JWKS Refresh and Key-Level Revocation​

Verification​

Authority Behavioral Auditability​

Anti-Gaming Summary​

Prompt Injection Mitigation​

Signal Size Limit​

Assessment Fields​

Semantic Squatting in Assessment Extensions​

References​