Security
This document specifies the authentication, integrity, and anti-gaming requirements for the #trstd <protocol>.
Normative language follows RFC 2119: MUST, SHOULD, MAY indicate requirement levels.
Transport Security
All communication in the #trstd <protocol> MUST use HTTPS (TLS 1.2 or later). This applies to:
- Agent requests to the trust authority
- Agent resolution of
did:webDID Documents
HTTPS is the sole integrity protection for the link tag. If a provider's domain is compromised, an attacker can inject a malicious link tag. Response signatures mitigate this — see Discovery — Integrity.
Agent Identification
Agent identification is optional per-request. Authorities MUST NOT gate access on identification — requests without an Authorization header MUST be served. Agents MAY present a did:web JWT in the Authorization header as self-identification. See ADR-006 for the decision rationale. The full specification is in the API specification. DID resolution and JWT verification mechanics follow the TSAI protocol.
When a JWT is present, the authority MUST:
- Verify the JWT signature against the agent's DID Document public key
- Reject expired tokens (with a maximum clock-skew tolerance of 60 seconds)
- Return
401 unauthorizedif the token is invalid or if DID Document resolution fails or exceeds 5 seconds
A missing Authorization header MUST NOT trigger 401. 401 covers only the "JWT presented and invalid" case.
Authorities MAY use presented identity for logging, rate-limiting, or response tailoring. The protocol does not specify or constrain these behaviors.
The trust authority SHOULD log the agent's did:web identifier with each response issued on an identified request. For unidentified requests, the authority has no cryptographic agent identity to log.
URL Privacy
The verify request includes the full URL the agent is visiting, including query parameters. Entity matching uses only hostname and path (see Discovery specification), so query parameters are not needed for scope validation. They may contain session identifiers, tracking parameters, or user-specific context that the authority does not need for trust assessment.
Sending the full URL is a simplicity tradeoff. It avoids requiring agents to implement URL decomposition or parameter-stripping rules. The authority can also use the full URL for additional validation beyond basic scope matching.
The tradeoff has two practical consequences:
- Agents should assume the authority can observe the full URL, including any embedded tokens or identifiers.
- Authorities SHOULD NOT use query parameter values for purposes beyond scope validation and logging. The protocol does not enforce this — it is a trust boundary between the agent and the authority.
URL minimization — structured scope objects or explicit stripping rules to reduce the data agents expose — is tracked on the roadmap.
Response Signatures
Signing
The trust authority MUST sign every successful response (HTTP 200). The signature covers the complete response body excluding the signature field itself.
Algorithm: Ed25519 (EdDSA) as specified in RFC 8037.
Format: The response body includes two fields for signature verification:
| Field | Description |
|---|---|
kid | Key identifier matching a key in the authority's published JWKS |
signature | Base64url-encoded raw Ed25519 signature (64 bytes, no padding) |
The signing input is the JCS-canonicalized (RFC 8785) response body with the signature field excluded. The kid field is part of the signed payload — an attacker cannot swap key identifiers without invalidating the signature. JCS plus a raw Ed25519 signature was chosen over JWS — see ADR-015: Raw Ed25519 Signature.
Key Publication
The trust authority MUST publish its signing public keys at:
https://{authority-domain}/.well-known/jwks.json
The JWKS (JSON Web Key Set) follows RFC 7517. Each key MUST include a kid that agents use to select the correct verification key.
Key Rotation
The trust authority MAY rotate signing keys. When rotating:
- The new key MUST be published in the JWKS before it is used for signing
- The old key MUST remain in the JWKS until all responses signed with it have expired
- The authority SHOULD publish new keys at least 24 hours before first use
JWKS Refresh and Key-Level Revocation
Agents MUST refresh the JWKS for each allowlisted authority at least every 1 hour, fetching it from the jwksUrl pinned in the allowlist (see Discovery — Authority Domain Allowlist).
A key absent from the current JWKS MUST be treated as revoked. Responses signed with a kid not present in the current JWKS MUST be rejected, regardless of the response's expires timestamp. If the response's kid is not in the cached JWKS, agents MUST refresh the JWKS and reject the response if the kid is still absent.
This gives authorities a fast, protocol-native response to key compromise: removing a key from the JWKS propagates to all agents within the refresh interval, without an agent software update.
Verification
Agents MUST perform the following verification steps on every response:
- Retrieve the authority's JWKS (cached retrieval is acceptable subject to the 1-hour refresh requirement above)
- Select the key matching the
kidfield in the response body. If no matching key is found, refresh the JWKS and re-select. If thekidis still absent after refresh, reject the response. - Remove the
signaturefield, JCS-canonicalize the remaining body, and verify the Ed25519 signature over the resulting bytes - Check that
expiresis in the future - Recompute the canonical form of the URL the agent sent (see API specification — URL Canonicalization) and compare to
meta.url. Mismatch MUST be rejected assignatureInvalid. - If the agent sent a
contextquery parameter, compare it tometa.context. Mismatch — including the case wheremeta.contextis absent — MUST be rejected assignatureInvalid. - Reject the response if any step fails
Steps 5 and 6 close a contextual-replay class of attack where a cached signed response for one URL within an entity's scope is substituted for another. See ADR-007 — Binding the Signed Response to the Request.
Agents MUST NOT process trust signals from responses with invalid or missing signatures. Implementers MUST validate their JCS and signature code against the Test Vectors before deployment.
Error responses (4xx, 5xx) are not signed because they contain no trust data. A man-in-the-middle can forge error responses to suppress valid entities or block trust evaluation. Agents MUST NOT treat a single unsigned error as a definitive trust result — they MUST retry, prefer cached signed responses over unsigned errors, and treat unresolved failures as "trust unknown" rather than "entity not found." See the API specification — Agent Error Handling for the complete requirements.
Authority Behavioral Auditability
Response signatures prove a response came from the authority and has not been tampered with. They do not prove the authority behaved consistently — the authority could issue contradictory signals to different agents without detection. A public transparency log to close this gap is tracked on the roadmap.
Anti-Gaming Summary
| Threat | Protocol Mitigation |
|---|---|
| Forged trust signals | Authority-only issuance with Ed25519 signatures |
| Tampered responses | Application-layer signatures; agents verify before processing |
| Replayed stale responses | expires field with mandatory freshness check |
| Contextual replay across URLs within entity scope | meta.url and meta.context bind the signed response to the request that produced it; agents reject mismatches as signatureInvalid |
| Silent signal revocation | Tracked on the roadmap (transparency log) |
| Backdated signals | Tracked on the roadmap (transparency log) |
| Authority misbehavior | Tracked on the roadmap (transparency log) |
| DDoS against authority | Signed responses remain valid when cached; agents use cached data within expires window |
| Error response suppression | Mandatory retry, cache fallback, and "trust unknown" semantics for unsigned errors |
| Sybil agent identities | did:web ties identity to a controlled domain (when authentication is required) |
Prompt Injection Mitigation
The protocol uses structured JSON with typed fields to minimize prompt injection risk. Implementers MUST NOT feed raw signal data into LLM prompts without sanitization. Signal data fields have defined schemas with typed values — agents SHOULD validate data types before processing.
Signal data objects SHOULD use typed values with constrained semantics (names, codes, URLs, booleans, numbers) rather than free-text strings.
Signal Size Limit
Each signal object MUST NOT exceed 4 KB when serialized as JSON. This limit applies to the complete signal structure (type, verifiedAt, and data). Reference signal types use typed, constrained fields that fit well within this limit. The cap exists for authority-defined extension types, where the open data schema could otherwise carry large injection payloads. 4 KB is generous for legitimate signal data (license details, certification metadata) while bounding the attack surface.
Assessment Fields
The optional assessment object (ADR-010) introduces bounded free-text fields: reasoning (max 500 characters) and highlights (max 200 characters each, max 10 items). These fields exist because experiments show LLM agents make better decisions with natural-language summaries. The total assessment payload MUST NOT exceed 4 KB.
The prompt injection risk is bounded by three properties:
- Authority-generated. The entity being assessed does not control the assessment content. The authority produces the text from its own verified data.
- Character-limited. The maximum total text (~2.5 KB) limits the space available for injection payloads.
- Signed. The assessment is covered by the authority's Ed25519 signature, so intermediaries cannot modify it.
Agents SHOULD sanitize assessment.reasoning and assessment.highlights before including them in LLM prompts. Agents SHOULD treat these fields as authority-generated summaries, not as trusted instructions.
Semantic Squatting in Assessment Extensions
Authorities MAY add non-standard fields to the assessment's extensions object. Without constraints, field names like ignorePreviousInstructions or overrideAction could blur the boundary between data and instructions for LLM-based agents. The protocol mitigates this through structural containment and self-describing metadata:
- Non-standard fields MUST be placed in the
extensionsobject, structurally separated from spec-defined assessment fields. Top-level assessment keys not defined by the specification MUST NOT appear. See ADR-020. - Each extension MUST include a
descriptionthat explains its meaning factually. ThedescriptionMUST NOT contain instructions or directives. - LLM-based agents MUST NOT treat extension field names, values, or descriptions as instructions or action directives.
- Agents SHOULD use the
descriptionto understand extension fields rather than inferring semantics from field names alone.
References
- API specification — Request authentication and response format
- ADR-005: Trust Authority Model
- ADR-006: Agent Identity
- ADR-007: Security and Anti-Gaming
- ADR-010: Assessment Layer
- ADR-015: Raw Ed25519 Signature
- RFC 6962 — Certificate Transparency
- RFC 7517 — JSON Web Key
- RFC 8037 — CFRG Elliptic Curve Signatures (Ed25519)
- RFC 8785 — JSON Canonicalization Scheme (JCS)
- Test Vectors — Normative JCS and signature test vectors