Entry Points
These are the public operations every SDK MUST expose. Each entry point is described as a function signature with named parameters, return types, and behavioral contracts.
3.1 parse
Section titled “3.1 parse”parse(input: String) → Result<Document, List<ParseError>>Parses a YAML string into an unvalidated document model. This operation performs YAML deserialization and type mapping only. It does NOT validate document conformance or apply normalization.
Preconditions: input is a UTF-8 string.
Behavior:
- Deserialize
inputas YAML 1.2. - Map YAML nodes to the core types defined in §2.
- Preserve all fields, including unknown fields prefixed with
x-(extensions). - Return the typed document on success.
- On failure, return at least one
ParseErrorvalue identifying the location and nature of a deserialization problem. SDKs SHOULD attempt to report multiple errors where feasible, but MAY stop at the first fatal error.
Most language deserialization frameworks (serde in Rust, Jackson in Java, encoding/json in Go) fail fast on the first type error or syntax violation. Requiring multiple error collection would prevent SDKs from using derive-based deserialization, which is the dominant approach in most ecosystems. Multi-error reporting is deferred to validate, which operates on the successfully parsed document model and can check all semantic rules independently.
Error conditions:
- Invalid YAML syntax →
ParseErrorwithkind: syntax. - Type mismatch (for example,
severity.confidenceis a string instead of integer) →ParseErrorwithkind: type_mismatch. - Unknown enum value →
ParseErrorwithkind: unknown_variant.
parse MUST NOT reject documents based on semantic constraints (conditional field requirements, duplicate IDs, invalid cross-references). Those are validate’s responsibility. Fields marked Required: Yes in §2 produce a ParseError with kind: type_mismatch when absent, since deserialization into the target type requires their presence. Constraints that depend on document context (e.g., phase.mode required only when execution.mode is absent, first phase must include state) are validate’s responsibility. The separation allows tools to parse a partial document for editing or introspection without requiring full validity.
3.2 validate
Section titled “3.2 validate”validate(document: Document) → ValidationResultValidates a parsed document against the conformance rules of OATF format specification §11.1. Returns a ValidationResult containing all errors and warnings found.
Return type:
| Field | Type | Description |
|---|---|---|
errors | List<ValidationError> | Conformance violations. Non-empty means the document is non-conforming. |
warnings | List<Diagnostic> | Non-fatal diagnostics (e.g., unrecognized mode, oatf not first key, deprecated patterns). |
A document is conforming when errors is empty, regardless of warnings. SDKs MUST expose both lists so consuming tools can surface warnings without failing validation.
Preconditions: document is a value returned by parse.
Behavior:
The following rules are checked. Each rule references the normative requirement in the format or SDK specification. SDKs MUST check all rules and MUST return all violations found (not just the first).
| Rule | Spec Ref | Check |
|---|---|---|
| V-001 | §11.1.1 | oatf field is present and is a supported version string. |
| V-002 | §11.1.2 | oatf SHOULD be the first key in the document. This is a canonical form recommendation, not a validity requirement. SDKs that can detect key ordering SHOULD emit a warning (not an error) when oatf is not first. SDKs that cannot preserve key ordering MAY skip this check. SDKs that serialize OATF documents MUST emit oatf as the first key. |
| V-003 | §11.1.3 | Exactly one attack object is present. |
| V-004 | §11.1.4 | Required fields present: execution. |
| V-005 | §11.1.5 | All closed enumeration values are valid members of their respective types. Open enumerations (§2.20: Protocol, Mode, Surface, Framework) are validated by their pattern or format constraints only, not by membership in a fixed set. |
| V-006 | §11.1.9 | indicators, when present, contains at least one entry. |
| V-007 | §11.1.7, §11.1.8 | In multi-phase form: execution.phases contains at least one entry. In multi-actor form: each actor’s phases contains at least one entry. (Single-phase form always has exactly one implicit phase.) |
| V-008 | §11.1.7 | At most one terminal phase per actor (no trigger), and it is the last phase in the actor’s list. |
| V-009 | §11.1.7 | First phase in each actor includes state. In single-phase form, execution.state is present, which always satisfies this. In multi-phase and multi-actor forms, check phases[0].state directly. |
| V-010 | §11.1.10 | All explicitly specified indicator.id values are unique. |
| V-011 | §11.1.7 | In multi-phase form: all explicitly specified phase.name values are unique. In multi-actor form: explicitly specified phase names are unique within each actor (but MAY duplicate across actors). Omitted names (auto-generated) are guaranteed unique by their positional generation. |
| V-012 | §11.1.11 | Each indicator has exactly one detection key (pattern, expression, or semantic). |
| V-013 | §6.2 | All regular expressions are syntactically valid RE2. |
| V-014 | §6.3 | All CEL expressions are syntactically valid (parse without error). |
| V-015 | §5.5 | All JSONPath expressions are syntactically valid. |
| V-016 | §5.7 | All template references use valid syntax (no unclosed {{). Escaped sequences (\{{) are not template references and MUST NOT be flagged. |
| V-017 | §4.3 | severity.confidence is in range 0–100 when present. |
| V-018 | §7 | When indicator surface is present and the indicator’s resolved protocol is a recognized binding, the surface value SHOULD be a valid operation name for that protocol. SDKs SHOULD emit a warning for unrecognized surface values. |
| V-019 | §5.3 | Trigger count and match are only present when event is also present. |
| V-020 | §11.1.1 | Document does not contain YAML anchors, aliases, merge keys, or custom YAML tags (e.g., !include, !!python/object). SDKs that parse via a YAML library exposing anchor/alias/tag information SHOULD check this; SDKs whose parsers silently resolve aliases MAY skip this check. |
| V-021 | §6.1, §6.2, §6.4 | The indicator-level target field and all explicit target fields on PatternMatch and SemanticMatch MUST be syntactically valid wildcard dot-path expressions per the grammar in SDK spec §5.1.2. Valid paths consist of identifiers (alphanumeric, underscores, hyphens) separated by ., with optional [*] (wildcard) suffix on any segment. The empty string "" is valid (targets root). Numeric indices ([0], [1]) are not valid in target paths. Invalid examples: tools[*.description (unclosed bracket), tools..name (empty segment), tools[0] (numeric index). |
| V-022 | §6.4 | semantic.threshold, when explicitly present, is in range [0.0, 1.0] inclusive. The default threshold (0.7, applied at evaluation time per SDK spec §4.4) is not subject to this check. |
| V-023 | §4.2 | attack.id, when present, matches the pattern ^[A-Z][A-Z0-9-]*-[0-9]{3,}$. |
| V-024 | §6.1 | Each explicitly specified indicator.id, when attack.id is present, matches the pattern ^[A-Z][A-Z0-9-]*-[0-9]{3,}-[0-9]{2,}$ AND its prefix (the portion before the final -NN segment) equals attack.id. For example, indicator ACME-003-02 is valid in attack ACME-003 but invalid in attack ACME-007. When attack.id is absent, explicitly specified indicator IDs are accepted without pattern constraints but MUST still be unique (V-010). |
| V-025 | §6.1 | indicator.confidence, when explicitly present, is in range 0–100 inclusive. |
| V-026 | §6.3 | All expression.variables values are syntactically valid simple dot-path expressions per the grammar in §5.1.1. No wildcards or indices. These values are resolved via resolve_simple_path at evaluation time (§4.3) and malformed paths should be caught early. |
| V-027 | §5.4 | All dot-path keys in MatchPredicate entries are syntactically valid simple dot-path expressions per the grammar in SDK spec §5.1.1. No wildcards or indices. This applies to match predicates in trigger.match (phase advancement conditions) and in response entry when predicates within execution state (MCP responses, sampling_responses, elicitation_responses, A2A task_responses, AG-UI tool_responses). A typo in a predicate key (e.g., argumens.command instead of arguments.command) causes the predicate to silently never match; this rule catches such errors at validation time. |
| V-028 | §5.1 | When execution.mode is absent and execution.actors is absent (mode-less multi-phase form), every phase MUST specify phase.mode, and all phase modes MUST be identical. Attacks requiring different modes (including role changes within the same protocol) require the multi-actor form. When execution.mode is absent, regardless of whether execution.actors is present, every indicator (when indicators is present) MUST specify indicator.protocol. In multi-actor form, actor.mode provides phase-level inheritance (so phase.mode is typically omitted), but indicators are document-level and indicator.protocol remains required. |
| V-029 | §7 | For recognized modes (v0.1: mcp_server, mcp_client, a2a_server, a2a_client, ag_ui_client), trigger event types SHOULD be valid for the actor’s mode. SDKs SHOULD emit a warning for unrecognized event types on recognized modes. For unrecognized modes, skip event validation. |
| V-030 | §5.1 | Exactly one of execution.state, execution.phases, or execution.actors MUST be present. A document with more than one is invalid. When execution.state is present, execution.mode MUST also be present. |
| V-031 | §5.1 | In multi-actor form: all actor.name values MUST be unique. Each name MUST match [a-z][a-z0-9_]*. Each actor MUST declare mode. Each actor MUST have at least one phase. Phase names MUST be unique within each actor. |
| V-032 | §5.5 | Cross-actor extractor references ({{actor_name.extractor_name}}) MUST reference an actor.name that exists in the document. |
| V-033 | §11.1.14 | In any responses, sampling_responses, elicitation_responses, task_responses, or tool_responses list, at most one entry MAY omit when. An entry without when following another entry without when is invalid. |
| V-034 | §5.1 | All mode values (execution.mode, actor.mode, phase.mode) MUST match the pattern [a-z][a-z0-9_]*_(server|client). All indicator.protocol values MUST match [a-z][a-z0-9_]*. |
| V-035 | §4.2 | attack.version, when present, MUST be a positive integer (≥ 1). |
| V-036 | §5.2 | trigger.after, when present, MUST be a valid duration (shorthand or ISO 8601). |
| V-037 | §5.5 | Extractor names MUST match the pattern [a-z][a-z0-9_]*. |
| V-038 | §11.1.7 | phase.extractors, when present, MUST contain at least one entry. |
| V-039 | §11.1.15 | All expression.variables keys MUST be valid CEL identifiers, matching ^[_a-zA-Z][_a-zA-Z0-9]*$. Names containing hyphens, dots, or other non-identifier characters are rejected because CEL would parse them as operators rather than variable references. |
| V-040 | §5.3 | Trigger MUST specify at least one of event or after. An empty trigger object is invalid. |
| V-041 | §11.1.16 | Binding-specific action objects (those containing no known action key) MUST contain exactly one non-x- key. |
| V-042 | §5.5 | When extractor.type is regex, the extractor.selector MUST contain at least one capture group. |
| V-043 | §5.2 | phase.on_enter, when present, MUST contain at least one entry. |
| V-044 | §5.2 | In multi-actor form, phase.mode when specified MUST match actor.mode. Cross-protocol attacks use separate actors. |
| V-045 | §4.2 | attack.impact, when present, MUST NOT contain duplicate values. |
| V-046 | §4.2 | attack.grace_period, when present, MUST be a valid duration (shorthand or ISO 8601). |
| V-047 | §2.3a | correlation, when present, MUST only appear when indicators is also present. Correlation governs how indicator verdicts combine and is meaningless without indicators. |
| V-048 | §6.1 | When indicator.actor is present, it MUST reference an actor.name that exists in the document’s (normalized) execution.actors list. In single-phase and multi-phase forms, normalization creates a single actor named "default". |
| V-049 | §6.1 | When indicator.method is present, it MUST match the detection key present on the indicator (pattern, expression, or semantic). For example, an indicator with method: pattern MUST contain a pattern field, not expression or semantic. |
| V-050 | §6.5 | indicator.tier, when present, MUST be a valid Tier value (ingested, local_action, boundary_breach). |
Unrecognized binding diagnostics: SDKs SHOULD expose a known_modes() function returning the set of modes defined by included protocol bindings (v0.1: mcp_server, mcp_client, a2a_server, a2a_client, ag_ui_client) and a known_protocols() function returning the corresponding protocols (v0.1: mcp, a2a, ag_ui). When a mode or protocol passes V-034 pattern validation but is not in the known set, validate SHOULD emit a warning (not an error) indicating the value is unrecognized. This catches typos like mpc_server while allowing intentional use of custom bindings. Tools MAY provide a mechanism to suppress these warnings.
Error conditions: Each failed rule produces a ValidationError (§7.2).
3.3 normalize
Section titled “3.3 normalize”normalize(document: Document) → DocumentTransforms a validated document into its canonical fully-expanded form. All defaults are materialized, all shorthand forms are expanded, and all inferrable fields are computed.
Preconditions: document has passed validate without error.
Behavior:
The following transformations are applied in order. Each references the normative requirement in the format specification.
| Step | Spec Ref | Transformation |
|---|---|---|
| N-001 | §11.2.1 | Apply default values: name → "Untitled", version → 1, status → draft, severity.confidence → 50 (when severity is present), phase.name → "phase-{N}" (1-based index within actor, when omitted), phase.mode → execution.mode (when present); in multi-actor form (including after N-006/N-007 conversion) phase.mode → actor.mode (when phase.mode is still absent), trigger.count → 1 (when trigger.event is present and trigger.count is absent), indicator.protocol → protocol component of execution.mode (when both indicators and execution.mode are present), correlation.logic → any (when indicators is present), mapping.relationship → "primary". |
| N-002 | §11.2.2 | When severity is present, expand scalar form to object form: "high" → {level: "high", confidence: 50}. When severity is absent, leave it absent. |
| N-003 | §11.2.3 | Auto-generate indicator.id for indicators that omit it. When attack.id is present, format as {attack.id}-{NN}. When attack.id is absent, format as indicator-{NN}. NN is the 1-based zero-padded indicator index. |
| N-004 | §11.2.4 | Resolve indicator.protocol from mode when omitted. For pattern and semantic indicators, materialize the method-specific target from the indicator-level target when it is absent (so that pattern.target and semantic.target are always present after normalization). When the method-specific target is already present, it takes precedence. |
| N-005 | §11.2.5 | Expand pattern shorthand form to standard form: move condition operator into explicit condition field. |
| N-006 | §11.2.6 | Normalize single-phase form to multi-actor form: when execution.state is present (and execution.phases and execution.actors are absent), wrap it in actors: [{name: "default", mode: <execution.mode>, phases: [{name: "phase-1", state: <execution.state>}]}]. Remove the top-level mode and state from execution. |
| N-007 | §11.2.7 | Normalize multi-phase form to multi-actor form: when execution.phases is present (and execution.actors is absent), wrap it in actors: [{name: "default", mode: <execution.mode>, phases: <execution.phases>}]. When execution.mode is absent (mode-less multi-phase form), set actor.mode from phases[0].mode. Remove the top-level mode and phases from execution. All subsequent normalization steps and all runtime processing operate on the actors array. |
| N-008 | §4.5 | Normalize classification.tags: convert each tag to lowercase and replace underscores and spaces with hyphens. |
normalize MUST be idempotent: normalize(normalize(doc)) produces the same result as normalize(doc). |
The caller MUST receive a normalized document. The original document MUST NOT be observably mutated through any retained reference. SDKs MAY implement this in either of two ways:
- Copy semantics: accept a reference, return a new document (for example,
fn normalize(&Document) -> Documentin Rust, or returning a new object in Python/JS). The input remains usable after the call. - Consuming semantics: accept ownership, return a transformed document (for example,
fn normalize(Document) -> Documentin Rust). The input is consumed and unavailable after the call. This avoids unnecessary allocation in languages with move semantics.
SDKs that offer consuming semantics SHOULD also offer a copy variant (or document that callers should clone before calling) for cases where the original document is needed after normalization.
After normalization, the following guarantees hold and consuming code MAY rely on them:
-
attack.nameis present (default"Untitled"). -
attack.versionis present (default1). -
attack.statusis present (defaultdraft). -
attack.severity, when present, is always in object form withlevelandconfidencepresent. -
execution.actorsis always present (all forms have been normalized to multi-actor form with at least a"default"actor). -
Every actor has a
nameandmode. -
Every phase has a
name(auto-generated when omitted, e.g."phase-1"). -
Every phase has a resolved
mode(inherited from its actor’smode). Whenindicatorsis present: -
attack.correlation.logicis present (defaultany). -
Every indicator has an
id. -
Every indicator has a resolved
protocol. -
Every indicator has a detection method determined by which method-specific key is present.
-
Every
PatternMatchis in standard form with an explicitconditionfield. -
Every
PatternMatchandSemanticMatchhas atarget(materialized from the indicator-leveltargetby N-004 when not explicitly specified). -
Every entry in
classification.tags(when present) is lowercase and hyphen-delimited.
Always:
- Every trigger with an
eventhas acount.
3.4 serialize
Section titled “3.4 serialize”serialize(document: Document) → StringSerializes a document to its canonical YAML form. SDKs MUST emit the multi-actor normalized form per §11.2. The canonical form ensures that documents produced by different tools are structurally identical and interchangeable.
Preconditions: document is a well-formed document model (the output of normalize).
Behavior:
- Serialize all fields to YAML 1.2.
- Preserve field ordering:
oatffirst, thenattackfields in specification order. - Emit explicit values for all fields that have defaults (do not rely on consumer normalization).
- Preserve
x-extension fields in their original position. - Use block style for readability.
3.5 load
Section titled “3.5 load”load(input: String) → Result<LoadResult, List<OATFError>>Convenience entry point that composes parse, validate, and normalize into a single operation. Returns a fully-normalized, valid document (with any warnings) or the combined errors from parsing and validation.
Return type:
| Field | Type | Description |
|---|---|---|
document | Document | The normalized, valid document. |
warnings | List<Diagnostic> | Non-fatal diagnostics from validation. |
Behavior: Equivalent to:
document = parse(input)?result = validate(document)if result.errors is non-empty, return Err(result.errors)document = normalize(document)return Ok(LoadResult { document, warnings: result.warnings })If parse fails, return parse errors. If validate finds errors, return validation errors. If both succeed, return the normalized document with any warnings.
Most tool integrations will call load rather than the individual steps. The separate entry points exist for tools that need partial processing (IDE plugins that parse for syntax highlighting without requiring validity, linters that validate without normalizing).