@@ -2029,32 +2029,129 @@ SHOULD use the terms defined by this document to do so.
20292029
20302030## Security Considerations {#security}
20312031
2032- Both schemas and instances are JSON values. As such, all security considerations
2033- defined in [ RFC 8259] [ rfc8259 ] apply.
2034-
2035- Instances and schemas are both frequently written by untrusted third parties, to
2036- be deployed on public Internet servers. Implementations should take care that
2037- the parsing and evaluating against schemas does not consume excessive system
2038- resources. Implementations MUST NOT fall into an infinite loop.
2039-
2040- A malicious party could cause an implementation to repeatedly collect a copy of
2041- a very large value as an annotation. Implementations SHOULD guard against
2042- excessive consumption of system resources in such a scenario.
2043-
2044- Servers MUST ensure that malicious parties cannot change the functionality of
2045- existing schemas by uploading a schema with a pre-existing or very similar
2046- ` $id ` .
2047-
2048- Individual JSON Schema extensions are liable to also have their own security
2049- considerations. Consult the respective specifications for more information.
2050-
2051- Schema authors should take care with ` $comment ` contents, as a malicious
2052- implementation can display them to end-users in violation of a spec, or fail to
2053- strip them if such behavior is expected.
2054-
2055- A malicious schema author could place executable code or other dangerous
2056- material within a ` $comment ` . Implementations MUST NOT parse or otherwise take
2057- action based on ` $comment ` contents.
2032+ While schemas and instances are not always represented as JSON text, they are
2033+ defined in terms of the JSON data model. As such, the security considerations
2034+ defined in [ RFC 8259] [ rfc8259 ] may still apply in environments where text-based
2035+ representations are used, particularly those considerations related to parsing,
2036+ number precision, and structural limitations.
2037+
2038+ Schemas and instances are frequently authored by untrusted parties.
2039+ Implementations that accept or evaluate such inputs may be exposed to several
2040+ classes of attack, particularly denial-of-service (DoS) by means of resource
2041+ exhaustion.
2042+
2043+ ### Nested ` anyOf ` /` oneOf `
2044+
2045+ One risk for resource exhaustion in JSON Schema arises from the nested use of
2046+ ` anyOf ` and ` oneOf ` . While a single combinator keyword with multiple subschemas
2047+ is typically manageable, nesting them causes the number of evaluation paths to
2048+ grow exponentially.
2049+
2050+ For example, a ` oneOf ` with 5 subschemas, each containing another ` oneOf ` with 5
2051+ options, results in 25 evaluation paths. Adding a third level increases this to
2052+ 125, and so on. Attackers can exploit this by crafting schemas that force
2053+ validators to explore a large number of branches.
2054+
2055+ This evaluation explosion is particularly dangerous when each path involves
2056+ expensive work such as collecting large annotations or evaluating complex
2057+ regular expressions. These effects multiply across paths and can result in
2058+ excessive CPU or memory consumption, leading to denial-of-service.
2059+
2060+ Implementations that evaluate untrusted schema are encouraged to take steps to
2061+ mitigate these threats with measures such as bounding combinator keyword depth
2062+ and breadth, limiting memory used for annotation collection, and guarding
2063+ against resource-intensive validations such as pathological regexes.
2064+
2065+ ### Dynamic References
2066+
2067+ The paper [ "The Complexity of JSON Schema: Undecidable, Expensive, Yet
2068+ Tractable" (Caroni et al., 2024)] ( https://doi.org/10.1145/3632891 ) has shown
2069+ that validation in the presence of dynamic references is PSPACE-complete. The
2070+ paper describes a method for replacing dynamic references with static ones, but
2071+ doing so can cause the size of the schema to grow exponentially. Implementations
2072+ should be aware of this risk and may wish to implement the method described in
2073+ the paper or impose limits on dynamic reference resolution.
2074+
2075+ ### Infinite Loops and Cycles
2076+
2077+ Infinite loops can occur when evaluating schemas that produce cycles during
2078+ reference resolution. These cycles may involve multiple schemas. Not all
2079+ recursive schemas create loops, but implementations are advised to detect and
2080+ break these cycles when they are encountered.
2081+
2082+ ### Schema Identity and Collisions
2083+
2084+ Schemas may declare an ` $id ` to identify themselves or have embedded schemas
2085+ that declare an ` $id ` . An attacker may attempt to register a schema with an
2086+ ` $id ` that collides with a previously registered schema, or that differs only by
2087+ case, encoding, or other URI normalization quirks. Such collisions could result
2088+ in overwriting or shadowing of trusted schemas.
2089+
2090+ Implementations should consider rejecting schemas that have identifiers
2091+ (including embedded schema identifiers) that conflict with registered schemas
2092+ and should apply consistent URI normalization and comparison logic to detect and
2093+ prevent conflicts.
2094+
2095+ ### External Schema Resolution
2096+
2097+ JSON Schema implementations are expected to resolve external references using a
2098+ local registry. Although the specification allows for dynamic retrieval
2099+ (` https: ` to fetch schemas over HTTP, or ` file: ` to read schemas from disk),
2100+ this behavior is discouraged unless it's intrinsic to the use case, such as with
2101+ JSON Hyper-Schema.
2102+
2103+ Resolving schemas dynamically introduces several security concerns, each of
2104+ which can be mitigated by limiting or controlling resolution behavior. A tightly
2105+ scoped schema resolution policy significantly reduces the attack surface,
2106+ especially when validating untrusted data.
2107+
2108+ Implementations are advised to disable dynamic retrieval by default and limit
2109+ external schema resolution to the local registry unless dynamic retrieval is
2110+ explicitly enabled. If enabled, they should consider limiting the number of
2111+ dynamic retrievals a validation can perform and defining timeouts on dynamic
2112+ retrievals to reduce the risk of resource exhaustion.
2113+
2114+ #### HTTP(S) Specific Threats
2115+
2116+ Allowing schema references to resolve over HTTP or HTTPS introduces several
2117+ threats:
2118+
2119+ * ** Denial of Service (DoS)** : Validation may hang or become slow if a
2120+ referenced schema URL is slow to respond or never returns.
2121+ * ** Server-Side Request Forgery (SSRF)** : Malicious schemas can reference
2122+ internal-only services using hostnames like localhost or private IPs.
2123+ Implementations are advised to restrict HTTP schema retrieval to a
2124+ configurable allowlist of trusted domains.
2125+ * ** Lack of Integrity Guarantees** : Retrieved schemas may be altered in transit
2126+ or change between validations. If network retrieval is allowed,
2127+ implementations are advised to only allow retrieval over HTTPS unless
2128+ specifically configured to allow unsecured transport.
2129+
2130+ #### File System Specific Threats
2131+
2132+ Allowing resolution from the local filesystem (` file: ` URIs) raises different
2133+ issues:
2134+
2135+ * ** Information Disclosure** : Malicious schemas may access sensitive files on
2136+ the system. Implementations should consider restricting filesystem access to
2137+ a specific schema directory tree.
2138+ * ** Cross-Context Access** : A schema fetched from HTTP may try to reference a
2139+ schema on the filesystem. Implementations are advised to allow resolving
2140+ ` file: ` references only when the referencing schema was itself loaded from the
2141+ file system, similar to same-origin policies in web browsers.
2142+ * ** Exposing Internal Paths** : Schemas that use ` file: ` URIs may reveal
2143+ host-specific filesystem details in two ways: through the ` $id ` itself or
2144+ through schema locations in validation output. Implementations are advised to
2145+ reject ` $id ` values that use the ` file: ` scheme. If ` file: ` URIs are permitted
2146+ internally, implementations are advised to sanitize them (for example, by
2147+ converting them to relative URIs) to avoid exposing host filesystem structure
2148+ to users.
2149+
2150+ ### Vocabulary-Specific Risks
2151+
2152+ Third-party JSON Schema vocabularies may introduce additional risks.
2153+ Implementers are advised to consult the specifications of any extensions they
2154+ support and take into account their security considerations as well.
20582155
20592156## IANA Considerations
20602157
0 commit comments