-
Notifications
You must be signed in to change notification settings - Fork 9.2k
uri resolution notes #4926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v3.2-dev
Are you sure you want to change the base?
uri resolution notes #4926
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,16 +10,19 @@ properties: | |
| pattern: '^3\.2\.\d+(-.+)?$' | ||
| $self: | ||
| type: string | ||
| $comment: resolved against the retrieval uri | ||
| format: uri-reference | ||
| $comment: MUST NOT contain a fragment | ||
| pattern: '^[^#]*$' | ||
| info: | ||
| $ref: '#/$defs/info' | ||
| jsonSchemaDialect: | ||
| type: string | ||
| $comment: resolved against the resolved version of $self | ||
| format: uri-reference | ||
| default: 'https://spec.openapis.org/oas/3.2/dialect/WORK-IN-PROGRESS' | ||
| servers: | ||
| $comment: server urls are resolved against the HTTP request uri itself (template matching ideally happens first, because of url encoding, but matching the template parts separately may produce ambiguous results, so in my implementation I concatenate the server url template with the path template and then match that against the entire uri) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this "HTTP request uri" the request URI of the OAD document? This is a place where our URI/URL distinction can get a bit muddled. In this case, I would call this a URL matters because the document's identity (URI) might be set (by The tricky part here is that if we are simulating the document's retrieval URL (e.g. because you are testing things and don't want to actually deploy to production, but do want to test as if it were deployed), then it's arguably being a bit more URI-ish than URL-ish? But I still lean towards calling that a URL because the whole point is that you are simulating the location, and want it to behave as a location. You are not simulating it to separate identity from location (which is what This simulation use case is discussed in the third paragraph of 4.1.2.2.1 Establishing the Base URI, although that's not linked from the Server Object so maybe that's a concern. Does all of this make sense, and if so do you have any ideas on how we can clarify it? I moved the "Relative References in API URLs" section under the Server Object, but might have accidentally dropped explanatory text elsehwere in the process.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No, whenever I refer to an HTTP request it is in the server context -- it is a request for one of the APIs described by the document (or perhaps not matching any API, but there is still an attempt to match one of the path-items to it). When I try to match server urls to the HTTP request, I resolve the url against the retrieval uri (not $self), and then again against the HTTP request's URI, if the retrieval URI was relative. (We have to have a URI with host and scheme in order to match it against the HTTP request URI, and having a relative retrieval URI would prevent that. I believe this is correct?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @karenetheridge I think I am confused. How can you resolve a relative server URL against an API request, when you need the server URL to make an API request? That sounds like server-side matching which is, to me, a different concern from URL construction, although they are definitely related. I'm glad you are calling attention to server matching. It had never occurred to me to think through that process, so please bear with me as a try to wrap my head around it. By URI/URL construction, I mean "I'm parsing this OAD and I found some part of an OAD URI or API URL and need to construct the full URI/URL before I can use it." By server-side matching, I mean "I am handling an API request and I am trying to figure out which possible URL construction (as in the previous definition) matches this URL, after which I need to split it back into parts to extract server variables, path template parameters, and handle the query string." Most of your comments here seem to be about construction, which is why I am confused. I think it is important to get the construction parts clear and separate from the matching guidance. Because if we aren't constructing correctly, we'll end up attempting to match to an incorrect set of potential URLs.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Those are on different systems though, so we get at the information in a different way.
I'm confused too, as I'm not sure what has given you that impression.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't follow, can you elaborate on this? Regarding construction, perhaps that is my assumption. What are you trying to describe here? Construction, matching, or both? To me, successful matching requires successful construction first, otherwise you don't know what you are matching. Does that make sense? If not, how do you think of it? Again, I have not thought about this much from the matching side, so I might be getting things really wrong. I'm trying to understand the perspective so I can understand what your goals are here.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ah, now I understand the confusion. This is not the "construction" I mean. The "construction" I meant is that prior to doing anything, no matter whether it is on the client or the server, you need to parse the OAD for bits and pieces of API URLs (or URL Templates, by which I mean our weird sort-of-but-not-quite-URI-Template thing split across the Sever Objects and Paths Object) and construct actual full URLs (not relative references or portions to be concatenated) from them. I agree with everything else you wrote. So to come back to the original point of confusion:
This I still don't entirely understand. In your server-side use case where you are parsing a request, I do understand that there is an HTTP request URI in the request. But my reading of the spec is that you MUST resolve a relative Server Object What part of the spec are you reading to get the "resolve against request URI" behavior? (I am really not as familiar with the Server Object and still might be missing something).
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens when the retrieval URI is relative?
Nothing. But if after resolving the server url against the retrieval URI, and it's still relative, it will never match the request URI. But from a user perspective, it should match. So resolving against the request URI will allow it to do so. This is the only thing I could think of doing that would make things work, other than requiring the retrieval URI to be absolute (which didn't seem to be an option since sometimes it's not even provided at all).
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The retrieval URI can't be relative, by definition. It is the URI that was used, past tense, to retrieve the thing. Even if it is simulated (e.g. you told the tool what it was for an already-in-memory-document rather than the tool actually performing the I/O), then it still has to be a full URI. There is no way to have a relative retrieval URI.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I can assure you that if the user doesn't tell me what it is, it defaults to the empty string, which is a relative URI :) Where in the spec does it say either that providing a retrieval URI is mandatory, or that it must not be relative? I don't see that I can do anything else here, if there is to be any hope of making any matches at all. Do you have any suggestions?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
RFC3986 does not allow for a default retrieval URI for base URI establishment purposes (RFC 3986 §5.1.3). Defaulting is handled as an application-specific default base URI, distinct from retrieval, that is discussed in RFC 3986 §5.1.4, and such URIs MUST be non-relative (§4.1.5 uses the term "URI", which in that context means non-relative).
OAS v3.2 Section 4.1.2.2.1:
So it's not mandatory, but it's very strongly recommended.
As for a why a retrieval URI cannot be relative, it is inherent in how the term is defined in RFC3986, which comes from this text in RFC3986 §1.2.2 which discusses the use of URIs for retrieval:
Note the use of "URI", as in non-relative, above. A bit further in that paragraph, we have:
Again, "URI" as in non-relative. In the next paragraph:
This is a bit more obscure, but the "late-binding"-ness here includes the late resolution against a base URI prior to use of the non-relative URI for retrieval. There are other things that fit there, like URN resolvers that first convert a URN to a URL that can be directly retrieved, but let's not worry about that 😵💫
If the document was not retrieved, and the user fails to provide a non-relative retrieval URI, and there is no usable application-specific default base URI (§5.1.4, non-relative, so the empty string cannot be used here), then the OAD is not usable. §5.1.4 addresses this directly:
So if your only relative references are fragment-only, then you're fine, because literally any base URI (even |
||
| type: array | ||
| items: | ||
| $ref: '#/$defs/server' | ||
|
|
@@ -68,6 +71,7 @@ $defs: | |
| description: | ||
| type: string | ||
| termsOfService: | ||
| $comment: resolved against the resolved version of $self | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW, I had to think about whether Info/License/Contact should be OAD URI-based or API URL-based. Taking a look at these fields, we do say "URI", and that's what I was expecting to find. I wonder if we should make "URI" a link to 4.1.2.2 Relative References in API Description URIs for added clarity? |
||
| type: string | ||
| format: uri-reference | ||
| contact: | ||
|
|
@@ -89,6 +93,7 @@ $defs: | |
| name: | ||
| type: string | ||
| url: | ||
| $comment: resolved against the resolved version of $self | ||
| type: string | ||
| format: uri-reference | ||
| email: | ||
|
|
@@ -106,6 +111,7 @@ $defs: | |
| identifier: | ||
| type: string | ||
| url: | ||
| $comment: resolved against the resolved version of $self | ||
| type: string | ||
| format: uri-reference | ||
| required: | ||
|
|
@@ -225,6 +231,7 @@ $defs: | |
| type: object | ||
| properties: | ||
| $ref: | ||
| $comment: resolved against the resolved version of $self | ||
| type: string | ||
| format: uri-reference | ||
| summary: | ||
|
|
@@ -327,6 +334,7 @@ $defs: | |
| description: | ||
| type: string | ||
| url: | ||
| $comment: resolved against ??? | ||
| type: string | ||
| format: uri-reference | ||
| required: | ||
|
|
@@ -682,6 +690,7 @@ $defs: | |
| type: string | ||
| value: true | ||
| externalValue: | ||
| $comment: resolved against the resolved version of $self | ||
| type: string | ||
| format: uri-reference | ||
| allOf: | ||
|
|
@@ -719,6 +728,7 @@ $defs: | |
| type: object | ||
| properties: | ||
| operationRef: | ||
| $comment: resolved against the resolved version of $self | ||
| type: string | ||
| format: uri-reference | ||
| operationId: | ||
|
|
@@ -825,6 +835,7 @@ $defs: | |
| type: object | ||
| properties: | ||
| $ref: | ||
| $comment: resolved against the resolved version of $self | ||
| type: string | ||
| format: uri-reference | ||
| summary: | ||
|
|
@@ -929,6 +940,7 @@ $defs: | |
| flows: | ||
| $ref: '#/$defs/oauth-flows' | ||
| oauth2MetadataUrl: | ||
| $comment: resolved against the matching and de-templated server url | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's an interesting wrinkle here that this is a URL (including possibly a relative URL-reference), but the resolved server URL is expected to be used as a prefix (with resolved Path Templates) rather than a normal base URL. I'm not sure how much this matters as both behaviors are well-defined, but it means that, given a server URL of Which leads to another observation: The Paths Object requires Path Templates to start with
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, that was my observation as well. I assume that this is the best we can do, and it doesn't really make sense for the It's even more clear when we change the retrieval URL to ".../api.json". That's clearly a filename, not a directory, and we don't expect to see that in any request URLs that we're trying to match.
Yes. I played around with trying to resolve the path template against the server url, instead of appending, but the results were less favourable. We could say something like "the templated URI resulting from the concatenation of the server url and path template portions should then be normalized", which would also allow for constructs like The alternative is: you end up with a uri template that contains
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In my example, the resolved API endpoint does include
My sense is that URL normalization is something that does or doesn't happen independently. I would tend to normalize URLs as much as possible before attempting to compare, athough as you note that could get surprising. (For those unfamiliar, normalization and comparison is a tricky topic addressed in-depth by RFC3986 §6.)
I want to say that |
||
| type: string | ||
| format: uri-reference | ||
| required: | ||
|
|
@@ -944,6 +956,7 @@ $defs: | |
| then: | ||
| properties: | ||
| openIdConnectUrl: | ||
| $comment: resolved against the matching and de-templated server url | ||
| type: string | ||
| format: uri-reference | ||
| required: | ||
|
|
@@ -977,6 +990,7 @@ $defs: | |
|
|
||
| $defs: | ||
| implicit: | ||
| $comment: references are resolved against the matching and de-templated server url | ||
| type: object | ||
| properties: | ||
| authorizationUrl: | ||
|
|
@@ -994,6 +1008,7 @@ $defs: | |
| unevaluatedProperties: false | ||
|
|
||
| password: | ||
| $comment: references are resolved against the matching and de-templated server url | ||
| type: object | ||
| properties: | ||
| tokenUrl: | ||
|
|
@@ -1011,6 +1026,7 @@ $defs: | |
| unevaluatedProperties: false | ||
|
|
||
| client-credentials: | ||
| $comment: references are resolved against the matching and de-templated server url | ||
| type: object | ||
| properties: | ||
| tokenUrl: | ||
|
|
@@ -1028,6 +1044,7 @@ $defs: | |
| unevaluatedProperties: false | ||
|
|
||
| authorization-code: | ||
| $comment: references are resolved against the matching and de-templated server url | ||
| type: object | ||
| properties: | ||
| authorizationUrl: | ||
|
|
@@ -1049,6 +1066,7 @@ $defs: | |
| unevaluatedProperties: false | ||
|
|
||
| device-authorization: | ||
| $comment: references are resolved against the matching and de-templated server url | ||
| type: object | ||
| properties: | ||
| deviceAuthorizationUrl: | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of several spots where I would not use "retrieval URI." I tried to make this more clear in 3.2, so it would be good to understand what is not coming across. I would probably use a phrase like "against the document's base URI," although I don't think that that phrase is used prominently in 3.2. Maybe that is part of the lack of clarity?
Here's the wording for
$self:Perhaps we could change "See Establishing the Base URI for the base URI behavior when $self is absent or relative" to "When
$selfis relative, it is resolved against the document's base URI"?In section 4.1.2.2.1 Establishing the Base URI, the third paragraph starts:
My intention with that phrasing was both to note the common behavior and to emphasize that the base URI is not always the retrieval URI. Does it seem clear enough if the
$selfwording is changed, or is it still unclear?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear on the difference. Isn't the retrieval URI already defined as "the initial URI we assign to the document when we start reading from it"? I recall we used the retrieval URI phrasing even if the document is never actually reachable as an URL on the network, but just a thing in application state.
It sounds like we need to come up with a clearer name for whatever this thing is, and use it throughout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karenetheridge the clear name is "base URI." The less-clear part is that there are several possible base URIs that are searched in order, and if any of them are relative, then they are resolved against the next possible base URI. Since
$selfis the first place to search, it both is a base URI, and potentially needs to be resolved against the next base URI in the list.But all of this is about base URIs. It is not about retrieval URLs/URIs, except by coincidence. They happen to be one of four possible base URI sources (and one of three after
$self). But it's always about base URIs, and we should always talk in terms of base URIs.This is for OAD URIs, though, not API URLs, where the situation is different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well yes, the key is "which base URI do we use", and that's the thing that needs to be stated clearly. :p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karenetheridge then I do not understand your question about a clearer name. "Base URI" is the name. RFC3986 gives names for each of the four possible sources. "Retrieval URI" is incorrect except in specific circumstances, being one of the four sources. So what sort of clearer name are you looking for? It's a multi-step look-up, so there's no one name other than "Base URI".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karenetheridge to be clear, I understand and very much sympathize with the desire for there to be a simpler name for all of this. But having spent a great deal of time with RFC3986 and its predecessors and related specs, I just don't think there's any easy term to use.
It's not unlike the annoying lack of a clear term for "URI with a scheme that MAY have a fragment." Officially, "URI" is the term for that, but it's also used generically. RFC3986 is not the greatest for terminology. 😒