I've come across this curious scenario while writing tests + documentation for a REST API I am developing. According to this REST tutorial, a key abstraction to exploit in a RESTful API is the concept of a resource, and a common pattern is to have resources which themselves contain resources of their own. Additionally, returning 404
for an ID'd resource that does not actually exist is just as much of a common pattern.
My questions comes from the fact that a 404
response code can be ambiguous considering the hierarchical nature of a REST API.
For example, assume the data layer our REST API interacts with has the following data:
{
"users": {
"foo": {
"notes": {
"hello": "world"
}
}
}
}
Calls to our REST API that return 200
imply that all resources in the path exist:
GET /users/foo
returns 200
because the user foo
exists.GET /users/foo/notes
returns 200
for the same reason.GET /users/foo/notes/hello
returns 200
because both the user foo
and a note named hello
belonging to foo
both exist.There are even expected 404
response codes for particular paths:
GET /users/bar
returns 404
. That is nonambiguous since the 404
only refers to one resource.GET /users/bar/notes
returns 404
. This is just as unambiguous (assuming the API does not return 404
for nonexistent paths).But consider that the following return 404
for different and ambiguous reasons:
GET /users/bar/notes/baz
returns 404
because the user bar
does not exists.GET /users/foo/notes/baz
returns 404
because the existing user foo
does not have a baz
note.In short, the 404
s returned do not inform the client what exactly failed to be found: the user or the note. So my question is as follows:
Is it the responsibility of the server to be nonambiguous with 404
response codes? And if so, how should it differentiate to the client the nonexistence of a user versus the nonexistence of a user's note?
Is it the responsibility of the server to be nonambiguous with 404 response codes? And if so, how should it differentiate to the client the nonexistence of a user versus the nonexistence of a user's note?
By providing a "a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition" as described in RFC 7231.
In other words, put the explanatory details into the document that you include in the HTTP response.
It may help to think more carefully about how all this works with web pages.
The status code is metadata in the transfer of documents over a network domain. The intended audience for that information is the web browser (and other general purpose components - spiders, caches, and so on). It's provided so that your browser (and other general purpose components) can correctly interpret the semantics of the response.
The audience for the "representation of the error" is the human being using the web browser. That's the place where one would provide, for example, information about what specifically has gone wrong, or what corrective actions might be taken.
In modern days, it is often the case that we are expecting bespoke machine clients, rather than humans, to be looking at the "web browser". Free form text or free form text marked up with hypermedia controls aren't likely to be useful. So we probably want to use problem details - a standardized schema for reporting problems.
One difficulty you may be having (not your fault; the literature sucks) is recognizing that identifiers are semantically opaque. /users/foo/notes/baz
does not, generally, have any dependency on /users/foo/notes
or any of the other prefixes. Nor does the identifier mean that /users/foo/notes/baz
has four different parts that need to be satisfied.
Identifiers should be understood like keys into a map/dictionary - 200 means that the key exists in the map, 404 means the key doesn't exist in the map. But that doesn't actually tell you anything about the presence or absence of other keys with similar spellings!
Is your API, which conventionally organizes its resource model into a hierarchy, and chooses identifiers that are closely aligned with that hierarchy, "better" than an API that uses an unconventional resource model and arbitrary identifiers? Probably.
But good resource models and good identifier spelling conventions are not a REST constraint, and the HTTP and URI specifications also support designs that don't follow the current conventions (among other things, backwards compatibility is really important to REST and the web; REST and the web predate these spelling conventions by quite a bit).
(Analogy: we have coding conventions that describe "best practices" around ideas like variable naming and function naming because we use languages that don't restrict us to using "good" names. The machines don't care.)