AIP-217
Unreachable resources
Occasionally, a user may ask for a list of resources, and some set of resources in the list are temporarily unavailable. For example, a user may ask to list resources across multiple parent locations, but one of those locations is temporarily unreachable. In this situation, it is still desirable to provide the user with all the available resources, while indicating that something is missing.
Guidance
If a method to retrieve data is capable of partially failing due to one or more resources being temporarily unreachable, the response message must include a field to indicate this:
message ListBooksResponse {
// The books matching the request.
repeated Book books = 1;
// The next page token, if there are more books matching the
// request.
string next_page_token = 2;
// Unreachable resources.
repeated string unreachable = 3;
}
- The field must be a repeated string, and should be named
unreachable
. - The field must be set to the names of the resources which are the cause
of the issue, such as the parent or individual resources that could not be
reached. The objects listed as unreachable may be parents (or higher
ancestors) rather than the individual resources being requested. For example,
if a location is unreachable, the location is listed.
- The response must not provide any other information about the issue, such as error details or codes. To discover what the underlying issue is, users should send a more specific request.
- The service must provide a way for the user to get an error with additional information, and should allow the user to repeat the original call with more restrictive parameters in order to do so.
- The resource names provided in this field may be heterogeneous. The field should document what potential resources may be provided in this field, and note that it might expand later.
Important: If a single unreachable location or resource prevents returning any data by definition (for example, a list request for a single publisher where that publisher is unreachable), the service must fail the entire request with an error.
Pagination
When paginating over a list, it is likely that the service will not know that there are unreachable parents or resources initially. Further, parents may alternate between being available and unavailable in unpredictable ways throughout the process of listing all the requested resources.
These facts lead to the following guidance:
- The response must provide any outstanding unreachable locations or
resources in the
unreachable
field on pages following the final page that contains a resource.- The response should not include both requested data and unreachable
resources on the same page.
- For example, if there are two pages of books and one unavailable publisher, there should be three pages total: first the two pages of books, and then a final page with no books and the unavailable publisher.
- If the number of unreachable resources to list is very large, the response
should honor the
page_size
field in the same way as for resources. In this case, all pages with requested information should precede all pages with unavailable resources or locations. - The final page's
unreachable
field must only include resources or parents that were partially provided (or missing completely) across the entirety of the pagination process.- For example, if a parent or resource was unreachable at the beginning of
pagination and it became reachable again and the entire set of previously
unreachable data was provided to the user on any page, the
unreachable
field must not include the intermittently-unreachable parent or resource. - On the other hand, if only some of the resources for a given parent are
provided during such an incident as described above, the parent or
resource must be included in the
unreachable
field.
- For example, if a parent or resource was unreachable at the beginning of
pagination and it became reachable again and the entire set of previously
unreachable data was provided to the user on any page, the
- The response should not include both requested data and unreachable
resources on the same page.
Adopting partial succcess
In order for an existing API that has a default behavior differing from the
aforementioned guidance i.e. the API call returns an error status instead of a
partial result, to adopt the unreachable
pattern the API must do the
following:
- The default behavior must be retained to avoid incompatible behavioral
changes
- For example, if the default behavior is to return an error if any location is unreachable, that default behavior must be retained.
- The request message must have a
bool return_partial_success
field - The response message must have the standard
repeated string unreachable
field - The two aforementioned fields must be added simultaneously
When the bool return_partial_success
field is set to true
in a request, the
API must behave as described in the aforementioned guidance with regards to
populating the repeated string unreachable
response field.
message ListBooksRequest {
// Standard List request fields...
// Setting this field to `true` will opt the request into returning the
// resources that are reachable, and into including the names of those that
// were unreachable in the [ListBooksResponse.unreachable] field. This can
// only be `true` when reading across collections e.g. when `parent` is set to
// `"projects/example/locations/-"`.
bool return_partial_success = 4;
}
message ListBooksResponse {
// Standard List Response fields...
// Unreachable resources. Populated when the request opts into
// `return_partial_success` and reading across collections e.g. when
// attempting to list all resources across all supported locations.
repeated string unreachable = 3;
}
Partial success granularity
If the bool return_partial_success
field is set to true
in a request that is
scoped beyond the supported granualirty of the API's ability to reasonably
report unreachable resources, the API should return an INVALID_ARGUMENT
error with details explaining the issue. For example, if the API only supports
return_partial_success
when [Reading Across Collections][aip159], it returns
an INVALID_ARGUMENT
error when given a request scoped to a specific parent
resource collection. The supported granularity must be documented on the
return_partial_success
field.
Rationale
Using request field to opt-in
Introducing a new request field as means of opting into the partial success
behavior is the best way to communicate user intent while keeping the
default behavior backwards compatible. The alternative, changing the default
behavior with the introduction of the unreachable
response field, presents
a backwards incompatible change. Users that previously expected failure when any
resource was unreachable, assume the successful response means all resources
are accounted for in the response.
Introducing fields simultaneously
Introducing the request and response fields simultaneously is to prevent an
invalid intermediate state that is presented by only adding one or the other. If
only unreachable
is added, then it could be assumed that it being empty means
all resources were returned when that may not be true. If only
return_partial_success
is added, then the user wouldn't have a means of
knowing which resources were unreachable.
Partial success granularity limitations
At a certain level of request scope granularity, an API is simply unable to
enumerate the resources that are unreachable. For example, global-only APIs may
be unable to provide granularity at a localized collection level. In such a
case, preemptively returning an error when return_partial_success=true
protects the user from the risks of the alternative - expecting unreachable
resources if there was an issue, but not getting any, thus falsely assuming
everything was retrieved. This aligns with guidance herein that suggests failing
requests that cannot be fulfilled preemptively.
Further reading
- For listing across collections, see AIP-159.
Changelog
- 2024-07-19: Add guidance for brownfield adoption of partial success.