Jump to content

HyperSwitch

From mediawiki.org

HyperSwitch is a framework for creating REST web services. Its defining feature is its use of modular Swagger specs for the service configuration. This avoids duplication, ensures consistency between specs & actual API operation, and automates spec-driven features like request validation, testing and monitoring.

HyperSwitch was initially developed for Wikimedia's RESTBase REST API.

API documentation

[edit]

Request handler spec: x-request-handler

[edit]

General model

[edit]

Each swagger end point can optionally define two declarative handlers:x-setup-handler and x-request-handler. x-setup-handler is run during RESTBase start-up, while the x-request-handler is executed on every incoming request, and can be made up of multiple sub-requests executed in a sequence of steps.

Together, these handlers can make it easy to hook up common behaviours without having to write code. If more complex functionality is needed, then this can be added with JS modules, which can either take over the handling of the entire request, or add handler-specific functionality to the handler templating environment.

Setup declaration

[edit]

The x-setup-handler stanza is typically used to set up storage or do any preparational requests needed on startup. And example of a setup declaration:

/{module:service}/test/{title}{/revision}:
  get:
    x-setup-handler:
      - init_storage:
          uri: /{domain}/sys/key_value/testservice.test

By default the PUT method is used for a request. This example would initialize a testservice.test bucket in a key_value module.

Request handlers

[edit]

Request handlers are called whenever a request matches the swagger route & validation of parameters succeeded. Here is an example demonstrating a few handler features:

x-request-handler:
  # First step.
  - wiki_page_history: # The request name can be used to reference the response later.
      request:
        method: get
        uri: http://{domain}/wiki/{title}
        query:
          action: history

    # Second request, executed in parallel with wiki_page.
    summary_data:
      request:
        uri: https://wikimedia.org/api/rest_v1/page/summary/{title}

  # Second step: Only defines a `composite` response for later use.
  - composite:
      response:
        headers:
          content-type: application/json
          date: '{{wiki_page_history.headers.date}}'
        body: 
          history_page: '{{wiki_page_history.body}}'
          first_view_item: '{{summary_data.body.items[0]}}' # Only return the first entry

  # Third step: Saves the `composite` response to a bucket.
  - save_to_bucket:
      request:
        method: put
        uri: /{domain}/sys/key_value/testservice.test/{title}
        headers: '{{composite.headers}}'
        body: '{{composite.body}}'

  # Final step: Returns the `composite` response to the client.
  - return_to_client:
      return:
        status: 200
        headers: '{{composite.headers}}'
        body: '{{composite.body}}'

Steps: Sequential execution of blocks of parallel requests.

[edit]

A handler template is made up of several steps, encoded as objects in an array structure. Each property within a step object describes a request and its response processing. The name of this property should be unique across the entire x-response-handler, as the responses are saved in a request-global namespace.

Each request spec can have the following properties:

  • request: a request template of the request to issue in the current block
  • catch: a conditional stanza specifying which error conditions should be ignored
  • return_if: Modifies the behavior of return to only return if the conditions in return_if evaluate to true.
  • return: Return statement, containing a response object template. Aborts the entire handler. Unconditional if no return_if is supplied. Only a single request within a step can have return or return_if set.
  • response: Defines a response template like return, but does not abort the step / handler.

Execution Flow

[edit]

Within each step, all requests (if defined) are sent out in parallel, and all responses are awaited. If no catch property is defined, or if it does not match, errors (incl. 4xx and 5xx responses) will abort the entire handler, and possibly also parallel requests If all parallel requests succeed, each result is registered in the global namespace. If return_if conditions are supplied, those are then evaluated against the raw response value. Next, return or response statements are evaluated. These have access to all previous responses, including those in the current step. The response template replaces the original response value with its expansion, while return will return the same value to the client if no return_if stanza was supplied, or if its condition evaluated to true against the original responses.

How to

[edit]

Creating a spec for new API end points

[edit]

You've developed a new high-performant robust service and want to put it behind RESTBase to integrate with REST APIs, improve discoverability, add storage and get all the perks RESTBase provides for you like rate limiting, page title normalisation and a lot more. Let's say your service has a hello endpoint with the following API:

GET /{domain}/v1/hello/{name} -> "Hello, {name}"

First, you need to create a public API specification for the service. All specs are created in Swagger format and live in YAML files within the /v1 directory in RESTBase source. These files specify public API and define what's visible in the RESTBase API documentation. To include your endpoint you need to create a pull request in RESTBase github repository and /cc Wikimedia Services team to get a review. The first version of the specification would simply proxy the requests to backend service, but later we can add storage to it.

In Swagger, each entry point is defined as a property of the `paths` object. The property name is the path, where segments in curly braces represent templated parameters, that would then be available in request.params object. For more information about path templates see swagger docs. For each path you define a set of request methods, GET in our case and specify the description of the endpoint as well as request and response content type and format. All request parameters should be specified in the parameters property, because RESTBase would automatically validate every incoming request and check whether all required params are present, have the right type and schema.

paths:
  /{name}:
    get:
      description: |
        Says "Hello {name}" to for the provided name.

        Stability: [experimental](https://www.mediawiki.org/wiki/API_versioning#Experimental)
      produces:
        - text/plain
      parameters:
        - name: name
          in: path
          description: The name of the user
          type: string
          required: true
      responses:
        '200':
          description: The definition for the given term
          type: string

Now we're ready to set up the actual request handler. You have several options:

  1. Set up the operationId and forward the request to the JavaScript handler. That's good when you need to do some complex logic to process the request, but our handler is not very complicated - it just forwards the request to the backend service. For simpler handlers there's a YAML config format that allows you to create new endpoints without any JS code.
  2. Set up the spec for HandlerTemplate. Documentation for the handler template could be found here.

In the following example we are setting up the handler that forwards the request to the backend service, and adds a Cache-Control header to the response.

      x-request-handler:
        - request_backend:
            request:
              method: get
              uri: '{{options.host}}/{domain}/v1/hello/{term}'
            return:
              status: '{{request_backend.status}}'
              headers: 'merge(request_backend.headers, {"Cache-Control": options.cache_control})'
              body: '{{request_backend.body}}'

Last but not least important, we would set up monitoring specs for the endpoint. There's a checker script which runs in Wikimedia production systems and monitors the health of RESTBase and individual services. If some endpoint doesn't respond or returns incorrect data an alert would be created notifying the operations team about a problem. In the monitoring section of the spec you would set up example requests and responses that would be picked up by the checker script and executed.

      x-monitor: true
      x-amples:
        - title: Say hello to Peter
          request:
            params:
              domain: en.wikipedia.org
              name: Peter
          response:
            status: 200
            body: 'Hello, Peter'

Now, that we've got the specification, it needs to be registered in RESTBase. To do that you would modify one of the project files in the projects directory. Each project file if loaded on some domain, for example the wmf_default.yaml is responsible for all the domains except wiktionary and wikimedia.org. To register you spec you would need to add it to the x-modules array under the path prefix you need. For this example service the path to the module would be /hello, it doesn't exist yet, so you would add the following code:

# This code exists in the project file
/media:
    x-modules:
        - path: v1/mathoid.yaml
          options: '{{options.mathoid}}'
# Here's the code you need to add:
/hello:
    x-modules:
        - path: v1/hello.yaml
          options: '{{options.hello}}'

Final step is to set up the options for your module in the config.example.wikimedia.yaml This file is mapped to puppet configuration template for RESTBase, so your options should contain all the properties managed by puppet, like hosts. Also you can put properties you would put in the service config, in our case it would be the Cache-Control header value. Here's the code you would add to the file:

# This code exists in the repo
related:
    cache_control: s-maxage=86400, max-age=86400
# Here's what you need to add
hello:
    host: https://hello.wikimedia.org
    cache_control: s-maxage=86400, max-age=86400

That's it, now you can start the service with npm start and check that your API endpoint works correctly and shows up on the docs page at http://localhost:7231/en.wikipedia.org/v1/?doc

Configuring rate limits for each API end point

[edit]

TODO: describe

[edit]