Backends
Fetch and store data from an API endpoint.
Overview
The ApiBackend gives developers a way to quickly build SDKs to connect a clearskies applications to arbitrary API endpoints. The backend has some built in flexibility to make it easy to connect it to most APIs, as well as behavioral hooks so that you can override small sections of the logic to accommodate APIs that don’t work in the expected way. This allows you to interact with APIs using the standard model methods, just like every other backend, and also means that you can attach such models to endpoints to quickly enable all kinds of pre-defined behaviors.
Usage
Configuring the API backend is pretty easy:
- Provide the
base_urlto the constructor, or extend it and set it in the__init__for the new backend. - Provide a
clearskies.authentication.Authenticationobject, assuming it isn’t a public API. - Match your model class name to the path of the API (or set
model.destination_name()appropriately) - Use the resulting model like you would any other model!
It’s important to understand how the Api Backend will map queries and saves to the API in question. The rules are fairly simple:
- The API backend only supports searching with the equals operator (e.g.
models.where("column=value")). - To specify routing parameters, use the
{parameter_name}or:parameter_namesyntax in either the url or in the destination name of your model. In order to query the model, you then must provide a value for any routing parameters, using a matching search condition: (e.g.models.where("routing_parameter_name=value")) - Any search clauses that don’t correspond to routing parameters will be translated into query parameters. So, if your destination_name is
https://example.com/:categoy_id/productsand you executed a model query:models.where("category_id=10").where("on_sale=1")then this would result in fetching a URL ofhttps://example.com/10/products?on_sale=1 - When you specifically search on the id column for the model, the id will be appended to the end of the URL rather than as a query parameter. So, with a destination name of
https://example.com/products, querying formodels.find("id=10")will result in fetchinghttps://example.com/products/10. - Delete and Update operations will similarly append the id to the URL, and also set the appropriate response method (e.g.
DELETEorPATCHby default). - When processing the response, the backend will attempt to automatically discover the results by looking for dictionaries that contain the expected column names (as determined from the model schema and the mapping rules).
- The backend will check for a response header called
linkand parse this to find pagination information so it can iterate through records.
NOTE: The API backend doesn’t support joins or group_by clauses. This limitation, as well as the fact that it only supports seaching with the equals operator, isn’t a limitation in the API backend itself, but simply reflects the behavior of most API endoints. If you want to support an API that has more flexibility (for instance, perhaps it allows for more search operations than just =), then you can extend the appropritae methods, discussed below, to map a model query to an API request.
Here’s an example of how to use the API Backend to integrate with the Github API:
import clearskies
class GithubPublicBackend(clearskies.backends.ApiBackend):
def __init__(
self,
# This varies from endpoint to endpoint, so we want to be able to set it for each model
pagination_parameter_name: str = "since",
):
# these are fixed for all gitlab API parameters, so there's no need to make them setable
# from the constructor
self.base_url = "https://api.github.com"
self.limit_parameter_name = "per_page"
self.pagination_parameter_name = pagination_parameter_name
self.finalize_and_validate_configuration()
class UserRepo(clearskies.Model):
# Corresponding API Docs: https://docs.github.com/en/rest/repos/repos?apiVersion=2022-11-28#list-repositories-for-a-user
id_column_name = "full_name"
backend = GithubPublicBackend(pagination_parameter_name="page")
@classmethod
def destination_name(cls) -> str:
return "users/:login/repos"
id = clearskies.columns.Integer()
full_name = clearskies.columns.String()
type = clearskies.columns.Select(["all", "owner", "member"])
url = clearskies.columns.String()
html_url = clearskies.columns.String()
created_at = clearskies.columns.Datetime()
updated_at = clearskies.columns.Datetime()
# The API endpoint won't return "login" (e.g. username), so it may not seem like a column, but we need to search by it
# because it's a URL parameter for this API endpoint. Clearskies uses strict validation and won't let us search by
# a column that doesn't exist in the model: therefore, we have to add the login column.
login = clearskies.columns.String(is_searchable=True, is_readable=False)
# The API endpoint let's us sort by `created`/`updated`. Note that the names of the columns (based on the data returned
# by the API endpoint) are `created_at`/`updated_at`. As above, clearskies strictly validates data, so we need columns
# named created/updated so that we can sort by them. We can set some flags to (hopefully) avoid confusion
updated = clearskies.columns.Datetime(
is_searchable=False, is_readable=False, is_writeable=False
)
created = clearskies.columns.Datetime(
is_searchable=False, is_readable=False, is_writeable=False
)
class User(clearskies.Model):
# Corresponding API docs: https://docs.github.com/en/rest/users/users?apiVersion=2022-11-28#list-users
# github has two columns that are both effecitvely id columns: id and login.
# We use the login column for id_column_name because that is the column that gets
# used in the API to fetch an individual record
id_column_name = "login"
backend = GithubPublicBackend()
id = clearskies.columns.Integer()
login = clearskies.columns.String()
gravatar_id = clearskies.columns.String()
avatar_url = clearskies.columns.String()
html_url = clearskies.columns.String()
repos_url = clearskies.columns.String()
# We can hook up relationships between models just like we would if we were using an SQL-like
# database. The whole point of the backend system is that the model queries work regardless of
# backend, so clearskies can issue API calls to fetch related records just like it would be able
# to fetch children from a related database table.
repos = clearskies.columns.HasMany(
UserRepo,
foreign_column_name="login",
readable_child_columns=["id", "full_name", "html_url"],
)
def fetch_user(users: User, user_repos: UserRepo):
# If we execute this models query:
some_repos = (
user_repos.where("login=cmancone")
.sort_by("created", "desc")
.where("type=owner")
.pagination(page=2)
.limit(5)
)
# the API backend will fetch this url:
# https://api.github.com/users/cmancone/repos?type=owner&sort=created&direction=desc&per_page=5&page=2
# and we can use the results like always
repo_names = [repo.full_name for repo in some_repos]
# For the below case, the backend will fetch this url:
# https://api.github.com/users/cmancone
# in addition, the readable column names on the callable endpoint includes "repos", which references our has_many
# column. This means that when converting the user model to JSON, it will also grab a page of repositories for that user.
# To do that, it will fetch this URL:
# https://api.github.com/users/cmancone/repos
return users.find("login=cmancone")
wsgi = clearskies.contexts.WsgiRef(
clearskies.endpoints.Callable(
fetch_user,
model_class=User,
readable_column_names=["id", "login", "html_url", "repos"],
),
classes=[User, UserRepo],
)
if __name__ == "__main__":
wsgi()
The following example demonstrates how models using this backend can be used in other clearskies endpoints, just like any other model. Note that the following example is re-using the above models and backend, I have just omitted them for the sake of brevity:
wsgi = clearskies.contexts.WsgiRef(
clearskies.endpoints.List(
model_class=User,
readable_column_names=["id", "login", "html_url"],
sortable_column_names=["id"],
default_sort_column_name=None,
default_limit=10,
),
classes=[User],
)
if __name__ == "__main__":
wsgi()
And if you invoke it:
$ curl 'http://localhost:8080' | jq
{
"status": "success",
"error": "",
"data": [
{
"id": 1,
"login": "mojombo",
"html_url": "https://github.com/mojombo"
},
{
"id": 2,
"login": "defunkt",
"html_url": "https://github.com/defunkt"
},
{
"id": 3,
"login": "pjhyett",
"html_url": "https://github.com/pjhyett"
},
{
"id": 4,
"login": "wycats",
"html_url": "https://github.com/wycats"
},
{
"id": 5,
"login": "ezmobius",
"html_url": "https://github.com/ezmobius"
},
{
"id": 6,
"login": "ivey",
"html_url": "https://github.com/ivey"
},
{
"id": 7,
"login": "evanphx",
"html_url": "https://github.com/evanphx"
},
{
"id": 17,
"login": "vanpelt",
"html_url": "https://github.com/vanpelt"
},
{
"id": 18,
"login": "wayneeseguin",
"html_url": "https://github.com/wayneeseguin"
},
{
"id": 19,
"login": "brynary",
"html_url": "https://github.com/brynary"
}
],
"pagination": {
"number_results": null,
"limit": 10,
"next_page": {
"since": "19"
}
},
"input_errors": {}
}
In essence, we now have an endpoint that lists results but, instead of pulling its data from a database, it makes API calls. It also tracks pagination as expected, so you can use the data in pagination.next_page to fetch the next set of results, just as you would if this were backed by a database, e.g.:
$ curl http://localhost:8080?since=19
Mapping from Queries to API calls
The process of mapping a model query into an API request involves a few different methods which can be overwritten to fully control the process. This is necessary in cases where an API behaves differently than expected by the API backend. This table outlines the method involved and how they are used:
| Method | Description |
|---|---|
| records_url | Return the absolute URL to fetch, as well as any columns that were used to fill in routing parameters |
| records_method | Reurn the HTTP request method to use for the API call |
| conditions_to_request_parameters | Translate the query conditions into URL fragments, query parameters, or JSON body parameters |
| pagination_to_request_parameters | Translate the pagination data into URL fragments, query parameters, or JSON body parameters |
| sorts_to_request_parameters | Translate the sort directive(s) into URL fragments, query parameters, or JSON body parameters |
| map_records_response | Take the response from the API and return a list of dictionaries with the resulting records |
In short, the details of the query are stored in a clearskies.query.Query object which is passed around to these various methods. They use that information to adjust the URL, add query parameters, or add parameters into the JSON body. The API Backend will then execute an API call with those final details, and use the map_record_response method to pull the returned records out of the response from the API endpoint.