Querying Records

  1. where
  2. first
  3. find
  4. paginate_all
  5. sort_by
  6. limit
  7. pagination
  8. join
  9. group_by
  10. select
  11. select_all
  12. has_query
  13. as_query

where

Add a condition to a query.

The where method (in combination with the find method) is typically the starting point for query records in a model. You don’t have to add a condition to a model in order to fetch records, but of course it’s a very common use case. Conditions in clearskies can be built from the columns or can be constructed as SQL-like string conditions, e.g. model.where("name=Bob") or model.where(model.name.equals("Bob")). The latter provides strict type-checking, while the former does not. Either way they have the same result. The list of supported operators for a given column can be seen by checking the _allowed_search_operators attribute of the column class. Most columns accept all allowed operators, which are:

  • ”<=>”
  • ”!=”
  • ”<=”
  • ”>=”
  • ”>”
  • ”<”
  • ”=”
  • “in”
  • “is not null”
  • “is null”
  • “like”

When working with string conditions, it is safe to inject user input into the condition. The allowed format for conditions is very simple: f"{column_name}\s?{operator}\s?{value}". This makes it possible to unambiguously separate all three pieces from eachother. It’s not possible to inject malicious payloads into either the column names or operators because both are checked against a strict allow list (e.g. the columns declared in the model or the list of allowed operators above). The value is then extracted from the leftovers, and this is provided to the backend separately so it can use it appropriately (e.g. using prepared statements for the cursor backend). Of course, you generally shouldn’t have to inject user input into conditions very often because, most often, the various list/search endpoints do this for you, but if you have to do it there are no security concerns.

You can include a table name before the column name, with the two separated by a period. As always, if you do this, ensure that you include a supporting join statement (via the join method - see it for examples).

When you call the where method it returns a new model object with it’s query configured to include the additional condition. The original model object remains unchanged. Multiple conditions are always joined with AND. There is no explicit option for OR. The closest is using an IN condition.

To access the results you have to iterate over the resulting model. If you are only expecting one result and want to work directly with it, then you can use model.find(condition) or model.where(condition).first().

Example:

import clearskies

class Order(clearskies.Model):
    id_column_name = "id"
    backend = clearskies.backends.MemoryBackend()

    id = clearskies.columns.Uuid()
    user_id = clearskies.columns.String()
    status = clearskies.columns.Select(["Pending", "In Progress"])
    total = clearskies.columns.Float()

def my_application(orders):
    orders.create({"user_id": "Bob", "status": "Pending", "total": 25})
    orders.create({"user_id": "Alice", "status": "In Progress", "total": 15})
    orders.create({"user_id": "Jane", "status": "Pending", "total": 30})

    return [order.user_id for order in orders.where("status=Pending").where(Order.total.greater_than(25))]

cli = clearskies.contexts.Cli(
    my_application,
    classes=[Order],
)
cli()

Which, if ran, returns: ["Jane"]

first

Return the first model for a given query.

The where method returns an object meant to be iterated over. If you are expecting your query to return a single record, then you can use first to turn that directly into the matching model so you don’t have to iterate over it:

import clearskies

class Order(clearskies.Model):
    id_column_name = "id"
    backend = clearskies.backends.MemoryBackend()

    id = clearskies.columns.Uuid()
    user_id = clearskies.columns.String()
    status = clearskies.columns.Select(["Pending", "In Progress"])
    total = clearskies.columns.Float()

def my_application(orders):
    orders.create({"user_id": "Bob", "status": "Pending", "total": 25})
    orders.create({"user_id": "Alice", "status": "In Progress", "total": 15})
    orders.create({"user_id": "Jane", "status": "Pending", "total": 30})

    jane = orders.where("status=Pending").where(Order.total.greater_than(25)).first()
    jane.total = 35
    jane.save()

    return {
        "user_id": jane.user_id,
        "total": jane.total,
    }

cli = clearskies.contexts.Cli(
    my_application,
    classes=[Order],
)
cli()

find

Return the first model matching a given where condition.

This is just shorthand for models.where("column=value").find(). Example:

import clearskies

class Order(clearskies.Model):
    id_column_name = "id"
    backend = clearskies.backends.MemoryBackend()

    id = clearskies.columns.Uuid()
    user_id = clearskies.columns.String()
    status = clearskies.columns.Select(["Pending", "In Progress"])
    total = clearskies.columns.Float()

def my_application(orders):
    orders.create({"user_id": "Bob", "status": "Pending", "total": 25})
    orders.create({"user_id": "Alice", "status": "In Progress", "total": 15})
    orders.create({"user_id": "Jane", "status": "Pending", "total": 30})

    jane = orders.find("user_id=Jane")
    jane.total = 35
    jane.save()

    return {
        "user_id": jane.user_id,
        "total": jane.total,
    }

cli = clearskies.contexts.Cli(
    my_application,
    classes=[Order],
)
cli()

paginate_all

Loop through all available pages of results and returns a list of all models that match the query.

If you don’t set a limit on a query, some backends will return all records but some backends have a default maximum number of results that they will return. In the latter case, you can use paginate_all to fetch all records by instructing clearskies to iterate over all pages. This is possible because backends are required to define how pagination works in a way that clearskies can automatically understand and use. To demonstrate this, the following example sets a limit of 1 which stops the memory backend from returning everything, and then uses paginate_all to fetch all records. The memory backend doesn’t have a default limit, so in practice the paginate_all is unnecessary here, but this is done for demonstration purposes.

import clearskies

class Order(clearskies.Model):
    id_column_name = "id"
    backend = clearskies.backends.MemoryBackend()

    id = clearskies.columns.Uuid()
    user_id = clearskies.columns.String()
    status = clearskies.columns.Select(["Pending", "In Progress"])
    total = clearskies.columns.Float()

def my_application(orders):
    orders.create({"user_id": "Bob", "status": "Pending", "total": 25})
    orders.create({"user_id": "Alice", "status": "In Progress", "total": 15})
    orders.create({"user_id": "Alice", "status": "Pending", "total": 30})
    orders.create({"user_id": "Bob", "status": "Pending", "total": 26})

    return orders.limit(1).paginate_all()

cli = clearskies.contexts.Cli(
    clearskies.endpoints.Callable(
        my_application,
        model_class=Order,
        readable_column_names=["user_id", "total"],
    ),
    classes=[Order],
)
cli()

NOTE: this loads up all records in memory before returning (e.g. it isn’t using generators yet), so expect delays for large record sets.

sort_by

Add a sort by clause to the query. You can sort by up to two columns at once.

Example:

import clearskies

class Order(clearskies.Model):
    id_column_name = "id"
    backend = clearskies.backends.MemoryBackend()

    id = clearskies.columns.Uuid()
    user_id = clearskies.columns.String()
    status = clearskies.columns.Select(["Pending", "In Progress"])
    total = clearskies.columns.Float()

def my_application(orders):
    orders.create({"user_id": "Bob", "status": "Pending", "total": 25})
    orders.create({"user_id": "Alice", "status": "In Progress", "total": 15})
    orders.create({"user_id": "Alice", "status": "Pending", "total": 30})
    orders.create({"user_id": "Bob", "status": "Pending", "total": 26})

    return orders.sort_by("user_id", "asc", secondary_column_name="total", secondary_direction="desc")

cli = clearskies.contexts.Cli(
    clearskies.endpoints.Callable(
        my_application,
        model_class=Order,
        readable_column_names=["user_id", "total"],
    ),
    classes=[Order],
)
cli()

limit

Set the number of records to return.

import clearskies

class Order(clearskies.Model):
    id_column_name = "id"
    backend = clearskies.backends.MemoryBackend()

    id = clearskies.columns.Uuid()
    user_id = clearskies.columns.String()
    status = clearskies.columns.Select(["Pending", "In Progress"])
    total = clearskies.columns.Float()

def my_application(orders):
    orders.create({"user_id": "Bob", "status": "Pending", "total": 25})
    orders.create({"user_id": "Alice", "status": "In Progress", "total": 15})
    orders.create({"user_id": "Alice", "status": "Pending", "total": 30})
    orders.create({"user_id": "Bob", "status": "Pending", "total": 26})

    return orders.limit(2)

cli = clearskies.contexts.Cli(
    clearskies.endpoints.Callable(
        my_application,
        model_class=Order,
        readable_column_names=["user_id", "total"],
    ),
    classes=[Order],
)
cli()

pagination

Set the pagination parameter(s) for the query.

The exact details of how pagination work depend on the backend. For instance, the cursor and memory backend expect to be given a start parameter, while an API backend will vary with the API, and the dynamodb backend expects a kwarg called cursor. As a result, it’s necessary to check the backend documentation to understand how to properly set pagination. The endpoints automatically account for this because backends are required to declare pagination details via the allowed_pagination_keys method. If you attempt to set invalid pagination data via this method, clearskies will raise a ValueError.

Example:

import clearskies

class Order(clearskies.Model):
    id_column_name = "id"
    backend = clearskies.backends.MemoryBackend()

    id = clearskies.columns.Uuid()
    user_id = clearskies.columns.String()
    status = clearskies.columns.Select(["Pending", "In Progress"])
    total = clearskies.columns.Float()

def my_application(orders):
    orders.create({"user_id": "Bob", "status": "Pending", "total": 25})
    orders.create({"user_id": "Alice", "status": "In Progress", "total": 15})
    orders.create({"user_id": "Alice", "status": "Pending", "total": 30})
    orders.create({"user_id": "Bob", "status": "Pending", "total": 26})

    return orders.sort_by("total", "asc").pagination(start=2)

cli = clearskies.contexts.Cli(
    clearskies.endpoints.Callable(
        my_application,
        model_class=Order,
        readable_column_names=["user_id", "total"],
    ),
    classes=[Order],
)
cli()

However, if the return line in my_application is switched for either of these:

return orders.sort_by("total", "asc").pagination(start="asdf")
return orders.sort_by("total", "asc").pagination(something_else=5)

Will result in an exception that explains exactly what is wrong.

join

Add a join clause to the query.

As with the where method, this expects a string which is parsed accordingly. The syntax is not as flexible as SQL and expects a format of:

[left|right|inner]? join [right_table_name] ON [right_table_name].[right_column_name]=[left_table_name].[left_column_name].

This is case insensitive. Aliases are allowed. If you don’t specify a join type it defaults to inner. Here are two examples of valid join statements:

  • join orders on orders.user_id=users.id
  • left join user_orders as orders on orders.id=users.id

Note that joins are not strictly limited to SQL-like backends, but of course no all backends will support joining.

A basic example:

import clearskies

class User(clearskies.Model):
    id_column_name = "id"
    backend = clearskies.backends.MemoryBackend()

    id = clearskies.columns.Uuid()
    name = clearskies.columns.String()

class Order(clearskies.Model):
    id_column_name = "id"
    backend = clearskies.backends.MemoryBackend()

    id = clearskies.columns.Uuid()
    user_id = clearskies.columns.BelongsToId(User, readable_parent_columns=["id", "name"])
    user = clearskies.columns.BelongsToModel("user_id")
    status = clearskies.columns.Select(["Pending", "In Progress"])
    total = clearskies.columns.Float()

def my_application(users, orders):
    jane = users.create({"name": "Jane"})
    another_jane = users.create({"name": "Jane"})
    bob = users.create({"name": "Bob"})

    # Jane's orders
    orders.create({"user_id": jane.id, "status": "Pending", "total": 25})
    orders.create({"user_id": jane.id, "status": "Pending", "total": 30})
    orders.create({"user_id": jane.id, "status": "In Progress", "total": 35})

    # Another Jane's orders
    orders.create({"user_id": another_jane.id, "status": "Pending", "total": 15})

    # Bob's orders
    orders.create({"user_id": bob.id, "status": "Pending", "total": 28})
    orders.create({"user_id": bob.id, "status": "In Progress", "total": 35})

    # return all orders for anyone named Jane that have a status of Pending
    return orders.join("join users on users.id=orders.user_id").where("users.name=Jane").sort_by("total", "asc").where("status=Pending")

cli = clearskies.contexts.Cli(
    clearskies.endpoints.Callable(
        my_application,
        model_class=Order,
        readable_column_names=["user", "total"],
    ),
    classes=[Order, User],
)
cli()

group_by

Add a group by clause to the query.

You just provide the name of the column to group by. Of course, not all backends support a group by clause.

select

Add some additional columns to the select part of the query.

This method returns a new object with the updated query. The original model object is unmodified. Multiple calls to this method add together. The following:

models.select("column_1 column_2").select("column_3")

will select column_1, column_2, column_3 in the final query.

select_all

Set whether or not to select all columns with the query.

This method returns a new object with the updated query. The original model object is unmodified.

has_query

Whether or not this model instance represents a query.

The model class is used for both querying records and modifying individual records. As a result, each model class instance keeps track of whether it is being used to query things, or whether it represents an individual record. This distinction is not usually very important to the developer (because there’s no good reason to use one model for both), but it may occassionaly be useful to tell how a given model is being used. Clearskies itself does use this to ensure that you can’t accidentally use a single model instance for both purposes, mostly because when this happens it’s usually a sign of a bug.

import clearskies

class User(clearskies.Model):
    id_column_name = "id"
    backend = clearskies.backends.MemoryBackend()

    id = clearskies.columns.Uuid()
    name = clearskies.columns.String()

def my_application(users):
    jane = users.create({"name": "Jane"})
    jane_instance_has_query = jane.has_query()

    some_search = users.where("name=Jane")
    some_search_has_query = some_search.has_query()

    invalid_request_error = ""
    try:
        some_search.save({"not": "valid"})
    except ValueError as e:
        invalid_request_error = str(e)

    return {
        "jane_instance_has_query": jane_instance_has_query,
        "some_search_has_query": some_search_has_query,
        "invalid_request_error": invalid_request_error,
    }

cli = clearskies.contexts.Cli(
    my_application,
    classes=[User],
)
cli()

Which if you run will return:

{
    "jane_instance_has_query": false,
    "some_search_has_query": true,
    "invalid_request_error": "You attempted to save/read record data for a model being used to make a query.  This is not allowed, as it is typically a sign of a bug in your application code."
}

as_query

Make the model queryable.

This is used to remove the ambiguity of attempting execute a query against a model object that stores a record.

The reason this exists is because the model class is used both to query as well as to operate on single records, which can cause subtle bugs if a developer accidentally confuses the two usages. Consider the following (partial) example:

def some_function(models):
    model = models.find("id=5")
    if model:
        models.save({"test": "example"})
    other_record = model.find("id=6")

In the above example it seems likely that the intention was to use model.save(), not models.save(). Similarly, the last line should be models.find(), not model.find(). To minimize these kinds of issues, clearskies won’t let you execute a query against an individual model record, nor will it let you execute a save against a model being used to make a query. In both cases, you’ll get an exception from clearskies, as the models track exactly how they are being used.

In some rare cases though, you may want to start a new query aginst a model that represents a single record. This is most common if you have a function that was passed an individual model, and you’d like to use it to fetch more records without having to inject the model class more generally. That’s where the as_query() method comes in. It’s basically just a way of telling clearskies “yes, I really do want to start a query using a model that represents a record”. So, for example:

def some_function(models):
    model = models.find("id=5")
    more_models = model.where("test=example")  # throws an exception.
    more_models = model.as_query().where("test=example")  # works as expected.