CategoryTree
The category tree helps you do quick lookups on a typical category tree.
- Overview
- tree_model_class
- tree_parent_id_column_name
- tree_child_id_column_name
- tree_is_parent_column_name
- tree_level_column_name
- max_iterations
- load_relatives_strategy
- readable_parent_columns
- join_type
- where
- default
- setable
- is_readable
- is_writeable
- is_searchable
- is_temporary
- validators
- on_change_pre_save
- on_change_post_save
- on_change_save_finished
- created_by_source_type
- created_by_source_key
- created_by_source_strict
Overview
It’s a very niche tool. In general, graph databases solve this problem better, but it’s not always worth the effort of spinning up a new kind of database.
This column needs a special tree table where it will pre-compute and store the necessary information to perform quick lookups about relationships in a cateogry tree. So, imagine you have a table that represents a standard category heirarchy:
CREATE TABLE categories (
id varchar(255),
parent_id varchar(255),
name varchar(255)
)
`parent_id`, in this case, would be a reference to the `categories` table itself -
hence the heirarchy. This works fine as a starting point but it gets tricky when you want to answer questions like
"what are all the parent categories of category X?" or "what are all the child categories of category Y?".
This column class solves that by building a tree table that caches this data as the categories are updated.
That table should look like this:
```sql
CREATE TABLE category_tree (
id varchar(255),
parent_id varchar(255),
child_id varchar(255),
is_parent tinyint(1),
level tinyint(1),
)
Then you would attach this column to your category model as a replacement for a typical BelongsToId relationship:
import clearskies
class Tree(clearskies.Model):
id_column_name = "id"
backend = clearskies.backends.MemoryBackend(silent_on_missing_tables=True)
id = clearskies.columns.Uuid()
parent_id = clearskies.columns.String()
child_id = clearskies.columns.String()
is_parent = clearskies.columns.Boolean()
level = clearskies.columns.Integer()
class Category(clearskies.Model):
id_column_name = "id"
backend = clearskies.backends.MemoryBackend(silent_on_missing_tables=True)
id = clearskies.columns.Uuid()
name = clearskies.columns.String()
parent_id = clearskies.columns.CategoryTree(Tree)
parent = clearskies.columns.BelongsToModel("parent_id")
children = clearskies.columns.CategoryTreeChildren("parent_id")
descendants = clearskies.columns.CategoryTreeDescendants("parent_id")
ancestors = clearskies.columns.CategoryTreeAncestors("parent_id")
def test_category_tree(category: Category):
root_1 = category.create({"name": "Root 1"})
root_2 = category.create({"name": "Root 2"})
sub_1_root_1 = category.create({"name": "Sub 1 of Root 1", "parent_id": root_1.id})
sub_2_root_1 = category.create({"name": "Sub 2 of Root 1", "parent_id": root_1.id})
sub_sub = category.create({"name": "Sub Sub", "parent_id": sub_1_root_1.id})
sub_1_root_2 = category.create({"name": "Sub 1 of Root 2", "parent_id": root_2.id})
return {
"descendants_of_root_1": [descendant.name for descendant in root_1.descendants],
"children_of_root_1": [child.name for child in root_1.children],
"descendants_of_root_2": [descendant.name for descendant in root_2.descendants],
"ancestors_of_sub_sub": [ancestor.name for ancestor in sub_sub.ancestors],
}
cli = clearskies.contexts.Cli(
clearskies.endpoints.Callable(test_category_tree),
classes=[Category, Tree],
)
cli()
And if you invoke the above you will get:
{
"status": "success",
"error": "",
"data": {
"descendants_of_root_1": ["Sub 1 of Root 1", "Sub 2 of Root 1", "Sub Sub"],
"children_of_root_1": ["Sub 1 of Root 1", "Sub 2 of Root 1"],
"descendants_of_root_2": ["Sub 1 of Root 2"],
"ancestors_of_sub_sub": ["Root 1", "Sub 1 of Root 1"],
},
"pagination": {},
"input_errors": {},
}
In case it’s not clear, the definition of these things are:
- Descendants: All children under a given category (recursively).
- Children: The direct descendants of a given category.
- Ancestors: The parents of a given category, starting from the root category.
- Parent: the immediate parent of the category.
tree_model_class
Required
The model class that will persist our tree data
tree_parent_id_column_name
Optional
The column in the tree model that references the parent in the relationship
tree_child_id_column_name
Optional
The column in the tree model that references the child in the relationship
tree_is_parent_column_name
Optional
The column in the tree model that denotes which node in the relationship represents the tree
tree_level_column_name
Optional
The column in the tree model that references the parent in a relationship
max_iterations
Optional
The maximum expected depth of the tree
load_relatives_strategy
Optional
The strategy for loading relatives.
Choose whatever one actually works for your backend
- JOIN: use an actual
JOIN
(e.g. quick and efficient, but mostly only works for SQL backends). - WHERE IN: Use a
WHERE IN
condition. - INDIVIDUAL: Load each record separately. Works for any backend but is also the slowest.
readable_parent_columns
Optional
The list of columns from the parent that should be included when converting this column to JSON.
When configuring readable columns for an endpoint, you can specify the BelongsToModel column. If you do this, you must set readable_parent_columns on the BelongsToId column to specify which columns from the parent model should be returned in the response. See this example:
import clearskies
class Owner(clearskies.Model):
id_column_name = "id"
backend = clearskies.backends.MemoryBackend()
id = clearskies.columns.Uuid()
name = clearskies.columns.String()
class Pet(clearskies.Model):
id_column_name = "id"
backend = clearskies.backends.MemoryBackend()
id = clearskies.columns.Uuid()
name = clearskies.columns.String()
owner_id = clearskies.columns.BelongsToId(
Owner,
readable_parent_columns=["id", "name"],
)
owner = clearskies.columns.BelongsToModel("owner_id")
cli = clearskies.contexts.Cli(
clearskies.endpoints.List(
Pet,
sortable_column_names=["id", "name"],
readable_column_names=["id", "name", "owner"],
default_sort_column_name="name",
),
classes=[Owner, Pet],
bindings={
"memory_backend_default_data": [
{
"model_class": Owner,
"records": [
{"id": "1-2-3-4", "name": "John Doe"},
{"id": "5-6-7-8", "name": "Jane Doe"},
],
},
{
"model_class": Pet,
"records": [
{"id": "a-b-c-d", "name": "Fido", "owner_id": "1-2-3-4"},
{"id": "e-f-g-h", "name": "Spot", "owner_id": "1-2-3-4"},
{"id": "i-j-k-l", "name": "Puss in Boots", "owner_id": "5-6-7-8"},
],
},
],
}
)
if __name__ == "__main__":
cli()
With readable_parent_columns set in the Pet.owner_id column, and owner set in the list configuration, The owner id and name are included in the owner
key of the returned Pet dictionary:
$ ./test.py | jq
{
"status": "success",
"error": "",
"data": [
{
"id": "a-b-c-d",
"name": "Fido",
"owner": {
"id": "1-2-3-4",
"name": "John Doe"
}
},
{
"id": "i-j-k-l",
"name": "Puss in Boots",
"owner": {
"id": "5-6-7-8",
"name": "Jane Doe"
}
},
{
"id": "e-f-g-h",
"name": "Spot",
"owner": {
"id": "1-2-3-4",
"name": "John Doe"
}
}
],
"pagination": {},
"input_errors": {}
}
join_type
Optional
The type of join to use when searching on the parent.
where
Optional
Any additional conditions to place on the parent table when finding related records.
where should be a list containing a combination of conditions-as-strings, queries built from the columns themselves, or callable functions which accept the model and apply filters. This is primarily used in input validation to exclude values as allowed parents.
default
Optional
A default value to set for this column.
The default is only used when creating a record for the first time, and only if a value for this column has not been set.
import clearskies
class Widget(clearskies.Model):
id_column_name = "id"
backend = clearskies.backends.MemoryBackend()
id = clearskies.columns.Uuid()
name = clearskies.columns.String(default="Jane Doe")
cli = clearskies.contexts.Cli(
clearskies.endpoints.Callable(
lambda widgets: widgets.create(no_data=True),
model_class=Widget,
readable_column_names=["id", "name"]
),
classes=[Widget],
)
if __name__ == "__main__":
cli()
Which when invoked returns:
{
"status": "success",
"error": "",
"data": {
"id": "03806afa-b189-4729-a43c-9da5aa17bf14",
"name": "Jane Doe"
},
"pagination": {},
"input_errors": {}
}
setable
Optional
is_readable
Optional
Whether or not this column can be converted to JSON and included in an API response.
If this is set to False for a column and you attempt to set that column as a readable_column in an endpoint, clearskies will throw an exception.
is_writeable
Optional
is_searchable
Optional
is_temporary
Optional
Whether or not this column is temporary. A temporary column is not persisted to the backend.
Temporary columns are useful when you want the developer or end user to set a value, but you use that value to trigger additional behavior, rather than actually recording it. Temporary columns often team up with actions or are used to calculate other values. For instance, in our setable example above, we had both an age and a date of birth column, with the date of birth calculated from the age. This obviously results in two columns with similar data. One could be marked as temporary and it will be available during the save operation, but it will be skipped when saving data to the backend:
import clearskies
class Pet(clearskies.Model):
id_column_name = "id"
backend = clearskies.backends.MemoryBackend()
id = clearskies.columns.Uuid()
name = clearskies.columns.String()
date_of_birth = clearskies.columns.Date(is_temporary=True)
age = clearskies.columns.Integer(
setable=lambda data, model, now:
(now-dateparser.parse(model.latest("date_of_birth", data))).total_seconds()/(86400*365),
)
created = clearskies.columns.Created()
cli = clearskies.contexts.Cli(
clearskies.endpoints.Callable(
lambda pets: pets.create({"name": "Spot", "date_of_birth": "2020-05-03"}),
model_class=Pet,
readable_column_names=["id", "age", "date_of_birth"],
),
classes=[Pet],
)
if __name__ == "__main__":
cli()
Which will return:
{
"status": "success",
"error": "",
"data": {
"id": "ee532cfa-91cf-4747-b798-3c6dcd79326e",
"age": 5,
"date_of_birth": null
},
"pagination": {},
"input_errors": {}
}
e.g. the date_of_birth column is empty. To be clear though, it’s not just empty - clearskies made no attempt to set it. If you were using an SQL database, you would not have to put a date_of_birth
column in your table.
validators
Optional
on_change_pre_save
Optional
Actions to take during the pre-save step of the save process if the column has changed during the active save operation.
Pre-save happens before the data is persisted to the backend. Actions/callables in this step must return a dictionary. The data in the dictionary will be included in the save operation. Since the save hasn’t completed, any data in the model itself reflects the model before the save operation started. Actions in the pre-save step must NOT make any changes directly, but should ONLY return modified data for the save operation. In addition, they must be idempotent - they should always return the same value when called with the same data. This is because clearskies can call them more than once. If a pre-save hook changes the save data, then clearskies will call all the pre-save hooks again in case this new data needs to trigger further changes. Stateful changes should be reserved for the post_save or save_finished stages.
Callables and actions can request any dependencies provided by the DI system. In addition, they can request two named parameters:
model
- the model involved in the save operationdata
- the new data being saved
The key here is that the defined actions will be invoked regardless of how the save happens. Whether the model.save() function is called directly or the model is creatd/modified via an endpoint, your business logic will always be executed. This makes for easy reusability and consistency throughout your application.
Here’s an example where we want to record a timestamp anytime an order status becomes a particular value:
import clearskies
class Order(clearskies.Model):
id_column_name = "id"
backend = clearskies.backends.MemoryBackend()
id = clearskies.columns.Uuid()
status = clearskies.columns.Select(
["Open", "On Hold", "Fulfilled"],
on_change_pre_save=[
lambda data, utcnow: {"fulfilled_at": utcnow} if data["status"] == "Fulfilled" else {},
],
)
fulfilled_at = clearskies.columns.Datetime()
wsgi = clearskies.contexts.WsgiRef(
clearskies.endpoints.Create(
model_class=Order,
writeable_column_names=["status"],
readable_column_names=["id", "status", "fulfilled_at"],
),
)
wsgi()
You can then see the difference depending on what you set the status to:
$ curl http://localhost:8080 -d '{"status":"Open"}' | jq
{
"status": "success",
"error": "",
"data": {
"id": "a732545f-51b3-4fd0-a6cf-576cf1b2872f",
"status": "Open",
"fulfilled_at": null
},
"pagination": {},
"input_errors": {}
}
$ curl http://localhost:8080 -d '{"status":"Fulfilled"}' | jq
{
"status": "success",
"error": "",
"data": {
"id": "c288bf43-2246-48e4-b168-f40cbf5376df",
"status": "Fulfilled",
"fulfilled_at": "2025-05-04T02:32:56+00:00"
},
"pagination": {},
"input_errors": {}
}
on_change_post_save
Optional
Actions to take during the post-save step of the process if the column has changed during the active save.
Post-save happens after the data is persisted to the backend but before the full save process has finished. Since the data has been persisted to the backend, any data returned by the callables/actions will be ignored. If you need to make data changes you’ll have to execute a separate save operation. Since the save hasn’t finished, the model is not yet updated with the new data, and any data you fetch out of the model will refelect the data in the model before the save started.
Callables and actions can request any dependencies provided by the DI system. In addition, they can request three named parameters:
model
- the model involved in the save operationdata
- the new data being savedid
- the id of the record being saved
Here’s an example of using a post-save action to record a simple audit trail when the order status changes:
import clearskies
class Order(clearskies.Model):
id_column_name = "id"
backend = clearskies.backends.MemoryBackend()
id = clearskies.columns.Uuid()
status = clearskies.columns.Select(
["Open", "On Hold", "Fulfilled"],
on_change_post_save=[
lambda model, data, order_histories: order_histories.create({
"order_id": model.latest("id", data),
"event": "Order status changed to " + data["status"]
}),
],
)
class OrderHistory(clearskies.Model):
id_column_name = "id"
backend = clearskies.backends.MemoryBackend()
id = clearskies.columns.Uuid()
event = clearskies.columns.String()
order_id = clearskies.columns.BelongsToId(Order)
# include microseconds in the created_at time so that we can sort our example by created_at
# and they come out in order (since, for our test program, they will all be created in the same second).
created_at = clearskies.columns.Created(date_format="%Y-%m-%d %H:%M:%S.%f")
def test_post_save(orders: Order, order_histories: OrderHistory):
my_order = orders.create({"status": "Open"})
my_order.status = "On Hold"
my_order.save()
my_order.save({"status": "Open"})
my_order.save({"status": "Fulfilled"})
return order_histories.where(OrderHistory.order_id.equals(my_order.id)).sort_by("created_at", "asc")
cli = clearskies.contexts.Cli(
clearskies.endpoints.Callable(
test_post_save,
model_class=OrderHistory,
return_records=True,
readable_column_names=["id", "event", "created_at"],
),
classes=[Order, OrderHistory],
)
cli()
Note that in our on_change_post_save
lambda function, we use model.latest("id", data)
. We can’t just use data["id"]
because data
is a dictionary containing the information present in the save. During the create operation data["id"]
will be populated, but during the subsequent edit operations it won’t be - only the status column is changing. model.latest("id", data)
is basically just short hand for: data.get("id", model.id)
. On the other hand, we can just use data["status"]
because the on_change
hook is attached to the status field, so it will only fire when status is being changed, which means that the status
key is guaranteed to be in the dictionary when the lambda is executed.
Finally, the post-save action has a named parameter called id
, so in this specific case we could use:
lambda data, id, order_histories: order_histories.create("order_id": id, "event": data["status"])
When we execute the above script it will return something like:
{
"status": "success",
"error": "",
"data": [
{
"id": "c550d714-839b-4f25-a9e1-bd7e977185ff",
"event": "Order status changed to Open",
"created_at": "2025-05-04T14:09:42.960119+00:00"
},
{
"id": "f393d7b0-da21-4117-a7a4-0359fab802bb",
"event": "Order status changed to On Hold",
"created_at": "2025-05-04T14:09:42.960275+00:00"
},
{
"id": "5b528a10-4a08-47ae-938c-fc7067603f8e",
"event": "Order status changed to Open",
"created_at": "2025-05-04T14:09:42.960395+00:00"
},
{
"id": "91f77a88-1c38-49f7-aa1e-7f97bd9f962f",
"event": "Order status changed to Fulfilled",
"created_at": "2025-05-04T14:09:42.960514+00:00"
}
],
"pagination": {},
"input_errors": {}
}
on_change_save_finished
Optional
Actions to take during the save-finished step of the save process if the column has changed in the save.
Save-finished happens after the save process has completely finished and the model is updated with the final data. Any data returned by these actions will be ignored, since the save has already finished. If you need to make data changes you’ll have to execute a separate save operation.
Callables and actions can request any dependencies provided by the DI system. In addition, they can request the following parameter:
model
- the model involved in the save operation
Unlike pre_save and post_save, data
is not provided because this data has already been merged into the model. If you need some context from the completed save operation, use methods like was_changed
and previous_value
.
created_by_source_type
Optional
created_by_source_key
Optional
created_by_source_strict
Optional