SQLAlchemy¶

Models¶

For optional str, enum and dict fields, use Field(default="") or Field(default_factory=dict), not the ... | None annotation.

See also

Define tables and Django models
For nullable datetime, int and Decimal fields, use the ... | None annotation, not the Optional[...] annotation or Field(nullable=True).
For timezone-aware datetime field, use Field(sa_column=Column(DateTime(timezone=True)), to avoid the mypy error:
```
error: No overload variant of "Field" matches argument type "DateTime"  [call-overload]
```
Attention

Column() is nullable=True by default. If the field isn’t nullable, set Column(..., nullable=False).

Attention

If needed, set default= on Field(), not Column(). BaseModel.model_validate ignores default= on Column().

See also

SQLModel issue on timezone-aware datetime fields
For other fields, use sa_type and sa_column_kwargs, not sa_column, to avoid conflicts between SQLModel and SQLAlchemy.

Sessions¶

Read SQLAlchemy’s Session Basics, in particular:

Adding New or Existing Items

“For transient (i.e. brand new) instances, Session.add() will have the effect of an INSERT taking place for those instances upon the next flush. For instances which are persistent (i.e. were loaded by this session), they are already present and do not need to be added.”
Flushing

With autoflush=True, “the flush step is nearly always done transparently. Specifically, the flush occurs before any individual SQL statement is issued as a result of a Query …, as well as within the Session.commit() call before the transaction is committed.”
When do I construct a Session, when do I commit it, and when do I close it?

For web applications, “the basic pattern is create a Session at the start of a web request, call the Session.commit() method at the end of web requests that do POST, PUT, or DELETE, and then close the session at the end of web request”
Expiring / Refreshing (also under State Management, in particular, When to Expire or Refresh)

In SQLAlchemy, as SQLModel documents, if you access an instance (but not its attributes) after session.commit() – like when constructing a JSON response – then “something unexpected happens” by default. We follow the advice from the answer to the previous question:

“It’s also usually a good idea to set Session.expire_on_commit to False so that subsequent access to objects that came from a Session within the view layer do not need to emit new SQL queries to refresh the objects, if the transaction has been committed already.”

See also

I’m re-loading data with my Session but it isn’t seeing changes that I committed elsewhere
My Query does not return the same number of objects as query.count() tells me - why?

Flushing¶

Use session.add(instance) to INSERT rows.
Use instance.related = related, not instance.related_id = related.id.

Attention

Otherwise, if session.flush() is not called after session.add(related), then related.id is None!
Use the app.models.ActiveRecordMixin.create() and app.models.ActiveRecordMixin.update() methods, which call session.flush() to avoid such errors.

Committing¶

Credere is an email-centered service. Until an email is sent, processing is incomplete. Send emails after all database queries (other than Message creation, which depends on the message ID), then commit. That way, after emails are sent, only integrity errors could cause the transaction to rollback (unfortunately, sent emails can’t be undone).
Commit before adding background tasks and returning responses, to ensure changes are persisted before irreversible actions are taken.
In a for-loop, commit after sending an email, so that if a later query fails, we don’t send repeat emails on the next run. This is contrary to the advice in Session Basics:

“For a command-line script, the application would create a single, global Session that is established when the program begins to do its work, and commits it right as the program is completing its task.” (emphasis added)

Query API¶

Use the Legacy Query API. (The project started with SQLAlchemy 1.4. 2.0 syntax is more verbose.)

SELECT¶

When selecting specific columns (like session.query(models.MyModel.field)), if the query is in a…

For-loop, do, for example:

for (lender_id,) in session.query(models.Lender.id):
    print(lender_id)

or:

for name, value in session.query(...):
    print(name, value)

NOT:

for row in session.query(models.Lender.id):  # AVOID
    print(row[0])

If-statement, do, for example:

if lender_id := session.query(models.Lender.id).limit(1).scalar():
    print(lender_id)

NOT:

if row := session.query(models.Lender.id).first(): # AVOID
    print(row[0])

Tip

Maintainers can find queries for specific columns using the regular expression:

session.query\((models\.\w+\.|(?!models)\w+\.)

JOIN¶

To join the Award model, always explicitly use join(Award, Award.id == Application.award_id), because we want to count applications or borrowers only. We don’t want to count awards, like with join(Award, Award.borrower_id == Borrower.id).
To join another model, use join(model), not join(model, model.… == other.…). If an ON clause is needed, use the order join(model, model.… == other.…), not join(model, other.… == model.…).

WHERE¶

Use filter, not filter_by, to avoid ambiguity.
Use filter(a, b, c), not filter(a).filter(b).filter(c).

Chains¶

Query instance methods can be chained in any order, but typically:

distinct
join

Note

“the order in which each call to the join() method occurs is important.”
outerjoin
options, with joinedload or defaultload
filter, not where
group_by
having
order_by
limit
offset

Execution¶

Query instances must be executed with one of:

SELECT
- __iter__
- all: all rows as list
- first: at most one row
- one: exactly one row, or error
- scalar: the first column of one_or_none
- count: row count as int
update
delete

Attention

Do not use a query in a condition, without executing it! bool(query) returns True even if the result would be empty.