SQLAlchemy

Models

  • For optional str, enum and dict fields, use Field(default="") or Field(default_factory=dict), not the ... | None annotation.

  • For nullable datetime, int and Decimal fields, use the ... | None annotation, not the Optional[...] annotation or Field(nullable=True).

  • For timezone-aware datetime field, use Field(sa_column=Column(DateTime(timezone=True)), to avoid the mypy error:

    error: No overload variant of "Field" matches argument type "DateTime"  [call-overload]
    

    Attention

    Column() is nullable=True by default. If the field isn’t nullable, set Column(..., nullable=False).

    Attention

    If needed, set default= on Field(), not Column(). BaseModel.model_validate ignores default= on Column().

  • For other fields, use sa_type and sa_column_kwargs, not sa_column, to avoid conflicts between SQLModel and SQLAlchemy.

Sessions

Read SQLAlchemy’s Session Basics, in particular:

Flushing

Committing

  • Credere is an email-centered service. Until an email is sent, processing is incomplete. Send emails after all database queries (other than Message creation, which depends on the message ID), then commit. That way, after emails are sent, only integrity errors could cause the transaction to rollback (unfortunately, sent emails can’t be undone).

  • Commit before adding background tasks and returning responses, to ensure changes are persisted before irreversible actions are taken.

  • In a for-loop, commit after sending an email, so that if a later query fails, we don’t send repeat emails on the next run. This is contrary to the advice in Session Basics:

    “For a command-line script, the application would create a single, global Session that is established when the program begins to do its work, and commits it right as the program is completing its task.” (emphasis added)

Query API

Use the Legacy Query API. (The project started with SQLAlchemy 1.4. 2.0 syntax is more verbose.)

SELECT

When selecting specific columns (like session.query(models.MyModel.field)), if the query is in a…

  • For-loop, do, for example:

    for (lender_id,) in session.query(models.Lender.id):
        print(lender_id)
    

    or:

    for name, value in session.query(...):
        print(name, value)
    

    NOT:

    for row in session.query(models.Lender.id):  # AVOID
        print(row[0])
    
  • If-statement, do, for example:

    if lender_id := session.query(models.Lender.id).limit(1).scalar():
        print(lender_id)
    

    NOT:

    if row := session.query(models.Lender.id).first(): # AVOID
        print(row[0])
    

Tip

Maintainers can find queries for specific columns using the regular expression:

session.query\((models\.\w+\.|(?!models)\w+\.)

JOIN

  • To join the Award model, always explicitly use join(Award, Award.id == Application.award_id), because we want to count applications or borrowers only. We don’t want to count awards, like with join(Award, Award.borrower_id == Borrower.id).

  • To join another model, use join(model), not join(model, model.… == other.…). If an ON clause is needed, use the order join(model, model.… == other.…), not join(model, other.… == model.…).

WHERE

  • Use filter, not filter_by, to avoid ambiguity.

  • Use filter(a, b, c), not filter(a).filter(b).filter(c).

Chains

Query instance methods can be chained in any order, but typically:

Execution

Query instances must be executed with one of:

Attention

Do not use a query in a condition, without executing it! bool(query) returns True even if the result would be empty.