返回

SQLAlchemy 混合属性与子查询:判断学生是否通过所有最新考试

mysql

SQLAlchemy 混合属性(Hybrid Property)与子查询:判断学生是否通过所有最新考试

碰到了一个挺有意思的场景:学生参加多个科目的考试,每个科目可能有多次考试记录,但我们只关心最近一次的考试结果。目标是:判断学生是否通过了 所有 科目的 最近一次 考试。

用 SQLAlchemy 的 hybrid property 来实现这个判断逻辑,并在查询中直接使用,比如 select(Student).where(Student.passed),想想就觉得很酷。

问题拆解

核心在于构建一个能表达“所有科目最近一次考试都通过”的 SQL 查询。给出的原始 SQL 语句已经很清晰了:

  1. 内层子查询找出每个学生每个科目的最近一次考试记录及其是否通过。
  2. 外层查询统计每个学生未通过的最近考试数量,如果数量为 0,则认为该学生通过了所有考试。

难点在于如何把这个 SQL 查询转换成 SQLAlchemy 的 hybrid property 表达式。

解决方案:步步为营

1. 构造 latest_exams 子查询

首先, 将用于找到每个学生各个科目的最近一次考试记录的部分,翻译成 SQLAlchemy 的查询表达式:

from sqlalchemy import select, func, case, label
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship, Session
from datetime import datetime

# ... (之前的模型定义保持不变) ...
class Base(DeclarativeBase):
    pass

reg = registry()

student_subjects = Table(
    "student_subjects",
    reg.metadata,
    Column('student_idx', ForeignKey('student.idx'), primary_key=True),
    Column('subject_idx', ForeignKey('subject.idx'), primary_key=True)
           )

@reg.mapped_as_dataclass
class Student(Base):
    __tablename__ = "student"
    idx: Mapped[int] = mapped_column(primary_key=True, init=False, autoincrement=True)
    name: Mapped[str] = mapped_column()
    exams: Mapped[list["Exam"]] = relationship(back_populates="student", init=False, default_factory=list)
    subjects: Mapped[list["Subject"]] = relationship(back_populates="students", init=False, default_factory=list, secondary=student_subjects)

    @hybrid_property
    def latest_exams(self):
        ret = []

        for subject in self.subjects:
            exams = [exam for exam in self.exams if exam.subject == subject]
            exams.sort(key=lambda x: x.completed_at, reverse=True)

            if len(exams) > 0:
                ret.append(exams[0])

        return ret

    @hybrid_property
    def passed(self):
        # 这是一个实例级别的方法, 仅仅用于单个 student 实例
        return all(exam.passed for exam in self.latest_exams)

    @passed.expression
    def passed(cls):
        # classmethod版本

        latest_exams = (
            select(
                Exam.student_idx,
                Exam.subject_idx,
                func.max(Exam.completed_at).label("max_completed_at")
            )
            .group_by(Exam.student_idx, Exam.subject_idx)
            .subquery()
        )

        #找到所有考试和最近考试信息, 这里利用相关子查询关联 Student 与 latest_exams.
        latest_exam_passed = (
            select(
                Exam.student_idx,
                Exam.passed
            )
            .join(latest_exams,
                  (Exam.student_idx == latest_exams.c.student_idx) &
                  (Exam.subject_idx == latest_exams.c.subject_idx) &
                  (Exam.completed_at == latest_exams.c.max_completed_at)
                  )
            .subquery()
        )
        
        # 统计每个学生的不及格数目,这里count 为 0 说明所有考试都及格
        return (
            select(
                (func.count() == 0)
            )
            .select_from(latest_exam_passed)
            .where(latest_exam_passed.c.student_idx == cls.idx)
            .where(latest_exam_passed.c.passed == False)  # 筛选出未通过的考试
            .label("passed")
        )
        
@reg.mapped_as_dataclass
class Subject(Base):
    __tablename__ = "subject"
    idx: Mapped[int] = mapped_column(primary_key=True, init=False, autoincrement=True)
    name: Mapped[str] = mapped_column()
    exams: Mapped[list["Exam"]] = relationship(back_populates="subject", init=False)
    students: Mapped[list["Student"]] = relationship(back_populates="subjects", init=False, secondary=student_subjects)


@reg.mapped_as_dataclass
class Exam(Base):
    __tablename__ = "Exam"

    idx: Mapped[int] = mapped_column( primary_key=True, init=False, autoincrement=True)
    passed: Mapped[bool] = mapped_column()

    subject: Mapped["Subject"] = relationship(back_populates="exams")
    subject_idx: Mapped[int] = mapped_column(ForeignKey("subject.idx"), init=False)

    student: Mapped["Student"] = relationship(back_populates="exams")
    student_idx: Mapped[int] = mapped_column(ForeignKey("student.idx"), init=False)

    completed_at: Mapped[datetime] = mapped_column(default_factory=datetime.now)

2. passed 混合属性

passed hybrid property 需要定义实例级别 (instance-level) 行为与表达式级别 (expression-level) 行为。

  • 实例级别: 对于已加载的 Student 对象,直接遍历 latest_exams 属性,检查是否所有考试都通过了。
  • 表达式级别: 这部分是重点,将 SQL 查询转换为 SQLAlchemy 表达式。 使用@passed.expression 装饰。

表达式级别中:
使用相关子查询可以更清晰地将 Student.idx 与子查询中的 student_idx 关联起来。

3. 测试代码


engine = create_engine("sqlite:///:memory:", echo=False) # 调试时用True
Base.metadata.create_all(engine)
SessionLocal = sessionmaker(bind=engine)

db = SessionLocal()

# 创建一些数据
math = Subject(name="Math")
physics = Subject(name="Physics")

student1 = Student(name="Alice")
student2 = Student(name="Bob")

# 初始考试成绩
exam1 = Exam(subject=math, student=student1, passed=True, completed_at=datetime(2023, 1, 1))
exam2 = Exam(subject=physics, student=student1, passed=False, completed_at=datetime(2023, 1, 1))

exam3 = Exam(subject=math, student=student2, passed=False, completed_at = datetime(2023, 1, 1))
exam4 = Exam(subject=physics, student=student2, passed=True, completed_at=datetime(2023, 1, 1))

student1.subjects.append(math)
student1.subjects.append(physics)
student2.subjects.append(math)
student2.subjects.append(physics)
# 添加到数据库
db.add_all([math, physics, student1, student2, exam1, exam2, exam3, exam4])
db.commit()

# 现在Alice 和 Bob 各有一门不及格.
students_failed = db.query(Student).where(Student.passed == False).all()
assert len(students_failed) == 2

# 现在进行最近的考试, Alice 两门都通过, Bob 数学仍然不及格.
exam5 = Exam(subject=math, student=student1, passed=True, completed_at=datetime(2024, 1, 1))
exam6 = Exam(subject=physics, student=student1, passed=True, completed_at=datetime(2024, 1, 1))
exam7 = Exam(subject=math, student=student2, passed=False, completed_at = datetime(2024, 1, 1))

db.add_all([exam5, exam6, exam7])
db.commit()

students_failed = db.query(Student).where(Student.passed == False).all()
students_passed = db.query(Student).where(Student.passed == True).all()

print ([s.name for s in students_failed]) # Bob
print ([s.name for s in students_passed]) # Alice
db.close()

这个测试用例会先插入一些初始的考试成绩,然后查询有多少学生存在未通过的科目,验证通过 hybrid property 定义的筛选条件。

再新增 Alice 的全部科目都及格的考试, 与 Bob 数学仍然不及格的考试. 再次验证筛选条件.

总结

这样就完整地实现了通过 SQLAlchemy 的 hybrid property,结合子查询来判断学生是否通过所有科目最近一次考试的需求。 这种写法兼顾了 Python 代码的易用性和 SQL 查询的效率。 核心思想是将复杂的判断逻辑封装到模型内部,对外提供简洁的查询接口。