ML Package¶

ml ¶

ML — F1 race prediction service.

Provides model training, feature engineering, and inference for predicting F1 race finishing positions.

Subpackages

db: Database connection and CRUD operations. models: SQLAlchemy ORM models and Pydantic schemas. routers: FastAPI endpoint definitions. services: Inference service for generating predictions. training: Model training pipeline and feature engineering.

config ¶

ML service configuration.

All settings are loaded from environment variables with sensible defaults. Variables are prefixed with F1BOARD_ML_ for service-specific values or F1BOARD_ for project-wide values.

db ¶

Database layer for the ML service.

connection ¶

Async database engine and session management.

get_db `async` ¶

get_db()

Yield an async database session for dependency injection.

init_db `async` ¶

init_db()

Create all tables defined in the ORM metadata.

close_db `async` ¶

close_db()

Dispose of the database engine and release connections.

crud ¶

CRUD helpers for the ML service database.

Every public function accepts an AsyncSession as its first argument and performs a single, focused database operation.

get_driver `async` ¶

get_driver(db: AsyncSession, driver_id: str) -> models.Driver | None

Fetch a driver by their string identifier.

get_driver_by_id `async` ¶

get_driver_by_id(db: AsyncSession, id: int) -> models.Driver | None

Fetch a driver by primary key.

get_all_drivers `async` ¶

get_all_drivers(db: AsyncSession) -> list[models.Driver]

Return all drivers.

create_driver `async` ¶

create_driver(db: AsyncSession, driver: DriverCreate) -> models.Driver

Insert a new driver row.

get_or_create_driver `async` ¶

get_or_create_driver(db: AsyncSession, driver: DriverCreate) -> models.Driver

Return an existing driver or create a new one.

get_constructor `async` ¶

get_constructor(db: AsyncSession, constructor_id: str) -> models.Constructor | None

Fetch a constructor by its string identifier.

create_constructor `async` ¶

create_constructor(db: AsyncSession, constructor: ConstructorCreate) -> models.Constructor

Insert a new constructor row.

get_or_create_constructor `async` ¶

get_or_create_constructor(db: AsyncSession, constructor: ConstructorCreate) -> models.Constructor

Return an existing constructor or create a new one.

get_race `async` ¶

get_race(db: AsyncSession, season: int, round: int) -> models.Race | None

Fetch a race by season and round number.

get_all_races `async` ¶

get_all_races(db: AsyncSession) -> list[models.Race]

Return all races ordered by season and round.

get_upcoming_race `async` ¶

get_upcoming_race(db: AsyncSession, target_date: date) -> models.Race | None

Return the next race after target_date.

create_race `async` ¶

create_race(db: AsyncSession, race: RaceCreate) -> models.Race

Insert a new race row.

get_or_create_race `async` ¶

get_or_create_race(db: AsyncSession, race: RaceCreate) -> models.Race

Return an existing race or create a new one.

create_race_result `async` ¶

create_race_result(db: AsyncSession, result: RaceResultCreate) -> models.RaceResult

Insert a new race result row.

get_race_results `async` ¶

get_race_results(db: AsyncSession, race_id: int) -> list[models.RaceResult]

Return all results for a race, eager-loading driver and constructor.

get_driver_results `async` ¶

get_driver_results(db: AsyncSession, driver_id: int) -> list[models.RaceResult]

Return all race results for a specific driver.

get_all_results `async` ¶

get_all_results(db: AsyncSession) -> list[models.RaceResult]

Return every race result with related race, driver, and constructor.

create_prediction `async` ¶

create_prediction(db: AsyncSession, prediction: PredictionCreate) -> models.Prediction

Insert a single prediction row.

get_predictions_for_race `async` ¶

get_predictions_for_race(db: AsyncSession, race_id: int) -> list[models.Prediction]

Return all predictions for a race ordered by position.

replace_predictions_for_race `async` ¶

replace_predictions_for_race(db: AsyncSession, race_id: int, predictions: list[PredictionCreate]) -> list[models.Prediction]

Delete existing predictions for a race and insert new ones.

get_stats `async` ¶

get_stats(db: AsyncSession) -> dict

Return aggregate statistics (driver/race/result counts, seasons).

main ¶

FastAPI application entry point for the ML service.

Configures the CORS middleware, registers routers, and manages the application lifespan (database initialisation, model loading).

lifespan `async` ¶

lifespan(app: FastAPI)

Manage application startup and shutdown.

On startup the database tables are created and the trained ML model is loaded into memory. On shutdown the database engine is disposed.

root `async` ¶

root()

Return service metadata.

models ¶

ORM models and Pydantic schemas for the ML service.

database ¶

SQLAlchemy models used by the ML service.

Prediction ¶

Bases: Base

AI-generated podium prediction for a race.

schemas ¶

Pydantic schemas for request/response validation in the ML service.

DriverBase ¶

Bases: BaseModel

Shared driver fields.

ConstructorBase ¶

Bases: BaseModel

Shared constructor fields.

RaceBase ¶

Bases: BaseModel

Shared race fields.

RaceResultBase ¶

Bases: BaseModel

Shared race-result fields.

PredictionBase ¶

Bases: BaseModel

Shared prediction fields.

PredictionRequest ¶

Bases: BaseModel

Client request to generate predictions for a race.

DriverPrediction ¶

Bases: BaseModel

A single driver's predicted finish.

PredictionResponse ¶

Bases: BaseModel

Full prediction response containing multiple driver predictions.

AIPodiumEntry ¶

Bases: BaseModel

Single podium entry in an AI prediction.

AIPodiumResponse ¶

Bases: BaseModel

AI-predicted podium for a race.

HealthResponse ¶

Bases: BaseModel

Health-check response schema.

StatsResponse ¶

Bases: BaseModel

Aggregate database statistics.

SyncRequest ¶

Bases: BaseModel

Data sync request specifying season range.

SyncResponse ¶

Bases: BaseModel

Data sync result summary.

routers ¶

FastAPI routers for the ML service.

health ¶

Health and readiness endpoints for the ML service.

health_check `async` ¶

health_check(db: AsyncSession = Depends(get_db))

Return service health including DB connectivity and model status.

readiness `async` ¶

readiness()

Lightweight readiness probe.

predictions ¶

Prediction endpoints for the ML service.

Provides routes to generate, retrieve, and cache AI race-finish predictions.

predict_race `async` ¶

predict_race(request: PredictionRequest, db: AsyncSession = Depends(get_db))

Generate finish-position predictions for the requested drivers.

get_or_create_race_podium `async` ¶

get_or_create_race_podium(season: int, round: int, db: AsyncSession = Depends(get_db))

Return the AI podium for a race, generating it if not yet stored.

get_or_create_current_race_podium `async` ¶

get_or_create_current_race_podium(db: AsyncSession = Depends(get_db))

Return the AI podium for the next upcoming race.

get_or_create_current_race_podium_alias `async` ¶

get_or_create_current_race_podium_alias(db: AsyncSession = Depends(get_db))

Backward-compatible alias for current race podium predictions.

get_predictions `async` ¶

get_predictions(season: int, round: int, db: AsyncSession = Depends(get_db))

Fetch stored predictions for a specific race.

services ¶

Business-logic services for the ML service.

inference ¶

ML model loading and race prediction service.

InferenceService ¶

Loads a trained model and generates race-finish predictions.

load_model ¶

load_model() -> bool

Load the trained model and feature engineer from disk.

Returns:

Type	Description
`bool`	`True` if the model was loaded successfully, `False` otherwise.

predict_race `async` ¶

predict_race(db: AsyncSession, season: int, round: int, driver_ids: list[str]) -> list[dict]

Predict race results for the specified list of driver IDs.

predict_and_store_race `async` ¶

predict_and_store_race(db: AsyncSession, season: int, round: int) -> list[dict]

Predict results for all known drivers and persist them.

training ¶

Model training pipeline, feature engineering, and predictor.

features ¶

Feature engineering for the F1 race prediction model.

Computes per-driver, per-constructor, and per-circuit aggregate statistics from historical race results.

FeatureEngineer ¶

Compute and store aggregate statistics for ML features.

Call :meth:fit on a training DataFrame, then :meth:transform to produce the feature matrix. At inference time use :meth:get_prediction_features.

fit ¶

fit(df: DataFrame)

Compute aggregate statistics from historical results.

transform ¶

transform(df: DataFrame) -> np.ndarray

Transform a results DataFrame into a feature matrix.

get_feature_names ¶

get_feature_names() -> list[str]

Return human-readable feature column names.

get_prediction_features `async` ¶

get_prediction_features(db: AsyncSession, driver_id: int, race_id: int) -> list | None

Build a feature vector for a single driver/race pair at inference time.

model ¶

Gradient-boosting classifier wrapper for F1 finish-position prediction.

RacePredictor ¶

Wraps a GradientBoostingClassifier to predict race finishing positions.

Positions are capped to 1-20 and encoded via LabelEncoder.

fit ¶

fit(X: ndarray, y: ndarray)

Train the classifier on features X and labels y.

predict ¶

predict(X: ndarray) -> np.ndarray

Predict finishing positions for feature matrix X.

predict_proba ¶

predict_proba(X: ndarray) -> np.ndarray

Return class probability estimates for X.

score ¶

score(X: ndarray, y: ndarray) -> float

Return accuracy score on X / y.

train ¶

Model training pipeline.

Loads historical race results from the database, engineers features, trains a RacePredictor model, and serialises the artefacts to disk.

load_training_data `async` ¶

load_training_data(start_year: int | None = None, end_year: int | None = None) -> pd.DataFrame

Query race results from the database and return them as a DataFrame.

train_model ¶

train_model(df: DataFrame) -> tuple[RacePredictor, FeatureEngineer]

Train a RacePredictor on the provided results DataFrame.

save_model ¶

save_model(model: RacePredictor, feature_engineer: FeatureEngineer)

Serialise the trained model and feature engineer to disk.

main `async` ¶

main(start_year: int | None = None, end_year: int | None = None)

End-to-end training entrypoint: load data, train, and save.

Configuration¶

ml.config ¶

ML service configuration.

All settings are loaded from environment variables with sensible defaults. Variables are prefixed with F1BOARD_ML_ for service-specific values or F1BOARD_ for project-wide values.

Database¶

Connection¶

ml.db.connection ¶

Async database engine and session management.

get_db `async` ¶

get_db()

Yield an async database session for dependency injection.

init_db `async` ¶

init_db()

Create all tables defined in the ORM metadata.

close_db `async` ¶

close_db()

Dispose of the database engine and release connections.

CRUD¶

ml.db.crud ¶

CRUD helpers for the ML service database.

Every public function accepts an AsyncSession as its first argument and performs a single, focused database operation.

get_driver `async` ¶

get_driver(db: AsyncSession, driver_id: str) -> models.Driver | None

Fetch a driver by their string identifier.

get_driver_by_id `async` ¶

get_driver_by_id(db: AsyncSession, id: int) -> models.Driver | None

Fetch a driver by primary key.

get_all_drivers `async` ¶

get_all_drivers(db: AsyncSession) -> list[models.Driver]

Return all drivers.

create_driver `async` ¶

create_driver(db: AsyncSession, driver: DriverCreate) -> models.Driver

Insert a new driver row.

get_or_create_driver `async` ¶

get_or_create_driver(db: AsyncSession, driver: DriverCreate) -> models.Driver

Return an existing driver or create a new one.

get_constructor `async` ¶

get_constructor(db: AsyncSession, constructor_id: str) -> models.Constructor | None

Fetch a constructor by its string identifier.

create_constructor `async` ¶

create_constructor(db: AsyncSession, constructor: ConstructorCreate) -> models.Constructor

Insert a new constructor row.

get_or_create_constructor `async` ¶

get_or_create_constructor(db: AsyncSession, constructor: ConstructorCreate) -> models.Constructor

Return an existing constructor or create a new one.

get_race `async` ¶

get_race(db: AsyncSession, season: int, round: int) -> models.Race | None

Fetch a race by season and round number.

get_all_races `async` ¶

get_all_races(db: AsyncSession) -> list[models.Race]

Return all races ordered by season and round.

get_upcoming_race `async` ¶

get_upcoming_race(db: AsyncSession, target_date: date) -> models.Race | None

Return the next race after target_date.

create_race `async` ¶

create_race(db: AsyncSession, race: RaceCreate) -> models.Race

Insert a new race row.

get_or_create_race `async` ¶

get_or_create_race(db: AsyncSession, race: RaceCreate) -> models.Race

Return an existing race or create a new one.

create_race_result `async` ¶

create_race_result(db: AsyncSession, result: RaceResultCreate) -> models.RaceResult

Insert a new race result row.

get_race_results `async` ¶

get_race_results(db: AsyncSession, race_id: int) -> list[models.RaceResult]

Return all results for a race, eager-loading driver and constructor.

get_driver_results `async` ¶

get_driver_results(db: AsyncSession, driver_id: int) -> list[models.RaceResult]

Return all race results for a specific driver.

get_all_results `async` ¶

get_all_results(db: AsyncSession) -> list[models.RaceResult]

Return every race result with related race, driver, and constructor.

create_prediction `async` ¶

create_prediction(db: AsyncSession, prediction: PredictionCreate) -> models.Prediction

Insert a single prediction row.

get_predictions_for_race `async` ¶

get_predictions_for_race(db: AsyncSession, race_id: int) -> list[models.Prediction]

Return all predictions for a race ordered by position.

replace_predictions_for_race `async` ¶

replace_predictions_for_race(db: AsyncSession, race_id: int, predictions: list[PredictionCreate]) -> list[models.Prediction]

Delete existing predictions for a race and insert new ones.

get_stats `async` ¶

get_stats(db: AsyncSession) -> dict

Return aggregate statistics (driver/race/result counts, seasons).

Models¶

ORM Models¶

ml.models.database ¶

SQLAlchemy models used by the ML service.

Prediction ¶

Bases: Base

AI-generated podium prediction for a race.

Schemas¶

ml.models.schemas ¶

Pydantic schemas for request/response validation in the ML service.

DriverBase ¶

Bases: BaseModel

Shared driver fields.

ConstructorBase ¶

Bases: BaseModel

Shared constructor fields.

RaceBase ¶

Bases: BaseModel

Shared race fields.

RaceResultBase ¶

Bases: BaseModel

Shared race-result fields.

PredictionBase ¶

Bases: BaseModel

Shared prediction fields.

PredictionRequest ¶

Bases: BaseModel

Client request to generate predictions for a race.

DriverPrediction ¶

Bases: BaseModel

A single driver's predicted finish.

PredictionResponse ¶

Bases: BaseModel

Full prediction response containing multiple driver predictions.

AIPodiumEntry ¶

Bases: BaseModel

Single podium entry in an AI prediction.

AIPodiumResponse ¶

Bases: BaseModel

AI-predicted podium for a race.

HealthResponse ¶

Bases: BaseModel

Health-check response schema.

StatsResponse ¶

Bases: BaseModel

Aggregate database statistics.

SyncRequest ¶

Bases: BaseModel

Data sync request specifying season range.

SyncResponse ¶

Bases: BaseModel

Data sync result summary.

Routers¶

Health¶

ml.routers.health ¶

Health and readiness endpoints for the ML service.

health_check `async` ¶

health_check(db: AsyncSession = Depends(get_db))

Return service health including DB connectivity and model status.

readiness `async` ¶

readiness()

Lightweight readiness probe.

Predictions¶

ml.routers.predictions ¶

Prediction endpoints for the ML service.

Provides routes to generate, retrieve, and cache AI race-finish predictions.

predict_race `async` ¶

predict_race(request: PredictionRequest, db: AsyncSession = Depends(get_db))

Generate finish-position predictions for the requested drivers.

get_or_create_race_podium `async` ¶

get_or_create_race_podium(season: int, round: int, db: AsyncSession = Depends(get_db))

Return the AI podium for a race, generating it if not yet stored.

get_or_create_current_race_podium `async` ¶

get_or_create_current_race_podium(db: AsyncSession = Depends(get_db))

Return the AI podium for the next upcoming race.

get_or_create_current_race_podium_alias `async` ¶

get_or_create_current_race_podium_alias(db: AsyncSession = Depends(get_db))

Backward-compatible alias for current race podium predictions.

get_predictions `async` ¶

get_predictions(season: int, round: int, db: AsyncSession = Depends(get_db))

Fetch stored predictions for a specific race.

Services¶

Inference¶

ml.services.inference ¶

ML model loading and race prediction service.

InferenceService ¶

Loads a trained model and generates race-finish predictions.

load_model ¶

load_model() -> bool

Load the trained model and feature engineer from disk.

Returns:

Type	Description
`bool`	`True` if the model was loaded successfully, `False` otherwise.

predict_race `async` ¶

predict_race(db: AsyncSession, season: int, round: int, driver_ids: list[str]) -> list[dict]

Predict race results for the specified list of driver IDs.

predict_and_store_race `async` ¶

predict_and_store_race(db: AsyncSession, season: int, round: int) -> list[dict]

Predict results for all known drivers and persist them.

Training¶

Feature Engineering¶

ml.training.features ¶

Feature engineering for the F1 race prediction model.

Computes per-driver, per-constructor, and per-circuit aggregate statistics from historical race results.

FeatureEngineer ¶

Compute and store aggregate statistics for ML features.

Call :meth:fit on a training DataFrame, then :meth:transform to produce the feature matrix. At inference time use :meth:get_prediction_features.

fit ¶

fit(df: DataFrame)

Compute aggregate statistics from historical results.

transform ¶

transform(df: DataFrame) -> np.ndarray

Transform a results DataFrame into a feature matrix.

get_feature_names ¶

get_feature_names() -> list[str]

Return human-readable feature column names.

get_prediction_features `async` ¶

get_prediction_features(db: AsyncSession, driver_id: int, race_id: int) -> list | None

Build a feature vector for a single driver/race pair at inference time.

Model¶

ml.training.model ¶

Gradient-boosting classifier wrapper for F1 finish-position prediction.

RacePredictor ¶

Wraps a GradientBoostingClassifier to predict race finishing positions.

Positions are capped to 1-20 and encoded via LabelEncoder.

fit ¶

fit(X: ndarray, y: ndarray)

Train the classifier on features X and labels y.

predict ¶

predict(X: ndarray) -> np.ndarray

Predict finishing positions for feature matrix X.

predict_proba ¶

predict_proba(X: ndarray) -> np.ndarray

Return class probability estimates for X.

score ¶

score(X: ndarray, y: ndarray) -> float

Return accuracy score on X / y.

Training Pipeline¶

ml.training.train ¶

Model training pipeline.

Loads historical race results from the database, engineers features, trains a RacePredictor model, and serialises the artefacts to disk.

load_training_data `async` ¶

load_training_data(start_year: int | None = None, end_year: int | None = None) -> pd.DataFrame

Query race results from the database and return them as a DataFrame.

train_model ¶

train_model(df: DataFrame) -> tuple[RacePredictor, FeatureEngineer]

Train a RacePredictor on the provided results DataFrame.

save_model ¶

save_model(model: RacePredictor, feature_engineer: FeatureEngineer)

Serialise the trained model and feature engineer to disk.

main `async` ¶

main(start_year: int | None = None, end_year: int | None = None)

End-to-end training entrypoint: load data, train, and save.

ML Package¶

ml ¶

config ¶

db ¶

connection ¶

get_db async ¶

init_db async ¶

close_db async ¶

crud ¶

get_driver async ¶

get_driver_by_id async ¶

get_all_drivers async ¶

create_driver async ¶

get_or_create_driver async ¶

get_constructor async ¶

create_constructor async ¶

get_or_create_constructor async ¶

get_race async ¶

get_all_races async ¶

get_upcoming_race async ¶

create_race async ¶

get_or_create_race async ¶

create_race_result async ¶

get_race_results async ¶

get_driver_results async ¶

get_all_results async ¶

create_prediction async ¶

get_predictions_for_race async ¶

replace_predictions_for_race async ¶

get_stats async ¶

main ¶

lifespan async ¶

root async ¶

models ¶

database ¶

Prediction ¶

schemas ¶

DriverBase ¶

ConstructorBase ¶

RaceBase ¶

RaceResultBase ¶

PredictionBase ¶

PredictionRequest ¶

DriverPrediction ¶

PredictionResponse ¶

AIPodiumEntry ¶

AIPodiumResponse ¶

HealthResponse ¶

StatsResponse ¶

SyncRequest ¶

SyncResponse ¶

routers ¶

health ¶

health_check async ¶

readiness async ¶

predictions ¶

predict_race async ¶

get_or_create_race_podium async ¶

get_or_create_current_race_podium async ¶

get_or_create_current_race_podium_alias async ¶

get_predictions async ¶

services ¶

inference ¶

InferenceService ¶

load_model ¶

predict_race async ¶

predict_and_store_race async ¶

training ¶

features ¶

FeatureEngineer ¶

fit ¶

transform ¶

get_feature_names ¶

get_prediction_features async ¶

model ¶

RacePredictor ¶

fit ¶

predict ¶

predict_proba ¶

score ¶

get_db `async` ¶

init_db `async` ¶

close_db `async` ¶

get_driver `async` ¶

get_driver_by_id `async` ¶

get_all_drivers `async` ¶

create_driver `async` ¶

get_or_create_driver `async` ¶

get_constructor `async` ¶

create_constructor `async` ¶

get_or_create_constructor `async` ¶

get_race `async` ¶

get_all_races `async` ¶

get_upcoming_race `async` ¶

create_race `async` ¶

get_or_create_race `async` ¶

create_race_result `async` ¶

get_race_results `async` ¶

get_driver_results `async` ¶

get_all_results `async` ¶

create_prediction `async` ¶

get_predictions_for_race `async` ¶

replace_predictions_for_race `async` ¶

get_stats `async` ¶

lifespan `async` ¶

root `async` ¶

health_check `async` ¶

readiness `async` ¶

predict_race `async` ¶

get_or_create_race_podium `async` ¶

get_or_create_current_race_podium `async` ¶

get_or_create_current_race_podium_alias `async` ¶

get_predictions `async` ¶

predict_race `async` ¶

predict_and_store_race `async` ¶

get_prediction_features `async` ¶

load_training_data `async` ¶

main `async` ¶

get_db `async` ¶

init_db `async` ¶

close_db `async` ¶

get_driver `async` ¶

get_driver_by_id `async` ¶

get_all_drivers `async` ¶

create_driver `async` ¶

get_or_create_driver `async` ¶

get_constructor `async` ¶

create_constructor `async` ¶

get_or_create_constructor `async` ¶

get_race `async` ¶

get_all_races `async` ¶

get_upcoming_race `async` ¶

create_race `async` ¶

get_or_create_race `async` ¶

create_race_result `async` ¶

get_race_results `async` ¶

get_driver_results `async` ¶

get_all_results `async` ¶

create_prediction `async` ¶

get_predictions_for_race `async` ¶

replace_predictions_for_race `async` ¶

get_stats `async` ¶

health_check `async` ¶

readiness `async` ¶

predict_race `async` ¶

get_or_create_race_podium `async` ¶

get_or_create_current_race_podium `async` ¶

get_or_create_current_race_podium_alias `async` ¶

get_predictions `async` ¶