ML Package¶
ml ¶
ML — F1 race prediction service.
Provides model training, feature engineering, and inference for predicting F1 race finishing positions.
Subpackages
db: Database connection and CRUD operations. models: SQLAlchemy ORM models and Pydantic schemas. routers: FastAPI endpoint definitions. services: Inference service for generating predictions. training: Model training pipeline and feature engineering.
config ¶
ML service configuration.
All settings are loaded from environment variables with sensible defaults.
Variables are prefixed with F1BOARD_ML_ for service-specific values or
F1BOARD_ for project-wide values.
db ¶
Database layer for the ML service.
connection ¶
crud ¶
CRUD helpers for the ML service database.
Every public function accepts an AsyncSession as its first argument and
performs a single, focused database operation.
get_driver
async
¶
get_driver(db: AsyncSession, driver_id: str) -> models.Driver | None
Fetch a driver by their string identifier.
get_driver_by_id
async
¶
get_driver_by_id(db: AsyncSession, id: int) -> models.Driver | None
Fetch a driver by primary key.
get_all_drivers
async
¶
get_all_drivers(db: AsyncSession) -> list[models.Driver]
Return all drivers.
create_driver
async
¶
create_driver(db: AsyncSession, driver: DriverCreate) -> models.Driver
Insert a new driver row.
get_or_create_driver
async
¶
get_or_create_driver(db: AsyncSession, driver: DriverCreate) -> models.Driver
Return an existing driver or create a new one.
get_constructor
async
¶
get_constructor(db: AsyncSession, constructor_id: str) -> models.Constructor | None
Fetch a constructor by its string identifier.
create_constructor
async
¶
create_constructor(db: AsyncSession, constructor: ConstructorCreate) -> models.Constructor
Insert a new constructor row.
get_or_create_constructor
async
¶
get_or_create_constructor(db: AsyncSession, constructor: ConstructorCreate) -> models.Constructor
Return an existing constructor or create a new one.
get_race
async
¶
get_race(db: AsyncSession, season: int, round: int) -> models.Race | None
Fetch a race by season and round number.
get_all_races
async
¶
get_all_races(db: AsyncSession) -> list[models.Race]
Return all races ordered by season and round.
get_upcoming_race
async
¶
get_upcoming_race(db: AsyncSession, target_date: date) -> models.Race | None
Return the next race after target_date.
create_race
async
¶
create_race(db: AsyncSession, race: RaceCreate) -> models.Race
Insert a new race row.
get_or_create_race
async
¶
get_or_create_race(db: AsyncSession, race: RaceCreate) -> models.Race
Return an existing race or create a new one.
create_race_result
async
¶
create_race_result(db: AsyncSession, result: RaceResultCreate) -> models.RaceResult
Insert a new race result row.
get_race_results
async
¶
get_race_results(db: AsyncSession, race_id: int) -> list[models.RaceResult]
Return all results for a race, eager-loading driver and constructor.
get_driver_results
async
¶
get_driver_results(db: AsyncSession, driver_id: int) -> list[models.RaceResult]
Return all race results for a specific driver.
get_all_results
async
¶
get_all_results(db: AsyncSession) -> list[models.RaceResult]
Return every race result with related race, driver, and constructor.
create_prediction
async
¶
create_prediction(db: AsyncSession, prediction: PredictionCreate) -> models.Prediction
Insert a single prediction row.
get_predictions_for_race
async
¶
get_predictions_for_race(db: AsyncSession, race_id: int) -> list[models.Prediction]
Return all predictions for a race ordered by position.
replace_predictions_for_race
async
¶
replace_predictions_for_race(db: AsyncSession, race_id: int, predictions: list[PredictionCreate]) -> list[models.Prediction]
Delete existing predictions for a race and insert new ones.
get_stats
async
¶
get_stats(db: AsyncSession) -> dict
Return aggregate statistics (driver/race/result counts, seasons).
main ¶
FastAPI application entry point for the ML service.
Configures the CORS middleware, registers routers, and manages the application lifespan (database initialisation, model loading).
models ¶
ORM models and Pydantic schemas for the ML service.
database ¶
SQLAlchemy models used by the ML service.
Prediction ¶
Bases: Base
AI-generated podium prediction for a race.
schemas ¶
Pydantic schemas for request/response validation in the ML service.
DriverBase ¶
Bases: BaseModel
Shared driver fields.
ConstructorBase ¶
Bases: BaseModel
Shared constructor fields.
RaceBase ¶
Bases: BaseModel
Shared race fields.
RaceResultBase ¶
Bases: BaseModel
Shared race-result fields.
PredictionBase ¶
Bases: BaseModel
Shared prediction fields.
PredictionRequest ¶
Bases: BaseModel
Client request to generate predictions for a race.
DriverPrediction ¶
Bases: BaseModel
A single driver's predicted finish.
PredictionResponse ¶
Bases: BaseModel
Full prediction response containing multiple driver predictions.
AIPodiumEntry ¶
Bases: BaseModel
Single podium entry in an AI prediction.
AIPodiumResponse ¶
Bases: BaseModel
AI-predicted podium for a race.
HealthResponse ¶
Bases: BaseModel
Health-check response schema.
StatsResponse ¶
Bases: BaseModel
Aggregate database statistics.
SyncRequest ¶
Bases: BaseModel
Data sync request specifying season range.
SyncResponse ¶
Bases: BaseModel
Data sync result summary.
routers ¶
FastAPI routers for the ML service.
health ¶
predictions ¶
Prediction endpoints for the ML service.
Provides routes to generate, retrieve, and cache AI race-finish predictions.
predict_race
async
¶
predict_race(request: PredictionRequest, db: AsyncSession = Depends(get_db))
Generate finish-position predictions for the requested drivers.
get_or_create_race_podium
async
¶
get_or_create_race_podium(season: int, round: int, db: AsyncSession = Depends(get_db))
Return the AI podium for a race, generating it if not yet stored.
get_or_create_current_race_podium
async
¶
get_or_create_current_race_podium(db: AsyncSession = Depends(get_db))
Return the AI podium for the next upcoming race.
get_or_create_current_race_podium_alias
async
¶
get_or_create_current_race_podium_alias(db: AsyncSession = Depends(get_db))
Backward-compatible alias for current race podium predictions.
get_predictions
async
¶
get_predictions(season: int, round: int, db: AsyncSession = Depends(get_db))
Fetch stored predictions for a specific race.
services ¶
Business-logic services for the ML service.
inference ¶
ML model loading and race prediction service.
InferenceService ¶
Loads a trained model and generates race-finish predictions.
load_model ¶
load_model() -> bool
Load the trained model and feature engineer from disk.
Returns:
| Type | Description |
|---|---|
bool
|
|
predict_race
async
¶
predict_race(db: AsyncSession, season: int, round: int, driver_ids: list[str]) -> list[dict]
Predict race results for the specified list of driver IDs.
predict_and_store_race
async
¶
predict_and_store_race(db: AsyncSession, season: int, round: int) -> list[dict]
Predict results for all known drivers and persist them.
training ¶
Model training pipeline, feature engineering, and predictor.
features ¶
Feature engineering for the F1 race prediction model.
Computes per-driver, per-constructor, and per-circuit aggregate statistics from historical race results.
FeatureEngineer ¶
Compute and store aggregate statistics for ML features.
Call :meth:fit on a training DataFrame, then :meth:transform to
produce the feature matrix. At inference time use
:meth:get_prediction_features.
model ¶
Gradient-boosting classifier wrapper for F1 finish-position prediction.
RacePredictor ¶
Wraps a GradientBoostingClassifier to predict race finishing positions.
Positions are capped to 1-20 and encoded via LabelEncoder.
train ¶
Model training pipeline.
Loads historical race results from the database, engineers features, trains a
RacePredictor model, and serialises the artefacts to disk.
load_training_data
async
¶
load_training_data(start_year: int | None = None, end_year: int | None = None) -> pd.DataFrame
Query race results from the database and return them as a DataFrame.
train_model ¶
train_model(df: DataFrame) -> tuple[RacePredictor, FeatureEngineer]
Train a RacePredictor on the provided results DataFrame.
save_model ¶
save_model(model: RacePredictor, feature_engineer: FeatureEngineer)
Serialise the trained model and feature engineer to disk.
main
async
¶
main(start_year: int | None = None, end_year: int | None = None)
End-to-end training entrypoint: load data, train, and save.
Configuration¶
ml.config ¶
ML service configuration.
All settings are loaded from environment variables with sensible defaults.
Variables are prefixed with F1BOARD_ML_ for service-specific values or
F1BOARD_ for project-wide values.
Database¶
Connection¶
ml.db.connection ¶
CRUD¶
ml.db.crud ¶
CRUD helpers for the ML service database.
Every public function accepts an AsyncSession as its first argument and
performs a single, focused database operation.
get_driver
async
¶
get_driver(db: AsyncSession, driver_id: str) -> models.Driver | None
Fetch a driver by their string identifier.
get_driver_by_id
async
¶
get_driver_by_id(db: AsyncSession, id: int) -> models.Driver | None
Fetch a driver by primary key.
get_all_drivers
async
¶
get_all_drivers(db: AsyncSession) -> list[models.Driver]
Return all drivers.
create_driver
async
¶
create_driver(db: AsyncSession, driver: DriverCreate) -> models.Driver
Insert a new driver row.
get_or_create_driver
async
¶
get_or_create_driver(db: AsyncSession, driver: DriverCreate) -> models.Driver
Return an existing driver or create a new one.
get_constructor
async
¶
get_constructor(db: AsyncSession, constructor_id: str) -> models.Constructor | None
Fetch a constructor by its string identifier.
create_constructor
async
¶
create_constructor(db: AsyncSession, constructor: ConstructorCreate) -> models.Constructor
Insert a new constructor row.
get_or_create_constructor
async
¶
get_or_create_constructor(db: AsyncSession, constructor: ConstructorCreate) -> models.Constructor
Return an existing constructor or create a new one.
get_race
async
¶
get_race(db: AsyncSession, season: int, round: int) -> models.Race | None
Fetch a race by season and round number.
get_all_races
async
¶
get_all_races(db: AsyncSession) -> list[models.Race]
Return all races ordered by season and round.
get_upcoming_race
async
¶
get_upcoming_race(db: AsyncSession, target_date: date) -> models.Race | None
Return the next race after target_date.
create_race
async
¶
create_race(db: AsyncSession, race: RaceCreate) -> models.Race
Insert a new race row.
get_or_create_race
async
¶
get_or_create_race(db: AsyncSession, race: RaceCreate) -> models.Race
Return an existing race or create a new one.
create_race_result
async
¶
create_race_result(db: AsyncSession, result: RaceResultCreate) -> models.RaceResult
Insert a new race result row.
get_race_results
async
¶
get_race_results(db: AsyncSession, race_id: int) -> list[models.RaceResult]
Return all results for a race, eager-loading driver and constructor.
get_driver_results
async
¶
get_driver_results(db: AsyncSession, driver_id: int) -> list[models.RaceResult]
Return all race results for a specific driver.
get_all_results
async
¶
get_all_results(db: AsyncSession) -> list[models.RaceResult]
Return every race result with related race, driver, and constructor.
create_prediction
async
¶
create_prediction(db: AsyncSession, prediction: PredictionCreate) -> models.Prediction
Insert a single prediction row.
get_predictions_for_race
async
¶
get_predictions_for_race(db: AsyncSession, race_id: int) -> list[models.Prediction]
Return all predictions for a race ordered by position.
replace_predictions_for_race
async
¶
replace_predictions_for_race(db: AsyncSession, race_id: int, predictions: list[PredictionCreate]) -> list[models.Prediction]
Delete existing predictions for a race and insert new ones.
get_stats
async
¶
get_stats(db: AsyncSession) -> dict
Return aggregate statistics (driver/race/result counts, seasons).
Models¶
ORM Models¶
ml.models.database ¶
SQLAlchemy models used by the ML service.
Prediction ¶
Bases: Base
AI-generated podium prediction for a race.
Schemas¶
ml.models.schemas ¶
Pydantic schemas for request/response validation in the ML service.
DriverBase ¶
Bases: BaseModel
Shared driver fields.
ConstructorBase ¶
Bases: BaseModel
Shared constructor fields.
RaceBase ¶
Bases: BaseModel
Shared race fields.
RaceResultBase ¶
Bases: BaseModel
Shared race-result fields.
PredictionBase ¶
Bases: BaseModel
Shared prediction fields.
PredictionRequest ¶
Bases: BaseModel
Client request to generate predictions for a race.
DriverPrediction ¶
Bases: BaseModel
A single driver's predicted finish.
PredictionResponse ¶
Bases: BaseModel
Full prediction response containing multiple driver predictions.
AIPodiumEntry ¶
Bases: BaseModel
Single podium entry in an AI prediction.
AIPodiumResponse ¶
Bases: BaseModel
AI-predicted podium for a race.
HealthResponse ¶
Bases: BaseModel
Health-check response schema.
StatsResponse ¶
Bases: BaseModel
Aggregate database statistics.
SyncRequest ¶
Bases: BaseModel
Data sync request specifying season range.
SyncResponse ¶
Bases: BaseModel
Data sync result summary.
Routers¶
Health¶
ml.routers.health ¶
Predictions¶
ml.routers.predictions ¶
Prediction endpoints for the ML service.
Provides routes to generate, retrieve, and cache AI race-finish predictions.
predict_race
async
¶
predict_race(request: PredictionRequest, db: AsyncSession = Depends(get_db))
Generate finish-position predictions for the requested drivers.
get_or_create_race_podium
async
¶
get_or_create_race_podium(season: int, round: int, db: AsyncSession = Depends(get_db))
Return the AI podium for a race, generating it if not yet stored.
get_or_create_current_race_podium
async
¶
get_or_create_current_race_podium(db: AsyncSession = Depends(get_db))
Return the AI podium for the next upcoming race.
get_or_create_current_race_podium_alias
async
¶
get_or_create_current_race_podium_alias(db: AsyncSession = Depends(get_db))
Backward-compatible alias for current race podium predictions.
get_predictions
async
¶
get_predictions(season: int, round: int, db: AsyncSession = Depends(get_db))
Fetch stored predictions for a specific race.
Services¶
Inference¶
ml.services.inference ¶
ML model loading and race prediction service.
InferenceService ¶
Loads a trained model and generates race-finish predictions.
load_model ¶
load_model() -> bool
Load the trained model and feature engineer from disk.
Returns:
| Type | Description |
|---|---|
bool
|
|
predict_race
async
¶
predict_race(db: AsyncSession, season: int, round: int, driver_ids: list[str]) -> list[dict]
Predict race results for the specified list of driver IDs.
predict_and_store_race
async
¶
predict_and_store_race(db: AsyncSession, season: int, round: int) -> list[dict]
Predict results for all known drivers and persist them.
Training¶
Feature Engineering¶
ml.training.features ¶
Feature engineering for the F1 race prediction model.
Computes per-driver, per-constructor, and per-circuit aggregate statistics from historical race results.
FeatureEngineer ¶
Compute and store aggregate statistics for ML features.
Call :meth:fit on a training DataFrame, then :meth:transform to
produce the feature matrix. At inference time use
:meth:get_prediction_features.
Model¶
ml.training.model ¶
Gradient-boosting classifier wrapper for F1 finish-position prediction.
RacePredictor ¶
Wraps a GradientBoostingClassifier to predict race finishing positions.
Positions are capped to 1-20 and encoded via LabelEncoder.
Training Pipeline¶
ml.training.train ¶
Model training pipeline.
Loads historical race results from the database, engineers features, trains a
RacePredictor model, and serialises the artefacts to disk.
load_training_data
async
¶
load_training_data(start_year: int | None = None, end_year: int | None = None) -> pd.DataFrame
Query race results from the database and return them as a DataFrame.
train_model ¶
train_model(df: DataFrame) -> tuple[RacePredictor, FeatureEngineer]
Train a RacePredictor on the provided results DataFrame.
save_model ¶
save_model(model: RacePredictor, feature_engineer: FeatureEngineer)
Serialise the trained model and feature engineer to disk.
main
async
¶
main(start_year: int | None = None, end_year: int | None = None)
End-to-end training entrypoint: load data, train, and save.