bakaano.runner¶
High-level orchestration for Bakaano-Hydro workflows.
Role: Provide a user-facing API to train, evaluate, and simulate streamflow.
- class bakaano.runner.BakaanoHydro(working_dir, study_area, climate_data_source)[source]¶
Bases:
objectMain user-facing interface for Bakaano-Hydro workflows.
- create_land_cover_scenario(scenario_name, geometry=None, percent_change=0, change_type='deforestation', map_obj=None, open_map_if_missing=True)[source]¶
Create scenario rasters from geometry or the latest drawn polygon on map.
If
geometryis None, this tries to read the last drawn geometry frommap_obj(if provided) or the last map fromexplore_scenario_draw_map. If nothing is drawn andopen_map_if_missing=True, it returns a draw map.
- evaluate_streamflow_model_interactively(model_path, val_start, val_end, grdc_netcdf, routing_method='mfd', catchment_size_threshold=1000, area_normalize=True, csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', runoff_output_dir=None)[source]¶
Interactively evaluate a trained streamflow model.
- Parameters:
model_path (str) – Path to a trained Keras model.
val_start (str) – Validation start date (YYYY-MM-DD).
val_end (str) – Validation end date (YYYY-MM-DD).
grdc_netcdf (str) – GRDC NetCDF path (if using GRDC data).
routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).
catchment_size_threshold (float) – Minimum catchment size for stations.
area_normalize (bool) – Whether to area-normalize predictors/response. If False, predictions are interpreted in raw m³/s.
csv_dir (str, optional) – Directory of per-station CSVs.
lookup_csv (str, optional) – CSV lookup file with station coords.
id_col (str) – Station id column in lookup CSV.
lat_col (str) – Latitude column in lookup CSV.
lon_col (str) – Longitude column in lookup CSV.
date_col (str) – Date column in station CSVs.
discharge_col (str) – Discharge column in station CSVs.
file_pattern (str) – Filename pattern for station CSVs.
- explore_data_interactively(start_date, end_date, grdc_netcdf=None)[source]¶
Launch an interactive map to explore inputs and stations.
- Parameters:
start_date (str) – Start date (YYYY-MM-DD) for GRDC filtering.
end_date (str) – End date (YYYY-MM-DD) for GRDC filtering.
grdc_netcdf (str, optional) – GRDC NetCDF path for stations overlay.
- Returns:
Interactive map object.
- Return type:
leafmap.foliumap.Map
- recompute_scenario_runoff(scenario_name, sim_start, sim_end, routing_method='mfd', climate_data_source=None, force=False, resume=False)[source]¶
Recompute routed runoff for a scenario.
- simulate_grdc_csv_stations(model_path, sim_start, sim_end, grdc_netcdf, routing_method='mfd', csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', area_normalize=True, runoff_output_dir=None)[source]¶
Simulate streamflow for GRDC or CSV stations in batch.
- Parameters:
model_path (str) – Path to a trained Keras model.
sim_start (str) – Simulation start date (YYYY-MM-DD).
sim_end (str) – Simulation end date (YYYY-MM-DD).
grdc_netcdf (str) – GRDC NetCDF path (if using GRDC data).
routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).
csv_dir (str, optional) – Directory of per-station CSVs.
lookup_csv (str, optional) – CSV lookup file with station coords.
id_col (str) – Station id column in lookup CSV.
lat_col (str) – Latitude column in lookup CSV.
lon_col (str) – Longitude column in lookup CSV.
date_col (str) – Date column in station CSVs.
discharge_col (str) – Discharge column in station CSVs.
file_pattern (str) – Filename pattern for station CSVs.
area_normalize (bool) – Whether to area-normalize predictors/response. If False, outputs are raw m³/s.
- simulate_scenario_grdc_csv_stations(scenario_name, model_path, sim_start, sim_end, grdc_netcdf, routing_method='mfd', csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', area_normalize=True, recompute_runoff=False)[source]¶
Run station-based streamflow simulation using scenario runoff outputs.
- simulate_scenario_streamflow(scenario_name, model_path, sim_start, sim_end, latlist, lonlist, routing_method='mfd', area_normalize=True, recompute_runoff=False)[source]¶
Run point-based streamflow simulation using scenario runoff outputs.
- simulate_streamflow(model_path, sim_start, sim_end, latlist, lonlist, routing_method='mfd', area_normalize=True, runoff_output_dir=None)[source]¶
Simulate streamflow for given coordinates using a trained model.
- Parameters:
model_path (str) – Path to a trained Keras model.
sim_start (str) – Simulation start date (YYYY-MM-DD).
sim_end (str) – Simulation end date (YYYY-MM-DD).
latlist (list[float]) – List of latitudes.
lonlist (list[float]) – List of longitudes.
routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).
area_normalize (bool) – Whether to area-normalize predictors/response. If False, outputs are raw m³/s.
- train_streamflow_model(train_start, train_end, grdc_netcdf, batch_size, num_epochs, learning_rate=0.0005, loss_function='asym_laplace_nll', seed=100, routing_method='mfd', catchment_size_threshold=1, area_normalize=True, lr_schedule='cosine', warmup_epochs=1, min_learning_rate=5e-05, csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', model_overwrite=True)[source]¶
Train the deep learning streamflow prediction model.
- Parameters:
train_start (str) – Training start date (YYYY-MM-DD).
train_end (str) – Training end date (YYYY-MM-DD).
grdc_netcdf (str) – GRDC NetCDF path (if using GRDC data).
batch_size (int) – Training batch size.
num_epochs (int) – Number of training epochs.
learning_rate (float) – Optimizer learning rate.
loss_function (str) – Loss name (default: “asym_laplace_nll”).
seed (int) – Random seed for sampling.
routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).
catchment_size_threshold (float) – Minimum catchment size for stations.
area_normalize (bool) – Whether to area-normalize predictors/response. If False, responses are modeled as raw m³/s without area normalization.
lr_schedule (str, optional) – Learning-rate schedule (“cosine”, “exp_decay”).
warmup_epochs (int) – Number of warmup epochs before scheduling.
min_learning_rate (float) – Minimum learning rate for schedules.
csv_dir (str, optional) – Directory of per-station CSVs.
lookup_csv (str, optional) – CSV lookup file with station coords.
id_col (str) – Station id column in lookup CSV.
lat_col (str) – Latitude column in lookup CSV.
lon_col (str) – Longitude column in lookup CSV.
date_col (str) – Date column in station CSVs.
discharge_col (str) – Discharge column in station CSVs.
file_pattern (str) – Filename pattern for station CSVs.
model_overwrite (bool) – If True, start a fresh model and overwrite existing checkpoints. If False and a saved model exists, load it and continue training.