bakaano.runner¶

High-level orchestration for Bakaano-Hydro workflows.

Role: Provide a user-facing API to train, evaluate, and simulate streamflow.

class bakaano.runner.BakaanoHydro(working_dir, study_area, climate_data_source)[source]¶

Bases: object

Main user-facing interface for Bakaano-Hydro workflows.

create_land_cover_scenario(scenario_name, geometry=None, percent_change=0, change_type='deforestation', map_obj=None, open_map_if_missing=True)[source]¶

Create scenario rasters from geometry or the latest drawn polygon on map.

If geometry is None, this tries to read the last drawn geometry from map_obj (if provided) or the last map from explore_scenario_draw_map. If nothing is drawn and open_map_if_missing=True, it returns a draw map.

evaluate_streamflow_model_interactively(model_path, val_start, val_end, grdc_netcdf, routing_method='mfd', catchment_size_threshold=1000, area_normalize=True, csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', runoff_output_dir=None)[source]¶

Interactively evaluate a trained streamflow model.

Parameters:

model_path (str) – Path to a trained Keras model.
val_start (str) – Validation start date (YYYY-MM-DD).
val_end (str) – Validation end date (YYYY-MM-DD).
grdc_netcdf (str) – GRDC NetCDF path (if using GRDC data).
routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).
catchment_size_threshold (float) – Minimum catchment size for stations.
area_normalize (bool) – Whether to area-normalize predictors/response. If False, predictions are interpreted in raw m³/s.
csv_dir (str, optional) – Directory of per-station CSVs.
lookup_csv (str, optional) – CSV lookup file with station coords.
id_col (str) – Station id column in lookup CSV.
lat_col (str) – Latitude column in lookup CSV.
lon_col (str) – Longitude column in lookup CSV.
date_col (str) – Date column in station CSVs.
discharge_col (str) – Discharge column in station CSVs.
file_pattern (str) – Filename pattern for station CSVs.

explore_data_interactively(start_date, end_date, grdc_netcdf=None)[source]¶

Launch an interactive map to explore inputs and stations.

Parameters:

start_date (str) – Start date (YYYY-MM-DD) for GRDC filtering.
end_date (str) – End date (YYYY-MM-DD) for GRDC filtering.
grdc_netcdf (str, optional) – GRDC NetCDF path for stations overlay.

Returns:

Interactive map object.

Return type:

leafmap.foliumap.Map

explore_scenario_draw_map()[source]¶: Launch an interactive draw map for scenario polygon creation.

recompute_scenario_runoff(scenario_name, sim_start, sim_end, routing_method='mfd', climate_data_source=None, force=False, resume=False)[source]¶: Recompute routed runoff for a scenario.

simulate_grdc_csv_stations(model_path, sim_start, sim_end, grdc_netcdf, routing_method='mfd', csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', area_normalize=True, runoff_output_dir=None)[source]¶

Simulate streamflow for GRDC or CSV stations in batch.

Parameters:

model_path (str) – Path to a trained Keras model.
sim_start (str) – Simulation start date (YYYY-MM-DD).
sim_end (str) – Simulation end date (YYYY-MM-DD).
grdc_netcdf (str) – GRDC NetCDF path (if using GRDC data).
routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).
csv_dir (str, optional) – Directory of per-station CSVs.
lookup_csv (str, optional) – CSV lookup file with station coords.
id_col (str) – Station id column in lookup CSV.
lat_col (str) – Latitude column in lookup CSV.
lon_col (str) – Longitude column in lookup CSV.
date_col (str) – Date column in station CSVs.
discharge_col (str) – Discharge column in station CSVs.
file_pattern (str) – Filename pattern for station CSVs.
area_normalize (bool) – Whether to area-normalize predictors/response. If False, outputs are raw m³/s.

simulate_scenario_grdc_csv_stations(scenario_name, model_path, sim_start, sim_end, grdc_netcdf, routing_method='mfd', csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', area_normalize=True, recompute_runoff=False)[source]¶: Run station-based streamflow simulation using scenario runoff outputs.

simulate_scenario_streamflow(scenario_name, model_path, sim_start, sim_end, latlist, lonlist, routing_method='mfd', area_normalize=True, recompute_runoff=False)[source]¶: Run point-based streamflow simulation using scenario runoff outputs.

simulate_streamflow(model_path, sim_start, sim_end, latlist, lonlist, routing_method='mfd', area_normalize=True, runoff_output_dir=None)[source]¶

Simulate streamflow for given coordinates using a trained model.

Parameters:

model_path (str) – Path to a trained Keras model.
sim_start (str) – Simulation start date (YYYY-MM-DD).
sim_end (str) – Simulation end date (YYYY-MM-DD).
latlist (list[float]) – List of latitudes.
lonlist (list[float]) – List of longitudes.
routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).
area_normalize (bool) – Whether to area-normalize predictors/response. If False, outputs are raw m³/s.

train_streamflow_model(train_start, train_end, grdc_netcdf, batch_size, num_epochs, learning_rate=0.0005, loss_function='asym_laplace_nll', seed=100, routing_method='mfd', catchment_size_threshold=1, area_normalize=True, lr_schedule='cosine', warmup_epochs=1, min_learning_rate=5e-05, csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', model_overwrite=True)[source]¶

Train the deep learning streamflow prediction model.

Parameters:

train_start (str) – Training start date (YYYY-MM-DD).
train_end (str) – Training end date (YYYY-MM-DD).
grdc_netcdf (str) – GRDC NetCDF path (if using GRDC data).
batch_size (int) – Training batch size.
num_epochs (int) – Number of training epochs.
learning_rate (float) – Optimizer learning rate.
loss_function (str) – Loss name (default: “asym_laplace_nll”).
seed (int) – Random seed for sampling.
routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).
catchment_size_threshold (float) – Minimum catchment size for stations.
area_normalize (bool) – Whether to area-normalize predictors/response. If False, responses are modeled as raw m³/s without area normalization.
lr_schedule (str, optional) – Learning-rate schedule (“cosine”, “exp_decay”).
warmup_epochs (int) – Number of warmup epochs before scheduling.
min_learning_rate (float) – Minimum learning rate for schedules.
csv_dir (str, optional) – Directory of per-station CSVs.
lookup_csv (str, optional) – CSV lookup file with station coords.
id_col (str) – Station id column in lookup CSV.
lat_col (str) – Latitude column in lookup CSV.
lon_col (str) – Longitude column in lookup CSV.
date_col (str) – Date column in station CSVs.
discharge_col (str) – Discharge column in station CSVs.
file_pattern (str) – Filename pattern for station CSVs.
model_overwrite (bool) – If True, start a fresh model and overwrite existing checkpoints. If False and a saved model exists, load it and continue training.