bakaano.runner

High-level orchestration for Bakaano-Hydro workflows.

Role: Provide a user-facing API to train, evaluate, and simulate streamflow.

class bakaano.runner.BakaanoHydro(working_dir, study_area, climate_data_source)[source]

Bases: object

Main user-facing interface for Bakaano-Hydro workflows.

create_land_cover_scenario(scenario_name, geometry=None, percent_change=0, change_type='deforestation', map_obj=None, open_map_if_missing=True)[source]

Create scenario rasters from geometry or the latest drawn polygon on map.

If geometry is None, this tries to read the last drawn geometry from map_obj (if provided) or the last map from explore_scenario_draw_map. If nothing is drawn and open_map_if_missing=True, it returns a draw map.

evaluate_streamflow_model_interactively(model_path, val_start, val_end, grdc_netcdf, routing_method='mfd', catchment_size_threshold=1000, area_normalize=True, csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', runoff_output_dir=None)[source]

Interactively evaluate a trained streamflow model.

Parameters:
  • model_path (str) – Path to a trained Keras model.

  • val_start (str) – Validation start date (YYYY-MM-DD).

  • val_end (str) – Validation end date (YYYY-MM-DD).

  • grdc_netcdf (str) – GRDC NetCDF path (if using GRDC data).

  • routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).

  • catchment_size_threshold (float) – Minimum catchment size for stations.

  • area_normalize (bool) – Whether to area-normalize predictors/response. If False, predictions are interpreted in raw m³/s.

  • csv_dir (str, optional) – Directory of per-station CSVs.

  • lookup_csv (str, optional) – CSV lookup file with station coords.

  • id_col (str) – Station id column in lookup CSV.

  • lat_col (str) – Latitude column in lookup CSV.

  • lon_col (str) – Longitude column in lookup CSV.

  • date_col (str) – Date column in station CSVs.

  • discharge_col (str) – Discharge column in station CSVs.

  • file_pattern (str) – Filename pattern for station CSVs.

explore_data_interactively(start_date, end_date, grdc_netcdf=None)[source]

Launch an interactive map to explore inputs and stations.

Parameters:
  • start_date (str) – Start date (YYYY-MM-DD) for GRDC filtering.

  • end_date (str) – End date (YYYY-MM-DD) for GRDC filtering.

  • grdc_netcdf (str, optional) – GRDC NetCDF path for stations overlay.

Returns:

Interactive map object.

Return type:

leafmap.foliumap.Map

explore_scenario_draw_map()[source]

Launch an interactive draw map for scenario polygon creation.

recompute_scenario_runoff(scenario_name, sim_start, sim_end, routing_method='mfd', climate_data_source=None, force=False, resume=False)[source]

Recompute routed runoff for a scenario.

simulate_grdc_csv_stations(model_path, sim_start, sim_end, grdc_netcdf, routing_method='mfd', csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', area_normalize=True, runoff_output_dir=None)[source]

Simulate streamflow for GRDC or CSV stations in batch.

Parameters:
  • model_path (str) – Path to a trained Keras model.

  • sim_start (str) – Simulation start date (YYYY-MM-DD).

  • sim_end (str) – Simulation end date (YYYY-MM-DD).

  • grdc_netcdf (str) – GRDC NetCDF path (if using GRDC data).

  • routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).

  • csv_dir (str, optional) – Directory of per-station CSVs.

  • lookup_csv (str, optional) – CSV lookup file with station coords.

  • id_col (str) – Station id column in lookup CSV.

  • lat_col (str) – Latitude column in lookup CSV.

  • lon_col (str) – Longitude column in lookup CSV.

  • date_col (str) – Date column in station CSVs.

  • discharge_col (str) – Discharge column in station CSVs.

  • file_pattern (str) – Filename pattern for station CSVs.

  • area_normalize (bool) – Whether to area-normalize predictors/response. If False, outputs are raw m³/s.

simulate_scenario_grdc_csv_stations(scenario_name, model_path, sim_start, sim_end, grdc_netcdf, routing_method='mfd', csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', area_normalize=True, recompute_runoff=False)[source]

Run station-based streamflow simulation using scenario runoff outputs.

simulate_scenario_streamflow(scenario_name, model_path, sim_start, sim_end, latlist, lonlist, routing_method='mfd', area_normalize=True, recompute_runoff=False)[source]

Run point-based streamflow simulation using scenario runoff outputs.

simulate_streamflow(model_path, sim_start, sim_end, latlist, lonlist, routing_method='mfd', area_normalize=True, runoff_output_dir=None)[source]

Simulate streamflow for given coordinates using a trained model.

Parameters:
  • model_path (str) – Path to a trained Keras model.

  • sim_start (str) – Simulation start date (YYYY-MM-DD).

  • sim_end (str) – Simulation end date (YYYY-MM-DD).

  • latlist (list[float]) – List of latitudes.

  • lonlist (list[float]) – List of longitudes.

  • routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).

  • area_normalize (bool) – Whether to area-normalize predictors/response. If False, outputs are raw m³/s.

train_streamflow_model(train_start, train_end, grdc_netcdf, batch_size, num_epochs, learning_rate=0.0005, loss_function='asym_laplace_nll', seed=100, routing_method='mfd', catchment_size_threshold=1, area_normalize=True, lr_schedule='cosine', warmup_epochs=1, min_learning_rate=5e-05, csv_dir=None, lookup_csv=None, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv', model_overwrite=True)[source]

Train the deep learning streamflow prediction model.

Parameters:
  • train_start (str) – Training start date (YYYY-MM-DD).

  • train_end (str) – Training end date (YYYY-MM-DD).

  • grdc_netcdf (str) – GRDC NetCDF path (if using GRDC data).

  • batch_size (int) – Training batch size.

  • num_epochs (int) – Number of training epochs.

  • learning_rate (float) – Optimizer learning rate.

  • loss_function (str) – Loss name (default: “asym_laplace_nll”).

  • seed (int) – Random seed for sampling.

  • routing_method (str) – Routing method (“mfd”, “d8”, “dinf”).

  • catchment_size_threshold (float) – Minimum catchment size for stations.

  • area_normalize (bool) – Whether to area-normalize predictors/response. If False, responses are modeled as raw m³/s without area normalization.

  • lr_schedule (str, optional) – Learning-rate schedule (“cosine”, “exp_decay”).

  • warmup_epochs (int) – Number of warmup epochs before scheduling.

  • min_learning_rate (float) – Minimum learning rate for schedules.

  • csv_dir (str, optional) – Directory of per-station CSVs.

  • lookup_csv (str, optional) – CSV lookup file with station coords.

  • id_col (str) – Station id column in lookup CSV.

  • lat_col (str) – Latitude column in lookup CSV.

  • lon_col (str) – Longitude column in lookup CSV.

  • date_col (str) – Date column in station CSVs.

  • discharge_col (str) – Discharge column in station CSVs.

  • file_pattern (str) – Filename pattern for station CSVs.

  • model_overwrite (bool) – If True, start a fresh model and overwrite existing checkpoints. If False and a saved model exists, load it and continue training.