bakaano.streamflow_simulator¶
Simulation and inference utilities for streamflow prediction.
Role: Prepare simulation inputs and run trained model inference.
- bakaano.streamflow_simulator._open_dataset_with_fallback(nc_path)[source]¶
Open NetCDF with backend fallback for Colab/Drive compatibility.
- class bakaano.streamflow_simulator.PredictDataPreprocessor(working_dir, study_area, sim_start, sim_end, routing_method, grdc_streamflow_nc_file=None, catchment_size_threshold=None, runoff_output_dir=None)[source]¶
Bases:
object- _extract_station_rowcol(lat, lon)[source]¶
Extract the row and column indices for a given latitude and longitude from given raster file.
- Parameters:
lat (float) – The latitude of the station.
lon (float) – The longitude of the station.
- Returns:
row (int) – The row index corresponding to the given latitude and longitude.
col (int) – The column index corresponding to the given latitude and longitude.
- _snap_coordinates(lat, lon)[source]¶
Snap the given latitude and longitude to the nearest river segment based on a river grid.
- Parameters:
lat (float) – The latitude to be snapped.
lon (float) – The longitude to be snapped.
- Returns:
snapped_lat (float) – The latitude of the nearest river segment.
snapped_lon (float) – The longitude of the nearest river segment.
- _check_point_in_region(olat, olon)[source]¶
Check whether a single (olat, olon) point lies within a study-area shapefile.
If NOT inside: raise SystemExit with a formatted, user-facing message
If inside: print confirmation and do nothing
- load_observed_streamflow(grdc_streamflow_nc_file)[source]¶
Load and filter observed GRDC streamflow data in a schema-robust way. Works for single- and multi-station NetCDFs.
- Parameters:
grdc_streamflow_nc_file (str) – Path to GRDC NetCDF file.
- Returns:
Filtered GRDC subset for the study area.
- Return type:
xarray.Dataset
- load_observed_streamflow_from_csv_dir(csv_dir, lookup_csv, id_col='id', lat_col='latitude', lon_col='longitude', date_col='date', discharge_col='discharge', file_pattern='{id}.csv')[source]¶
Load observed streamflow from per-station CSV files using a lookup table.
The lookup table must include station identifiers and coordinates. The method filters stations to the study area, then loads per-station CSVs by ID.
- Parameters:
csv_dir (str) – Directory containing per-station CSV files.
lookup_csv (str) – CSV file with station ids and coordinates.
id_col (str) – Station id column in lookup CSV.
lat_col (str) – Latitude column in lookup CSV.
lon_col (str) – Longitude column in lookup CSV.
date_col (str) – Date column in station CSVs.
discharge_col (str) – Discharge column in station CSVs.
file_pattern (str) – Pattern for station CSV filenames (e.g.,
"{id}.csv").
- Returns:
Mapping of station_id to observed discharge DataFrame.
- Return type:
dict
- get_data()[source]¶
Extract and preprocess predictor and response variables for each station based on its coordinates.
- Returns:
A list containing two elements: - self.data_list: A list of tuples, each containing predictors (DataFrame) and response (DataFrame). - self.catchment: A list of tuples, each containing catchment data (accumulation and slope values).
- Return type:
list
- class bakaano.streamflow_simulator.PredictStreamflow(working_dir, area_normalize=True)[source]¶
Bases:
object