Skip to content

Internals

For contributors, not end users

This page documents the internal modules used by GeoBayesianNetwork and the data sources. Normal users interact exclusively with the public API described in GeoBayesianNetwork, InferenceResult, and the Sources pages.

Grid module (grid.py)

GridSpec dataclass

Describes the spatial grid that all inputs are aligned to.

from_params classmethod

from_params(crs, resolution, extent)

Build a GridSpec from explicit parameters.

Parameters

crs: Target CRS as EPSG string (e.g. "EPSG:32632") or WKT. resolution: Pixel size in the units of crs. extent: (xmin, ymin, xmax, ymax) in the units of crs.

extent_wgs84

extent_wgs84()

Return the bounding box in WGS84 (lon_min, lat_min, lon_max, lat_max).

align_to_grid

align_to_grid(data, grid)

Reproject and resample data to match grid.

Returns a (H, W) float32 array. Pixels that fall outside data's extent are filled with NaN.

A source with crs=None (ConstantSource) is broadcast directly. A source that already matches the target grid is returned as-is.

Discretization module (discretize.py)

DiscretizationSpec dataclass

Maps continuous values to BN state labels via breakpoints.

Example

DiscretizationSpec([0, 10, 30, 90], ["flat", "moderate", "steep"]) produces: - value < 10 → "flat" (index 0) - 10 ≤ value < 30 → "moderate" (index 1) - value ≥ 30 → "steep" (index 2)

The first and last breakpoints define the documented valid range but do not affect the bin boundaries used for digitization.

discretize_array

discretize_array(array, spec)

Return an integer index array (H, W) matching each pixel to a state.

NaN pixels are mapped to -1 (sentinel for NoData).

Inference module (inference.py)

run_inference

run_inference(model, evidence_state_grids, evidence_state_names, query_nodes, query_state_names, nodata_mask, ve=None)

Run batched pixel-wise inference.

Parameters

model: A fitted pgmpy BayesianNetwork. evidence_state_grids: Mapping from evidence node name to (H, W) int16 array of state indices. evidence_state_names: Mapping from evidence node name to its ordered list of state labels. query_nodes: Nodes whose posterior distributions are requested. query_state_names: Mapping from query node name to its ordered list of state labels. nodata_mask: (H, W) boolean array; True where any input pixel is NoData. ve: Pre-built :class:pgmpy.inference.VariableElimination engine. If None (default) a new one is created from model. Pass a cached instance to avoid recreating it on every call when the model does not change.

Returns

Mapping from query node name to a (H, W, n_states) float32 array.

build_conditional_table

build_conditional_table(model, evidence_nodes, query_nodes, ve=None, max_table_cells=_MAX_TABLE_CELLS)

Compute P(query | evidence) for all evidence combinations at once.

Instead of one VE query per evidence combination (prod(n_states) calls), the joint P(query, e_1, ..., e_k) is computed with a single VE query per query node and normalised along the query axis. The result is a lookup table identical in shape to the one built by :meth:~geobn.GeoBayesianNetwork.precompute.

Parameters

model: A fitted pgmpy DiscreteBayesianNetwork. evidence_nodes: Evidence node names; their order defines the table axes. query_nodes: Nodes whose conditional distributions are tabulated. ve: Optional pre-built VariableElimination engine. max_table_cells: If the total number of table cells (summed over query nodes) exceeds this bound, None is returned and the caller should fall back to per-combination queries.

Returns

Mapping from query node to a float32 array of shape (n_states_0, ..., n_states_k, n_query_states), or None if the table would exceed max_table_cells (or a query node is itself an evidence node). Evidence combinations with zero prior probability yield NaN rows.

run_inference_from_table

run_inference_from_table(table, node_order, evidence_state_grids, nodata_mask)

Map pixel-wise discrete evidence to precomputed probabilities via fancy indexing.

This is the zero-pgmpy fast path used after :meth:~geobn.GeoBayesianNetwork.precompute. Probabilities are read from a lookup table using numpy advanced indexing — O(H×W) rather than running pgmpy per unique evidence combination.

Parameters

table: Mapping from query node name to a numpy array of shape (n_states_0, n_states_1, ..., n_states_k, n_query_states) where the first k axes correspond to the k nodes in node_order. node_order: Evidence node names in the order matching the table axes. evidence_state_grids: Mapping from node name to (H, W) int array of state indices. Nodata pixels (index -1) are masked out via nodata_mask. nodata_mask: (H, W) boolean array; True where any input pixel is NoData.

Returns

Mapping from query node name to a (H, W, n_states) float32 array. NaN where nodata_mask is True.

shannon_entropy

shannon_entropy(probs)

Compute per-pixel Shannon entropy (bits) from a probability array.

Parameters

probs: (..., n_states) array of probabilities.

Returns

(...) array of entropy values.

Types (_types.py)

RasterData

Bases: NamedTuple

Internal raster representation — never exposes rasterio objects.

I/O helpers (_io.py)

write_geotiff

write_geotiff(array, crs, transform, path, nodata=float('nan'))

Write a multi-band float32 GeoTIFF.

Parameters

array: (bands, H, W) float32 array. crs: CRS as EPSG string or WKT. transform: Affine pixel-to-world transform. path: Output file path. nodata: NoData value written into the file metadata.

Disk cache utilities (sources/_cache.py)

_make_cache_path

_make_cache_path(cache_dir, key)

_load_cached

_load_cached(cache_path)

Return cached RasterData or None if absent / corrupt.

_save_cached

_save_cached(cache_path, data)