Skip to content

Data Model

Object relationships

%%{init: {'theme': 'neutral', 'themeVariables': {'fontSize': '15px'}, 'flowchart': {'padding': 16, 'nodeSpacing': 30, 'rankSpacing': 50, 'htmlLabels': true}}}%%
graph TD
    A["<b>aggregate(da)</b>"] --> R["<b>AggregationResult</b>"]
    R --> CR["<b>.clustering</b>"]
    R --> ACC["<b>.accuracy</b>"]
    CR -->|"to_json / from_json"| JSON["📄 clustering.json"]
    JSON -->|"load_clustering()"| CR2["<b>ClusteringResult</b>"]
    CR2 -->|"apply(new_da)"| R2["<b>AggregationResult</b>"]
    CR2 -->|"disaggregate(data)"| D["DataArray<br/><i>full time axis</i>"]
    R -->|"disaggregate(data)"| D

AggregationResult

Returned by aggregate(). Contains everything about one aggregation.

%%{init: {'theme': 'neutral', 'themeVariables': {'fontSize': '13px'}, 'flowchart': {'padding': 12, 'nodeSpacing': 8, 'rankSpacing': 40, 'htmlLabels': true}}}%%
graph LR
    R["<b>AggregationResult</b>"]

    R --- D["<b>Data</b>"]
    R --- Meta["<b>Metadata</b>"]

    D --- F1[".cluster_representatives<br/><i>cluster, timestep, *cluster_dims, *slice_dims</i>"]
    D --- F2[".reconstructed<br/><i>same shape as input</i>"]
    D --- F3[".cluster_assignments<br/><i>period, *slice_dims</i>"]
    D --- F4[".cluster_weights<br/><i>cluster, *slice_dims</i>"]
    D --- F5[".segment_durations<br/><i>cluster, timestep, *slice_dims | None</i>"]

    Meta --- A[".accuracy<br/><b>→ AccuracyMetrics</b>"]
    Meta --- C[".clustering<br/><b>→ ClusteringResult</b>"]

ClusteringResult

The reusable part — knows how the time series was clustered, without the original data. Access via result.clustering or load_clustering("clustering.json").

All DataArray properties are cached on first access.

%%{init: {'theme': 'neutral', 'themeVariables': {'fontSize': '13px'}, 'flowchart': {'padding': 12, 'nodeSpacing': 8, 'rankSpacing': 40, 'htmlLabels': true}}}%%
graph LR
    CR["<b>ClusteringResult</b>"]

    CR --- S["<b>Scalars</b>"]
    CR --- DA["<b>DataArray properties</b>"]
    CR --- M["<b>Methods</b>"]

    S --- S1[".n_clusters"]
    S --- S2[".n_original_periods"]
    S --- S3[".n_timesteps_per_period"]
    S --- S4[".n_segments"]

    DA --- DA1[".cluster_assignments<br/><i>period, *slice_dims</i>"]
    DA --- DA2[".cluster_occurrences<br/><i>cluster, *slice_dims</i>"]
    DA --- DA3[".cluster_centers<br/><i>cluster, *slice_dims</i>"]
    DA --- DA4[".segment_durations<br/><i>cluster, timestep, *slice_dims | None</i>"]
    DA --- DA5[".segment_assignments<br/><i>cluster, timestep, *slice_dims | None</i>"]
    DA --- DA6[".segment_centers<br/><i>cluster, segment, *slice_dims | None</i>"]

    M --- M1[".apply(da)"]
    M --- M2[".disaggregate(data)"]
    M --- M3[".to_json(path)"]
    M --- M4[".from_json(path)"]

AccuracyMetrics

Per-column metrics as DataArrays, plus weighted scalars.

Field Type Description
rmse DataArray Per-column RMSE
mae DataArray Per-column MAE
rmse_duration DataArray Per-column duration-curve RMSE
weighted_rmse float Scalar RMSE weighted by column weights
weighted_mae float Scalar MAE weighted by column weights
weighted_rmse_duration float Scalar duration RMSE weighted by column weights

Glossary

Term Meaning
cluster_dim Dimensions clustered together (stacked internally)
slice_dims Dimensions aggregated independently
period One repeating unit of time (e.g., one day)
cluster A group of similar periods
timestep Position within a period (e.g., hour 0-23)
segment A contiguous block of timesteps (with segmentation)