Sample data¶
tsam_xarray includes sample energy data with realistic profiles for documentation.
In [1]:
Copied!
import plotly.io as pio
import xarray_plotly # noqa: F401 — registers .plotly accessor
import tsam_xarray
from tsam_xarray._sample_data import sample_energy_data
pio.renderers.default = "notebook_connected"
da = sample_energy_data(n_days=30)
print(f"Shape: {dict(da.sizes)}")
da.sel(region="north", scenario="low").plotly.line(
x="time", color="variable", title="Input data (north, low)"
)
import plotly.io as pio
import xarray_plotly # noqa: F401 — registers .plotly accessor
import tsam_xarray
from tsam_xarray._sample_data import sample_energy_data
pio.renderers.default = "notebook_connected"
da = sample_energy_data(n_days=30)
print(f"Shape: {dict(da.sizes)}")
da.sel(region="north", scenario="low").plotly.line(
x="time", color="variable", title="Input data (north, low)"
)
Shape: {'time': 720, 'variable': 3, 'region': 3, 'scenario': 2}
In [2]:
Copied!
da.sel(region="north", scenario="low").to_dataframe("value").head()
da.sel(region="north", scenario="low").to_dataframe("value").head()
Out[2]:
| region | scenario | value | ||
|---|---|---|---|---|
| time | variable | |||
| 2020-01-01 00:00:00 | solar | north | low | 0.022417 |
| wind | north | low | 0.997077 | |
| demand | north | low | 0.187212 | |
| 2020-01-01 01:00:00 | solar | north | low | 0.001815 |
| wind | north | low | 0.982863 |
Aggregate¶
For a (time, variable) array, cluster_dim is auto-detected.
In [3]:
Copied!
da_simple = da.sel(region="north", scenario="low")
result = tsam_xarray.aggregate(
da_simple,
time_dim="time",
cluster_dim="variable",
n_clusters=4,
)
result.cluster_representatives.to_dataframe("value").head(10)
da_simple = da.sel(region="north", scenario="low")
result = tsam_xarray.aggregate(
da_simple,
time_dim="time",
cluster_dim="variable",
n_clusters=4,
)
result.cluster_representatives.to_dataframe("value").head(10)
Out[3]:
| value | |||
|---|---|---|---|
| cluster | timestep | variable | |
| 0 | 0 | demand | 0.156564 |
| solar | 0.000000 | ||
| wind | 0.405613 | ||
| 1 | demand | 0.259442 | |
| solar | 0.002530 | ||
| wind | 0.341894 | ||
| 2 | demand | 0.338327 | |
| solar | 0.012811 | ||
| wind | 0.381879 | ||
| 3 | demand | 0.306719 |
In [4]:
Copied!
result.cluster_representatives.plotly.line(
line_shape="hv",
x="timestep",
facet_col="variable",
color="cluster",
title="Cluster representatives",
)
result.cluster_representatives.plotly.line(
line_shape="hv",
x="timestep",
facet_col="variable",
color="cluster",
title="Cluster representatives",
)
Inspect results¶
The result contains xarray-native fields.
In [5]:
Copied!
print(f"Clusters: {result.n_clusters}")
print(f"Timesteps per period: {result.n_timesteps_per_period}")
print("Cluster weights (days each represents):")
result.cluster_weights.to_dataframe("weight")
print(f"Clusters: {result.n_clusters}")
print(f"Timesteps per period: {result.n_timesteps_per_period}")
print("Cluster weights (days each represents):")
result.cluster_weights.to_dataframe("weight")
Clusters: 4 Timesteps per period: 24 Cluster weights (days each represents):
Out[5]:
| weight | |
|---|---|
| cluster | |
| 0 | 13 |
| 1 | 8 |
| 2 | 5 |
| 3 | 4 |
In [6]:
Copied!
result.accuracy.rmse.to_dataframe("RMSE")
result.accuracy.rmse.to_dataframe("RMSE")
Out[6]:
| RMSE | |
|---|---|
| variable | |
| demand | 0.073805 |
| solar | 0.112657 |
| wind | 0.156366 |
Reconstructed vs original¶
In [7]:
Copied!
import xarray as xr
comparison = xr.concat(
[da_simple.sel(variable="solar"), result.reconstructed.sel(variable="solar")],
dim="source",
).assign_coords(source=["original", "reconstructed"])
comparison.plotly.line(
x="time", color="source", title="Original vs reconstructed (solar)"
)
import xarray as xr
comparison = xr.concat(
[da_simple.sel(variable="solar"), result.reconstructed.sel(variable="solar")],
dim="source",
).assign_coords(source=["original", "reconstructed"])
comparison.plotly.line(
x="time", color="source", title="Original vs reconstructed (solar)"
)
In [8]:
Copied!
result.residuals.plotly.line(x="time", facet_col="variable", title="Residuals")
result.residuals.plotly.line(x="time", facet_col="variable", title="Residuals")
Passing tsam parameters¶
All tsam.aggregate() keyword arguments pass through.
In [9]:
Copied!
from tsam import ClusterConfig
result_km = tsam_xarray.aggregate(
da_simple,
time_dim="time",
cluster_dim="variable",
n_clusters=4,
cluster=ClusterConfig(method="kmeans"),
)
result_km.accuracy.rmse.to_dataframe("RMSE")
from tsam import ClusterConfig
result_km = tsam_xarray.aggregate(
da_simple,
time_dim="time",
cluster_dim="variable",
n_clusters=4,
cluster=ClusterConfig(method="kmeans"),
)
result_km.accuracy.rmse.to_dataframe("RMSE")
Out[9]:
| RMSE | |
|---|---|
| variable | |
| demand | 0.055077 |
| solar | 0.080815 |
| wind | 0.127189 |
Disaggregate¶
disaggregate() is the inverse of aggregate() — it maps any data on the (cluster, timestep) grid back to the original time axis.
In [10]:
Copied!
disaggregated = result.disaggregate(result.cluster_representatives)
comparison = xr.concat(
[da_simple.sel(variable="solar"), disaggregated.sel(variable="solar")],
dim="source",
).assign_coords(source=["original", "disaggregated"])
comparison.plotly.line(
x="time", color="source", title="Disaggregated vs original (solar)"
)
disaggregated = result.disaggregate(result.cluster_representatives)
comparison = xr.concat(
[da_simple.sel(variable="solar"), disaggregated.sel(variable="solar")],
dim="source",
).assign_coords(source=["original", "disaggregated"])
comparison.plotly.line(
x="time", color="source", title="Disaggregated vs original (solar)"
)