Add a Data Variable#
A data variable is a thematic family loaded by Project.load_data
and stored in the input cache. Today seventeen variables ship under
hydromodpy/data/variables/: hydrometry, piezometry,
water_quality, hydrography, oceanic, geology,
dem, precipitation, temperature, etp,
soil_moisture, humidity, recharge, runoff,
radiation, wind, intermittency.
Add a brand-new variable only when the family is genuinely new (a new physical quantity with its own normalisation contract). To add a source to an existing variable, see Add a Data Source instead.
Two manager flavours#
Point variables (gauges, observations) inherit from
BaseVariableManagerand emitPointRecordobjects.Field variables (rasters, gridded forcing) inherit from
BaseFieldManagerand emitFieldRecordobjects.
Both contracts live under hydromodpy/data/ (base_manager_variable.py,
base_manager_field.py, _base_manager_common.py).
Files to create#
For a new variable called newvar:
hydromodpy/data/variables/newvar/
|-- __init__.py # exports
|-- config.py # NewvarSourceConfig, NewvarConfig
|-- manager.py # NewvarManager(BaseVariableManager|BaseFieldManager)
|-- custom.py # load_custom(source_cfg, project_period)
|-- contracts.py # variable-specific record fields if any
|-- apis/
| |-- __init__.py
| `-- <source_name>.py # one file per public source
`-- README.md # short rationale
Use hydromodpy/data/variables/hydrometry/ as the canonical
template for point variables, hydromodpy/data/variables/dem/ for
field variables.
Manager skeleton#
# hydromodpy/data/variables/newvar/manager.py
from typing import ClassVar
from hydromodpy.data.base_manager_variable import BaseVariableManager
from hydromodpy.data.contracts.records import PointRecord
from hydromodpy.data.contracts.results import LoadResult
class NewvarManager(BaseVariableManager):
VARIABLE_NAME: ClassVar[str] = "newvar"
def _fetch_from_source(self, source_cfg, project_period):
# dispatch on source_cfg.source ("custom", "myapi", ...)
# return LoadResult(points=[PointRecord(...)], fields=[], warnings=[])
...
The base class handles cache lookups, persistence to CSV plus LOC metadata, registration in the catalog, and warning aggregation.
Config block#
Each variable owns one block under [data.<variable>] with a
sources list:
[data.newvar]
date_start = "2020-01-01"
date_end = "2020-12-31"
[[data.newvar.sources]]
source = "myapi"
product = "qmnj"
extent = "watershed"
The matching Pydantic models:
# hydromodpy/data/variables/newvar/config.py
from typing import Annotated, Literal
from pydantic import BaseModel, ConfigDict, Field
from hydromodpy.core.config_kit.profile import Profile
class NewvarApiSourceConfig(BaseModel):
model_config = ConfigDict(extra="forbid")
source: Literal["myapi"]
product: Annotated[str, Profile.USER]
extent: Annotated[str, Profile.USER] = "watershed"
class NewvarCustomSourceConfig(BaseModel):
model_config = ConfigDict(extra="forbid")
source: Literal["custom"]
path: Annotated[str, Profile.USER]
NewvarSourceConfig = NewvarApiSourceConfig | NewvarCustomSourceConfig
class NewvarConfig(BaseModel):
model_config = ConfigDict(extra="forbid")
date_start: Annotated[str | None, Profile.USER] = None
date_end: Annotated[str | None, Profile.USER] = None
sources: list[NewvarSourceConfig] = Field(default_factory=list)
Source registry#
Each public API source registers itself through
hydromodpy/data/sources.py:
# hydromodpy/data/variables/newvar/apis/myapi.py
from hydromodpy.data.sources import register_source
@register_source(variable_type="newvar", source_name="myapi")
class NewvarMyApiSource:
def fetch(self, source_cfg, context) -> "LoadResult":
...
The decorator binds the (variable_type, source_name) pair so
NewvarManager._fetch_from_source can resolve it through
get_source(variable_type, source_name).
Wire it into the planner#
DataPlanner (hydromodpy/data/planner.py) merges the
[data].types list with rules that infer extra families from other
sections (geology if the domain references “geology”, hydrography if
the flow uses "stream" boundary conditions, etc.). If your
variable should be auto-inferred from a foreign config, add the rule
to the planner.
Otherwise the user activates it explicitly:
[data]
types = ["newvar"]
Wire it into HydroModPyConfig#
In hydromodpy/data/data_managers_config.py (or the equivalent
discriminated union), add NewvarConfig so [data.newvar]
parses cleanly.
Tests to add#
Unit under
tests/unit/data/newvar/for the config parsers, the custom loader on a fixture file, and the manager’s normalisation contract.Integration under
tests/integration/data/for oneProject.load_datacycle using a stub source.Replay under
hydromodpy/data/examples/(or a dedicated fixtures folder) for any HTTP-backed source so smoke tests can run offline.
Run hmp data list after a manual cache write to confirm the
catalog row appears.
Pitfalls flagged by the layer matrix#
datamay importcoreandschemaonly. Thedata -> spatialtolerance is strictly reserved for the geology field bridge.Do not import
simulation,solver,calibration,results,display,analysisfrom a manager.The data layer must not depend on a workspace; the optional
workspace.cachelives separately and is set at runtime.Records emitted by the manager must be timezone-aware when they carry datetimes (UTC by default).
See also#
data for the data package map and the existing variables.
Add a Data Source for adding a new source to an existing variable.
Add a Config Field if the new variable needs a dedicated TOML knob outside the source list.
Data Loading And Retrieval for the user-facing data loading guide.