Enrichers¶
What is the enricher module?
The enricher
module is the heart of UrbanMapper
’s analysis—they take your urban layer
and transform it
into meaningful
statistics
, like counting taxi pickups
at each intersection
or averaging building heights
per neighborhood
given your loaded urban data
.
Meanwhile, we recommend to look through the Example
's Enricher for a more hands-on introduction about
the enricher module and its usage.
Documentation Under Alpha Construction
This documentation is in its early stages and still being developed. The API may therefore change, and some parts might be incomplete or inaccurate.
Use at your own risk, and please report anything that seems incorrect
/ outdated
you find.
EnricherBase
¶
Bases: ABC
Base class for all data enrichers in UrbanMapper
This abstract class defines the common interface that all enricher implementations must implement. Enrichers add data or derived information to urban layers, enhancing them with additional attributes, statistics, or related data.
Enrichers typically perform operations like:
- Aggregating data values (sum, mean, median, etc.)
- Counting features within areas or near points
- Computing statistics on related data
- Joining external information to the urban layer
Attributes:
Name | Type | Description |
---|---|---|
config |
Configuration object for the enricher, containing parameters that control the enrichment process. |
Source code in src/urban_mapper/modules/enricher/abc_enricher.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
|
_enrich(input_geodataframe, urban_layer, **kwargs)
abstractmethod
¶
Internal method to carry out the enrichment.
This method must be fleshed out by subclasses to define the nitty-gritty of how enrichment happens.
Method Not Implemented
Subclasses must implement this. It’s where the logic of enrichment takes place.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_geodataframe
|
GeoDataFrame
|
The GeoDataFrame with data for enrichment. |
required |
urban_layer
|
UrbanLayerBase
|
The urban layer to be enriched. |
required |
**kwargs
|
Extra parameters to tweak the enrichment. |
{}
|
Returns:
Type | Description |
---|---|
UrbanLayerBase
|
The enriched urban layer. |
Source code in src/urban_mapper/modules/enricher/abc_enricher.py
enrich(input_geodataframe, urban_layer, **kwargs)
¶
Enrich an urban layer
with data from the input GeoDataFrame
.
The main public method for wielding enrichers. It hands off to the
implementation-specific _enrich
method after any needed validation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_geodataframe
|
Union[Dict[str, GeoDataFrame], GeoDataFrame]
|
one or more |
required |
urban_layer
|
UrbanLayerBase
|
Urban layer to beef up with data from input_geodataframe. |
required |
**kwargs
|
Additional bespoke parameters to customise enrichment. |
{}
|
Returns:
Type | Description |
---|---|
UrbanLayerBase
|
The enriched urban layer sporting new columns or attributes. |
Raises:
Type | Description |
---|---|
ValueError
|
If the enrichment can’t be done. |
Examples:
>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> streets = mapper.urban_layer.OSMNXStreets().from_place("London, UK")
>>> taxi_trips = mapper.loader.from_file("taxi_trips.csv") ... .with_columns(longitude_column="pickup_lng", latitude_column="pickup_lat") ... .load()
>>> enricher = mapper.enricher ... .with_type("SingleAggregatorEnricher") ... .with_data(group_by="nearest_street") ... .count_by(output_column="trip_count") ... .build()
>>> enriched_streets = enricher.enrich(taxi_trips, streets)
Source code in src/urban_mapper/modules/enricher/abc_enricher.py
preview(format='ascii')
abstractmethod
¶
Generate a preview of the enricher instance.
Produces a summary of the enricher for a quick peek during UrbanMapper
’s workflow.
Method Not Implemented
Subclasses must implement this to offer a preview of the enricher’s setup and data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
format
|
str
|
Output format for the preview. Options include:
|
'ascii'
|
Returns:
Type | Description |
---|---|
Any
|
A representation of the enricher in the requested format. Type varies by format. |
Raises:
Type | Description |
---|---|
ValueError
|
If an unsupported format is requested. |
Source code in src/urban_mapper/modules/enricher/abc_enricher.py
SingleAggregatorEnricher
¶
Bases: EnricherBase
Enricher Using a Single Aggregator
For Urban Layers
.
Uses one aggregator to enrich urban layers
, adding results as a new column
.
The aggregator decides how input data is processed (e.g., counted
, averaged
).
Attributes:
Name | Type | Description |
---|---|---|
config |
Config object for the enricher. |
|
aggregator |
Aggregator computing stats or counts. |
|
output_column |
Column name for aggregated results. |
|
debug |
Whether to include debug info. |
Examples:
>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> streets = mapper.urban_layer.OSMNXStreets().from_place("London, UK")
>>> trips = mapper.loader.from_file("trips.csv") ... .with_columns(longitude_column="lng", latitude_column="lat") ... .load()
>>> enricher = mapper.enricher ... .with_data(group_by="nearest_street") ... .count_by(output_column="trip_count") ... .build()
>>> enriched_streets = enricher.enrich(trips, streets)
Source code in src/urban_mapper/modules/enricher/enrichers/single_aggregator_enricher.py
_enrich(input_geodataframe, urban_layer, **kwargs)
¶
Enrich an urban layer
with an aggregator
.
Aggregates data from the input GeoDataFrame
and adds it to the urban layer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_geodataframe
|
GeoDataFrame
|
|
required |
urban_layer
|
UrbanLayerBase
|
Urban layer to enrich. |
required |
**kwargs
|
Extra params for customisation. |
{}
|
Returns:
Type | Description |
---|---|
UrbanLayerBase
|
Enriched urban layer with new columns. |
Raises:
Type | Description |
---|---|
ValueError
|
If aggregation fails. |
Source code in src/urban_mapper/modules/enricher/enrichers/single_aggregator_enricher.py
preview(format='ascii')
¶
Generate a preview of this enricher.
Creates a summary for quick inspection.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
format
|
str
|
Output format—"ascii" (text) or "json" (dict). |
'ascii'
|
Returns:
Type | Description |
---|---|
Any
|
Preview in the requested format. |
Source code in src/urban_mapper/modules/enricher/enrichers/single_aggregator_enricher.py
EnricherFactory
¶
Factory Class For Creating and Configuring Data Enrichers
.
This class offers a fluent, chaining-methods interface for crafting and setting up
data enrichers
in the UrbanMapper
workflow. Enrichers
empower spatial aggregation
and analysis on geographic data—like counting points in polygons or tallying stats
for regions.
The factory handles the nitty-gritty of enricher
instantiation, configuration
,
and application
, ensuring a uniform workflow no matter the enricher type.
Attributes:
Name | Type | Description |
---|---|---|
config |
Configuration settings steering the enricher. |
|
_instance |
Optional[EnricherBase]
|
The underlying enricher instance (internal use only). |
_preview |
Optional[dict]
|
Preview configuration (internal use only). |
Examples:
>>> import urban_mapper as um
>>> import geopandas as gpd
>>> mapper = um.UrbanMapper()
>>> hoods = mapper.urban_layer.region_neighborhoods().from_place("London, UK")
>>> points = gpd.read_file("points.geojson")
>>> # Count points per neighbourhood
>>> enriched_hoods = mapper.enricher ... .with_type("SingleAggregatorEnricher")\ # By default not needed as this is the default / only one at the moment.
... .with_data(group_by="neighbourhood") ... .count_by(output_column="point_count") ... .build() ... .enrich(points, hoods)
Source code in src/urban_mapper/modules/enricher/enricher_factory.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 |
|
with_data(*args, **kwargs)
¶
Specify columns to group by and values to aggregate.
Sets up which columns to group data by and, optionally, which to pull values from for aggregation during enrichment.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
group_by
|
Column name(s) to group by. Can be a string or list of strings. |
required | |
values_from
|
Column name(s) to aggregate. Optional; if wanted, must be a string. |
required |
Returns:
Type | Description |
---|---|
EnricherFactory
|
The EnricherFactory instance for chaining. |
Examples:
>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher.with_data(group_by="neighbourhood")
Source code in src/urban_mapper/modules/enricher/enricher_factory.py
with_debug(debug=True)
¶
Toggle debug mode for the enricher.
Enables or disables debug mode, which can spill extra info during enrichment.
What Extra Info?
For instance, we will be able to have an extra column for each enrichments that shows which indices were taken from the original data to apply the enrichment. This is useful to understand how the enrichment was done and to debug any issues that may arise. Another one may also be for some Machine learning-based tasks that would require so.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
debug
|
bool
|
Whether to turn on debug mode (default: True). # Such a parameter might be needed when stacking |
True
|
Returns: The EnricherFactory instance for chaining.
Examples:
>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher.with_debug(True)
Source code in src/urban_mapper/modules/enricher/enricher_factory.py
with_preview(format='ascii')
¶
Set the factory to show a preview after building.
Configures an automatic preview post-build—handy for a quick check.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
format
|
str
|
Preview format—"ascii" (default, text) or "json" (dict). |
'ascii'
|
Returns:
Type | Description |
---|---|
EnricherFactory
|
The EnricherFactory instance for chaining. |
Examples:
>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher ... .with_data(group_by="pickup") ... .count_by() ... .with_preview()
Source code in src/urban_mapper/modules/enricher/enricher_factory.py
aggregate_by(*args, **kwargs)
¶
Set the enricher to perform aggregation operations.
Configures the enricher to aggregate data (e.g., sum
, mean
) using provided args.
Available Methods
-
sum
-
mean
-
median
-
min
-
max
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*args
|
Positional args for EnricherConfig.aggregate_by. |
()
|
|
**kwargs
|
Keyword args like |
{}
|
Returns:
Type | Description |
---|---|
EnricherFactory
|
The EnricherFactory instance for chaining. |
Examples:
>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher ... .with_data(group_by="neighbourhood", values_from="temp") ... .aggregate_by(method="mean", output_column="avg_temp")
Source code in src/urban_mapper/modules/enricher/enricher_factory.py
count_by(*args, **kwargs)
¶
Set the enricher to count features.
Configures the enricher to count items per group—great for tallying points in areas.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*args
|
Positional args for EnricherConfig.count_by. |
()
|
|
**kwargs
|
Keyword args like |
{}
|
Returns:
Type | Description |
---|---|
EnricherFactory
|
The EnricherFactory instance for chaining. |
Examples:
>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher ... .with_data(group_by="pickup") ... .count_by(output_column="pickup_count")
Source code in src/urban_mapper/modules/enricher/enricher_factory.py
with_type(primitive_type)
¶
Choose the enricher type to create.
Sets the type of enricher, dictating the enrichment approach, from the registry.
At the moment only one exists
-
SingleAggregatorEnricher
(default)
Hence, no need use with_type
unless you want to use a different one in the future.
Furthermore, we kept it for compatibility with other modules.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
primitive_type
|
str
|
Name of the enricher type (e.g., "SingleAggregatorEnricher"). |
required |
Returns:
Type | Description |
---|---|
EnricherFactory
|
The EnricherFactory instance for chaining. |
Raises:
Type | Description |
---|---|
ValueError
|
If the type isn’t in the registry. |
Examples:
>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher.with_type("SingleAggregatorEnricher")
Source code in src/urban_mapper/modules/enricher/enricher_factory.py
build()
¶
Build and return the configured enricher instance.
Finalises the setup, validates it, and creates the enricher with its aggregator.
Returns:
Type | Description |
---|---|
EnricherBase
|
An EnricherBase-derived instance tailored to the factory’s settings. |
Raises:
Type | Description |
---|---|
ValueError
|
If config is invalid (e.g., missing params). |
Examples:
>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher ... .with_type("SingleAggregatorEnricher") ... .with_data(group_by="pickup") ... .count_by(output_column="pickup_count") ... .build()
Source code in src/urban_mapper/modules/enricher/enricher_factory.py
preview(format='ascii')
¶
Show a preview of the configured enricher.
Displays a sneak peek of the enricher setup in the chosen format.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
format
|
str
|
Preview format—"ascii" (text) or "json" (dict). |
'ascii'
|
Returns:
Type | Description |
---|---|
Union[None, str, dict]
|
None for "ascii" (prints to console), dict for "json". |
Raises:
Type | Description |
---|---|
ValueError
|
If format isn’t supported. |
Examples:
>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher ... .with_data(group_by="pickup") ... .count_by() ... .build()
>>> enricher.preview()
Source code in src/urban_mapper/modules/enricher/enricher_factory.py
BaseAggregator
¶
Bases: ABC
Base Class For Data Aggregators.
Where is that used?
Note the following are used throughout the Enrichers, e.g
SingleAggregatorEnricher
. This means, not to use this directly,
but to explore when needed for advanced configuration throughout
the enricher's primitive chosen.
Defines the interface for aggregator implementations, which crunch stats on
grouped data. Aggregators take input data
, group it
by a column
, apply
a function
,
and yields out the results
.
To Implement
All concrete aggregators must inherit from this and
implement _aggregate
.
Examples:
>>> import urban_mapper as um
>>> import pandas as pd
>>> mapper = um.UrbanMapper()
>>> data = pd.DataFrame({
... "hood": ["A", "A", "B", "B"],
... "value": [10, 20, 15, 25]
... })
>>> enricher = mapper.enricher ... .with_data(group_by="hood", values_from="value") ... .aggregate_by(method="mean", output_column="avg_value") ... .build()
Source code in src/urban_mapper/modules/enricher/aggregator/abc_aggregator.py
_aggregate(input_dataframe)
abstractmethod
¶
Perform the aggregation on the input DataFrame.
Core method for subclasses to override with specific aggregation logic.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_dataframe
|
DataFrame
|
DataFrame to aggregate. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
DataFrame with at least a 'value' column of aggregated results and |
DataFrame
|
an 'indices' column of original row indices per group. |
Source code in src/urban_mapper/modules/enricher/aggregator/abc_aggregator.py
aggregate(input_dataframe)
¶
Aggregate the input DataFrame.
Public method to kick off aggregation, validating input before delegating
to _aggregate
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_dataframe
|
DataFrame
|
DataFrame to aggregate. Mustn’t be None or empty. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
DataFrame with aggregation results. |
Raises:
Type | Description |
---|---|
ValueError
|
If input_dataframe is None or empty. |
Source code in src/urban_mapper/modules/enricher/aggregator/abc_aggregator.py
Enricher Aggregators Functions For Faster Perusal¶
In a Nutshell, How To Read That
An aggregation function is name
followed by a function
that takes a list
of values
and returns a single value.
The ones below are the common we deliver, utilising mainly Pandas.
AGGREGATION_FUNCTIONS = {'mean': pd.Series.mean, 'sum': pd.Series.sum, 'median': pd.Series.median, 'min': pd.Series.min, 'max': pd.Series.max}
module-attribute
¶
SimpleAggregator
¶
Bases: BaseAggregator
Aggregator For Standard Stats On Numeric Data.
Applies stats functions (e.g., mean
, sum
) to values
in a column
, grouped by another.
Useful for
Useful for scenarios like average height
per district or total population
per area.
Supports predefined functions in AGGREGATION_FUNCTIONS
or custom ones.
How to Use Custom Functions
Simply pass you own function receiving a series as parameter per the aggregation_function
argument.
Within the factory it'll be throughout aggregate_by(.)
and method
argument.
Attributes:
Name | Type | Description |
---|---|---|
group_by_column |
Column to group by. |
|
value_column |
Column with values to aggregate. |
|
aggregation_function |
Function to apply to grouped values. |
Examples:
>>> import urban_mapper as um
>>> import pandas as pd
>>> mapper = um.UrbanMapper()
>>> data = pd.DataFrame({
... "district": ["A", "A", "B"],
... "height": [10, 15, 20]
... })
>>> enricher = mapper.enricher ... .with_data(group_by="district", values_from="height") ... .aggregate_by(method="mean", output_column="avg_height") ... .build()
Source code in src/urban_mapper/modules/enricher/aggregator/aggregators/simple_aggregator.py
_aggregate(input_dataframe)
¶
Aggregate data with the aggregation function.
Groups the DataFrame
, applies the function to value_column
, and returns results.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_dataframe
|
DataFrame
|
DataFrame with |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
DataFrame with 'value' (aggregated values) and 'indices' (row indices). |
Raises:
Type | Description |
---|---|
KeyError
|
If required columns are missing. |
Source code in src/urban_mapper/modules/enricher/aggregator/aggregators/simple_aggregator.py
CountAggregator
¶
Bases: BaseAggregator
Aggregator For Counting Records In Groups.
Counts records per group, with an optional custom counting function. By default,
it uses len()
to count all records, but you can tweak it to count specific cases, see below.
Useful for
- Counting taxi pickups per area
- Tallying incidents per junction
- Totting up points of interest per district
Attributes:
Name | Type | Description |
---|---|---|
group_by_column |
Column to group data by. |
|
count_function |
Function to count records in each group (defaults to len). |
Examples:
>>> import urban_mapper as um
>>> import pandas as pd
>>> mapper = um.UrbanMapper()
>>> data = pd.DataFrame({
... "junction": ["A", "A", "B", "B", "C"],
... "type": ["minor", "major", "minor", "major", "minor"]
... })
>>> enricher = mapper.enricher ... .with_data(group_by="junction") ... .count_by(output_column="incident_count") ... .build()
Source code in src/urban_mapper/modules/enricher/aggregator/aggregators/count_aggregator.py
_aggregate(input_dataframe)
¶
Count records per group using the count function.
Groups the DataFrame by group_by_column
, applies the count function,
and returns a DataFrame with counts and indices.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_dataframe
|
DataFrame
|
DataFrame to aggregate, must have |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
DataFrame with 'value' (counts) and 'indices' (original row indices). |
Raises:
Type | Description |
---|---|
ValueError
|
If required column is missing. |