Enrichers¶

What is the enricher module?

The enricher module is the heart of UrbanMapper’s analysis—they take your urban layer and transform it into meaningful statistics, like counting taxi pickups at each intersection or averaging building heights per neighborhood given your loaded urban data. Meanwhile, we recommend to look through the Example's Enricher for a more hands-on introduction about the enricher module and its usage.

Documentation Under Alpha Construction

This documentation is in its early stages and still being developed. The API may therefore change, and some parts might be incomplete or inaccurate.

Use at your own risk, and please report anything that seems incorrect / outdated you find.

Open An Issue!

`EnricherBase` ¶

Bases: ABC

Base class for all data enrichers in UrbanMapper

This abstract class defines the common interface that all enricher implementations must implement. Enrichers add data or derived information to urban layers, enhancing them with additional attributes, statistics, or related data.

Enrichers typically perform operations like:

Aggregating data values (sum, mean, median, etc.)
Counting features within areas or near points
Computing statistics on related data
Joining external information to the urban layer

Attributes:

Name	Type	Description
`config`		Configuration object for the enricher, containing parameters that control the enrichment process.

Source code in src/urban_mapper/modules/enricher/abc_enricher.py

@beartype
class EnricherBase(ABC):
    """Base class for all data enrichers in `UrbanMapper`

    This abstract class defines the common interface that all enricher implementations
    must implement. Enrichers add data or derived information to urban layers,
    enhancing them with additional attributes, statistics, or related data.

    !!! note "Enrichers typically perform operations like:"

        - [x] Aggregating data values (sum, mean, median, etc.)
        - [x] Counting features within areas or near points
        - [x] Computing statistics on related data
        - [x] Joining external information to the urban layer

    Attributes:
        config: Configuration object for the enricher, containing parameters
            that control the enrichment process.
    """

    def __init__(self, config: Optional[Any] = None) -> None:
        from urban_mapper.modules.enricher.factory.config import EnricherConfig

        self.config = config or EnricherConfig()

    @abstractmethod
    def _enrich(
        self,
        input_geodataframe: gpd.GeoDataFrame,
        urban_layer: UrbanLayerBase,
        **kwargs,
    ) -> UrbanLayerBase:
        """Internal method to carry out the enrichment.

        This method must be fleshed out by subclasses to define the nitty-gritty
        of how enrichment happens.

        !!! warning "Method Not Implemented"
            Subclasses must implement this. It’s where the logic of enrichment takes place.

        Args:
            input_geodataframe: The GeoDataFrame with data for enrichment.
            urban_layer: The urban layer to be enriched.
            **kwargs: Extra parameters to tweak the enrichment.

        Returns:
            The enriched urban layer.
        """
        NotImplementedError("_enrich method not implemented.")

    @abstractmethod
    def preview(self, format: str = "ascii") -> Any:
        """Generate a preview of the enricher instance.

        Produces a summary of the enricher for a quick peek during `UrbanMapper`’s workflow.

        !!! warning "Method Not Implemented"
            Subclasses must implement this to offer a preview of the enricher’s setup and data.

        Args:
            format: Output format for the preview. Options include:

                - [x] `ascii`: Text-based format for terminal display
                - [x] `json`: JSON-formatted data for programmatic use

        Returns:
            A representation of the enricher in the requested format. Type varies by format.

        Raises:
            ValueError: If an unsupported format is requested.
        """
        NotImplementedError("Preview method not implemented.")

    def set_layer_data_source(
        self, urban_layer: UrbanLayerBase, index: Index
    ) -> UrbanLayerBase:
        """Initialized UrbanLayer data_id column with source name based on index list argument.

        Args:
            urban_layer: Urban layer to change.
            index: Index list of the Urban layer to change.

        Returns:
            Urban layer with new column data_id.
        """
        if self.config.data_id:
            if "data_id" not in urban_layer.layer:
                urban_layer.layer["data_id"] = pd.Series(np.nan, dtype="object")

            urban_layer.layer.loc[index, "data_id"] = self.config.data_id

        return urban_layer

    def enrich(
        self,
        input_geodataframe: Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame],
        urban_layer: UrbanLayerBase,
        **kwargs,
    ) -> UrbanLayerBase:
        """Enrich an `urban layer` with data from the input `GeoDataFrame`.

        The main public method for wielding enrichers. It hands off to the
        implementation-specific `_enrich` method after any needed validation.

        Args:
            input_geodataframe: one or more `GeoDataFrame` with data to enrich with.
            urban_layer: Urban layer to beef up with data from input_geodataframe.
            **kwargs: Additional bespoke parameters to customise enrichment.

        Returns:
            The enriched urban layer sporting new columns or attributes.

        Raises:
            ValueError: If the enrichment can’t be done.

        Examples:
            >>> import urban_mapper as um
            >>> mapper = um.UrbanMapper()
            >>> streets = mapper.urban_layer.OSMNXStreets().from_place("London, UK")
            >>> taxi_trips = mapper.loader.from_file("taxi_trips.csv")\
            ...     .with_columns(longitude_column="pickup_lng", latitude_column="pickup_lat")\
            ...     .load()
            >>> enricher = mapper.enricher\
            ...     .with_type("SingleAggregatorEnricher")\
            ...     .with_data(group_by="nearest_street")\
            ...     .count_by(output_column="trip_count")\
            ...     .build()
            >>> enriched_streets = enricher.enrich(taxi_trips, streets)
        """
        if isinstance(input_geodataframe, gpd.GeoDataFrame):
            return self._enrich(input_geodataframe, urban_layer, **kwargs)
        else:
            enriched_layer = urban_layer

            for key, gdf in input_geodataframe.items():
                if self.config.data_id is None or self.config.data_id == key:
                    enriched_layer = self._enrich(gdf, enriched_layer, **kwargs)

            return enriched_layer

`_enrich(input_geodataframe, urban_layer, **kwargs)` `abstractmethod` ¶

Internal method to carry out the enrichment.

This method must be fleshed out by subclasses to define the nitty-gritty of how enrichment happens.

Method Not Implemented

Subclasses must implement this. It’s where the logic of enrichment takes place.

Parameters:

Name	Type	Description	Default
`input_geodataframe`	`GeoDataFrame`	The GeoDataFrame with data for enrichment.	required
`urban_layer`	`UrbanLayerBase`	The urban layer to be enriched.	required
`**kwargs`		Extra parameters to tweak the enrichment.	`{}`

Returns:

Type	Description
`UrbanLayerBase`	The enriched urban layer.

Source code in src/urban_mapper/modules/enricher/abc_enricher.py

@abstractmethod
def _enrich(
    self,
    input_geodataframe: gpd.GeoDataFrame,
    urban_layer: UrbanLayerBase,
    **kwargs,
) -> UrbanLayerBase:
    """Internal method to carry out the enrichment.

    This method must be fleshed out by subclasses to define the nitty-gritty
    of how enrichment happens.

    !!! warning "Method Not Implemented"
        Subclasses must implement this. It’s where the logic of enrichment takes place.

    Args:
        input_geodataframe: The GeoDataFrame with data for enrichment.
        urban_layer: The urban layer to be enriched.
        **kwargs: Extra parameters to tweak the enrichment.

    Returns:
        The enriched urban layer.
    """
    NotImplementedError("_enrich method not implemented.")

`enrich(input_geodataframe, urban_layer, **kwargs)` ¶

Enrich an urban layer with data from the input GeoDataFrame.

The main public method for wielding enrichers. It hands off to the implementation-specific _enrich method after any needed validation.

Parameters:

Name	Type	Description	Default
`input_geodataframe`	`Union[Dict[str, GeoDataFrame], GeoDataFrame]`	one or more `GeoDataFrame` with data to enrich with.	required
`urban_layer`	`UrbanLayerBase`	Urban layer to beef up with data from input_geodataframe.	required
`**kwargs`		Additional bespoke parameters to customise enrichment.	`{}`

Returns:

Type	Description
`UrbanLayerBase`	The enriched urban layer sporting new columns or attributes.

Raises:

Type	Description
`ValueError`	If the enrichment can’t be done.

Examples:

>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> streets = mapper.urban_layer.OSMNXStreets().from_place("London, UK")
>>> taxi_trips = mapper.loader.from_file("taxi_trips.csv")            ...     .with_columns(longitude_column="pickup_lng", latitude_column="pickup_lat")            ...     .load()
>>> enricher = mapper.enricher            ...     .with_type("SingleAggregatorEnricher")            ...     .with_data(group_by="nearest_street")            ...     .count_by(output_column="trip_count")            ...     .build()
>>> enriched_streets = enricher.enrich(taxi_trips, streets)

Source code in src/urban_mapper/modules/enricher/abc_enricher.py

def enrich(
    self,
    input_geodataframe: Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame],
    urban_layer: UrbanLayerBase,
    **kwargs,
) -> UrbanLayerBase:
    """Enrich an `urban layer` with data from the input `GeoDataFrame`.

    The main public method for wielding enrichers. It hands off to the
    implementation-specific `_enrich` method after any needed validation.

    Args:
        input_geodataframe: one or more `GeoDataFrame` with data to enrich with.
        urban_layer: Urban layer to beef up with data from input_geodataframe.
        **kwargs: Additional bespoke parameters to customise enrichment.

    Returns:
        The enriched urban layer sporting new columns or attributes.

    Raises:
        ValueError: If the enrichment can’t be done.

    Examples:
        >>> import urban_mapper as um
        >>> mapper = um.UrbanMapper()
        >>> streets = mapper.urban_layer.OSMNXStreets().from_place("London, UK")
        >>> taxi_trips = mapper.loader.from_file("taxi_trips.csv")\
        ...     .with_columns(longitude_column="pickup_lng", latitude_column="pickup_lat")\
        ...     .load()
        >>> enricher = mapper.enricher\
        ...     .with_type("SingleAggregatorEnricher")\
        ...     .with_data(group_by="nearest_street")\
        ...     .count_by(output_column="trip_count")\
        ...     .build()
        >>> enriched_streets = enricher.enrich(taxi_trips, streets)
    """
    if isinstance(input_geodataframe, gpd.GeoDataFrame):
        return self._enrich(input_geodataframe, urban_layer, **kwargs)
    else:
        enriched_layer = urban_layer

        for key, gdf in input_geodataframe.items():
            if self.config.data_id is None or self.config.data_id == key:
                enriched_layer = self._enrich(gdf, enriched_layer, **kwargs)

        return enriched_layer

`preview(format='ascii')` `abstractmethod` ¶

Generate a preview of the enricher instance.

Produces a summary of the enricher for a quick peek during UrbanMapper’s workflow.

Method Not Implemented

Subclasses must implement this to offer a preview of the enricher’s setup and data.

Parameters:

Name	Type	Description	Default
`format`	`str`	Output format for the preview. Options include: `ascii`: Text-based format for terminal display `json`: JSON-formatted data for programmatic use	`'ascii'`

Returns:

Type	Description
`Any`	A representation of the enricher in the requested format. Type varies by format.

Raises:

Type	Description
`ValueError`	If an unsupported format is requested.

Source code in src/urban_mapper/modules/enricher/abc_enricher.py

@abstractmethod
def preview(self, format: str = "ascii") -> Any:
    """Generate a preview of the enricher instance.

    Produces a summary of the enricher for a quick peek during `UrbanMapper`’s workflow.

    !!! warning "Method Not Implemented"
        Subclasses must implement this to offer a preview of the enricher’s setup and data.

    Args:
        format: Output format for the preview. Options include:

            - [x] `ascii`: Text-based format for terminal display
            - [x] `json`: JSON-formatted data for programmatic use

    Returns:
        A representation of the enricher in the requested format. Type varies by format.

    Raises:
        ValueError: If an unsupported format is requested.
    """
    NotImplementedError("Preview method not implemented.")

`SingleAggregatorEnricher` ¶

Bases: EnricherBase

Enricher Using a Single Aggregator For Urban Layers.

Uses one aggregator to enrich urban layers, adding results as a new column. The aggregator decides how input data is processed (e.g., counted, averaged).

Attributes:

Name	Type	Description
`config`		Config object for the enricher.
`aggregator`		Aggregator computing stats or counts.
`output_column`		Column name for aggregated results.
`debug`		Whether to include debug info.

Examples:

>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> streets = mapper.urban_layer.OSMNXStreets().from_place("London, UK")
>>> trips = mapper.loader.from_file("trips.csv")        ...     .with_columns(longitude_column="lng", latitude_column="lat")        ...     .load()
>>> enricher = mapper.enricher        ...     .with_data(group_by="nearest_street")        ...     .count_by(output_column="trip_count")        ...     .build()
>>> enriched_streets = enricher.enrich(trips, streets)

Source code in src/urban_mapper/modules/enricher/enrichers/single_aggregator_enricher.py

@beartype
class SingleAggregatorEnricher(EnricherBase):
    """Enricher Using a `Single Aggregator` For `Urban Layers`.

    Uses one aggregator to enrich `urban layers`, adding `results as a new column`.
    The aggregator decides how input data is processed (e.g., `counted`, `averaged`).

    Attributes:
        config: Config object for the enricher.
        aggregator: Aggregator computing stats or counts.
        output_column: Column name for aggregated results.
        debug: Whether to include debug info.

    Examples:
        >>> import urban_mapper as um
        >>> mapper = um.UrbanMapper()
        >>> streets = mapper.urban_layer.OSMNXStreets().from_place("London, UK")
        >>> trips = mapper.loader.from_file("trips.csv")\
        ...     .with_columns(longitude_column="lng", latitude_column="lat")\
        ...     .load()
        >>> enricher = mapper.enricher\
        ...     .with_data(group_by="nearest_street")\
        ...     .count_by(output_column="trip_count")\
        ...     .build()
        >>> enriched_streets = enricher.enrich(trips, streets)
    """

    def __init__(
        self,
        aggregator: BaseAggregator,
        output_column: str = "aggregated_value",
        config: EnricherConfig = None,
    ) -> None:
        super().__init__(config)
        self.aggregator = aggregator
        self.output_column = output_column
        self.debug = config.debug

    def _enrich(
        self,
        input_geodataframe: gpd.GeoDataFrame,
        urban_layer: UrbanLayerBase,
        **kwargs,
    ) -> UrbanLayerBase:
        """Enrich an `urban layer` with an `aggregator`.

        Aggregates data from the input `GeoDataFrame` and adds it to the urban layer.

        Args:
            input_geodataframe: `GeoDataFrame` with enrichment data.
            urban_layer: Urban layer to enrich.
            **kwargs: Extra params for customisation.

        Returns:
            Enriched urban layer with new columns.

        Raises:
            ValueError: If aggregation fails.
        """
        aggregated_df = self.aggregator.aggregate(input_geodataframe)
        enriched_values = (
            aggregated_df["value"].reindex(urban_layer.layer.index).fillna(0)
        )
        urban_layer = self.set_layer_data_source(urban_layer, aggregated_df.index)
        urban_layer.layer[self.output_column] = enriched_values
        if self.debug:
            indices_values = (
                aggregated_df["indices"]
                .reindex(urban_layer.layer.index)
                .apply(lambda x: x if isinstance(x, list) else [])
            )
            urban_layer.layer[f"DEBUG_{self.output_column}"] = indices_values
        return urban_layer

    def preview(self, format: str = "ascii") -> Any:
        """Generate a preview of this enricher.

        Creates a summary for quick inspection.

        Args:
            format: Output format—"ascii" (text) or "json" (dict).

        Returns:
            Preview in the requested format.
        """
        preview_builder = PreviewBuilder(self.config, ENRICHER_REGISTRY)
        return preview_builder.build_preview(format=format)

`_enrich(input_geodataframe, urban_layer, **kwargs)` ¶

Enrich an urban layer with an aggregator.

Aggregates data from the input GeoDataFrame and adds it to the urban layer.

Parameters:

Name	Type	Description	Default
`input_geodataframe`	`GeoDataFrame`	`GeoDataFrame` with enrichment data.	required
`urban_layer`	`UrbanLayerBase`	Urban layer to enrich.	required
`**kwargs`		Extra params for customisation.	`{}`

Returns:

Type	Description
`UrbanLayerBase`	Enriched urban layer with new columns.

Raises:

Type	Description
`ValueError`	If aggregation fails.

Source code in src/urban_mapper/modules/enricher/enrichers/single_aggregator_enricher.py

def _enrich(
    self,
    input_geodataframe: gpd.GeoDataFrame,
    urban_layer: UrbanLayerBase,
    **kwargs,
) -> UrbanLayerBase:
    """Enrich an `urban layer` with an `aggregator`.

    Aggregates data from the input `GeoDataFrame` and adds it to the urban layer.

    Args:
        input_geodataframe: `GeoDataFrame` with enrichment data.
        urban_layer: Urban layer to enrich.
        **kwargs: Extra params for customisation.

    Returns:
        Enriched urban layer with new columns.

    Raises:
        ValueError: If aggregation fails.
    """
    aggregated_df = self.aggregator.aggregate(input_geodataframe)
    enriched_values = (
        aggregated_df["value"].reindex(urban_layer.layer.index).fillna(0)
    )
    urban_layer = self.set_layer_data_source(urban_layer, aggregated_df.index)
    urban_layer.layer[self.output_column] = enriched_values
    if self.debug:
        indices_values = (
            aggregated_df["indices"]
            .reindex(urban_layer.layer.index)
            .apply(lambda x: x if isinstance(x, list) else [])
        )
        urban_layer.layer[f"DEBUG_{self.output_column}"] = indices_values
    return urban_layer

`preview(format='ascii')` ¶

Generate a preview of this enricher.

Creates a summary for quick inspection.

Parameters:

Name	Type	Description	Default
`format`	`str`	Output format—"ascii" (text) or "json" (dict).	`'ascii'`

Returns:

Type	Description
`Any`	Preview in the requested format.

Source code in src/urban_mapper/modules/enricher/enrichers/single_aggregator_enricher.py

def preview(self, format: str = "ascii") -> Any:
    """Generate a preview of this enricher.

    Creates a summary for quick inspection.

    Args:
        format: Output format—"ascii" (text) or "json" (dict).

    Returns:
        Preview in the requested format.
    """
    preview_builder = PreviewBuilder(self.config, ENRICHER_REGISTRY)
    return preview_builder.build_preview(format=format)

`EnricherFactory` ¶

Factory Class For Creating and Configuring Data Enrichers.

This class offers a fluent, chaining-methods interface for crafting and setting up data enrichers in the UrbanMapper workflow. Enrichers empower spatial aggregation and analysis on geographic data—like counting points in polygons or tallying stats for regions.

The factory handles the nitty-gritty of enricher instantiation, configuration, and application, ensuring a uniform workflow no matter the enricher type.

Attributes:

Name	Type	Description
`config`		Configuration settings steering the enricher.
`_instance`	`Optional[EnricherBase]`	The underlying enricher instance (internal use only).
`_preview`	`Optional[dict]`	Preview configuration (internal use only).

Examples:

>>> import urban_mapper as um
>>> import geopandas as gpd
>>> mapper = um.UrbanMapper()
>>> hoods = mapper.urban_layer.region_neighborhoods().from_place("London, UK")
>>> points = gpd.read_file("points.geojson")
>>> # Count points per neighbourhood
>>> enriched_hoods = mapper.enricher        ...     .with_type("SingleAggregatorEnricher")\ # By default not needed as this is the default / only one at the moment.
...     .with_data(group_by="neighbourhood")        ...     .count_by(output_column="point_count")        ...     .build()        ...     .enrich(points, hoods)

Source code in src/urban_mapper/modules/enricher/enricher_factory.py

@beartype
class EnricherFactory:
    """Factory Class For Creating and Configuring Data `Enrichers`.

    This class offers a fluent, chaining-methods interface for crafting and setting up
    data `enrichers` in the `UrbanMapper` workflow. `Enrichers` empower spatial aggregation
    and analysis on geographic data—like counting points in polygons or tallying stats
    for regions.

    The factory handles the nitty-gritty of `enricher` instantiation, `configuration`,
    and `application`, ensuring a uniform workflow no matter the enricher type.

    Attributes:
        config: Configuration settings steering the enricher.
        _instance: The underlying enricher instance (internal use only).
        _preview: Preview configuration (internal use only).

    Examples:
        >>> import urban_mapper as um
        >>> import geopandas as gpd
        >>> mapper = um.UrbanMapper()
        >>> hoods = mapper.urban_layer.region_neighborhoods().from_place("London, UK")
        >>> points = gpd.read_file("points.geojson")
        >>> # Count points per neighbourhood
        >>> enriched_hoods = mapper.enricher\
        ...     .with_type("SingleAggregatorEnricher")\ # By default not needed as this is the default / only one at the moment.
        ...     .with_data(group_by="neighbourhood")\
        ...     .count_by(output_column="point_count")\
        ...     .build()\
        ...     .enrich(points, hoods)
    """

    def __init__(self):
        self.config = EnricherConfig()
        self._instance: Optional[EnricherBase] = None
        self._preview: Optional[dict] = None

    def with_data(self, *args, **kwargs) -> "EnricherFactory":
        """Specify columns to group by and values to aggregate.

        Sets up which columns to group data by and, optionally, which to pull
        values from for aggregation during enrichment.

        Args:
            group_by: Column name(s) to group by. Can be a string or list of strings.
            values_from: Column name(s) to aggregate. Optional; if wanted, must be a string.

        Returns:
            The EnricherFactory instance for chaining.

        Examples:
            >>> import urban_mapper as um
            >>> mapper = um.UrbanMapper()
            >>> enricher = mapper.enricher.with_data(group_by="neighbourhood")
        """
        self.config.with_data(*args, **kwargs)
        return self

    def with_debug(self, debug: bool = True) -> "EnricherFactory":
        """Toggle debug mode for the enricher.

        Enables or disables debug mode, which can spill extra info during enrichment.

        !!! note "What Extra Info?"
            For instance, we will be able to have an extra column for each enrichments that shows which indices
            were taken from the original data to apply the enrichment. This is useful to understand
            how the enrichment was done and to debug any issues that may arise. Another one may also be
            for some Machine learning-based tasks that would require so.

        Args:
            debug: Whether to turn on debug mode (default: True). # Such a parameter might be needed when stacking `.with_debug()`, and trying to `false` the behaviour rather than deleting the line.
        Returns:
            The EnricherFactory instance for chaining.

        Examples:
            >>> import urban_mapper as um
            >>> mapper = um.UrbanMapper()
            >>> enricher = mapper.enricher.with_debug(True)
        """
        self.config.debug = debug
        return self

    def aggregate_by(self, *args, **kwargs) -> "EnricherFactory":
        """Set the enricher to perform aggregation operations.

        Configures the enricher to aggregate data (e.g., `sum`, `mean`) using provided args.

        !!! tip "Available Methods"

            - [x] `sum`
            - [x] `mean`
            - [x] `median`
            - [x] `min`
            - [x] `max`

        Args:
            *args: Positional args for EnricherConfig.aggregate_by.
            **kwargs: Keyword args like `group_by`, `values_from`, `method` (e.g., "sum").

        Returns:
            The EnricherFactory instance for chaining.

        Examples:
            >>> import urban_mapper as um
            >>> mapper = um.UrbanMapper()
            >>> enricher = mapper.enricher\
            ...     .with_data(group_by="neighbourhood", values_from="temp")\
            ...     .aggregate_by(method="mean", output_column="avg_temp")
        """
        self.config.aggregate_by(*args, **kwargs)
        return self

    def count_by(self, *args, **kwargs) -> "EnricherFactory":
        """Set the enricher to count features.

        Configures the enricher to count items per group—great for tallying points in areas.

        Args:
            *args: Positional args for EnricherConfig.count_by.
            **kwargs: Keyword args like `group_by`, `output_column`.

        Returns:
            The EnricherFactory instance for chaining.

        Examples:
            >>> import urban_mapper as um
            >>> mapper = um.UrbanMapper()
            >>> enricher = mapper.enricher\
            ...     .with_data(group_by="pickup")\
            ...     .count_by(output_column="pickup_count")
        """
        self.config.count_by(*args, **kwargs)
        return self

    def with_type(self, primitive_type: str) -> "EnricherFactory":
        """Choose the enricher type to create.

        Sets the type of enricher, dictating the enrichment approach, from the registry.

        !!! note "At the moment only one exists"

            - [x] `SingleAggregatorEnricher` (default)

            Hence, no need use `with_type` unless you want to use a different one in the future.
            Furthermore, we kept it for compatibility with other modules.

        Args:
            primitive_type: Name of the enricher type (e.g., "SingleAggregatorEnricher").

        Returns:
            The EnricherFactory instance for chaining.

        Raises:
            ValueError: If the type isn’t in the registry.

        Examples:
            >>> import urban_mapper as um
            >>> mapper = um.UrbanMapper()
            >>> enricher = mapper.enricher.with_type("SingleAggregatorEnricher")
        """
        if primitive_type not in ENRICHER_REGISTRY:
            available = list(ENRICHER_REGISTRY.keys())
            match, score = process.extractOne(primitive_type, available)
            if score > 80:
                suggestion = f" Maybe you meant '{match}'?"
            else:
                suggestion = ""
            raise ValueError(
                f"Unknown enricher type '{primitive_type}'. Available: {', '.join(available)}.{suggestion}"
            )
        self.config.with_type(primitive_type)
        return self

    def preview(self, format: str = "ascii") -> Union[None, str, dict]:
        """Show a preview of the configured enricher.

        Displays a sneak peek of the enricher setup in the chosen format.

        Args:
            format: Preview format—"ascii" (text) or "json" (dict).

        Returns:
            None for "ascii" (prints to console), dict for "json".

        Raises:
            ValueError: If format isn’t supported.

        Examples:
            >>> import urban_mapper as um
            >>> mapper = um.UrbanMapper()
            >>> enricher = mapper.enricher\
            ...     .with_data(group_by="pickup")\
            ...     .count_by()\
            ...     .build()
            >>> enricher.preview()
        """
        if self._instance is None:
            print("No Enricher instance available to preview.")
            return None
        if hasattr(self._instance, "preview"):
            preview_data = self._instance.preview(format=format)
            if format == "ascii":
                print(preview_data)
            elif format == "json":
                return preview_data
            else:
                raise ValueError(f"Unsupported format '{format}'.")
        else:
            print("Preview not supported for this Enricher instance.")
        return None

    def with_preview(self, format: str = "ascii") -> "EnricherFactory":
        """Set the factory to show a preview after building.

        Configures an automatic preview post-build—handy for a quick check.

        Args:
            format: Preview format—"ascii" (default, text) or "json" (dict).

        Returns:
            The EnricherFactory instance for chaining.

        Examples:
            >>> import urban_mapper as um
            >>> mapper = um.UrbanMapper()
            >>> enricher = mapper.enricher\
            ...     .with_data(group_by="pickup")\
            ...     .count_by()\
            ...     .with_preview()
        """
        self._preview = {"format": format}
        return self

    def build(self) -> EnricherBase:
        """Build and return the configured enricher instance.

        Finalises the setup, validates it, and creates the enricher with its aggregator.

        Returns:
            An EnricherBase-derived instance tailored to the factory’s settings.

        Raises:
            ValueError: If config is invalid (e.g., missing params).

        Examples:
            >>> import urban_mapper as um
            >>> mapper = um.UrbanMapper()
            >>> enricher = mapper.enricher\
            ...     .with_type("SingleAggregatorEnricher")\
            ...     .with_data(group_by="pickup")\
            ...     .count_by(output_column="pickup_count")\
            ...     .build()
        """
        validate_group_by(self.config)
        validate_action(self.config)

        if self.config.action == "aggregate":
            method = self.config.aggregator_config["method"]
            if isinstance(method, str):
                if method not in AGGREGATION_FUNCTIONS:
                    raise ValueError(f"Unknown aggregation method '{method}'")
                aggregation_function = AGGREGATION_FUNCTIONS[method]
            elif callable(method):
                aggregation_function = method
            else:
                raise ValueError("Aggregation method must be a string or a callable")
            aggregator = SimpleAggregator(
                group_by_column=self.config.group_by[0],
                value_column=self.config.values_from[0],
                aggregation_function=aggregation_function,
            )
        elif self.config.action == "count":
            aggregator = CountAggregator(
                group_by_column=self.config.group_by[0],
                count_function=len,
            )
        else:
            raise ValueError(
                "Unknown action. Please open an issue on GitHub to request such feature."
            )

        enricher_class = ENRICHER_REGISTRY[self.config.enricher_type]
        self._instance = enricher_class(
            aggregator=aggregator,
            output_column=self.config.enricher_config["output_column"],
            config=self.config,
        )
        if self._preview:
            self.preview(format=self._preview["format"])
        return self._instance

`with_data(*args, **kwargs)` ¶

Specify columns to group by and values to aggregate.

Sets up which columns to group data by and, optionally, which to pull values from for aggregation during enrichment.

Parameters:

Name	Type	Description	Default
`group_by`		Column name(s) to group by. Can be a string or list of strings.	required
`values_from`		Column name(s) to aggregate. Optional; if wanted, must be a string.	required

Returns:

Type	Description
`EnricherFactory`	The EnricherFactory instance for chaining.

Examples:

>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher.with_data(group_by="neighbourhood")

Source code in src/urban_mapper/modules/enricher/enricher_factory.py

def with_data(self, *args, **kwargs) -> "EnricherFactory":
    """Specify columns to group by and values to aggregate.

    Sets up which columns to group data by and, optionally, which to pull
    values from for aggregation during enrichment.

    Args:
        group_by: Column name(s) to group by. Can be a string or list of strings.
        values_from: Column name(s) to aggregate. Optional; if wanted, must be a string.

    Returns:
        The EnricherFactory instance for chaining.

    Examples:
        >>> import urban_mapper as um
        >>> mapper = um.UrbanMapper()
        >>> enricher = mapper.enricher.with_data(group_by="neighbourhood")
    """
    self.config.with_data(*args, **kwargs)
    return self

`with_debug(debug=True)` ¶

Toggle debug mode for the enricher.

Enables or disables debug mode, which can spill extra info during enrichment.

What Extra Info?

For instance, we will be able to have an extra column for each enrichments that shows which indices were taken from the original data to apply the enrichment. This is useful to understand how the enrichment was done and to debug any issues that may arise. Another one may also be for some Machine learning-based tasks that would require so.

Parameters:

Name	Type	Description	Default
`debug`	`bool`	Whether to turn on debug mode (default: True). # Such a parameter might be needed when stacking `.with_debug()`, and trying to `false` the behaviour rather than deleting the line.	`True`

Returns: The EnricherFactory instance for chaining.

Examples:

>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher.with_debug(True)

Source code in src/urban_mapper/modules/enricher/enricher_factory.py

def with_debug(self, debug: bool = True) -> "EnricherFactory":
    """Toggle debug mode for the enricher.

    Enables or disables debug mode, which can spill extra info during enrichment.

    !!! note "What Extra Info?"
        For instance, we will be able to have an extra column for each enrichments that shows which indices
        were taken from the original data to apply the enrichment. This is useful to understand
        how the enrichment was done and to debug any issues that may arise. Another one may also be
        for some Machine learning-based tasks that would require so.

    Args:
        debug: Whether to turn on debug mode (default: True). # Such a parameter might be needed when stacking `.with_debug()`, and trying to `false` the behaviour rather than deleting the line.
    Returns:
        The EnricherFactory instance for chaining.

    Examples:
        >>> import urban_mapper as um
        >>> mapper = um.UrbanMapper()
        >>> enricher = mapper.enricher.with_debug(True)
    """
    self.config.debug = debug
    return self

`with_preview(format='ascii')` ¶

Set the factory to show a preview after building.

Configures an automatic preview post-build—handy for a quick check.

Parameters:

Name	Type	Description	Default
`format`	`str`	Preview format—"ascii" (default, text) or "json" (dict).	`'ascii'`

Returns:

Type	Description
`EnricherFactory`	The EnricherFactory instance for chaining.

Examples:

>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher            ...     .with_data(group_by="pickup")            ...     .count_by()            ...     .with_preview()

Source code in src/urban_mapper/modules/enricher/enricher_factory.py

def with_preview(self, format: str = "ascii") -> "EnricherFactory":
    """Set the factory to show a preview after building.

    Configures an automatic preview post-build—handy for a quick check.

    Args:
        format: Preview format—"ascii" (default, text) or "json" (dict).

    Returns:
        The EnricherFactory instance for chaining.

    Examples:
        >>> import urban_mapper as um
        >>> mapper = um.UrbanMapper()
        >>> enricher = mapper.enricher\
        ...     .with_data(group_by="pickup")\
        ...     .count_by()\
        ...     .with_preview()
    """
    self._preview = {"format": format}
    return self

`aggregate_by(*args, **kwargs)` ¶

Set the enricher to perform aggregation operations.

Configures the enricher to aggregate data (e.g., sum, mean) using provided args.

Available Methods

Parameters:

Name	Type	Description	Default
`*args`		Positional args for EnricherConfig.aggregate_by.	`()`
`**kwargs`		Keyword args like `group_by`, `values_from`, `method` (e.g., "sum").	`{}`

Returns:

Type	Description
`EnricherFactory`	The EnricherFactory instance for chaining.

Examples:

>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher            ...     .with_data(group_by="neighbourhood", values_from="temp")            ...     .aggregate_by(method="mean", output_column="avg_temp")

Source code in src/urban_mapper/modules/enricher/enricher_factory.py

def aggregate_by(self, *args, **kwargs) -> "EnricherFactory":
    """Set the enricher to perform aggregation operations.

    Configures the enricher to aggregate data (e.g., `sum`, `mean`) using provided args.

    !!! tip "Available Methods"

        - [x] `sum`
        - [x] `mean`
        - [x] `median`
        - [x] `min`
        - [x] `max`

    Args:
        *args: Positional args for EnricherConfig.aggregate_by.
        **kwargs: Keyword args like `group_by`, `values_from`, `method` (e.g., "sum").

    Returns:
        The EnricherFactory instance for chaining.

    Examples:
        >>> import urban_mapper as um
        >>> mapper = um.UrbanMapper()
        >>> enricher = mapper.enricher\
        ...     .with_data(group_by="neighbourhood", values_from="temp")\
        ...     .aggregate_by(method="mean", output_column="avg_temp")
    """
    self.config.aggregate_by(*args, **kwargs)
    return self

`count_by(*args, **kwargs)` ¶

Set the enricher to count features.

Configures the enricher to count items per group—great for tallying points in areas.

Parameters:

Name	Type	Description	Default
`*args`		Positional args for EnricherConfig.count_by.	`()`
`**kwargs`		Keyword args like `group_by`, `output_column`.	`{}`

Returns:

Type	Description
`EnricherFactory`	The EnricherFactory instance for chaining.

Examples:

>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher            ...     .with_data(group_by="pickup")            ...     .count_by(output_column="pickup_count")

Source code in src/urban_mapper/modules/enricher/enricher_factory.py

def count_by(self, *args, **kwargs) -> "EnricherFactory":
    """Set the enricher to count features.

    Configures the enricher to count items per group—great for tallying points in areas.

    Args:
        *args: Positional args for EnricherConfig.count_by.
        **kwargs: Keyword args like `group_by`, `output_column`.

    Returns:
        The EnricherFactory instance for chaining.

    Examples:
        >>> import urban_mapper as um
        >>> mapper = um.UrbanMapper()
        >>> enricher = mapper.enricher\
        ...     .with_data(group_by="pickup")\
        ...     .count_by(output_column="pickup_count")
    """
    self.config.count_by(*args, **kwargs)
    return self

`with_type(primitive_type)` ¶

Choose the enricher type to create.

Sets the type of enricher, dictating the enrichment approach, from the registry.

At the moment only one exists

SingleAggregatorEnricher (default)

Hence, no need use with_type unless you want to use a different one in the future. Furthermore, we kept it for compatibility with other modules.

Parameters:

Name	Type	Description	Default
`primitive_type`	`str`	Name of the enricher type (e.g., "SingleAggregatorEnricher").	required

Returns:

Type	Description
`EnricherFactory`	The EnricherFactory instance for chaining.

Raises:

Type	Description
`ValueError`	If the type isn’t in the registry.

Examples:

>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher.with_type("SingleAggregatorEnricher")

Source code in src/urban_mapper/modules/enricher/enricher_factory.py

def with_type(self, primitive_type: str) -> "EnricherFactory":
    """Choose the enricher type to create.

    Sets the type of enricher, dictating the enrichment approach, from the registry.

    !!! note "At the moment only one exists"

        - [x] `SingleAggregatorEnricher` (default)

        Hence, no need use `with_type` unless you want to use a different one in the future.
        Furthermore, we kept it for compatibility with other modules.

    Args:
        primitive_type: Name of the enricher type (e.g., "SingleAggregatorEnricher").

    Returns:
        The EnricherFactory instance for chaining.

    Raises:
        ValueError: If the type isn’t in the registry.

    Examples:
        >>> import urban_mapper as um
        >>> mapper = um.UrbanMapper()
        >>> enricher = mapper.enricher.with_type("SingleAggregatorEnricher")
    """
    if primitive_type not in ENRICHER_REGISTRY:
        available = list(ENRICHER_REGISTRY.keys())
        match, score = process.extractOne(primitive_type, available)
        if score > 80:
            suggestion = f" Maybe you meant '{match}'?"
        else:
            suggestion = ""
        raise ValueError(
            f"Unknown enricher type '{primitive_type}'. Available: {', '.join(available)}.{suggestion}"
        )
    self.config.with_type(primitive_type)
    return self

`build()` ¶

Build and return the configured enricher instance.

Finalises the setup, validates it, and creates the enricher with its aggregator.

Returns:

Type	Description
`EnricherBase`	An EnricherBase-derived instance tailored to the factory’s settings.

Raises:

Type	Description
`ValueError`	If config is invalid (e.g., missing params).

Examples:

>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher            ...     .with_type("SingleAggregatorEnricher")            ...     .with_data(group_by="pickup")            ...     .count_by(output_column="pickup_count")            ...     .build()

Source code in src/urban_mapper/modules/enricher/enricher_factory.py

def build(self) -> EnricherBase:
    """Build and return the configured enricher instance.

    Finalises the setup, validates it, and creates the enricher with its aggregator.

    Returns:
        An EnricherBase-derived instance tailored to the factory’s settings.

    Raises:
        ValueError: If config is invalid (e.g., missing params).

    Examples:
        >>> import urban_mapper as um
        >>> mapper = um.UrbanMapper()
        >>> enricher = mapper.enricher\
        ...     .with_type("SingleAggregatorEnricher")\
        ...     .with_data(group_by="pickup")\
        ...     .count_by(output_column="pickup_count")\
        ...     .build()
    """
    validate_group_by(self.config)
    validate_action(self.config)

    if self.config.action == "aggregate":
        method = self.config.aggregator_config["method"]
        if isinstance(method, str):
            if method not in AGGREGATION_FUNCTIONS:
                raise ValueError(f"Unknown aggregation method '{method}'")
            aggregation_function = AGGREGATION_FUNCTIONS[method]
        elif callable(method):
            aggregation_function = method
        else:
            raise ValueError("Aggregation method must be a string or a callable")
        aggregator = SimpleAggregator(
            group_by_column=self.config.group_by[0],
            value_column=self.config.values_from[0],
            aggregation_function=aggregation_function,
        )
    elif self.config.action == "count":
        aggregator = CountAggregator(
            group_by_column=self.config.group_by[0],
            count_function=len,
        )
    else:
        raise ValueError(
            "Unknown action. Please open an issue on GitHub to request such feature."
        )

    enricher_class = ENRICHER_REGISTRY[self.config.enricher_type]
    self._instance = enricher_class(
        aggregator=aggregator,
        output_column=self.config.enricher_config["output_column"],
        config=self.config,
    )
    if self._preview:
        self.preview(format=self._preview["format"])
    return self._instance

`preview(format='ascii')` ¶

Show a preview of the configured enricher.

Displays a sneak peek of the enricher setup in the chosen format.

Parameters:

Name	Type	Description	Default
`format`	`str`	Preview format—"ascii" (text) or "json" (dict).	`'ascii'`

Returns:

Type	Description
`Union[None, str, dict]`	None for "ascii" (prints to console), dict for "json".

Raises:

Type	Description
`ValueError`	If format isn’t supported.

Examples:

>>> import urban_mapper as um
>>> mapper = um.UrbanMapper()
>>> enricher = mapper.enricher            ...     .with_data(group_by="pickup")            ...     .count_by()            ...     .build()
>>> enricher.preview()

Source code in src/urban_mapper/modules/enricher/enricher_factory.py

def preview(self, format: str = "ascii") -> Union[None, str, dict]:
    """Show a preview of the configured enricher.

    Displays a sneak peek of the enricher setup in the chosen format.

    Args:
        format: Preview format—"ascii" (text) or "json" (dict).

    Returns:
        None for "ascii" (prints to console), dict for "json".

    Raises:
        ValueError: If format isn’t supported.

    Examples:
        >>> import urban_mapper as um
        >>> mapper = um.UrbanMapper()
        >>> enricher = mapper.enricher\
        ...     .with_data(group_by="pickup")\
        ...     .count_by()\
        ...     .build()
        >>> enricher.preview()
    """
    if self._instance is None:
        print("No Enricher instance available to preview.")
        return None
    if hasattr(self._instance, "preview"):
        preview_data = self._instance.preview(format=format)
        if format == "ascii":
            print(preview_data)
        elif format == "json":
            return preview_data
        else:
            raise ValueError(f"Unsupported format '{format}'.")
    else:
        print("Preview not supported for this Enricher instance.")
    return None

`BaseAggregator` ¶

Bases: ABC

Base Class For Data Aggregators.

Where is that used?

Note the following are used throughout the Enrichers, e.g SingleAggregatorEnricher. This means, not to use this directly, but to explore when needed for advanced configuration throughout the enricher's primitive chosen.

Defines the interface for aggregator implementations, which crunch stats on grouped data. Aggregators take input data, group it by a column, apply a function, and yields out the results.

To Implement

All concrete aggregators must inherit from this and implement _aggregate.

Examples:

>>> import urban_mapper as um
>>> import pandas as pd
>>> mapper = um.UrbanMapper()
>>> data = pd.DataFrame({
...     "hood": ["A", "A", "B", "B"],
...     "value": [10, 20, 15, 25]
... })
>>> enricher = mapper.enricher        ...     .with_data(group_by="hood", values_from="value")        ...     .aggregate_by(method="mean", output_column="avg_value")        ...     .build()

Source code in src/urban_mapper/modules/enricher/aggregator/abc_aggregator.py

@beartype
class BaseAggregator(ABC):
    """Base Class For Data Aggregators.

    !!! question "Where is that used?"
        Note the following are used throughout the Enrichers, e.g
        `SingleAggregatorEnricher`. This means, not to use this directly,
        but to explore when needed for advanced configuration throughout
        the enricher's primitive chosen.

    Defines the interface for aggregator implementations, which crunch stats on
    grouped data. Aggregators take `input data`, `group it` by a `column`, `apply` a `function`,
    and `yields out the results`.

    !!! note "To Implement"
        All concrete aggregators must inherit from this and
        implement `_aggregate`.

    Examples:
        >>> import urban_mapper as um
        >>> import pandas as pd
        >>> mapper = um.UrbanMapper()
        >>> data = pd.DataFrame({
        ...     "hood": ["A", "A", "B", "B"],
        ...     "value": [10, 20, 15, 25]
        ... })
        >>> enricher = mapper.enricher\
        ...     .with_data(group_by="hood", values_from="value")\
        ...     .aggregate_by(method="mean", output_column="avg_value")\
        ...     .build()
    """

    @abstractmethod
    def _aggregate(self, input_dataframe: pd.DataFrame) -> pd.DataFrame:
        """Perform the aggregation on the input DataFrame.

        Core method for subclasses to override with specific aggregation logic.

        Args:
            input_dataframe: DataFrame to aggregate.

        Returns:
            DataFrame with at least a 'value' column of aggregated results and
            an 'indices' column of original row indices per group.
        """
        ...

    @require_arguments_not_none(
        "input_dataframe", error_msg="No input dataframe provided.", check_empty=True
    )
    def aggregate(self, input_dataframe: pd.DataFrame) -> pd.DataFrame:
        """Aggregate the input DataFrame.

        Public method to kick off aggregation, validating input before delegating
        to `_aggregate`.

        Args:
            input_dataframe: DataFrame to aggregate. Mustn’t be None or empty.

        Returns:
            DataFrame with aggregation results.

        Raises:
            ValueError: If input_dataframe is None or empty.
        """
        return self._aggregate(input_dataframe)

`_aggregate(input_dataframe)` `abstractmethod` ¶

Perform the aggregation on the input DataFrame.

Core method for subclasses to override with specific aggregation logic.

Parameters:

Name	Type	Description	Default
`input_dataframe`	`DataFrame`	DataFrame to aggregate.	required

Returns:

Type	Description
`DataFrame`	DataFrame with at least a 'value' column of aggregated results and
`DataFrame`	an 'indices' column of original row indices per group.

Source code in src/urban_mapper/modules/enricher/aggregator/abc_aggregator.py

@abstractmethod
def _aggregate(self, input_dataframe: pd.DataFrame) -> pd.DataFrame:
    """Perform the aggregation on the input DataFrame.

    Core method for subclasses to override with specific aggregation logic.

    Args:
        input_dataframe: DataFrame to aggregate.

    Returns:
        DataFrame with at least a 'value' column of aggregated results and
        an 'indices' column of original row indices per group.
    """
    ...

`aggregate(input_dataframe)` ¶

Aggregate the input DataFrame.

Public method to kick off aggregation, validating input before delegating to _aggregate.

Parameters:

Name	Type	Description	Default
`input_dataframe`	`DataFrame`	DataFrame to aggregate. Mustn’t be None or empty.	required

Returns:

Type	Description
`DataFrame`	DataFrame with aggregation results.

Raises:

Type	Description
`ValueError`	If input_dataframe is None or empty.

Source code in src/urban_mapper/modules/enricher/aggregator/abc_aggregator.py

@require_arguments_not_none(
    "input_dataframe", error_msg="No input dataframe provided.", check_empty=True
)
def aggregate(self, input_dataframe: pd.DataFrame) -> pd.DataFrame:
    """Aggregate the input DataFrame.

    Public method to kick off aggregation, validating input before delegating
    to `_aggregate`.

    Args:
        input_dataframe: DataFrame to aggregate. Mustn’t be None or empty.

    Returns:
        DataFrame with aggregation results.

    Raises:
        ValueError: If input_dataframe is None or empty.
    """
    return self._aggregate(input_dataframe)

Enricher Aggregators Functions For Faster Perusal¶

In a Nutshell, How To Read That

An aggregation function is name followed by a function that takes a list of values and returns a single value. The ones below are the common we deliver, utilising mainly Pandas.

`AGGREGATION_FUNCTIONS = {'mean': pd.Series.mean, 'sum': pd.Series.sum, 'median': pd.Series.median, 'min': pd.Series.min, 'max': pd.Series.max}` `module-attribute` ¶

`SimpleAggregator` ¶

Bases: BaseAggregator

Aggregator For Standard Stats On Numeric Data.

Applies stats functions (e.g., mean, sum) to values in a column, grouped by another.

Useful for

Useful for scenarios like average height per district or total population per area.

Supports predefined functions in AGGREGATION_FUNCTIONS or custom ones.

How to Use Custom Functions

Simply pass you own function receiving a series as parameter per the aggregation_function argument. Within the factory it'll be throughout aggregate_by(.) and method argument.

Attributes:

Name	Type	Description
`group_by_column`		Column to group by.
`value_column`		Column with values to aggregate.
`aggregation_function`		Function to apply to grouped values.

Examples:

>>> import urban_mapper as um
>>> import pandas as pd
>>> mapper = um.UrbanMapper()
>>> data = pd.DataFrame({
...     "district": ["A", "A", "B"],
...     "height": [10, 15, 20]
... })
>>> enricher = mapper.enricher        ...     .with_data(group_by="district", values_from="height")        ...     .aggregate_by(method="mean", output_column="avg_height")        ...     .build()

Source code in src/urban_mapper/modules/enricher/aggregator/aggregators/simple_aggregator.py

@beartype
class SimpleAggregator(BaseAggregator):
    """Aggregator For Standard Stats On Numeric Data.

    Applies stats functions (e.g., `mean`, `sum`) to `values` in a `column`, grouped by another.

    !!! tip "Useful for"
        Useful for scenarios like `average height` per district or `total population` per area.

    Supports predefined functions in `AGGREGATION_FUNCTIONS` or custom ones.

    !!! question "How to Use Custom Functions"
        Simply pass you own function receiving a series as parameter per the `aggregation_function` argument.
        Within the factory it'll be throughout `aggregate_by(.)` and `method` argument.

    Attributes:
        group_by_column: Column to group by.
        value_column: Column with values to aggregate.
        aggregation_function: Function to apply to grouped values.

    Examples:
        >>> import urban_mapper as um
        >>> import pandas as pd
        >>> mapper = um.UrbanMapper()
        >>> data = pd.DataFrame({
        ...     "district": ["A", "A", "B"],
        ...     "height": [10, 15, 20]
        ... })
        >>> enricher = mapper.enricher\
        ...     .with_data(group_by="district", values_from="height")\
        ...     .aggregate_by(method="mean", output_column="avg_height")\
        ...     .build()
    """

    def __init__(
        self,
        group_by_column: str,
        value_column: str,
        aggregation_function: Callable[[pd.Series], float],
    ) -> None:
        self.group_by_column = group_by_column
        self.value_column = value_column
        self.aggregation_function = aggregation_function

    def _aggregate(self, input_dataframe: pd.DataFrame) -> pd.DataFrame:
        """Aggregate data with the aggregation function.

        `Groups the DataFrame`, applies the function to `value_column`, and returns results.

        Args:
            input_dataframe: DataFrame with `group_by_column` and `value_column`.

        Returns:
            DataFrame with 'value' (aggregated values) and 'indices' (row indices).

        Raises:
            KeyError: If required columns are missing.
        """
        grouped = input_dataframe.groupby(self.group_by_column)
        aggregated = grouped[self.value_column].agg(self.aggregation_function)
        indices = grouped.apply(lambda g: list(g.index))
        return pd.DataFrame({"value": aggregated, "indices": indices})

`_aggregate(input_dataframe)` ¶

Aggregate data with the aggregation function.

Groups the DataFrame, applies the function to value_column, and returns results.

Parameters:

Name	Type	Description	Default
`input_dataframe`	`DataFrame`	DataFrame with `group_by_column` and `value_column`.	required

Returns:

Type	Description
`DataFrame`	DataFrame with 'value' (aggregated values) and 'indices' (row indices).

Raises:

Type	Description
`KeyError`	If required columns are missing.

Source code in src/urban_mapper/modules/enricher/aggregator/aggregators/simple_aggregator.py

def _aggregate(self, input_dataframe: pd.DataFrame) -> pd.DataFrame:
    """Aggregate data with the aggregation function.

    `Groups the DataFrame`, applies the function to `value_column`, and returns results.

    Args:
        input_dataframe: DataFrame with `group_by_column` and `value_column`.

    Returns:
        DataFrame with 'value' (aggregated values) and 'indices' (row indices).

    Raises:
        KeyError: If required columns are missing.
    """
    grouped = input_dataframe.groupby(self.group_by_column)
    aggregated = grouped[self.value_column].agg(self.aggregation_function)
    indices = grouped.apply(lambda g: list(g.index))
    return pd.DataFrame({"value": aggregated, "indices": indices})

`CountAggregator` ¶

Bases: BaseAggregator

Aggregator For Counting Records In Groups.

Counts records per group, with an optional custom counting function. By default, it uses len() to count all records, but you can tweak it to count specific cases, see below.

Useful for

Counting taxi pickups per area
Tallying incidents per junction
Totting up points of interest per district

Attributes:

Name	Type	Description
`group_by_column`		Column to group data by.
`count_function`		Function to count records in each group (defaults to len).

Examples:

>>> import urban_mapper as um
>>> import pandas as pd
>>> mapper = um.UrbanMapper()
>>> data = pd.DataFrame({
...     "junction": ["A", "A", "B", "B", "C"],
...     "type": ["minor", "major", "minor", "major", "minor"]
... })
>>> enricher = mapper.enricher        ...     .with_data(group_by="junction")        ...     .count_by(output_column="incident_count")        ...     .build()

Source code in src/urban_mapper/modules/enricher/aggregator/aggregators/count_aggregator.py

@beartype
class CountAggregator(BaseAggregator):
    """Aggregator For Counting Records In Groups.

    Counts records per group, with an optional custom counting function. By default,
    it uses `len()` to count all records, but you can tweak it to count specific cases, see below.

    !!! tip "Useful for"

        - [x] Counting taxi pickups per area
        - [x] Tallying incidents per junction
        - [x] Totting up points of interest per district

    Attributes:
        group_by_column: Column to group data by.
        count_function: Function to count records in each group (defaults to len).

    Examples:
        >>> import urban_mapper as um
        >>> import pandas as pd
        >>> mapper = um.UrbanMapper()
        >>> data = pd.DataFrame({
        ...     "junction": ["A", "A", "B", "B", "C"],
        ...     "type": ["minor", "major", "minor", "major", "minor"]
        ... })
        >>> enricher = mapper.enricher\
        ...     .with_data(group_by="junction")\
        ...     .count_by(output_column="incident_count")\
        ...     .build()
    """

    def __init__(
        self,
        group_by_column: str,
        count_function: Callable[[pd.DataFrame], Any] = len,
    ) -> None:
        self.group_by_column = group_by_column
        self.count_function = count_function

    @require_attribute_columns("input_dataframe", ["group_by_column"])
    def _aggregate(self, input_dataframe: pd.DataFrame) -> pd.DataFrame:
        """Count records per group using the count function.

        Groups the DataFrame by `group_by_column`, applies the count function,
        and returns a DataFrame with counts and indices.

        Args:
            input_dataframe: DataFrame to aggregate, must have `group_by_column`.

        Returns:
            DataFrame with 'value' (counts) and 'indices' (original row indices).

        Raises:
            ValueError: If required column is missing.
        """
        grouped = input_dataframe.groupby(self.group_by_column)
        values = grouped.apply(self.count_function)
        indices = grouped.apply(lambda g: list(g.index))
        return pd.DataFrame({"value": values, "indices": indices})

`_aggregate(input_dataframe)` ¶

Count records per group using the count function.

Groups the DataFrame by group_by_column, applies the count function, and returns a DataFrame with counts and indices.

Parameters:

Name	Type	Description	Default
`input_dataframe`	`DataFrame`	DataFrame to aggregate, must have `group_by_column`.	required

Returns:

Type	Description
`DataFrame`	DataFrame with 'value' (counts) and 'indices' (original row indices).

Raises:

Type	Description
`ValueError`	If required column is missing.

Source code in src/urban_mapper/modules/enricher/aggregator/aggregators/count_aggregator.py

@require_attribute_columns("input_dataframe", ["group_by_column"])
def _aggregate(self, input_dataframe: pd.DataFrame) -> pd.DataFrame:
    """Count records per group using the count function.

    Groups the DataFrame by `group_by_column`, applies the count function,
    and returns a DataFrame with counts and indices.

    Args:
        input_dataframe: DataFrame to aggregate, must have `group_by_column`.

    Returns:
        DataFrame with 'value' (counts) and 'indices' (original row indices).

    Raises:
        ValueError: If required column is missing.
    """
    grouped = input_dataframe.groupby(self.group_by_column)
    values = grouped.apply(self.count_function)
    indices = grouped.apply(lambda g: list(g.index))
    return pd.DataFrame({"value": values, "indices": indices})

2025-04-282025-08-28Provost Simon

Enrichers¶

EnricherBase ¶

_enrich(input_geodataframe, urban_layer, **kwargs) abstractmethod ¶

enrich(input_geodataframe, urban_layer, **kwargs) ¶

preview(format='ascii') abstractmethod ¶

SingleAggregatorEnricher ¶

_enrich(input_geodataframe, urban_layer, **kwargs) ¶

preview(format='ascii') ¶

EnricherFactory ¶

with_data(*args, **kwargs) ¶

with_debug(debug=True) ¶

with_preview(format='ascii') ¶

aggregate_by(*args, **kwargs) ¶

count_by(*args, **kwargs) ¶

with_type(primitive_type) ¶

build() ¶

preview(format='ascii') ¶

BaseAggregator ¶

_aggregate(input_dataframe) abstractmethod ¶

aggregate(input_dataframe) ¶

Enricher Aggregators Functions For Faster Perusal¶

AGGREGATION_FUNCTIONS = {'mean': pd.Series.mean, 'sum': pd.Series.sum, 'median': pd.Series.median, 'min': pd.Series.min, 'max': pd.Series.max} module-attribute ¶

SimpleAggregator ¶

_aggregate(input_dataframe) ¶

CountAggregator ¶

_aggregate(input_dataframe) ¶

`EnricherBase` ¶

`_enrich(input_geodataframe, urban_layer, **kwargs)` `abstractmethod` ¶

`enrich(input_geodataframe, urban_layer, **kwargs)` ¶

`preview(format='ascii')` `abstractmethod` ¶

`SingleAggregatorEnricher` ¶

`_enrich(input_geodataframe, urban_layer, **kwargs)` ¶

`preview(format='ascii')` ¶

`EnricherFactory` ¶

`with_data(*args, **kwargs)` ¶

`with_debug(debug=True)` ¶

`with_preview(format='ascii')` ¶

`aggregate_by(*args, **kwargs)` ¶

`count_by(*args, **kwargs)` ¶

`with_type(primitive_type)` ¶

`build()` ¶

`preview(format='ascii')` ¶

`BaseAggregator` ¶

`_aggregate(input_dataframe)` `abstractmethod` ¶

`aggregate(input_dataframe)` ¶

`AGGREGATION_FUNCTIONS = {'mean': pd.Series.mean, 'sum': pd.Series.sum, 'median': pd.Series.median, 'min': pd.Series.min, 'max': pd.Series.max}` `module-attribute` ¶

`SimpleAggregator` ¶

`_aggregate(input_dataframe)` ¶

`CountAggregator` ¶

`_aggregate(input_dataframe)` ¶