Skip to content

Filters

What is the Filter module?

The filter module is responsible for filtering geospatial datasets based on specific criteria or conditions out of your urban layer.

Meanwhile, we recommend to look through the Example's Filter for a more hands-on introduction about the Filter module and its usage.

Documentation Under Alpha Construction

This documentation is in its early stages and still being developed. The API may therefore change, and some parts might be incomplete or inaccurate.

Use at your own risk, and please report anything that seems incorrect / outdated you find.

Open An Issue!

GeoFilterBase

Bases: ABC

Base class for all spatial filters in UrbanMapper

This abstract class defines the common interface that all filter implementations must follow. Filters are used to subset or filter GeoDataFrames based on spatial criteria derived from an urban layer.

Note

This is an abstract class and cannot be instantiated directly. Use concrete implementations like BoundingBoxFilter instead.

Source code in src/urban_mapper/modules/filter/abc_filter.py
@beartype
class GeoFilterBase(ABC):
    """Base class for all spatial filters in `UrbanMapper`

    This abstract class defines the common interface that all filter implementations
    must follow. Filters are used to subset or filter `GeoDataFrames` based on spatial
    criteria derived from an `urban layer`.

    !!! note
        This is an abstract class and cannot be instantiated directly. Use concrete
        implementations like `BoundingBoxFilter` instead.
    """

    def __init__(
        self,
        data_id: Optional[str] = None,
        **kwargs,
    ) -> None:
        self.data_id = data_id

    @abstractmethod
    def _transform(
        self, input_geodataframe: gpd.GeoDataFrame, urban_layer: UrbanLayerBase
    ) -> gpd.GeoDataFrame:
        """Internal implementation method for filtering a `GeoDataFrame`

        Called by `transform()` after input validation. Subclasses must override this
        method to implement specific filtering logic.

        !!! note "To be implemented by subclasses"
            This method should contain the core logic for filter data given the
            `urban_layer`. It should be implemented in subclasses to handle the
            specific filtering task (e.g., bounding box, polygonal area) and return the
            modified `GeoDataFrame`.

        !!! question "Usefulness of Filters?"
            Filters are essential for narrowing down large datasets to only those
            relevant to a specific analysis or study area. Think of an analysis in
            `Downtown Brooklyn` but your dataset is having data points all over the `New York City & Its Boroughs`.
            In this case, you can use a filter to subset the data to only include points within `Downtown Brooklyn`.

        Args:
            input_geodataframe (gpd.GeoDataFrame): The `GeoDataFrame` to filter.
            urban_layer (UrbanLayerBase): The `urban layer` providing spatial filtering criteria.

        Returns:
            gpd.GeoDataFrame: A filtered `GeoDataFrame` containing only rows meeting the criteria.

        Raises:
            ValueError: If the filtering operation cannot be performed due to invalid inputs.
        """
        ...

    @require_arguments_not_none(
        "input_geodataframe", error_msg="Input GeoDataFrame cannot be None."
    )
    @require_arguments_not_none("urban_layer", error_msg="Urban layer cannot be None.")
    def transform(
        self,
        input_geodataframe: Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame],
        urban_layer: UrbanLayerBase,
    ) -> Union[
        Dict[str, gpd.GeoDataFrame],
        gpd.GeoDataFrame,
    ]:
        """Filter a `GeoDataFrame` based on spatial criteria from an `urban layer`

        The primary public method for applying filters. It validates inputs and delegates
        to the subclass-specific `_transform()` method.

        Args:
            input_geodataframe (Union[Dict[str, GeoDataFrame], GeoDataFrame]): one or more `GeoDataFrame` to filter.
            urban_layer (UrbanLayerBase): The `urban layer` providing spatial filtering criteria.

        Returns:
            Union[Dict[str, GeoDataFrame], GeoDataFrame]: one or more filtered `GeoDataFrame` containing only rows meeting the criteria.

        Raises:
            ValueError: If input_geodataframe or urban_layer is None.
            ValueError: If the filtering operation fails.

        Examples:
            >>> from urban_mapper.modules.filter import BoundingBoxFilter
            >>> from urban_mapper.modules.urban_layer import OSMNXStreets
            >>> streets_layer = OSMNXStreets().from_place("Manhattan, New York")
            >>> bbox_filter = BoundingBoxFilter()
            >>> filtered_data = bbox_filter.transform(taxi_trips, streets_layer)
            >>> filtered_data.head()
            >>> # 👆This would show onloy data within the bounding box of the streets layer. I.e. `Manhattan, New York`.
        """

        if isinstance(input_geodataframe, gpd.GeoDataFrame):
            return self._transform(input_geodataframe, urban_layer)
        else:
            return {
                key: self._transform(gdf, urban_layer)
                if self.data_id is None or self.data_id == key
                else gdf
                for key, gdf in input_geodataframe.items()
            }

    @abstractmethod
    def preview(self, format: str = "ascii") -> Any:
        """Generate a preview of the filter's configuration.

        Provides a summary of the filter for inspection.

        Args:
            format (str): The output format. Options are:

                - [x] "ascii": Text-based format for terminal display.
                - [x] "json": JSON-formatted data for programmatic use.

                Defaults to "ascii".

        Returns:
            Any: A representation of the filter in the requested format (e.g., str or dict).

        Raises:
            ValueError: If an unsupported format is specified.

        !!! warning "Abstract Method"
            Subclasses must implement this method to provide configuration details.
        """
        pass

_transform(input_geodataframe, urban_layer) abstractmethod

Internal implementation method for filtering a GeoDataFrame

Called by transform() after input validation. Subclasses must override this method to implement specific filtering logic.

To be implemented by subclasses

This method should contain the core logic for filter data given the urban_layer. It should be implemented in subclasses to handle the specific filtering task (e.g., bounding box, polygonal area) and return the modified GeoDataFrame.

Usefulness of Filters?

Filters are essential for narrowing down large datasets to only those relevant to a specific analysis or study area. Think of an analysis in Downtown Brooklyn but your dataset is having data points all over the New York City & Its Boroughs. In this case, you can use a filter to subset the data to only include points within Downtown Brooklyn.

Parameters:

Name Type Description Default
input_geodataframe GeoDataFrame

The GeoDataFrame to filter.

required
urban_layer UrbanLayerBase

The urban layer providing spatial filtering criteria.

required

Returns:

Type Description
GeoDataFrame

gpd.GeoDataFrame: A filtered GeoDataFrame containing only rows meeting the criteria.

Raises:

Type Description
ValueError

If the filtering operation cannot be performed due to invalid inputs.

Source code in src/urban_mapper/modules/filter/abc_filter.py
@abstractmethod
def _transform(
    self, input_geodataframe: gpd.GeoDataFrame, urban_layer: UrbanLayerBase
) -> gpd.GeoDataFrame:
    """Internal implementation method for filtering a `GeoDataFrame`

    Called by `transform()` after input validation. Subclasses must override this
    method to implement specific filtering logic.

    !!! note "To be implemented by subclasses"
        This method should contain the core logic for filter data given the
        `urban_layer`. It should be implemented in subclasses to handle the
        specific filtering task (e.g., bounding box, polygonal area) and return the
        modified `GeoDataFrame`.

    !!! question "Usefulness of Filters?"
        Filters are essential for narrowing down large datasets to only those
        relevant to a specific analysis or study area. Think of an analysis in
        `Downtown Brooklyn` but your dataset is having data points all over the `New York City & Its Boroughs`.
        In this case, you can use a filter to subset the data to only include points within `Downtown Brooklyn`.

    Args:
        input_geodataframe (gpd.GeoDataFrame): The `GeoDataFrame` to filter.
        urban_layer (UrbanLayerBase): The `urban layer` providing spatial filtering criteria.

    Returns:
        gpd.GeoDataFrame: A filtered `GeoDataFrame` containing only rows meeting the criteria.

    Raises:
        ValueError: If the filtering operation cannot be performed due to invalid inputs.
    """
    ...

transform(input_geodataframe, urban_layer)

Filter a GeoDataFrame based on spatial criteria from an urban layer

The primary public method for applying filters. It validates inputs and delegates to the subclass-specific _transform() method.

Parameters:

Name Type Description Default
input_geodataframe Union[Dict[str, GeoDataFrame], GeoDataFrame]

one or more GeoDataFrame to filter.

required
urban_layer UrbanLayerBase

The urban layer providing spatial filtering criteria.

required

Returns:

Type Description
Union[Dict[str, GeoDataFrame], GeoDataFrame]

Union[Dict[str, GeoDataFrame], GeoDataFrame]: one or more filtered GeoDataFrame containing only rows meeting the criteria.

Raises:

Type Description
ValueError

If input_geodataframe or urban_layer is None.

ValueError

If the filtering operation fails.

Examples:

>>> from urban_mapper.modules.filter import BoundingBoxFilter
>>> from urban_mapper.modules.urban_layer import OSMNXStreets
>>> streets_layer = OSMNXStreets().from_place("Manhattan, New York")
>>> bbox_filter = BoundingBoxFilter()
>>> filtered_data = bbox_filter.transform(taxi_trips, streets_layer)
>>> filtered_data.head()
>>> # 👆This would show onloy data within the bounding box of the streets layer. I.e. `Manhattan, New York`.
Source code in src/urban_mapper/modules/filter/abc_filter.py
@require_arguments_not_none(
    "input_geodataframe", error_msg="Input GeoDataFrame cannot be None."
)
@require_arguments_not_none("urban_layer", error_msg="Urban layer cannot be None.")
def transform(
    self,
    input_geodataframe: Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame],
    urban_layer: UrbanLayerBase,
) -> Union[
    Dict[str, gpd.GeoDataFrame],
    gpd.GeoDataFrame,
]:
    """Filter a `GeoDataFrame` based on spatial criteria from an `urban layer`

    The primary public method for applying filters. It validates inputs and delegates
    to the subclass-specific `_transform()` method.

    Args:
        input_geodataframe (Union[Dict[str, GeoDataFrame], GeoDataFrame]): one or more `GeoDataFrame` to filter.
        urban_layer (UrbanLayerBase): The `urban layer` providing spatial filtering criteria.

    Returns:
        Union[Dict[str, GeoDataFrame], GeoDataFrame]: one or more filtered `GeoDataFrame` containing only rows meeting the criteria.

    Raises:
        ValueError: If input_geodataframe or urban_layer is None.
        ValueError: If the filtering operation fails.

    Examples:
        >>> from urban_mapper.modules.filter import BoundingBoxFilter
        >>> from urban_mapper.modules.urban_layer import OSMNXStreets
        >>> streets_layer = OSMNXStreets().from_place("Manhattan, New York")
        >>> bbox_filter = BoundingBoxFilter()
        >>> filtered_data = bbox_filter.transform(taxi_trips, streets_layer)
        >>> filtered_data.head()
        >>> # 👆This would show onloy data within the bounding box of the streets layer. I.e. `Manhattan, New York`.
    """

    if isinstance(input_geodataframe, gpd.GeoDataFrame):
        return self._transform(input_geodataframe, urban_layer)
    else:
        return {
            key: self._transform(gdf, urban_layer)
            if self.data_id is None or self.data_id == key
            else gdf
            for key, gdf in input_geodataframe.items()
        }

preview(format='ascii') abstractmethod

Generate a preview of the filter's configuration.

Provides a summary of the filter for inspection.

Parameters:

Name Type Description Default
format str

The output format. Options are:

  • "ascii": Text-based format for terminal display.
  • "json": JSON-formatted data for programmatic use.

Defaults to "ascii".

'ascii'

Returns:

Name Type Description
Any Any

A representation of the filter in the requested format (e.g., str or dict).

Raises:

Type Description
ValueError

If an unsupported format is specified.

Abstract Method

Subclasses must implement this method to provide configuration details.

Source code in src/urban_mapper/modules/filter/abc_filter.py
@abstractmethod
def preview(self, format: str = "ascii") -> Any:
    """Generate a preview of the filter's configuration.

    Provides a summary of the filter for inspection.

    Args:
        format (str): The output format. Options are:

            - [x] "ascii": Text-based format for terminal display.
            - [x] "json": JSON-formatted data for programmatic use.

            Defaults to "ascii".

    Returns:
        Any: A representation of the filter in the requested format (e.g., str or dict).

    Raises:
        ValueError: If an unsupported format is specified.

    !!! warning "Abstract Method"
        Subclasses must implement this method to provide configuration details.
    """
    pass

BoundingBoxFilter

Bases: GeoFilterBase

Filter that limits data to the bounding box of an urban layer

Retains only data points or geometries within the urban layer’s bounding box, using geopandas’ .cx accessor for efficient spatial indexing.

See further in https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.cx.html

Note

The bounding box may include areas outside the urban layer’s actual features.

Examples:

>>> from urban_mapper.modules.filter import BoundingBoxFilter
>>> from urban_mapper.modules.urban_layer import OSMNXStreets
>>> streets = OSMNXStreets()
>>> streets.from_place("Manhattan, New York")
>>> bbox_filter = BoundingBoxFilter()
>>> filtered_data = bbox_filter.transform(taxi_trips, streets)
Source code in src/urban_mapper/modules/filter/filters/bounding_box_filter.py
@beartype
class BoundingBoxFilter(GeoFilterBase):
    """Filter that limits data to the bounding box of an `urban layer`

    Retains only data points or geometries within the `urban layer`’s bounding box,
    using geopandas’ .cx accessor for efficient spatial indexing.

    See further in https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.cx.html

    !!! note
        The bounding box may include areas outside the `urban layer`’s actual features.

    Examples:
        >>> from urban_mapper.modules.filter import BoundingBoxFilter
        >>> from urban_mapper.modules.urban_layer import OSMNXStreets
        >>> streets = OSMNXStreets()
        >>> streets.from_place("Manhattan, New York")
        >>> bbox_filter = BoundingBoxFilter()
        >>> filtered_data = bbox_filter.transform(taxi_trips, streets)
    """

    def _transform(
        self, input_geodataframe: gpd.GeoDataFrame, urban_layer: UrbanLayerBase
    ) -> gpd.GeoDataFrame:
        """Filter data to the bounding box of the `urban layer`

        Uses the `urban layer`’s bounding box to filter the input `GeoDataFrame`.

        !!! tip
            Ensure the `urban layer` is fully loaded before applying the filter.

        Args:
            input_geodataframe (gpd.GeoDataFrame): The `GeoDataFrame` to filter.
            urban_layer (UrbanLayerBase): The `urban layer` defining the bounding box.

        Returns:
            gpd.GeoDataFrame: Filtered `GeoDataFrame` within the bounding box.

        Raises:
            AttributeError: If `urban_layer` lacks `get_layer_bounding_box` method.

        """
        if not hasattr(urban_layer, "get_layer_bounding_box"):
            raise AttributeError(
                f"Urban layer {urban_layer.__class__.__name__} does not have a method to get its bounding box."
            )
        minx, miny, maxx, maxy = urban_layer.get_layer_bounding_box()
        return input_geodataframe.cx[minx:maxx, miny:maxy]

    def preview(self, format: str = "ascii") -> Any:
        """Generate a preview of this bounding box filter.

        Provides a summary of the filter’s configuration.

        Args:
            format (str): The output format ("ascii" or "json"). Defaults to "ascii".

        Returns:
            Any: A string (for "ascii") or dict (for "json") representing the filter.

        Raises:
            ValueError: If format is unsupported.

        Examples:
            >>> bbox_filter = BoundingBoxFilter()
            >>> print(bbox_filter.preview())
            Filter: BoundingBoxFilter
              Action: Filter data to the bounding box of the urban layer
        """
        if format == "ascii":
            lines = [
                "Filter: BoundingBoxFilter",
                "  Action: Filter data to the bounding box of the urban layer",
            ]
            if self.data_id:
                lines.append(f"  Data ID: '{self.data_id}'")

            return "\n".join(lines)
        elif format == "json":
            return {
                "filter": "BoundingBoxFilter",
                "action": "Filter data to the bounding box of the urban layer",
                "data_id": self.data_id,
            }
        else:
            raise ValueError(f"Unsupported format '{format}'")

_transform(input_geodataframe, urban_layer)

Filter data to the bounding box of the urban layer

Uses the urban layer’s bounding box to filter the input GeoDataFrame.

Tip

Ensure the urban layer is fully loaded before applying the filter.

Parameters:

Name Type Description Default
input_geodataframe GeoDataFrame

The GeoDataFrame to filter.

required
urban_layer UrbanLayerBase

The urban layer defining the bounding box.

required

Returns:

Type Description
GeoDataFrame

gpd.GeoDataFrame: Filtered GeoDataFrame within the bounding box.

Raises:

Type Description
AttributeError

If urban_layer lacks get_layer_bounding_box method.

Source code in src/urban_mapper/modules/filter/filters/bounding_box_filter.py
def _transform(
    self, input_geodataframe: gpd.GeoDataFrame, urban_layer: UrbanLayerBase
) -> gpd.GeoDataFrame:
    """Filter data to the bounding box of the `urban layer`

    Uses the `urban layer`’s bounding box to filter the input `GeoDataFrame`.

    !!! tip
        Ensure the `urban layer` is fully loaded before applying the filter.

    Args:
        input_geodataframe (gpd.GeoDataFrame): The `GeoDataFrame` to filter.
        urban_layer (UrbanLayerBase): The `urban layer` defining the bounding box.

    Returns:
        gpd.GeoDataFrame: Filtered `GeoDataFrame` within the bounding box.

    Raises:
        AttributeError: If `urban_layer` lacks `get_layer_bounding_box` method.

    """
    if not hasattr(urban_layer, "get_layer_bounding_box"):
        raise AttributeError(
            f"Urban layer {urban_layer.__class__.__name__} does not have a method to get its bounding box."
        )
    minx, miny, maxx, maxy = urban_layer.get_layer_bounding_box()
    return input_geodataframe.cx[minx:maxx, miny:maxy]

preview(format='ascii')

Generate a preview of this bounding box filter.

Provides a summary of the filter’s configuration.

Parameters:

Name Type Description Default
format str

The output format ("ascii" or "json"). Defaults to "ascii".

'ascii'

Returns:

Name Type Description
Any Any

A string (for "ascii") or dict (for "json") representing the filter.

Raises:

Type Description
ValueError

If format is unsupported.

Examples:

>>> bbox_filter = BoundingBoxFilter()
>>> print(bbox_filter.preview())
Filter: BoundingBoxFilter
  Action: Filter data to the bounding box of the urban layer
Source code in src/urban_mapper/modules/filter/filters/bounding_box_filter.py
def preview(self, format: str = "ascii") -> Any:
    """Generate a preview of this bounding box filter.

    Provides a summary of the filter’s configuration.

    Args:
        format (str): The output format ("ascii" or "json"). Defaults to "ascii".

    Returns:
        Any: A string (for "ascii") or dict (for "json") representing the filter.

    Raises:
        ValueError: If format is unsupported.

    Examples:
        >>> bbox_filter = BoundingBoxFilter()
        >>> print(bbox_filter.preview())
        Filter: BoundingBoxFilter
          Action: Filter data to the bounding box of the urban layer
    """
    if format == "ascii":
        lines = [
            "Filter: BoundingBoxFilter",
            "  Action: Filter data to the bounding box of the urban layer",
        ]
        if self.data_id:
            lines.append(f"  Data ID: '{self.data_id}'")

        return "\n".join(lines)
    elif format == "json":
        return {
            "filter": "BoundingBoxFilter",
            "action": "Filter data to the bounding box of the urban layer",
            "data_id": self.data_id,
        }
    else:
        raise ValueError(f"Unsupported format '{format}'")

FilterFactory

Factory class for creating and configuring spatial filters

Provides a fluent chaining-based-methods interface to instantiate filters, configure settings, and apply them to GeoDataFrames.

Attributes:

Name Type Description
_filter_type Optional[str]

The type of filter to create.

_extra_params Dict[str, Any]

Configuration parameters for the filter.

_instance Optional[GeoFilterBase]

The filter instance (internal use).

_preview Optional[dict]

Preview configuration (internal use).

Examples:

>>> from urban_mapper import UrbanMapper
>>> import geopandas as gpd
>>> mapper = UrbanMapper()
>>> layer = mapper.urban_layer.region_neighborhoods().from_place("Brooklyn, New York")
>>> data = gpd.read_file("nyc_points.csv") # Example data
>>> filtered_data = mapper.filter.with_type("BoundingBoxFilter")        ...     .transform(data, layer)
Source code in src/urban_mapper/modules/filter/filter_factory.py
@beartype
class FilterFactory:
    """Factory class for creating and configuring spatial filters

    Provides a fluent chaining-based-methods interface to instantiate `filters`, `configure settings`, and `apply` them
    to `GeoDataFrames`.

    Attributes:
        _filter_type (Optional[str]): The type of filter to create.
        _extra_params (Dict[str, Any]): Configuration parameters for the filter.
        _instance (Optional[GeoFilterBase]): The filter instance (internal use).
        _preview (Optional[dict]): Preview configuration (internal use).

    Examples:
        >>> from urban_mapper import UrbanMapper
        >>> import geopandas as gpd
        >>> mapper = UrbanMapper()
        >>> layer = mapper.urban_layer.region_neighborhoods().from_place("Brooklyn, New York")
        >>> data = gpd.read_file("nyc_points.csv") # Example data
        >>> filtered_data = mapper.filter.with_type("BoundingBoxFilter")\
        ...     .transform(data, layer)
    """

    def __init__(self):
        self._filter_type: Optional[str] = None
        self._extra_params: Dict[str, Any] = {}
        self._instance: Optional[GeoFilterBase] = None
        self._preview: Optional[dict] = None
        self._data_id: Optional[str] = None

    @reset_attributes_before(["_filter_type"])
    def with_type(self, primitive_type: str) -> "FilterFactory":
        """Specify the type of filter to use.

        Configures the factory to create a specific filter type from FILTER_REGISTRY.

        !!! tip "FILTER_REGISTRY looks like this"
            Open the folder `filters` in `src/urban_mapper/modules/filter` to see the available filter types
            in FILTER_REGISTRY. Each filter class is registered under its class name.

            You also can use `list(FILTER_REGISTRY.keys())` to see available filter types.

        Args:
            primitive_type (str): The name of the filter type (e.g., "BoundingBoxFilter").

        Returns:
            FilterFactory: Self for method chaining.

        Raises:
            ValueError: If primitive_type is not in FILTER_REGISTRY.

        Examples:
            >>> filter_factory = mapper.filter.with_type("BoundingBoxFilter")

        """
        if self._filter_type is not None:
            logger.log(
                "DEBUG_MID",
                f"WARNING: Filter method already set to '{self._filter_type}'. Overwriting.",
            )
            self._filter_type = None
        if primitive_type not in FILTER_REGISTRY:
            available = list(FILTER_REGISTRY.keys())
            match, score = process.extractOne(primitive_type, available)
            if score > 80:
                suggestion = f" Maybe you meant '{match}'?"
            else:
                suggestion = ""
            raise ValueError(
                f"Unknown filter method '{primitive_type}'. Available: {', '.join(available)}.{suggestion}"
            )
        self._filter_type = primitive_type
        logger.log(
            "DEBUG_LOW",
            f"WITH_TYPE: Initialised FilterFactory with filter_type={primitive_type}",
        )
        return self

    def with_data(self, data_id: str) -> "FilterFactory":
        """Set the data ID to perform filtering.

        Args:
            data_id: ID of the dataset to be transformed

        Returns:
            FilterFactory: Self for chaining.

        Raises:
            ValueError: If primitive_type is not in FILTER_REGISTRY.

        !!! tip
            Check FILTER_REGISTRY keys for valid filtering types.
        """
        if self._data_id is not None:
            logger.log(
                "DEBUG_MID",
                f"WARNING: Data ID already set to '{self._data_id}'. Overwriting.",
            )
            self._data_id = None

        self._data_id = data_id
        logger.log(
            "DEBUG_LOW",
            f"WITH_DATA: Initialised FilterFactory with data_id={data_id}",
        )
        return self

    @require_attributes_not_none("_filter_type")
    def transform(
        self,
        input_geodataframe: Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame],
        urban_layer: UrbanLayerBase,
    ) -> Union[
        Dict[str, gpd.GeoDataFrame],
        gpd.GeoDataFrame,
    ]:
        """Apply the filter to input data and return filtered results

        Creates and applies a filter instance to the input `GeoDataFrame`.

        Args:
            input_geodataframe (Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame]): one or more `GeoDataFrame` to filter.
            urban_layer (UrbanLayerBase): The urban layer for filtering criteria.

        Returns:
            Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame]: The filtered data.

        Raises:
            ValueError: If _filter_type is not set.

        Examples:
            >>> layer = mapper.urban_layer.region_neighborhoods().from_place("Brooklyn, New York")
            >>> data = gpd.read_file("nyc_points.csv") # Example data
            >>> filtered_data = mapper.filter.with_type("BoundingBoxFilter")\
            ...     .transform(data, layer)
        """
        filter_class = FILTER_REGISTRY[self._filter_type]
        self._instance = filter_class(data_id=self._data_id, **self._extra_params)

        if (
            isinstance(input_geodataframe, Dict)
            and self._data_id is not None
            and self._data_id not in input_geodataframe
        ):
            print(
                "WARNING: ",
                f"Data ID {self._data_id} was not found in the list of dataframes ",
                "No filter transformation will be executed ",
            )

        return self._instance.transform(input_geodataframe, urban_layer)

    def build(self) -> GeoFilterBase:
        """Build and return a filter instance without applying it.

        Creates a filter instance for use in pipelines or deferred execution.

        !!! note
            Prefer `transform()` for immediate filtering; use build() for pipelines.

        Returns:
            GeoFilterBase: A configured filter instance.

        Raises:
            ValueError: If _filter_type is not set.

        Examples:
            >>> filter_component = mapper.filter.with_type("BoundingBoxFilter").build()
            >>> pipeline.add_filter(filter_component)
        """
        logger.log(
            "DEBUG_MID",
            "WARNING: build() should only be used in UrbanPipeline. In other cases, "
            "using transform() is a better choice.",
        )
        if self._filter_type is None:
            raise ValueError("Filter type must be specified. Call with_type() first.")
        filter_class = FILTER_REGISTRY[self._filter_type]
        self._instance = filter_class(
            data_id=self._data_id,
            **self._extra_params,
        )
        if self._preview is not None:
            self.preview(format=self._preview["format"])
        return self._instance

    def preview(self, format: str = "ascii") -> None:
        """Display a preview of the filter configuration and settings.

        Shows the filter’s configuration in the specified format.

        !!! note
            Requires a prior call to build() or transform().

        Args:
            format (str): The format to display ("ascii" or "json"). Defaults to "ascii".

        Raises:
            ValueError: If format is unsupported.

        Examples:
            >>> factory = mapper.filter.with_type("BoundingBoxFilter")
            >>> factory.build()
            >>> factory.preview(format="json")
        """
        if self._instance is None:
            print("No filter instance available to preview. Call build() first.")
            return
        if hasattr(self._instance, "preview"):
            preview_data = self._instance.preview(format=format)
            if format == "ascii":
                print(preview_data)
            elif format == "json":
                print(json.dumps(preview_data, indent=2))
            else:
                raise ValueError(f"Unsupported format '{format}'.")
        else:
            print("Preview not supported for this filter instance.")

    def with_preview(self, format: str = "ascii") -> "FilterFactory":
        """Configure the factory to display a preview after building.

        Enables automatic preview after build().

        Args:
            format (str): The preview format ("ascii" or "json"). Defaults to "ascii".

        Returns:
            FilterFactory: Self for chaining.

        Examples:
            >>> filter_component = mapper.filter.with_type("BoundingBoxFilter")\
            ...     .with_preview(format="json")\
            ...     .build()
        """
        self._preview = {"format": format}
        return self

with_type(primitive_type)

Specify the type of filter to use.

Configures the factory to create a specific filter type from FILTER_REGISTRY.

FILTER_REGISTRY looks like this

Open the folder filters in src/urban_mapper/modules/filter to see the available filter types in FILTER_REGISTRY. Each filter class is registered under its class name.

You also can use list(FILTER_REGISTRY.keys()) to see available filter types.

Parameters:

Name Type Description Default
primitive_type str

The name of the filter type (e.g., "BoundingBoxFilter").

required

Returns:

Name Type Description
FilterFactory FilterFactory

Self for method chaining.

Raises:

Type Description
ValueError

If primitive_type is not in FILTER_REGISTRY.

Examples:

>>> filter_factory = mapper.filter.with_type("BoundingBoxFilter")
Source code in src/urban_mapper/modules/filter/filter_factory.py
@reset_attributes_before(["_filter_type"])
def with_type(self, primitive_type: str) -> "FilterFactory":
    """Specify the type of filter to use.

    Configures the factory to create a specific filter type from FILTER_REGISTRY.

    !!! tip "FILTER_REGISTRY looks like this"
        Open the folder `filters` in `src/urban_mapper/modules/filter` to see the available filter types
        in FILTER_REGISTRY. Each filter class is registered under its class name.

        You also can use `list(FILTER_REGISTRY.keys())` to see available filter types.

    Args:
        primitive_type (str): The name of the filter type (e.g., "BoundingBoxFilter").

    Returns:
        FilterFactory: Self for method chaining.

    Raises:
        ValueError: If primitive_type is not in FILTER_REGISTRY.

    Examples:
        >>> filter_factory = mapper.filter.with_type("BoundingBoxFilter")

    """
    if self._filter_type is not None:
        logger.log(
            "DEBUG_MID",
            f"WARNING: Filter method already set to '{self._filter_type}'. Overwriting.",
        )
        self._filter_type = None
    if primitive_type not in FILTER_REGISTRY:
        available = list(FILTER_REGISTRY.keys())
        match, score = process.extractOne(primitive_type, available)
        if score > 80:
            suggestion = f" Maybe you meant '{match}'?"
        else:
            suggestion = ""
        raise ValueError(
            f"Unknown filter method '{primitive_type}'. Available: {', '.join(available)}.{suggestion}"
        )
    self._filter_type = primitive_type
    logger.log(
        "DEBUG_LOW",
        f"WITH_TYPE: Initialised FilterFactory with filter_type={primitive_type}",
    )
    return self

transform(input_geodataframe, urban_layer)

Apply the filter to input data and return filtered results

Creates and applies a filter instance to the input GeoDataFrame.

Parameters:

Name Type Description Default
input_geodataframe Union[Dict[str, GeoDataFrame], GeoDataFrame]

one or more GeoDataFrame to filter.

required
urban_layer UrbanLayerBase

The urban layer for filtering criteria.

required

Returns:

Type Description
Union[Dict[str, GeoDataFrame], GeoDataFrame]

Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame]: The filtered data.

Raises:

Type Description
ValueError

If _filter_type is not set.

Examples:

>>> layer = mapper.urban_layer.region_neighborhoods().from_place("Brooklyn, New York")
>>> data = gpd.read_file("nyc_points.csv") # Example data
>>> filtered_data = mapper.filter.with_type("BoundingBoxFilter")            ...     .transform(data, layer)
Source code in src/urban_mapper/modules/filter/filter_factory.py
@require_attributes_not_none("_filter_type")
def transform(
    self,
    input_geodataframe: Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame],
    urban_layer: UrbanLayerBase,
) -> Union[
    Dict[str, gpd.GeoDataFrame],
    gpd.GeoDataFrame,
]:
    """Apply the filter to input data and return filtered results

    Creates and applies a filter instance to the input `GeoDataFrame`.

    Args:
        input_geodataframe (Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame]): one or more `GeoDataFrame` to filter.
        urban_layer (UrbanLayerBase): The urban layer for filtering criteria.

    Returns:
        Union[Dict[str, gpd.GeoDataFrame], gpd.GeoDataFrame]: The filtered data.

    Raises:
        ValueError: If _filter_type is not set.

    Examples:
        >>> layer = mapper.urban_layer.region_neighborhoods().from_place("Brooklyn, New York")
        >>> data = gpd.read_file("nyc_points.csv") # Example data
        >>> filtered_data = mapper.filter.with_type("BoundingBoxFilter")\
        ...     .transform(data, layer)
    """
    filter_class = FILTER_REGISTRY[self._filter_type]
    self._instance = filter_class(data_id=self._data_id, **self._extra_params)

    if (
        isinstance(input_geodataframe, Dict)
        and self._data_id is not None
        and self._data_id not in input_geodataframe
    ):
        print(
            "WARNING: ",
            f"Data ID {self._data_id} was not found in the list of dataframes ",
            "No filter transformation will be executed ",
        )

    return self._instance.transform(input_geodataframe, urban_layer)

build()

Build and return a filter instance without applying it.

Creates a filter instance for use in pipelines or deferred execution.

Note

Prefer transform() for immediate filtering; use build() for pipelines.

Returns:

Name Type Description
GeoFilterBase GeoFilterBase

A configured filter instance.

Raises:

Type Description
ValueError

If _filter_type is not set.

Examples:

>>> filter_component = mapper.filter.with_type("BoundingBoxFilter").build()
>>> pipeline.add_filter(filter_component)
Source code in src/urban_mapper/modules/filter/filter_factory.py
def build(self) -> GeoFilterBase:
    """Build and return a filter instance without applying it.

    Creates a filter instance for use in pipelines or deferred execution.

    !!! note
        Prefer `transform()` for immediate filtering; use build() for pipelines.

    Returns:
        GeoFilterBase: A configured filter instance.

    Raises:
        ValueError: If _filter_type is not set.

    Examples:
        >>> filter_component = mapper.filter.with_type("BoundingBoxFilter").build()
        >>> pipeline.add_filter(filter_component)
    """
    logger.log(
        "DEBUG_MID",
        "WARNING: build() should only be used in UrbanPipeline. In other cases, "
        "using transform() is a better choice.",
    )
    if self._filter_type is None:
        raise ValueError("Filter type must be specified. Call with_type() first.")
    filter_class = FILTER_REGISTRY[self._filter_type]
    self._instance = filter_class(
        data_id=self._data_id,
        **self._extra_params,
    )
    if self._preview is not None:
        self.preview(format=self._preview["format"])
    return self._instance

preview(format='ascii')

Display a preview of the filter configuration and settings.

Shows the filter’s configuration in the specified format.

Note

Requires a prior call to build() or transform().

Parameters:

Name Type Description Default
format str

The format to display ("ascii" or "json"). Defaults to "ascii".

'ascii'

Raises:

Type Description
ValueError

If format is unsupported.

Examples:

>>> factory = mapper.filter.with_type("BoundingBoxFilter")
>>> factory.build()
>>> factory.preview(format="json")
Source code in src/urban_mapper/modules/filter/filter_factory.py
def preview(self, format: str = "ascii") -> None:
    """Display a preview of the filter configuration and settings.

    Shows the filter’s configuration in the specified format.

    !!! note
        Requires a prior call to build() or transform().

    Args:
        format (str): The format to display ("ascii" or "json"). Defaults to "ascii".

    Raises:
        ValueError: If format is unsupported.

    Examples:
        >>> factory = mapper.filter.with_type("BoundingBoxFilter")
        >>> factory.build()
        >>> factory.preview(format="json")
    """
    if self._instance is None:
        print("No filter instance available to preview. Call build() first.")
        return
    if hasattr(self._instance, "preview"):
        preview_data = self._instance.preview(format=format)
        if format == "ascii":
            print(preview_data)
        elif format == "json":
            print(json.dumps(preview_data, indent=2))
        else:
            raise ValueError(f"Unsupported format '{format}'.")
    else:
        print("Preview not supported for this filter instance.")

with_preview(format='ascii')

Configure the factory to display a preview after building.

Enables automatic preview after build().

Parameters:

Name Type Description Default
format str

The preview format ("ascii" or "json"). Defaults to "ascii".

'ascii'

Returns:

Name Type Description
FilterFactory FilterFactory

Self for chaining.

Examples:

>>> filter_component = mapper.filter.with_type("BoundingBoxFilter")            ...     .with_preview(format="json")            ...     .build()
Source code in src/urban_mapper/modules/filter/filter_factory.py
def with_preview(self, format: str = "ascii") -> "FilterFactory":
    """Configure the factory to display a preview after building.

    Enables automatic preview after build().

    Args:
        format (str): The preview format ("ascii" or "json"). Defaults to "ascii".

    Returns:
        FilterFactory: Self for chaining.

    Examples:
        >>> filter_component = mapper.filter.with_type("BoundingBoxFilter")\
        ...     .with_preview(format="json")\
        ...     .build()
    """
    self._preview = {"format": format}
    return self
Provost Simon