Skip to content

Contributing to UrbanMapper

Welcome to the contributing guide for UrbanMapper! We are excited to collaborate on developing a tool for urban data analysis that is both accessible and powerful. This guide will help you set up your environment, add new components, and submit contributions. Whether you are fixing bugs, adding features, or improving documentation, your work is important!

Status of UrbanMapper

UrbanMapper is actively evolving. Expect changes, and if you hit a snag, open a GitHub Issue—we’re here to help!

New Contributors

Check out the GitHub Issues for good first tasks or reach out for guidance!


Project Setup Guide

Prerequisites

  • UrbanMapper requires Python 3.10 or higher.
  • Use uv (recommended), conda, or venv to manage your project setup. Follow the steps below to install one of them.

    • To install uv, follow the instructions on the uv documentation.
    • If you don’t have Python 3.10 or higher / you prefer to be sure, you can install and pin it using uv:
      uv python install 3.10
      uv python pin 3.10
      
    • To install conda, follow the instructions on the conda documentation.
    • If you don’t have Python 3.10 or higher, you can create a new conda environment with the required version:
      conda create -n urbanmapper python=3.10
      conda activate urbanmapper
      
    • Ensure you have Python 3.10 or higher installed. You can check your Python version with:
      python3 --version
      
    • To create a virtual environment, use Python's built-in venv module:
      python3 -m venv urbanmapper-env
      source urbanmapper-env/bin/activate  # On macOS/Linux
      urbanmapper-env\Scripts\activate     # On Windows
      

Clone the Repo

git clone git@github.com:VIDA-NYU/UrbanMapper.git
cd UrbanMapper

Environment Setup

Get started by setting up your development environment. We recommend uv for its speed, but pip or conda work too. Choose one of the following options:

  1. Lock and sync dependencies:

    uv lock
    uv sync
    
    Note: If you encounter errors related to 'cairo' during dependency installation, see the Troubleshooting section.

  2. (Recommended) Install Jupyter extensions for interactive visualisations requiring Jupyter widgets:

    uv run jupyter labextension install @jupyter-widgets/jupyterlab-manager
    

  3. Launch Jupyter Lab to explore UrbanMapper (faster than running Jupyter without uv):

    uv run --with jupyter jupyter lab
    

If you prefer not to use uv, you can install UrbanMapper using pip. This method is slower and requires more manual intervention.

Assumptions:

  • You have pip installed.
  • You are working within a virtual environment or a conda environment.

Note

If you are not using a virtual or conda environment, it is highly recommended to set one up to avoid conflicts. Refer to Python's venv documentation or conda's environment management guide for assistance.

  1. Install dependencies:
    pip install -r requirements.txt
    
  2. Install UrbanMapper:

    pip install -e ./UrbanMapper
    # or if you ensure you are in your virtual environment, cd UrbanMapper && pip install -e .
    
    The -e flag installs UrbanMapper in editable mode, allowing changes to the code to be reflected immediately. If you don’t need this, use pip install ./UrbanMapper instead.

  3. (Recommended) Install Jupyter extensions for interactive visualisations:

    jupyter labextension install @jupyter-widgets/jupyterlab-manager
    

  4. Launch Jupyter Lab:

    jupyter lab
    

Config Note:

Check out config.yaml in urban_mapper/ for pipeline schemas and mixin mappings. It’s optional for basic setup but key for advanced tweaks.

Alternative Tools

Prefer pip or conda? That’s fine—just note uv is our go-to for performance.


Linting and Formatting with Ruff

We use ruff to keep the codebase clean and consistent. Run it before submitting changes.

Commands

  • Check Issues:
    uv run ruff check
    
  • Fix Formatting:
    uv run ruff check --fix
    

Editor Integration

Integrate ruff into your editor (e.g., VSCode) for live feedback.


Pre-Commit Hooks

Pre-commit hooks enforce standards by running checks (like ruff) before commits.

Setup

  1. Install:
    uv run pre-commit install
    
  2. Test Manually (optional):
    uv run pre-commit run --all-files
    

Automatic Execution

Hooks run automatically on git commit. Fix any failures to proceed.


How to Create New Components

UrbanMapper’s modular design makes extending it a breeze. Select the component type you want to add:

Loaders pull data (e.g., CSV, Shapefiles) into a GeoDataFrame.

  1. Subclass LoaderBase (urban_mapper/modules/loader/abc_loader.py):
    • Implement load_data_from_file. Refer to the base class for details.
  2. Register It:
    • Add to FILE_LOADER_FACTORY in urban_mapper/modules/loader/loader_factory.py.

Example (csv_loader.py):

from urban_mapper.modules.loader.abc_loader import LoaderBase
import geopandas as gpd
import pandas as pd
from beartype import beartype

@beartype
class CSVLoader(LoaderBase):
    def load_data_from_file(self) -> gpd.GeoDataFrame:
        df = pd.read_csv(self.file_path)  #(1)
        # Convert to GeoDataFrame...
        return gdf

  1. Reads the CSV file into a pandas DataFrame before geospatial conversion.

  2. Place in urban_mapper/modules/loader/loaders/.

Urban layers (e.g., streets) are spatial entities as GeoDataFrames.

  1. Subclass UrbanLayerBase (urban_mapper/modules/urban_layer/abc_urban_layer.py):
  2. Add methods like from_place and _map_nearest_layer. Refer to the base class for details.

Example (osmnx_streets.py):

from urban_mapper.modules.urban_layer.abc_urban_layer import UrbanLayerBase
import geopandas as gpd
import osmnx as ox
from beartype import beartype

@beartype
class OSMNXStreets(UrbanLayerBase):
    def from_place(self, place_name: str, **kwargs) -> None:
        self.network = ox.graph_from_place(place_name, network_type="all")  # (1)
        self.layer = ox.graph_to_gdfs(self.network)[1].to_crs(self.coordinate_reference_system)

  1. Fetches street network data using OSMnx for the specified place.

  2. Place in urban_mapper/modules/urban_layer/urban_layers/.

  3. Auto-detected—no registration needed.

Imputers fill missing geospatial data, such as gaps in a dataset.

  1. Subclass GeoImputerBase (urban_mapper/modules/imputer/abc_imputer.py):
  2. Implement _transform and preview. Refer to the base class for details.

Example (simple_geo_imputer.py):

from urban_mapper.modules.imputer.abc_imputer import GeoImputerBase
import geopandas as gpd
from beartype import beartype

@beartype
class SimpleGeoImputer(GeoImputerBase):
    def _transform(self, input_geodataframe: gpd.GeoDataFrame, urban_layer) -> gpd.GeoDataFrame:
        # Impute logic here
        return input_geodataframe        def preview(rmat: s        r = "-> str::
s        r     return f"Imputer: Simpl:s        r     eGeoImputer\n  Lat: {self.la:titude_column}"

  • Place: in urban_mapper/modules/imputer/imputers/.
  • Auto-detected.

Filters refine datasets (e.g., by spatial bounds).

  1. Subclass GeoFilterBase (urban_mapper/modules/filter/abc_filter.py):
  2. Implement _transform. Refer to the base class for details.

Example (bounding_box_filter.py):

from urban_mapper.modules.filter.abc_filter import GeoFilterBase
import geopandas as gpd
from beartype import beartype

@beartype
class BoundingBoxFilter(GeoFilterBase):
    def _transform(self, input_geodataframe: gpd.GeoDataFrame, urban_layer) -> gpd.GeoDataFrame:
        minx, miny, maxx, maxy = urban_layer.get_layer_bounding_box()
        return input_geodataframe.cx[minx:maxx, miny:maxy]

  • Place in urban_mapper/modules/filter/filters/.
  • Auto-detected.

Enrichers enhance urban layers with insights; aggregators summarise data.

Enrichers:

  • Subclass EnricherBase (urban_mapper/modules/enricher/abc_enricher.py).
  • Place in urban_mapper/modules/enricher/enrichers/.
  • Auto-detected.

Aggregators:

  • Subclass BaseAggregator (urban_mapper/modules/enricher/aggregator/abc_aggregator.py).
  • Update EnricherFactory.build() in urban_mapper/modules/enricher/enricher_factory.py.

Example (sum_aggregator.py):

from urban_mapper.modules.enricher.aggregator.abc_aggregator import BaseAggregator
import pandas as pd
from beartype import beartype

@beartype
class SumAggregator(BaseAggregator):
    def __init__(self, group_by_column: str, value_column: str):
        self.group_by_column = group_by_column
        self.value_column = value_column
    def _aggregate(self, input_dataframe: pd.DataFrame) -> pd.Series:
        return input_dataframe.groupby(self.group_by_column)[self.value_column].sum()

  • Place in urban_mapper/modules/enricher/aggregator/aggregators/.
  • Add to EnricherFactory:
    elif self.config.action == "sum":
        aggregator = SumAggregator(self.config.group_by[0], self.config.values_from[0])
    

Visualisers render maps for analysis.

  1. Subclass VisualiserBase (urban_mapper/modules/visualiser/abc_visualiser.py):
  2. Implement _render. Refer to the base class for details.

Example (static_visualiser.py):

from urban_mapper.modules.visualiser.abc_visualiser import VisualiserBase
import geopandas as gpd
from beartype import beartype

@beartype
class StaticVisualiser(VisualiserBase):
    def _render(self, urban_layer_geodataframe: gpd.GeoDataFrame, columns: list, **kwargs):
        return urban_layer_geodataframe.plot(column=columns[0], legend=True, **kwargs).get_figure()

  • Place in urban_mapper/modules/visualiser/visualisers/.
  • Auto-detected.

Generators create pipeline steps dynamically.

  1. Subclass PipelineGeneratorBase (urban_mapper/modules/pipeline_generator/abc_pipeline_generator.py).

Example (gpt4o_pipeline_generator.py):

from urban_mapper.modules.pipeline_generator.abc_pipeline_generator import PipelineGeneratorBase
from beartype import beartype

@beartype
class GPT4OPipelineGenerator(PipelineGeneratorBase):
    def generate_pipeline(self, data_description: str) -> list:
        # AI-driven step generation
        return []

  • Place in urban_mapper/modules/pipeline_generator/generators/.
  • Auto-detected via pipeline_generator_factory.py.

Pipeline Architecture

UrbanMapper’s pipeline flows like this:

%%{init: { 'theme': 'base', 'themeVariables': { 'primaryColor': '#57068c', 'primaryTextColor': '#fff', 'primaryBorderColor': '#F49BAB', 'lineColor': '#F49BAB', 'secondaryColor': '#9B7EBD', 'tertiaryColor': '#E5D9F2' } }}%% graph LR subgraph "Data Ingestion" A["Loader (1)"] B["Urban Layer (1)"] A -->|Raw data| B end subgraph "Data Preprocessing" direction TB C["Imputers (0..*)"] D["Filters (0..*)"] C -->|Imputed data| D end subgraph "Data Processing" E["Enrichers (1..*)"] end subgraph "Data Output" F["Visualiser (0, 1)"] end B -->|Spatial data| C D -->|Filtered data| E E -->|Enriched data| F

Notation: (1) = exactly one instance, (0..*) = zero or more instances, (1..*) = one or more instances, (0, 1) = zero or one instance

Each step processes the data sequentially, transforming it from raw input to enriched urban insights. New components should slot into this sequence (see urban_mapper/pipeline/).


Generate Documentation

First and foremost, thank you for your contribution! To generate documentation, follow these steps:

  1. UV Sync with DEV:

    uv sync --dev
    

  2. Build Docs:

    ./build_docs.sh
    

  3. Serve Docs:

    uv run mkdocs serve
    

  4. Open in Browser: Localhost

Note: During documentation generation, ensure your environment is correctly set up. If you encounter issues with dependencies like 'cairo', refer to the Troubleshooting section. If you’ve added new packages, update the requirements files as described in Managing Dependencies.


Pull Requests and Rebasing

  • Branch:
    git checkout -b feat/your-feature
    
  • Commit:
    • Use Git Karma style (e.g., feat: add new loader).
  • Rebase:
    git fetch origin
    git rebase origin/main
    
  • Submit PR:
    • Push and open a PR against main.
    • Note: We highly encourage using Git Karma for commits/branches (e.g., feat/add-loader).

PR for the being are merge commits without squashing

We are currently using merge commits without squashing. This may change when UM becomes more stable. Therefore, make sure your history is clean enough and does not provide a spaghetti-style history.

Interested in further readings? Look here.

Common Git Commands

Command Description
git clone <url> Clone the repository
git checkout -b <name> Create and switch to a new branch
git add <files> Stage changes for commit
git commit -m "<msg>" Commit changes with a message
git push origin <branch> Push changes to the remote repository
git fetch origin Fetch latest changes from remote
git rebase origin/main Rebase your branch on top of main

Git Hints

  • Use git rebase -i to polish commits, but don’t rewrite shared history.
  • We may request history rewritting for clarity. Beginner to fixup ? Read this nice article: https://github.com/TheAssemblyArmada/Thyme/wiki/Using-Fixup-Commits
  • Want to have a nice look at your logs? Use Tig to visualize your git history. Install it via brew install tig and run tig in your terminal.

Thank You!

Thanks for contributing to UrbanMapper! Your efforts shape urban data analysis. Questions? Open an issue—we’ve got your back.

Enjoy mapping! 🌍


Troubleshooting

Handling 'cairo' Dependency Issues

Warning

On MacOS, if you encounter errors related to 'cairo', follow these steps:

Source: https://github.com/squidfunk/mkdocs-material/issues/5121.

  1. Install 'cairo' if not already installed:
    brew install cairo
    
  2. If already installed, try reinstalling:
    brew reinstall cairo
    
  3. If the symlink is broken, fix it:
    brew link cairo
    
    Or:
    brew unlink cairo && brew link cairo
    
  4. For MacOS with M2 and before (Intel included):
  5. Set the environment variable:
    export DYLD_FALLBACK_LIBRARY_PATH=/opt/homebrew/lib
    
  6. For MacOS with M3 chip or later:
  7. Create a symbolic link:
    ln -s /opt/homebrew/lib/libcairo.2.dylib .
    

Note: These steps have been tested on MacOS. For Windows or Linux, adapt accordingly or open a GitHub issue.


Managing Dependencies

When you add a new package to the project, update the requirements files as follows:

  1. Update requirements.txt –– Mainly for the people who are using pip or conda to install the dependencies.

    uv pip compile pyproject.toml -o requirements.txt
    
    Note that this step will vanish after the first release of UrbanMapper on PyPi. Meanwhile, if you want to know more about why we should do that and, why not UV is automatically doing that: https://github.com/astral-sh/uv/issues/6007. Happy reading 💪!

  2. Update requirements-dev.txt –– Mainly for the newly generated documentation via ReadTheDoc.

    uv export --dev --no-hashes --no-header --no-annotate | awk '{print $1}' FS=' ;' > requirements-dev.txt
    

    Then, manually adjust for platform-specific dependencies. Locate lines like:

    pywin32==310
    pywin32-ctypes==0.2.3
    pywinpty==2.0.15
    

    Replace them with:

    pywin32==310; platform_system=="Windows"
    pywin32-ctypes==0.2.3; platform_system=="Windows"
    pywinpty==2.0.15; platform_system=="Windows"
    

    Note that this manual adjustment is needed for Read The Docs, which does not yet support uv. See the ongoing discussion at https://github.com/astral-sh/uv/issues/10074.

  3. Commit and Push: After updating, commit and push the changes to the repository.

Provost Simon, sonia