Pipeline Generators¶
What is the Pipeline Generator module?
The Pipeline Generator
module is solving the following scenario, imagine telling UrbanMapper
exactly what
urban analysis you want, and watching it craft a pipeline for you—no coding required. Powered by
Large Language Models (LLMs)
, this module transforms your natural language descriptions into executable
Python code for UrbanMapper
pipelines.
Meanwhile, we recommend to look through the Example
's Pipeline Generator for a more hands-on introduction about
the Pipeline Generator module and its usage.
Documentation Under Alpha Construction
This documentation is in its early stages and still being developed. The API may therefore change, and some parts might be incomplete or inaccurate.
Use at your own risk, and please report anything that seems incorrect
/ outdated
you find.
PipelineGeneratorBase
¶
Bases: ABC
Abstract base class for pipeline generators.
This class defines the interface for pipeline generators.
What is a pipeline geneartor's primitive
Pipeline generators use large language models (LLMs) to automatically
create UrbanMapper pipelines
from natural language descriptions
.
Implementations of this class must provide a generate_urban_pipeline
method
that takes a user description and returns Python code for an UrbanMapper pipeline
.
Use of Short Name
The short name of the generator is used to identify the generator in the
PipelineGeneratorFactory
. It should be unique among all generators.
For instance, much easier to call GPT4 than GPT4Generator
. See further in the factory.
Attributes:
Name | Type | Description |
---|---|---|
instructions |
The instructions to guide the LLM in generating pipelines. |
Examples:
>>> class GPT4Generator(PipelineGeneratorBase):
... short_name = "GPT4"
...
... def __init__(self, instructions: str):
... self.instructions = instructions
...
... def generate_urban_pipeline(self, user_description: str) -> str:
... # Implementation that uses GPT-4 to generate a pipeline
... ...
Source code in src/urban_mapper/modules/pipeline_generator/abc_pipeline_generator.py
generate_urban_pipeline(user_description)
abstractmethod
¶
Generate an UrbanMapper pipeline
from a natural language description
.
This method uses a large language model
to generate Python code
for an
UrbanMapper pipeline
based on the user's natural language description
.
The generated code can then be executed to create and run the pipeline.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
user_description
|
str
|
A natural language description of the desired pipeline, such as "Load traffic data for New York and visualise accident hotspots." |
required |
Returns:
Type | Description |
---|---|
str
|
A string containing Python code that implements the described pipeline. |
str
|
This code can be executed with exec() to run the pipeline. |
Examples:
>>> generator = SomeGenerator(instructions)
>>> pipeline_code = generator.generate_urban_pipeline(
... "Load taxi trip data for Manhattan and create a heatmap of pickups"
... )
>>> print(pipeline_code) # You may use Ipyleaflet Code(.) for highlighting, or even `exec(pipeline_code)` for running, yet this is not recommended.
Source code in src/urban_mapper/modules/pipeline_generator/abc_pipeline_generator.py
GPT4PipelineGenerator
¶
Bases: PipelineGeneratorBase
Generates UrbanMapper pipelines
using GPT-4.
This class uses the GPT-4 language model via the ell
library to generate
Python code for UrbanMapper pipelines
based on user-provided instructions and descriptions.
What is ell
ell
is a lightweight, functional prompt engineering framework built on a few core principles:
- Prompts are programs, not strings.
- Prompts are actually parameters of a machine learning model.
- Tools for monitoring, versioning, and visualization
- Multimodality should be first class
- ...and much more!
See more in ell github repository.
Short Name
To use this primitive, when calling with_LLM(.)
make sure to write gpt-4
as the short name.
Source code in src/urban_mapper/modules/pipeline_generator/generators/gpt4_pipeline_generator.py
generate_urban_pipeline(user_description)
¶
Source code in src/urban_mapper/modules/pipeline_generator/generators/gpt4_pipeline_generator.py
GPT4OPipelineGenerator
¶
Bases: PipelineGeneratorBase
Generates UrbanMapper pipelines
using GPT-4o.
This class uses the GPT-4o language model via the ell
library to generate
Python code for UrbanMapper pipelines
based on user-provided instructions and descriptions.
What is ell
ell
is a lightweight, functional prompt engineering framework built on a few core principles:
- Prompts are programs, not strings.
- Prompts are actually parameters of a machine learning model.
- Tools for monitoring, versioning, and visualization
- Multimodality should be first class
- ...and much more!
See more in ell github repository.
Short Name
To use this primitive, when calling with_LLM(.)
make sure to write gpt-4o
as the short name.
Source code in src/urban_mapper/modules/pipeline_generator/generators/gpt4o_pipeline_generator.py
generate_urban_pipeline(user_description)
¶
Source code in src/urban_mapper/modules/pipeline_generator/generators/gpt4o_pipeline_generator.py
GPT35TurboPipelineGenerator
¶
Bases: PipelineGeneratorBase
Generates UrbanMapper pipelines
using GPT-3.5-turbo.
This class uses the GPT-3.5-turbo language model via the ell
library to generate
Python code for UrbanMapper pipelines
based on user-provided instructions and descriptions.
What is ell
ell
is a lightweight, functional prompt engineering framework built on a few core principles:
- Prompts are programs, not strings.
- Prompts are actually parameters of a machine learning model.
- Tools for monitoring, versioning, and visualization
- Multimodality should be first class
- ...and much more!
See more in ell github repository.
Short Name
To use this primitive, when calling with_LLM(.)
make sure to write gpt-3.5-turbo
as the short name.
Source code in src/urban_mapper/modules/pipeline_generator/generators/gpt35turbo_pipeline_generator.py
generate_urban_pipeline(user_description)
¶
Source code in src/urban_mapper/modules/pipeline_generator/generators/gpt35turbo_pipeline_generator.py
PipelineGeneratorFactory
¶
Factory class for creating and configuring pipeline generators.
This class implements a fluent chaining-methods-based interface for creating and configuring pipeline
generators. Pipeline generators
use Large Language Models (LLMs)
to automatically create UrbanMapper pipelines
from natural language descriptions.
The factory manages the details of generator instantiation
, configuration
, and
execution
, providing a consistent interface regardless of the underlying LLM implementation.
Attributes:
Name | Type | Description |
---|---|---|
_type |
The type of LLM-based generator to create. |
|
_custom_instructions |
Optional custom instructions to guide the LLM. |
Examples:
>>> from urban_mapper import UrbanMapper
>>> mapper = UrbanMapper()
>>> pipeline_code = mapper.pipeline_generator.with_LLM("GPT4") ... .generate_urban_pipeline(
... "Load taxi trips in Manhattan and show the count of pickups per street segments (roads)."
... )
Source code in src/urban_mapper/modules/pipeline_generator/pipeline_generator_factory.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
|
with_LLM(primitive_type)
¶
Specify the LLM to use for pipeline generation.
See This As with.type(.)
In mostly all other modules we use with.type(.)
to specify the type of
primitive given the module we are trying to use. Here we name it with_LLM
for the sake of clarity and given the very little scope the current module has.
This method sets the type of LLM to use for generating pipelines. Available
types are registered in the PIPELINE_GENERATOR_REGISTRY
or simply by perusing the folder
generators
in src/urban_mapper/modules/pipeline_generator/
, they are all
subclasses of PipelineGeneratorBase
. The short name of the generator is used
to identify the generator in the factory. It should be unique among all generators.
Naming Mistakes
If you make a mistake in the name of the generator, the factory will
provide a probable suggestion based on the available names. This is done using
the fuzzywuzzy
library, which uses Levenshtein distance to find the closest match.
Pretty cool addition to UrbanMapper
!
Parameters:
Name | Type | Description | Default |
---|---|---|---|
primitive_type
|
str
|
The name of the LLM type to use, such as "GPT4" or "GPT35Turbo". |
required |
Returns:
Type | Description |
---|---|
PipelineGeneratorFactory
|
The PipelineGeneratorFactory instance for method chaining. |
Raises:
Type | Description |
---|---|
ValueError
|
If the specified LLM type is not found in the registry. |
Examples:
Source code in src/urban_mapper/modules/pipeline_generator/pipeline_generator_factory.py
with_custom_instructions(instructions)
¶
Set custom instructions for guiding the LLM.
This method provides custom instructions to guide the LLM in generating pipelines. This can be used to override the default instructions or to provide additional context or constraints.
What is the default instructions file?
The default instructions file is located in: src/urban_mapper/modules/pipeline_generator/instructions.txt
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
instructions
|
str
|
The custom instructions to provide to the LLM. |
required |
Returns:
Type | Description |
---|---|
PipelineGeneratorFactory
|
The PipelineGeneratorFactory instance for method chaining. |
Examples:
>>> instructions = "Generate a pipeline that focuses on urban mobility..."
>>> generator = mapper.pipeline_generator.with_custom_instructions(instructions)
>>> # Using a file reading
>>> with open("path/to/custom_instructions.txt", "r") as file:
... instructions = file.read()
>>> generator = mapper.pipeline_generator.with_custom_instructions(instructions)
Source code in src/urban_mapper/modules/pipeline_generator/pipeline_generator_factory.py
generate_urban_pipeline(user_description)
¶
Generate an UrbanMapper pipeline
suggestion from a natural language description
.
This method uses the configured LLM to generate Python code for an
UrbanMapper pipeline
based on the user's natural language description
.
The generated code can then be executed to create and run the pipeline.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
user_description
|
str
|
A natural language description of the desired pipeline, such as "Load traffic data for New York and visualise accident hotspots." |
required |
Returns:
Type | Description |
---|---|
str
|
A string containing Python code that implements the described pipeline. |
str
|
This code can be executed with exec() to run the pipeline. |
Examples:
>>> pipeline_code = PipelineGeneratorFactory().with_LLM("GPT4") ... .generate_urban_pipeline(
... "Load taxi trip data for Manhattan and create a heatmap of pickups"
... )