min read

Prompt Chaining in AI Development

Walter Budzian

Published on

Jun 5, 2024

Prompt chaining is a way to sequence LLM calls (and their prompts) by using the output of the last call as input to the next, to guide an LLM to produce more useful answers than if it had been prompted only once.

By treating the entire chain of calls and prompts as part of a larger request to arrive at an ultimate response, you’re able to refine and steer the intermediate calls and responses at each step to achieve a better result.

Prompt chaining allows you to manage what may start out as a large, unwieldy prompt, whose implicitly defined subtasks and details can throw off language models and result in unsatisfying responses. This is because LLMs lose focus when asked to process different ideas thrown together. They can misread relationships between different instructions and incompletely execute them.

Such misreadings can also cascade as downstream errors given that LLMs generate text token by token.

We even see this in Chain-of-Thought prompting (CoT) sometimes. Though this style of prompting does a reasonably good job at decomposing tasks into smaller ones, it nonetheless generates the entire output on the fly in a single call—with no intermediate steps.

This gives you no granular control over the flow: What if you want to prompt engineer around the attributes of a single variable in the middle of a sequence?

That’s why prompt chaining is suitable for some situations. In this article, we give you a full picture of what prompt chains do. We also show ways of implementing chains using Mirascope, our own Python toolkit for building with language models (GitHub), along with do’s and don’ts for chaining LLM calls.

A Brief Use Case

Imagine asking an LLM to generate a detailed travel itinerary for a week-long trip across Europe. You supply it with details like starting date, countries of interest, and duration of visit. You want it to give details on flight suggestions, hotel recommendations, and local attractions.

Feeding all this into the language model in one go might not produce the output you’re looking for. It might struggle to prioritize and accurately fulfill each request (whether explicitly or implicitly stated), resulting in an incomplete or unsatisfying itinerary. You might end up refining the prompt and resubmitting it until you’re satisfied with the answer.

To implement prompt chaining, you could break down the main prompt into smaller, separate prompts, each with their own call:

“Suggest popular travel destinations in Europe for a week-long trip:” the LLM responds with a list of destinations.
‍
“What are good flight options from [starting city] to [chosen destination]:” it responds with a list of flights at different dates and times.
‍
“Recommend highly rated hotels in [chosen destination] arriving on [date] for a week-long stay:” it responds with a list of hotel options, and so on.

Smaller prompts and calls are more manageable and allow the model to focus on the details of each individual task, resulting in responses that are likely to be more thorough and accurate than if you had just sent all the details at once.

Types of Prompt Chains

Although most of the literature concerning chaining focuses on sequential prompting, it’s possible to work with other configurations of prompt chains.

In this section, we briefly describe each of the other prompt engineering techniques before returning to mainly discussing sequential prompts in the rest of this article.

Sequential prompts
Parallel prompts
Conditional prompts
Recursive prompts

Note: Although different prompt types are individually described below, a powerful technique is to combine these as needed in your code.

Sequential Prompts

These are straightforward in that the chain uses the output of the call from the previous step as input to the next to build a more complex and refined output.

In the sequential prompt below, `RecipeRecommender` is instantiated and calls `ChefSelector`, whose output it uses for its own response.

1from functools import cached_property
2
3from mirascope.openai import OpenAICall
4from pydantic import computed_field
5
6
7# Step 1: Identify a Chef
8class ChefSelector(OpenAICall):
9    prompt_template = "Name a chef who is really good at cooking {food_type} food"
10    food_type: str
11
12
13# Step 2: Recommend Recipes
14class RecipeRecommender(ChefSelector):
15    prompt_template = "Imagine that you are chef {chef}. Recommend a {food_type} recipe using {ingredient}."
16    ingredient: str
17
18    @computed_field
19    @cached_property
20    def chef(self) -> str:
21        """Uses `ChefSelector` to select the chef based on the food type."""
22        return ChefSelector(food_type=self.food_type).call().content
23
24
25recommender = RecipeRecommender(ingredient="apples", food_type="japanese")
26response = recommender.call()
27print(response.content)

Parallel Prompts

With parallel prompts, multiple prompts are executed simultaneously, often using the same initial input. The outputs of these parallel prompts can be combined or processed further in subsequent steps.

In the code example below, the `chef` and `ingredients` properties are both computed in parallel and independently, as they don’t depend on each other. Their values are also cached, meaning their respective LLM calls can be made in parallel whenever the properties are accessed.

Note: We’re currently improving our support for async properties so these can be run in parallel on separate threads.

1# Parallel Execution
2from functools import cached_property
3
4from mirascope.openai import OpenAICall, OpenAIExtractor
5from pydantic import computed_field
6
7
8class ChefSelector(OpenAICall):
9    prompt_template = """
10    Please identify a chef who is well known for cooking with {ingredient}.
11    Respond only with the chef's name.
12    """
13
14    ingredient: str
15
16
17class IngredientIdentifier(OpenAIExtractor[list]):
18    extract_schema: type[list] = list[str]
19    prompt_template = """
20    Given a base ingredient {ingredient}, return a list of complementary ingredients.
21    Make sure to exclude the original ingredient from the list.
22    """
23
24    ingredient: str
25
26
27class RecipeRecommender(ChefSelector):
28    prompt_template = """
29    SYSTEM:
30    Your task is to recommend a recipe. Pretend that you are chef {chef}.
31
32    USER:
33    Recommend recipes that use the following ingredients:
34    {ingredients}
35    """
36
37    @computed_field
38    @cached_property
39    def chef(self) -> str:
40        response = ChefSelector(ingredient=self.ingredient).call()
41        return response.content
42
43    @computed_field
44    @cached_property
45    def ingredients(self) -> list[str]:
46        identifier = IngredientIdentifier(ingredient=self.ingredient)
47        return identifier.extract() + [self.ingredient]
48
49
50recommender = RecipeRecommender(ingredient="apples")
51response = recommender.call()
52print(response.content)

Conditional Prompts

Conditional prompts dynamically adjust their prompt queries based on specific conditions or criteria. This involves using conditional logic to modify parts of the prompt template depending on the input data or the results of intermediate steps.

Below, the `conditional_review_prompt property` checks the sentiment of the review and returns a different string based on whether the sentiment is positive or negative, which then dynamically adjusts the prompt template used in `ReviewResponder`.

1# Conditional Prompt
2from enum import Enum
3from functools import cached_property
4
5from mirascope.openai import OpenAICall, OpenAIExtractor
6from pydantic import computed_field
7
8
9class Sentiment(str, Enum):
10    POSITIVE = "positive"
11    NEGATIVE = "negative"
12
13
14class SentimentClassifier(OpenAIExtractor[Sentiment]):
15    extract_schema: type[Sentiment] = Sentiment
16    prompt_template = "Is the following review positive or negative? {review}"
17
18    review: str
19
20
21class ReviewResponder(OpenAICall):
22    prompt_template = """
23    SYSTEM:
24    Your task is to respond to a review.
25    The review has been identified as {sentiment}.
26    Please write a {conditional_review_prompt}.
27
28    USER: Write a response for the following review: {review}
29    """
30
31    review: str
32
33    @property
34    def conditional_review_prompt(self) -> str:
35        if self.sentiment == Sentiment.POSITIVE:
36            return "thank you response for the review."
37        else:
38            return "reponse addressing the review."
39
40    @computed_field
41    @cached_property
42    def sentiment(self) -> Sentiment:
43        classifier = SentimentClassifier(review=self.review)
44        return classifier.extract()
45
46
47positive_review = "This tool is awesome because it's so flexible!"
48responder = ReviewResponder(review=positive_review)
49response = responder.call()
50print(f"Sentiment: {responder.sentiment}")  # positive
51print(f"Positive Response: {response.content}")
52
53negative_review = "This product is terrible and too expensive!"
54responder.__dict__.pop("sentiment", None)  # remove from cache
55responder.review = negative_review
56response = responder.call()
57print(f"Sentiment: {responder.sentiment}")  # negative
58print(f"Negative Response: {response.content}")

Recursive Prompts

Recursive prompts involve a kind of automation in the form of a repeating loop where a prompt calls itself or another prompt in a loop-like structure. This is useful for tasks that require iteration, such as refining outputs or handling multi-step processes that need repeated evaluation.

In the code below, the `rewrite_iteratively` function takes the initial summary of the first prompt to fine-tune it through iterative feedback and rewriting. Each iteration's output becomes the input for the next iteration, and this process is controlled by the `depth` parameter.

1# Recursive Prompt
2from mirascope.openai import OpenAICall, OpenAIExtractor
3from pydantic import BaseModel, Field
4
5
6class SummaryFeedback(BaseModel):
7    """Feedback on summary with a critique and review rewrite based on said critique."""
8
9    critique: str = Field(..., description="The critique of the summary.")
10    rewritten_summary: str = Field(
11        ...,
12        description="A rewritten sumary that takes the critique into account.",
13    )
14
15
16class Summarizer(OpenAICall):
17    prompt_template = "Summarize the following text into one sentence: {original_text}"
18
19    original_text: str
20
21
22class Resummarizer(OpenAIExtractor[SummaryFeedback]):
23    extract_schema: type[SummaryFeedback] = SummaryFeedback
24    prompt_template = """
25    Original Text: {original_text}
26    Summary: {summary}
27
28    Critique the summary of the original text.
29    Then rewrite the summary based on the critique. It must be one sentence.
30    """
31
32    original_text: str
33    summary: str
34
35
36def rewrite_iteratively(original_text: str, summary: str, depth=2):
37    resummarizer = Resummarizer(original_text=original_text, summary=summary)
38    for _ in range(depth):
39        feedback = resummarizer.extract()
40        resummarizer.summary = feedback.rewritten_summary
41    return feedback.rewritten_summary
42
43
44original_text = """
45In the heart of a dense forest, a boy named Timmy pitched his first tent, fumbling with the poles and pegs.
46His grandfather, a seasoned camper, guided him patiently, their bond strengthening with each knot tied.
47As night fell, they sat by a crackling fire, roasting marshmallows and sharing tales of old adventures.
48Timmy marveled at the star-studded sky, feeling a sense of wonder he'd never known.
49By morning, the forest had transformed him, instilling a love for the wild that would last a lifetime.
50"""
51
52summarizer = Summarizer(original_text=original_text)
53summary = summarizer.call().content
54print(f"Summary: {summary}")
55# > Summary: During a camping trip in a dense forest, Timmy, guided by his grandfather, experienced a transformative bonding moment and developed a lifelong love for the wilderness.
56rewritten_summary = rewrite_iteratively(original_text, summary)
57print(f"Rewritten Summary: {rewritten_summary}")
58# > Rewritten Summary: During a camping trip in a dense forest, Timmy, guided by his seasoned camper grandfather, fumbled with the tent, bonded over a crackling fire and marshmallows, marveled at a star-studded sky, and discovered a profound, lifelong love for the wilderness.

Getting Started with Prompt Chaining, with Examples

Many frameworks offer dedicated chaining functionality, such as LangChain Expression Language (LCEL), which is a declarative language for composing chains.

As its name suggests, LangChain was created with chaining in mind and offers specialized classes and abstractions for accomplishing sequences of LLM calls.

You typically assemble chains in LCEL using its pipe operator (`|`), along with classes like `Runnable` and `SequentialChain`. LCEL chains generally have this format:

chain = prompt | model | output_parser

LCEL works well with simpler prompts, offering a compact syntax with straightforward flows via the pipe moderator.

But when building more complex chains, we found it challenging to debug errors and follow pipeline operations in detail. For example, with LCEL we always had to use an object that fit into it, so we couldn’t just code as we normally would.

In particular, we found `RunnablePassthrough`, an object that forwards input data without changes through a chain, to be unnecessary and to actually hinder building complex prompts with parallel sub chains. It’s more of an afterthought to solve a problem that LangChain itself created with LCEL. If you do things in a Pythonic way like Mirascope, you don’t need additional classes or structures to pass information through a chain—you simply always have access to it like you should.

Due to the complexity of working with such frameworks we designed Mirascope so you can build pipelines using the Python syntax you already know, rather than having to learn new structures.

Mirascope lets you build prompt chains using either Python properties or functions, as explained below.

Chaining Prompts Using Properties in Mirascope

Chaining prompts with properties means building a sequence of method calls where each step depends on the result of the previous step. The example below leverages the `@cached_property` decorator to cache the result of the LLM call (made once) to determine a scientist based on a field of study, which is made available to subsequent calls.

1import os
2from functools import cached_property
3
4from mirascope.openai import OpenAICall
5from pydantic import computed_field
6
7os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"
8
9
10class ScientistSelector(OpenAICall):
11    prompt_template = """
12    Name a scientist who is known for their work in {field_of_study}.
13    Give me just the name.
14    """
15
16    field_of_study: str
17
18
19class TheoryExplainer(ScientistSelector):
20    prompt_template = """
21    SYSTEM:
22    Imagine that you are {scientist}.
23    Your task is to explain a theory that you, {scientist}, are famous for.
24
25    USER:
26    Explain the theory related to {topic}.
27    """
28
29    topic: str
30
31    @computed_field
32    @cached_property
33    def scientist(self) -> str:
34        """Uses `ScientistSelector` to select the scientist based on the field of study."""
35        return ScientistSelector(field_of_study=self.field_of_study).call().content
36
37
38explainer = TheoryExplainer(field_of_study="physics", topic="relativity")
39theory_explanation = explainer.call()
40print(theory_explanation.content)
41# > The theory of relativity, developed by Albert Einstein, is a fundamental theory in physics ...

‍

In addition, `@computed_field` includes the output at every step of the chain in the final dump:

1print(explainer.dump())
2#> {
3#>   "tags": [],
4#>   "template": "SYSTEM:\nImagine that you are {scientist}.\nYour task is to explain a theory that you,
5#>     {scientist}, are famous for.\n\nUSER:\nExplain the theory related to {topic}.",
6#>   "inputs": {
7#>     "field_of_study": "physics",
8#>     "topic": "relativity",
9#>     "scientist": "Albert Einstein"
10#>   }
11#> }

‍

We generally recommend building with properties:

Your pipelines will be more readable than with functions, as the flow of data and dependencies will be clear.
‍
Once defined, properties can be reused multiple times.
‍
Properties (especially cached properties) are evaluated only when accessed, which can be efficient if certain steps in the chain are conditionally required or re-used multiple times.
‍
Using properties keeps the logic encapsulated within the class, making it easier to manage and debug. Each property represents a distinct step in the chain, and the dependencies are explicitly defined.

Chaining Prompts Using Functions in Mirascope

You can alternatively use functions rather than properties to chain prompts, where you straightforwardly pass the output from one call to the next. Functions offer a simple way to build chains but lack the efficiency of being able to cache call outputs and colocate with one prompt all inputs along the chain.

1import os
2
3from mirascope.openai import OpenAICall
4
5os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"
6
7
8class ScientistSelector(OpenAICall):
9    prompt_template = """
10    Name a scientist who is well-known for their work in {field_of_study}.
11    Give me just the name.
12    """
13
14    field_of_study: str
15
16
17class TheoryExplainer(OpenAICall):
18    prompt_template = """
19    SYSTEM:
20    Imagine that you are scientist {scientist}.
21    Your task is to explain a theory that you, {scientist}, are famous for.
22
23    USER:
24    Explain the theory of {theory}.
25    """
26
27    scientist: str
28    theory: str
29
30
31selector = ScientistSelector(field_of_study="physics")
32scientist = selector.call().content
33print(scientist)
34# > Albert Einstein.
35
36explainer = TheoryExplainer(scientist=scientist, theory="relativity")
37theory_explanation = explainer.call().content
38print(theory_explanation)
39# > Certainly! Here's an explanation of the theory of relativity: ...

When using functions for prompt chaining, the results of intermediate steps may not be evident. When dumping the results for these example functions, we see that `field_of_study=”physics”` doesn’t appear, which means you’d need to log every call along the chain to get full transparency on model outputs.

1print(explainer.dump())
2#> {
3#>   "tags": [],
4#>   "template": "SYSTEM:\nImagine that you are {scientist}.\nYour task is to explain a theory that you,
5#>     {scientist}, are famous for.\n\nUSER:\nExplain the theory related to {topic}.",
6#>   "inputs": {
7#>     "scientist": "Albert Einstein",
8#>     "topic": "relativity"
9#>   }
10#> }

‍

Functions offer some advantages, however:

They offer explicit control over the flow of execution, making it easier to implement complex chains where steps might need to be skipped or conditionally executed. This can be more straightforward compared to using properties, as functions allow you to write normal Python scripts with simple conditional logic.
Functions can be more easily reused across different classes and modules. This is useful if the chaining logic needs to be applied in various contexts or with different classes.

Downsides of Prompt Chaining

Prompt chaining provides utility in situations where you want to set up pipelines for processing and transforming data step-by-step.

However, there are downsides:

Each step of the chain needs to be explicitly defined ahead of time in a fixed sequence, which leaves little room for deviation based on runtime conditions or user input beyond what has been predefined. A dynamically generated agent workflow might adapt more flexibly to different conditions at runtime, but with this you also lose granular control over each piece of the chain.
Prompt chaining isn’t necessarily a cheaper option than alternatives, but it depends on how calls are structured since they’re token based and a chain with two calls may cost the same as a single CoT call if the input and output tokens are the same.
Speed might also be a downside, as you’ll need to make several requests instead of just one and this can slow down the "real-time" feel of the application. In such cases, you might consider providing feedback to the user, like progress messages, as the chain is executing intermediate steps.

You should balance these considerations when deciding whether to leverage prompt chaining for generative AI in your application.

Best Practices for Prompt Chaining

Good prompt chaining practices share a lot with best prompt engineering practices. Here are some recommendations for developing effective prompt chains when using AI models:

Keep Your Prompts Focused and Specific

Clear and concise prompts minimize misunderstandings by the model. For example, rather than writing, "Tell me something about scientists," a more focused prompt might be, "Name a scientist known for their work in quantum physics."

Each prompt should also perform a specific function and have a single responsibility assigned. This means breaking down complex tasks into smaller, manageable steps where each prompt addresses one aspect of the overall task.

Manage Data Carefully

Ensure that the format of the desired output of one prompt matches the expected input format of the next prompt to avoid errors. Schemas can easily help you achieve this, and the Mirascope library offers its `BaseExtractor` class, which is built on top of Pydantic, to help you define schemas and validate outputs of large language models.

For example, if a prompt expects a JSON object with specific fields, ensure the preceding prompt generates data in that exact structure. As well, be prepared to transform data to meet the input requirements of subsequent prompts. This might involve parsing JSON, reformatting strings, or converting data types.

Optimize for Performance

Use caching like `@cached_property` whenever feasible to improve performance. We also recommend finding ways to minimize the number of sequential calls by combining steps where possible, which may involve consolidating prompts or rethinking the chain structure to reduce the dependency on intermediate results.

The goal should be to minimize latency for the user experience, but there’s a tradeoff with complexity, as decreasing latency might increase the complexity of individual steps. Developers need to balance the trade-offs between maintaining a clear, manageable chain and optimizing for speed.

Harness the Efficiency of Python for Prompt Chaining

Mirascope leverages Python for chaining LLM calls, avoiding the need for complex abstractions with steep learning curves, as well as having to master new frameworks.

By keeping things simple, Mirascope lets you work with your data directly, helping you keep your workflows smoother and more intuitive.

Want to learn more? You can find more Mirascope code samples on both our documentation site and on GitHub.

Join our beta list!

Get updates and early access to try out new features as a beta tester.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.