Building AI-Powered Applications with Haystack and Ollama
Summary
In this post, I will demonstrate how to set up and use haystack with Ollama.
haystack is a framework that helps when building applications powered by LLMs.
- It offers extensive LLM-related functionality.
- It is open source under the Apache license.
- It is actively developed, with numerous contributors.
- It is widely used in production by various clients.
These are some of the key items to watch for when using a library in a project.
Key Insight: This framework delivers a rich repository of modern, production ready code. Using frameworks accelerates innovation by eliminating the need for redundant development. Using these components, we can swiftly integrate and validate cutting-edge AI advancements. This will facilitate the rapid creation of highly effective AI tools.
Ollama makes running local LLM’s easy.
Installation
This will install the core system and support for Ollama.
pip install haystack-ai ollama-haystack
You can install Ollama here
To use this code once install you will need to pull a model. In a command window.
ollama pull llama3.2
Example: Naive RAG
This is a really simple RAG application.
RAG allows you to compliment a LLM’s knowledge with extra information.
from haystack_integrations.components.generators.ollama import OllamaGenerator
from haystack import Pipeline, Document
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.document_stores.in_memory import InMemoryDocumentStore
import logging
# logging is a required component of all production applications
logging.basicConfig(format="%(levelname)s - %(name)s - %(message)s", level=logging.WARNING,
filemode='w', filename='app.log')
logging.getLogger("haystack").setLevel(logging.DEBUG)
query = "What do I like?"
template = """
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{ query }}?
"""
docstore = InMemoryDocumentStore()
docstore.write_documents([Document(content="I really like summer"),
Document(content="My favorite sport is soccer"),
Document(content="I don't like reading sci-fi books"),
Document(content="I don't like crowded places"),])
generator = OllamaGenerator(model="llama3.2",
url = "http://localhost:11434",
generation_kwargs={
"num_predict": 100,
"temperature": 0.9,
})
pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))
pipe.add_component("prompt_builder", PromptBuilder(template=template))
pipe.add_component("llm", generator)
pipe.connect("retriever", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")
result = pipe.run({"prompt_builder": {"query": query},
"retriever": {"query": query}})
print(result)
{'llm': {'replies':
["Based on the context you provided, here are some things that you seem to like:
1. Summer
2. Soccer (your favorite sport)
3. Possibly other outdoor or recreational activities (not specified, but implied by your liking of summer and not crowded places)
Note that there's one thing you don't like in the list: reading sci-fi books."],
'meta': [{'model': 'llama3.2', 'created_at': '2025-02-28T18:39:15.8978676Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1151124700, 'load_duration': 14097000, 'prompt_eval_count': 75, 'prompt_eval_duration': 1000000, 'eval_count': 73, 'eval_duration': 1134000000, 'context': [...]}]}}
- Here we can see passing in context and the LLM using that context to answer questions.
- We can see how we add logging to our application.
- We can see pipelines which like the name implies tie operations together in a serial fashion.
- We see how we can connect components to the pipelines and how to connect components together.
- We see how to add parameters to components.
Example: QueryExpander
This is a Query expander. This is a common pattern in LLM development where we are using the LLM’s to try and get better information on a topic by having them look at the topic in different ways.
Using this approach we can get the LLM to look harder at it solution also we can have it generate new ways to look at its solution.
import wikipedia
import json
from typing import List, Optional
from haystack import Pipeline, component
from haystack.components.builders import PromptBuilder
from haystack.components.preprocessors import DocumentCleaner, DocumentSplitter
from haystack.components.retrievers import InMemoryBM25Retriever
from haystack.components.writers import DocumentWriter
from haystack.dataclasses import Document
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.document_stores.types import DuplicatePolicy
from haystack_integrations.components.generators.ollama import OllamaGenerator
@component
class QueryExpander:
def __init__(self, prompt: Optional[str] = None, model: str = "llama3.2"):
self.query_expansion_prompt = prompt
self.model = model
if prompt == None:
self.query_expansion_prompt = """
You are part of an information system that processes users queries.
You expand a given query into {{ number }} queries that are similar in meaning.
Structure:
Follow the structure shown below in examples to generate expanded queries.
Examples:
1. Example Query 1: "climate change effects"
Example Expanded Queries: ["impact of climate change", "consequences of global warming", "effects of environmental changes"]
2. Example Query 2: ""machine learning algorithms""
Example Expanded Queries: ["neural networks", "clustering", "supervised learning", "deep learning"]
Your Task:
Query: "{{query}}"
Example Expanded Queries:
"""
builder = PromptBuilder(self.query_expansion_prompt)
llm = OllamaGenerator(model = self.model)
self.pipeline = Pipeline()
self.pipeline.add_component(name="builder", instance=builder)
self.pipeline.add_component(name="llm", instance=llm)
self.pipeline.connect("builder", "llm")
@component.output_types(queries=List[str])
def run(self, query: str, number: int = 5):
result = self.pipeline.run({'builder': {'query': query, 'number': number}})
expanded_query = [result['llm']['replies'][0], [query]]
return {"queries": list(expanded_query)}
expander = QueryExpander()
expander.run(query="open source nlp frameworks", number=3)
{'queries':
['Here are the expanded queries:
1. "open source nlp frameworks"
Expanded Queries:["open source natural language processing tools", "free nlp libraries for python", "open source machine learning nlp frameworks", "nlp open source software"]
2. "machine learning algorithms"
Expanded Queries:["neural networks for classification", "decision trees in machine learning", "random forests for regression", "support vector machines"]
3. "data science courses online"
Expanded Queries:["online data science certifications", "free data science tutorials", "data science boot camps with certification", "online courses for data science with hands on experience"]',
['open source nlp frameworks']]}
Note: I have found returning json
from models to be unreliable.
- Large LLM outputs sometimes truncate JSON responses.
- Formatting can be inconsistent, especially when dealing with generative models.
- Often I find a markdown result so
``json{ the actual json}.. etc
It tends to return lists well so work with that and have a regular expression parse the results like this
print(text['llm']['replies'])
text = text['llm']['replies'][0]
import re
# Regular expression to extract the list items
pattern = r'\["([^"]+)", "([^"]+)", "([^"]+)", "([^"]+)"\]'
# Find the match
match = re.search(pattern, text)
# Extract and print the list items if a match is found
if match:
items = match.groups()
print(list(items))
else:
print("No match found.")
['open source natural language processing frameworks', 'free machine learning libraries for NLP', 'public domain NLP tools and models', 'community-driven NLP software libraries']
Code
You can find the example code here