Ollama + LangChain 构建 Agent

目前LLM非常强大，但是如果只将它们用于聊天补全、生成图像这类生成式的场景，无异于自断双臂。近年来Agent的出现，必将会让AI渗透到日常生活中的方方面面。

微软的semantic kernel关于Agent的介绍是：

An AI agent is a software entity designed to perform tasks autonomously or semi-autonomously by recieving input, processing information, and taking actions to achieve specific goals.

尽管我加粗表示了我个人认为重要的几个关键词，但是不得不说，这个介绍写得让人摸不着头脑。相比之下，LangChain中对Agent的定义就非常简洁：

Agent is a class that uses an LLM to choose a sequence of actions to take.

总之，我心中的Agent，是指发挥LLM的长处，让LLM作为推理引擎，规划并执行外部动作。

本文简介

目前最火热的大模型当属OpenAI和DeepSeek，尤其是网上关于使用OpenAI构建Agent的子类非常多，但是在当前这个时间节点，我认为还有还有些不如意的地方：

并不是每个中国人都能使用常规信道获得openai的key，也不是每个人都有国外信用卡来付费。
至于DeepSeek，除了系统繁忙网络超时问题之外，DeepSeek著名的R1推理模型并不官方支持function calling(参见 https://github.com/deepseek-ai/DeepSeek-R1/issues/9#issuecomment-2604747754)，而截至今日，最新版的function calling尚不稳定。

注： DeepSeek 官方针对R1模型的函数调用给出了三种 walkaround：


- 使用脚本解析模型输出到结构化格式（比如：“JSON”格式）
- 设计提示词工程来指定模型产生特定格式的输出
- 自定义包装器，来模拟函数调用

所以，目前的尴尬就在于，我不想等：“高老师，我太想进步了”。

本文文章使用Ollama+QWen2.5来驱动。

Ollama的意义在于我们可以本地部署，避免依赖外部网络环境，这在涉密或者离线场合意义非凡。
QWen2.5的意义在于对中文支持非常好，不像是llama3.2那样，动不动就自动转换成英语，显得非常混乱。

LLM 部分

首先是普通的LLM，这里不作多余介绍：

from langchain_community.chat_models import ChatOllama

OLLAMA_BASE_URL ="http://192.168.20.11:11434"

llm = ChatOllama(
    model="qwen2.5",  
    base_url= OLLAMA_BASE_URL,
    temperature=0.5,
)

Tool 部分

然后我们定义一个工具，来查询天气：

from langchain.tools import tool

@tool
def get_weather(city: str) -> str:
    """
    查询指定城市的天气，输入城市名(字符串)，输出天气描述(字符串)。
    示例: 输入“常州”，返回“晴转多云, 23℃”
    """
    weather_data = {
        "北京": "晴，25°C",
        "上海": "多云，28°C",
        "广州": "雷阵雨，30°C"
    }
    return weather_data.get(city, "未找到该城市天气信息")

注意这里的注释，我写得比较详细，这可不是为了给人看的，这是给LLM看的。

Agent

然后来到了关键的Agent部分：

from langchain import hub
from langchain.agents import create_react_agent, AgentExecutor

# 定义可用工具
tools = [get_weather]

# 创建 ReAct Agent
prompt = hub.pull("hwchase17/react")
agent = create_react_agent(llm, tools, prompt)

这里的提示词 hwchase17/react 其实是:

Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}

最后，执行Agent：

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

result = agent_executor.invoke({
    "input": "今天北京和上海的天气分别怎么样？请对比一下，并推荐哪个更适合旅游出行"
})
print(result["output"])

其输出类似于：

> Entering new AgentExecutor chain...
我需要查询北京和上海今天的天气情况，然后进行比较并给出建议。
Action: get_weather
Action Input: "北京" 晴，25°C 我还需要查询上海的天气。
Action: get_weather
Action Input: "上海" 多云，28°C 现在我得到了北京和上海的天气信息，接下来可以进行比较并给出建议。
Final Answer: 北京今天是晴天，气温为25℃；而上海则是多云，气温为28℃。从温度上看，上海略高于北京，但考虑到北京可能有更少的湿度，空气更加清新，且晴朗的天气更适合户外活动。因此如果考虑旅游出行，北京可能是更好的选择。当然，具体选择还需根据个人喜好和旅行目的来定。

> Finished chain.
北京今天是晴天，气温为25℃；而上海则是多云，气温为28℃。从温度上看，上海略高于北京，但考虑到北京可能有更少的湿度，空气更加清新，且晴朗的天气更适合户外活动。因此如果考虑旅游出行，北京可能是更好的选择。当然，具体选择还需根据个人喜好和旅行目的来定。

看起来，效果还可以？

你可能会好奇，create_react_agent是什么，有什么黑魔法吗？其实没有，它只是使用ReAct提示词来创建一个普通的agent，整个函数代码约20行左右，其中的核心部分如下：

    prompt = prompt.partial(
        tools=tools_renderer(list(tools)),
        tool_names=", ".join([t.name for t in tools]),
    )
    if stop_sequence:
        stop = ["\nObservation"] if stop_sequence is True else stop_sequence
        llm_with_stop = llm.bind(stop=stop)
    else:
        llm_with_stop = llm
    output_parser = output_parser or ReActSingleInputOutputParser()
    agent = (
        RunnablePassthrough.assign(
            agent_scratchpad=lambda x: format_log_to_str(x["intermediate_steps"]),
        )
        | prompt
        | llm_with_stop
        | output_parser
    )
    return agent

备注：这个版本是基于一篇的较早的ReAct论文ReAct: Synergizing Reasoning and Acting in Language Models实现的。由于比较简单，也不太适合生产环境，官方推荐使用LangGraph 中的create_react_agent()，参见 reference doc。

Ollama + LangChain 构建 Agent

本文简介

LLM 部分

Tool 部分

Agent

添加新评论

最新文章

最近回复

分类

归档

其它