# AI 模型 model = Azure.chat_model chain = LLMChain(prompt=prompt, llm=model) output = chain.run("around when was bitcoin founded?") # 解析 print(output_parser.parse(output))
这里可能是因为使用了
AzureOpenAI,没法按照限制的格式输出,不太稳定
1
ValueError: time data 'Bitcoin was founded on January 3, 2009. The corresponding datetime string would be "2009-01-03T00:00:00.000000Z".' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'
枚举解析器
枚举解析器用来解析枚举
定义枚举
1 2 3 4 5 6
classColors(Enum): WHITE = "白色" RED = "红色" GREEN = "绿色" BLUE = "蓝色" YELLOW = "黄色"
# Define your desired data structure. classJoke(BaseModel): setup: str = Field(description="question to set up a joke") punchline: str = Field(description="answer to resolve the joke")
# 您可以使用 Pydantic 轻松添加自定义验证逻辑 @validator("setup") defquestion_ends_with_question_mark(cls, field): if field[-1] != "?": raise ValueError("Badly formed question!") return field # PydanticOutputParser 解析器 parser = PydanticOutputParser(pydantic_object=Joke)
定义模板
1 2 3 4 5 6 7 8 9
# And a query intented to prompt a language model to populate the data structure. joke_query = "Tell me a joke."
# prompt template prompt_template = PromptTemplate( template="Answer the user query.\n{format_instructions}\n{query}\n", input_variables=["query"], partial_variables={"format_instructions": parser.get_format_instructions()}, )
# >> Joke(setup='Why did the chicken cross the road?', punchline='To get to the other side!')
复合类型字段的示例
1 2 3 4
# Here's another example, but with a compound typed field. classActor(BaseModel): name: str = Field(description="name of an actor") film_names: List[str] = Field(description="list of names of films they starred in")
classActor(BaseModel): name: str = Field(description="name of an actor") film_names: List[str] = Field(description="list of names of films they starred in") # Pydantic 解析器 parser = PydanticOutputParser(pydantic_object=Actor)
langchain.schema.output_parser.OutputParserException: Failed to parse Actor from completion {'name': '汤姆汉克斯', 'film_names': ['阿甘正传']}. Got: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
使用自动修复解析器包装解析器和 LLM
1 2 3 4 5 6
new_parser = OutputFixingParser.from_llm(parser=parser, llm=Azure.chat_model, max_retries=2) result = new_parser.parse(mis_formatted) print(result)
# >> name='Tom Hanks' film_names=['Forrest Gump'] # 呃,这里被 AI 修复成了英文
实现原理就是进行重试,将解析器的格式说明和错误内容交给 AI
重新进行处理
重试解析器
某些情况下查看输出就可以修复解析错误
但是例如输出不仅格式不正确,而且不完整则完全无法解析
假设定义结构和错误返回
1 2 3 4 5 6
classAction(BaseModel): action: str = Field(description="action to take") action_input: str = Field(description="input to the action") # 错误返回并没有 action_input 的信息 bad_response = '{"action": "search"}'
如果我们使用 OutputFixingParser 也无法完整修复,因为 AI
并不清楚数据是什么(感觉可以理解为修复的更多是格式错误,无法修复内容的不完整)