使用 OpenAI API 进行 Prompt 工程的最佳实践

发表于 2023-10-14 分类于开发 > AI

Prompt 工程的工作原理

由于指导遵循模型的训练方式或训练数据，有一些特定的提示格式效果特别好，与手头的任务更为一致

下面我们介绍了一些提示格式，我们认为这些格式运行可靠，但可以随意探索不同的格式，这些格式可能最适合您的任务

经验法则和示例

注意： {text input here} 是实际文本 / 上下文的占位符

使用最新版本

为了最好的结果，我们通常建议使用最新、功能最强的模型

截至 2022 年 11 月，最佳选择是用于文本生成的 text-davinci-003 模型和用于代码生成的 code-davinci-002 模型

提示放在开头并且使用 ### 或者 """ 分隔提示和内容

效果较差：

1
2
3

Summarize the text below as a bullet point list of the most important points.

{text input here}

改进：

Summarize the text below as a bullet point list of the most important points.

Text: """
{text input here}
"""

对所需的上下文、结果、长度、格式、风格等进行具体和尽可能详细的描述

效果较差：

1	Write a poem about OpenAI.

改进：

1	Write a short inspiring poem about OpenAI, focusing on the recent DALL-E product launch (DALL-E is a text to image ML model) in the style of a {famous poet}

通过示例阐明所需的输出格式

效果较差：

1
2
3

Extract the entities mentioned in the text below. Extract the following 4 entity types: company names, people names, specific topics and themes.

Text: {text}

显示和告知，当显示特定的格式要求时，模型的响应会更好

这也使得以编程方式可靠地解析多个输出变得更加容易

改进：

Extract the important entities mentioned in the text below. First extract all company names, then extract all people names, then extract specific topics which fit the content and finally extract general overarching themes

Desired format:
Company names: <comma_separated_list_of_company_names>
People names: -||-
Specific topics: -||-
General themes: -||-

Text: {text}

从零样本开始，然后是少量样本，它们都不起作用则微调

零样本：

Extract keywords from the below text.

Text: {text}

Keywords:

少量样本 - 举几个例子

Extract keywords from the corresponding texts below.

Text 1: Stripe provides APIs that web developers can use to integrate payment processing into their websites and mobile applications.
Keywords 1: Stripe, payment processing, APIs, web developers, websites, mobile applications
##
Text 2: OpenAI has trained cutting-edge language models that are very good at understanding and generating text. Our API provides access to these models and can be used to solve virtually any task that involves processing language.
Keywords 2: OpenAI, language models, text processing, API.
##
Text 3: {text}
Keywords 3:

微调：请参阅此处的微调最佳实践 here

减少模糊、不准确的描述

效果较差：

1	The description for this product should be fairly short, a few sentences only, and not too much more.

改进：

1	Use a 3 to 5 sentence paragraph to describe this product.

与其说不该做什么不如说该做什么

效果较差：

The following is a conversation between an Agent and a Customer. DO NOT ASK USERNAME OR PASSWORD. DO NOT REPEAT.

Customer: I can’t log in to my account.
Agent:

改进：

The following is a conversation between an Agent and a Customer. The agent will attempt to diagnose the problem and suggest a solution, whilst refraining from asking any questions related to PII. Instead of asking for PII, such as username or password, refer the user to the help article www.samplewebsite.com/help/faq

Customer: I can’t log in to my account.
Agent:

代码生成特有 - 使用引导词引导模型

效果较差：

1
2
3

# Write a simple python function that
# 1. Ask me for a number in mile
# 2. It converts miles to kilometers

在下方的代码示例中，添加 "import" 对模型进行提示，它应该使用 Python 进行编写（类似地，"SELECT" 是一个针对 SQL 语句好的提示）

改进：

# Write a simple python function that
# 1. Ask me for a number in mile
# 2. It converts miles to kilometers
 
import

参数

通常，我们发现 model 和 temperature 是改变模型输出最常用的参数

model - 更高性能的模型成本更高，延迟也更高
temperature - 衡量模型输出不太可能的 tokens 的指标；温度越高，输出就越随机（通常是创造性的）；然而这并不等同于“真实性”；对于大多数实际用例，如数据提取和真实问答，0 的参数是最好的
max_tokens - 不能控制输出的长度，是一个针对 token 生成的硬截止限制；理想情况下你不会经常达到这个极限，因为当你的模型认为它已经完成时，或者当它达到你定义的停止序列时，它就会停止
stop - 一组字符（标记），生成后将导致文本生成停止

更多的参数描述参考 API reference

参考

Best practices for prompt engineering with OpenAI API | OpenAI Help Center