一条命令含多任务？Bot如何理解“弦外之音”

2025-03-13 01:57:06

一条命令，多个任务？如何让你的 Bot 听懂“弦外之音”

咱们做 Bot 的时候，经常会遇到用户一句话里布置好几个任务的情况。咋判断用户到底想干一件还是几件事？这事儿挺让人头疼的。

我之前也琢磨过这问题，用 GPT-3.5 试了下 prompt engineering，就像你尝试的那样。有时灵，有时不灵。尤其是遇到这种：When a New google calendar event is created, post a message to general channel in slack plus sync it to Salesforce leads，GPT-3.5 就有点懵了。

所以，直接 prompt engineering 可能不太靠谱，毕竟用户说话的方式千奇百怪，prompt 很难覆盖所有情况。训练一个模型或者微调一个大语言模型（LLM）呢？听起来不错，但是数据集得多大啊，而且用户表达方式太多变，感觉也是个无底洞。

那怎么办？别急，下面我分享几种我认为可行的办法，帮你理理思路。

一、问题根源：模糊性和多样性

归根结底，问题出在两点：

模糊性（Ambiguity）: 自然语言本身就存在歧义，一句话可能有多种解读。比如，“买个苹果”，可以是买水果，也可以是买手机。
多样性（Diversity）: 用户表达方式太多样了。同样的意思，可以用完全不同的句式、词语来表达。

二、解决方案：多管齐下

面对这种复杂情况，单一方法可能搞不定，最好是多管齐下。

1. 基于规则 + 关键词的方法

这法子比较直接，也相对容易上手。

原理：

咱们可以先定义一些常用的连接词、关键词，比如 "and", "plus", "also", "同时", "并且", "然后" 等等。
当用户输入命令时，先检查有没有这些关键词。
如果有，就根据关键词把命令拆成几个部分，再分别判断每个部分是不是一个独立的任务。

代码示例 (Python):

import re

def split_command(command):
    connectors = ["and", "plus", "also", "同时", "并且", "然后", ","]
    patterns = [r'\b' + c + r'\b' for c in connectors]  # 使用\b确保匹配整个单词
    combined_pattern = '|'.join(patterns)

    parts = re.split(combined_pattern, command, flags=re.IGNORECASE)
    parts = [p.strip() for p in parts if p.strip()] #去除多余的空格
    return parts

command1 = "When a New google calendar event is created, post a message to general channel in slack plus sync it to Salesforce leads"
command2 = "Turn on the lights and play music"
command3 = 'Check the weather'

print(f"'{command1}' 分割结果: {split_command(command1)}")
print(f"'{command2}' 分割结果: {split_command(command2)}")
print(f"'{command3}' 分割结果: {split_command(command3)}")

输出：

'When a New google calendar event is created, post a message to general channel in slack plus sync it to Salesforce leads' 分割结果: ['When a New google calendar event is created, post a message to general channel in slack', 'sync it to Salesforce leads']
'Turn on the lights and play music' 分割结果: ['Turn on the lights', 'play music']
'Check the weather' 分割结果: ['Check the weather']

进阶技巧：

可以建立一个更全面的关键词/连接词库，根据你的 Bot 应用场景不断完善。
结合词性标注（Part-of-Speech Tagging）等 NLP 技术，更准确地识别连接词和短语。

安全提示：

对用户的输入进行校验和过滤，避免恶意代码注入。

2. 利用句法分析 (Dependency Parsing)

这种方法比关键词更进一步，能分析句子结构。

原理：

句法分析能找出句子中各个词语之间的依赖关系，比如主谓关系、动宾关系等等。
我们可以利用这些关系，判断句子中有没有包含多个并列的动作或目标。
如果有，很可能就是多个任务。

代码示例 (Python, 使用 spaCy):

import spacy

nlp = spacy.load("en_core_web_sm")  # 或 "zh_core_web_sm" (中文)

def analyze_dependencies(command):
    doc = nlp(command)
    root = doc.root
    multiple_tasks= False
    conjuncts = [c for c in root.children if c.dep_ == "conj"]

    #如果有并列的动词, 大概率是多任务.
    if(len(conjuncts) > 0):
        multiple_tasks = True

    print(f"句子:'{command}' 是否包含多任务:{multiple_tasks}")
    for token in doc:
      print(token.text, token.dep_, token.head.text, token.head.pos_,
            [child for child in token.children])

analyze_dependencies("When a New google calendar event is created, post a message to general channel in slack plus sync it to Salesforce leads")
analyze_dependencies("Turn on the lights and play music")
analyze_dependencies("Check the weather")

输出：（部分）

句子:'When a New google calendar event is created, post a message to general channel in slack plus sync it to Salesforce leads' 是否包含多任务:True
...
post ROOT post VERB [created, to, channel, sync]
...
sync conj post VERB [leads]
...

句子:'Turn on the lights and play music' 是否包含多任务:True
...
Turn ROOT Turn VERB [on, lights, play]
...
play conj Turn VERB [music]

句子:'Check the weather' 是否包含多任务:False
...
Check ROOT Check VERB [weather]
...

解释：

spaCy 能帮我们分析出句子的依赖关系。
我们重点关注 ROOT (通常是句子的主要动词) 和 conj (并列关系) 这两个标签。
conj连接的, 大概率就是并列的任务.
上面示例的第一个和第二个句子被识别为多任务, 第三个为单任务。

进阶技巧：

可以根据不同的句型结构，制定更精细的规则。
可以训练一个自定义的句法分析模型，以适应特定的领域和表达方式。

3. 微调一个轻量级的 LLM

如果前面两种方法效果还不够好，可以考虑微调一个轻量级的 LLM。

原理：

我们不需要从头训练一个大模型，那样太费资源。
可以选择一个预训练好的、参数量较小的 LLM，比如 BERT、RoBERTa 的 base 或 small 版本。
准备一个标注好的数据集，包含单任务和多任务的命令。
在你的数据集上微调这个模型，让它学会区分单任务和多任务。

代码示例 (Python, 使用 Hugging Face Transformers):

(因篇幅限制，这里只给出关键步骤的伪代码，具体实现需要参考 Transformers 文档)

# 1. 准备数据集 (CSV 格式, 两列: "text" 和 "label" , label 为 0 或 1, 表示单任务或多任务)

# 2. 加载预训练模型和分词器
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model_name = "bert-base-uncased"  # 或其他轻量级模型
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# 3. 数据预处理 (分词、padding 等)

# 4. 定义训练参数 (学习率、batch size、epochs 等)

# 5. 训练模型
trainer.train()

# 6. 评估模型

# 7. 使用模型进行预测
def predict_task_type(command, model, tokenizer):
    inputs = tokenizer(command, return_tensors="pt")
    outputs = model(**inputs)
    prediction = outputs.logits.argmax().item()
    return "single_task" if prediction == 0 else "multiple_task"

进阶技巧：

可以在标注数据时, 不仅仅是标注单/多任务, 可以细化为, 将每个子任务的范围标注出来, 当做一个序列标注任务处理.
可以采用一些数据增强技巧，扩充数据集，提高模型泛化能力。
可以尝试不同的模型架构和超参数，找到最适合你的任务的模型。

4. Prompt Engineering + 更强大的LLM(GPT-4, Claude 3等)

虽然我之前提到 prompt engineering 对于这类任务有些局限性，但可以试着用一些高级技巧, 搭配更强的 LLM, 例如 GPT-4, 或者 Claude 3 等.

思路:
* 采用 Few-shot prompting 策略: 准备一些高质量的例子.
* Chain-of-thought prompting: 引导模型一步一步分析, 将思考步骤打印出来.
* 使用更强的语言模型.

改进的Prompt 例子 (给 GPT-4):

你是一个任务分析助手, 请仔细分析用户请求, 判断是单个任务还是多个任务, 并给出原因.

格式如下:

请求: {用户请求}
分析:
1. ...
2. ...
结论: 单个任务/多个任务

---

请求: 查找附近的餐馆并预订
分析:
1. 查找附近的餐馆. 这是一个独立的查询动作.
2. 预订. 这是在找到餐馆后的另一个动作.
结论: 多个任务

---

请求: 关闭卧室的灯
分析:
1. 只有 "关闭卧室的灯" 这一个动作.
结论: 单个任务

---
请求: {实际的用户输入}

这种方式更依赖大模型的能力. 如果是自建的LLM，建议主要采用微调方案，或者用RAG等方式。

三、总结

要准确判断一条命令包含单任务还是多任务，单靠一种方法可能不够。可以根据实际情况，结合多种方法。
规则 + 关键词简单直接；句法分析能深入句子结构；微调 LLM 精度更高；而 Prompt Engineering结合强大模型可以在无法准备大量数据的时候进行尝试。
可以根据项目的时间, 数据情况和所追求的精度综合来选择解决方案.