mirror of https://github.com/hwchase17/langchain
You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
278 lines
7.8 KiB
Plaintext
278 lines
7.8 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "raw",
|
|
"id": "af408f61",
|
|
"metadata": {},
|
|
"source": [
|
|
"---\n",
|
|
"sidebar_position: 1\n",
|
|
"---"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1a65e4c9",
|
|
"metadata": {},
|
|
"source": [
|
|
"# How to use example selectors\n",
|
|
"\n",
|
|
"If you have a large number of examples, you may need to select which ones to include in the prompt. The Example Selector is the class responsible for doing so.\n",
|
|
"\n",
|
|
"The base interface is defined as below:\n",
|
|
"\n",
|
|
"```python\n",
|
|
"class BaseExampleSelector(ABC):\n",
|
|
" \"\"\"Interface for selecting examples to include in prompts.\"\"\"\n",
|
|
"\n",
|
|
" @abstractmethod\n",
|
|
" def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:\n",
|
|
" \"\"\"Select which examples to use based on the inputs.\"\"\"\n",
|
|
" \n",
|
|
" @abstractmethod\n",
|
|
" def add_example(self, example: Dict[str, str]) -> Any:\n",
|
|
" \"\"\"Add new example to store.\"\"\"\n",
|
|
"```\n",
|
|
"\n",
|
|
"The only method it needs to define is a ``select_examples`` method. This takes in the input variables and then returns a list of examples. It is up to each specific implementation as to how those examples are selected.\n",
|
|
"\n",
|
|
"LangChain has a few different types of example selectors. For an overview of all these types, see the below table.\n",
|
|
"\n",
|
|
"In this guide, we will walk through creating a custom example selector."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "638e9039",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Examples\n",
|
|
"\n",
|
|
"In order to use an example selector, we need to create a list of examples. These should generally be example inputs and outputs. For this demo purpose, let's imagine we are selecting examples of how to translate English to Italian."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 36,
|
|
"id": "48658d53",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"examples = [\n",
|
|
" {\"input\": \"hi\", \"output\": \"ciao\"},\n",
|
|
" {\"input\": \"bye\", \"output\": \"arrivaderci\"},\n",
|
|
" {\"input\": \"soccer\", \"output\": \"calcio\"},\n",
|
|
"]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c2830b49",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Custom Example Selector\n",
|
|
"\n",
|
|
"Let's write an example selector that chooses what example to pick based on the length of the word."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 37,
|
|
"id": "56b740a1",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_core.example_selectors.base import BaseExampleSelector\n",
|
|
"\n",
|
|
"\n",
|
|
"class CustomExampleSelector(BaseExampleSelector):\n",
|
|
" def __init__(self, examples):\n",
|
|
" self.examples = examples\n",
|
|
"\n",
|
|
" def add_example(self, example):\n",
|
|
" self.examples.append(example)\n",
|
|
"\n",
|
|
" def select_examples(self, input_variables):\n",
|
|
" # This assumes knowledge that part of the input will be a 'text' key\n",
|
|
" new_word = input_variables[\"input\"]\n",
|
|
" new_word_length = len(new_word)\n",
|
|
"\n",
|
|
" # Initialize variables to store the best match and its length difference\n",
|
|
" best_match = None\n",
|
|
" smallest_diff = float(\"inf\")\n",
|
|
"\n",
|
|
" # Iterate through each example\n",
|
|
" for example in self.examples:\n",
|
|
" # Calculate the length difference with the first word of the example\n",
|
|
" current_diff = abs(len(example[\"input\"]) - new_word_length)\n",
|
|
"\n",
|
|
" # Update the best match if the current one is closer in length\n",
|
|
" if current_diff < smallest_diff:\n",
|
|
" smallest_diff = current_diff\n",
|
|
" best_match = example\n",
|
|
"\n",
|
|
" return [best_match]"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 38,
|
|
"id": "ce928187",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"example_selector = CustomExampleSelector(examples)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 39,
|
|
"id": "37ef3149",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[{'input': 'bye', 'output': 'arrivaderci'}]"
|
|
]
|
|
},
|
|
"execution_count": 39,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"example_selector.select_examples({\"input\": \"okay\"})"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 40,
|
|
"id": "c5ad9f35",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"example_selector.add_example({\"input\": \"hand\", \"output\": \"mano\"})"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 41,
|
|
"id": "e4127fe0",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"[{'input': 'hand', 'output': 'mano'}]"
|
|
]
|
|
},
|
|
"execution_count": 41,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"example_selector.select_examples({\"input\": \"okay\"})"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "786c920c",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Use in a Prompt\n",
|
|
"\n",
|
|
"We can now use this example selector in a prompt"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 42,
|
|
"id": "619090e2",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain_core.prompts.few_shot import FewShotPromptTemplate\n",
|
|
"from langchain_core.prompts.prompt import PromptTemplate\n",
|
|
"\n",
|
|
"example_prompt = PromptTemplate.from_template(\"Input: {input} -> Output: {output}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 43,
|
|
"id": "5934c415",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Translate the following words from English to Italain:\n",
|
|
"\n",
|
|
"Input: hand -> Output: mano\n",
|
|
"\n",
|
|
"Input: word -> Output:\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"prompt = FewShotPromptTemplate(\n",
|
|
" example_selector=example_selector,\n",
|
|
" example_prompt=example_prompt,\n",
|
|
" suffix=\"Input: {input} -> Output:\",\n",
|
|
" prefix=\"Translate the following words from English to Italain:\",\n",
|
|
" input_variables=[\"input\"],\n",
|
|
")\n",
|
|
"\n",
|
|
"print(prompt.format(input=\"word\"))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e767f69d",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Example Selector Types\n",
|
|
"\n",
|
|
"| Name | Description |\n",
|
|
"|------------|---------------------------------------------------------------------------------------------|\n",
|
|
"| Similarity | Uses semantic similarity between inputs and examples to decide which examples to choose. |\n",
|
|
"| MMR | Uses Max Marginal Relevance between inputs and examples to decide which examples to choose. |\n",
|
|
"| Length | Selects examples based on how many can fit within a certain length |\n",
|
|
"| Ngram | Uses ngram overlap between inputs and examples to decide which examples to choose. |"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "8a6e0abe",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.1"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|