x

2 weeks ago · 8ec1581eaf
parent 58e91eaca9
commit 8ec1581eaf
1 changed files with 154 additions and 95 deletions
--- a/docs/versioned_docs/version-0.2.x/how_to/llm_token_usage_tracking.ipynb
+++ b/docs/versioned_docs/version-0.2.x/how_to/llm_token_usage_tracking.ipynb
@ -2,120 +2,187 @@
 "cells": [
  {
   "cell_type": "markdown",
-   "id": "e5715368",
+   "id": "90dff237-bc28-4185-a2c0-d5203bbdeacd",
   "metadata": {},
   "source": [
    "# How to track token usage for LLMs\n",
    "\n",
-    "This notebook goes over how to track your token usage for specific calls. It is currently only implemented for the OpenAI API.\n",
+    "Tracking token usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
    "\n",
-    "Let's first look at an extremely simple example of tracking token usage for a single LLM call."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "9455db35",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_community.callbacks import get_openai_callback\n",
-    "from langchain_openai import OpenAI"
+    "```{=mdx}\n",
+    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
+    "\n",
+    "<PrerequisiteLinks content={`\n",
+    "- [LLMs](/docs/concepts/#llms)\n",
+    "`} />\n",
+    "```\n",
+    "\n",
+    "## Using LangSmith\n",
+    "\n",
+    "You can use [LangSmith](https://www.langchain.com/langsmith) to help track token usage in your LLM application. See the [LangSmith quick start guide](https://docs.smith.langchain.com/).\n",
+    "\n",
+    "## Using callbacks\n",
+    "\n",
+    "There are some API-specific callback context managers that allow you to track token usage across multiple calls. You'll need to check whether such an integration is available for your particular model.\n",
+    "\n",
+    "If such an integration is not available for your model, you can create a custom callback manager by adapting the implementation of the OpenAI callback manager (find it in the code base and adapt for your own LLMs.)\n",
+    "\n",
+    "### OpenAI\n",
+    "\n",
+    "Let's first look at an extremely simple example of tracking token usage for a single Chat model call.\n",
+    "\n",
+    ":::{.callout-danger}\n",
+    "\n",
+    "The callback handler does **NOT** work when **streaming** becauyse OpenAI does not return token information in its API response.\n",
+    "\n",
+    "::"
   ]
  },
  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "d1c55cc9",
+   "cell_type": "markdown",
+   "id": "f790edd9-823e-4bc5-befa-e9529c7237a0",
   "metadata": {},
-   "outputs": [],
   "source": [
-    "llm = OpenAI(model_name=\"gpt-3.5-turbo-instruct\", n=2, best_of=2)"
+    "### Single call"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
-   "id": "31667d54",
+   "execution_count": 1,
+   "id": "757cbfd6-22a1-44a2-bbe3-65d85aee3fee",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "Tokens Used: 37\n",
-      "\tPrompt Tokens: 4\n",
-      "\tCompletion Tokens: 33\n",
-      "Successful Requests: 1\n",
-      "Total Cost (USD): $7.2e-05\n"
+      "\n",
+      "Why don't scientists trust atoms?\n",
+      "\n",
+      "Because they make up everything!\n",
+      "---\n",
+      "\n",
+      "Total Tokens: 18\n",
+      "Prompt Tokens: 4\n",
+      "Completion Tokens: 14\n",
+      "Total Cost (USD): $3.4e-05\n"
     ]
    }
   ],
   "source": [
+    "from langchain_community.callbacks import get_openai_callback\n",
+    "from langchain_openai import OpenAI\n",
+    "\n",
+    "llm = OpenAI(model_name=\"gpt-3.5-turbo-instruct\")\n",
+    "\n",
    "with get_openai_callback() as cb:\n",
    "    result = llm.invoke(\"Tell me a joke\")\n",
-    "    print(cb)"
+    "    print(result)\n",
+    "    print(\"---\")\n",
+    "print()\n",
+    "\n",
+    "print(f\"Total Tokens: {cb.total_tokens}\")\n",
+    "print(f\"Prompt Tokens: {cb.prompt_tokens}\")\n",
+    "print(f\"Completion Tokens: {cb.completion_tokens}\")\n",
+    "print(f\"Total Cost (USD): ${cb.total_cost}\")"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "c0ab6d27",
+   "id": "7df3be35-dd97-4e3a-bd51-52434ab2249d",
   "metadata": {},
   "source": [
-    "Anything inside the context manager will get tracked. Here's an example of using it to track multiple calls in sequence."
+    "### Multiple calls\n",
+    "\n",
+    "Anything inside the context manager will get tracked. Here's an example of using it to track multiple calls in sequence to a chain. This will also work for an agent which may use multiple steps."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
-   "id": "e09420f4",
-   "metadata": {},
+   "execution_count": 50,
+   "id": "7d7f8d26-d8f4-40f9-9539-7a6bafdc4e24",
+   "metadata": {
+    "tags": []
+   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "72\n"
+      "\n",
+      "\n",
+      "Why did the chicken go to the seance?\n",
+      "\n",
+      "To talk to the other side of the road!\n",
+      "--\n",
+      "\n",
+      "\n",
+      "Why did the fish blush?\n",
+      "\n",
+      "Because it saw the ocean's bottom!\n",
+      "\n",
+      "---\n",
+      "Total Tokens: 49\n",
+      "Prompt Tokens: 12\n",
+      "Completion Tokens: 37\n",
+      "Total Cost (USD): $9.200000000000001e-05\n"
     ]
    }
   ],
   "source": [
+    "from langchain_community.callbacks import get_openai_callback\n",
+    "from langchain_core.prompts import PromptTemplate\n",
+    "from langchain_openai import OpenAI\n",
+    "\n",
+    "llm = OpenAI(model_name=\"gpt-3.5-turbo-instruct\")\n",
+    "\n",
+    "template = PromptTemplate.from_template(\"Tell me a joke about {topic}\")\n",
+    "chain = template | llm\n",
+    "\n",
    "with get_openai_callback() as cb:\n",
-    "    result = llm.invoke(\"Tell me a joke\")\n",
-    "    result2 = llm.invoke(\"Tell me a joke\")\n",
-    "    print(cb.total_tokens)"
+    "    response = chain.invoke({\"topic\": \"birds\"})\n",
+    "    print(response)\n",
+    "    response = chain.invoke({\"topic\": \"fish\"})\n",
+    "    print(\"--\")\n",
+    "    print(response)\n",
+    "\n",
+    "\n",
+    "print()\n",
+    "print(\"---\")\n",
+    "print(f\"Total Tokens: {cb.total_tokens}\")\n",
+    "print(f\"Prompt Tokens: {cb.prompt_tokens}\")\n",
+    "print(f\"Completion Tokens: {cb.completion_tokens}\")\n",
+    "print(f\"Total Cost (USD): ${cb.total_cost}\")"
   ]
  },
  {
   "cell_type": "markdown",
-   "id": "d8186e7b",
-   "metadata": {},
-   "source": [
-    "If a chain or agent with multiple steps in it is used, it will track all those steps."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "5d1125c6",
-   "metadata": {},
-   "outputs": [],
+   "id": "ad7a3fba-9fac-4222-8f87-d1d276d27d6e",
+   "metadata": {
+    "tags": []
+   },
   "source": [
-    "from langchain.agents import AgentType, initialize_agent, load_tools\n",
-    "from langchain_openai import OpenAI\n",
+    "## Streaming\n",
+    "\n",
+    ":::{.callout-danger}\n",
+    "\n",
+    "`get_openai_callback` relies on metadata information in the API response to get\n",
+    "information about the token counts; however, the `OpenAI` API does **NOT** return\n",
+    "token counts when streaming.\n",
    "\n",
-    "llm = OpenAI(temperature=0)\n",
-    "tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm)\n",
-    "agent = initialize_agent(\n",
-    "    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
-    ")"
+    "If you want to count token counts correctly, for streaming, either\n",
+    "implement a custom callback handler that uses an appropriate tokenizers to\n",
+    "count the tokens or use a monitoring platform like [LangSmith](https://www.langchain.com/langsmith).\n",
+    ":::"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
-   "id": "2f98c536",
-   "metadata": {},
+   "execution_count": 7,
+   "id": "cd61ed79-7858-49bb-afb5-d41291f597ba",
+   "metadata": {
+    "tags": []
+   },
   "outputs": [
    {
     "name": "stdout",
@ -123,48 +190,40 @@
     "text": [
      "\n",
      "\n",
-      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
-      "\u001b[32;1m\u001b[1;3m I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.\n",
-      "Action: Search\n",
-      "Action Input: \"Olivia Wilde boyfriend\"\u001b[0m\n",
-      "Observation: \u001b[36;1m\u001b[1;3m[\"Olivia Wilde and Harry Styles took fans by surprise with their whirlwind romance, which began when they met on the set of Don't Worry Darling.\", 'Olivia Wilde started dating Harry Styles after ending her years-long engagement to Jason Sudeikis — see their relationship timeline.', 'Olivia Wilde and Harry Styles were spotted early on in their relationship walking around London. (. Image ...', \"Looks like Olivia Wilde and Jason Sudeikis are starting 2023 on good terms. Amid their highly publicized custody battle – and the actress' ...\", 'The two started dating after Wilde split up with actor Jason Sudeikisin 2020. However, their relationship came to an end last November.', \"Olivia Wilde and Harry Styles started dating during the filming of Don't Worry Darling. While the movie got a lot of backlash because of the ...\", \"Here's what we know so far about Harry Styles and Olivia Wilde's relationship.\", 'Olivia and the Grammy winner kept their romance out of the spotlight as their relationship began just two months after her split from ex-fiancé ...', \"Harry Styles and Olivia Wilde first met on the set of Don't Worry Darling and stepped out as a couple in January 2021. Relive all their biggest relationship ...\"]\u001b[0m\n",
-      "Thought:\u001b[32;1m\u001b[1;3m Harry Styles is Olivia Wilde's boyfriend.\n",
-      "Action: Search\n",
-      "Action Input: \"Harry Styles age\"\u001b[0m\n",
-      "Observation: \u001b[36;1m\u001b[1;3m29 years\u001b[0m\n",
-      "Thought:\u001b[32;1m\u001b[1;3m I need to calculate 29 raised to the 0.23 power.\n",
-      "Action: Calculator\n",
-      "Action Input: 29^0.23\u001b[0m\n",
-      "Observation: \u001b[33;1m\u001b[1;3mAnswer: 2.169459462491557\u001b[0m\n",
-      "Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
-      "Final Answer: Harry Styles is Olivia Wilde's boyfriend and his current age raised to the 0.23 power is 2.169459462491557.\u001b[0m\n",
-      "\n",
-      "\u001b[1m> Finished chain.\u001b[0m\n",
-      "Total Tokens: 2205\n",
-      "Prompt Tokens: 2053\n",
-      "Completion Tokens: 152\n",
-      "Total Cost (USD): $0.0441\n"
+      "Why couldn't the bicycle stand up by itself?\n",
+      "\n",
+      "Because it was two-tired!\n",
+      "\n",
+      "Why don't scientists trust atoms?\n",
+      "\n",
+      "Because they make up everything.\n",
+      "---\n",
+      "\n",
+      "Total Tokens: 0\n",
+      "Prompt Tokens: 0\n",
+      "Completion Tokens: 0\n",
+      "Total Cost (USD): $0.0\n"
     ]
    }
   ],
   "source": [
+    "from langchain_community.callbacks import get_openai_callback\n",
+    "from langchain_openai import OpenAI\n",
+    "\n",
+    "llm = OpenAI(model_name=\"gpt-3.5-turbo-instruct\")\n",
+    "\n",
    "with get_openai_callback() as cb:\n",
-    "    response = agent.run(\n",
-    "        \"Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?\"\n",
-    "    )\n",
-    "    print(f\"Total Tokens: {cb.total_tokens}\")\n",
-    "    print(f\"Prompt Tokens: {cb.prompt_tokens}\")\n",
-    "    print(f\"Completion Tokens: {cb.completion_tokens}\")\n",
-    "    print(f\"Total Cost (USD): ${cb.total_cost}\")"
+    "    for chunk in llm.stream(\"Tell me a joke\"):\n",
+    "        print(chunk, end=\"\", flush=True)\n",
+    "    print(result)\n",
+    "    print(\"---\")\n",
+    "print()\n",
+    "\n",
+    "print(f\"Total Tokens: {cb.total_tokens}\")\n",
+    "print(f\"Prompt Tokens: {cb.prompt_tokens}\")\n",
+    "print(f\"Completion Tokens: {cb.completion_tokens}\")\n",
+    "print(f\"Total Cost (USD): ${cb.total_cost}\")"
   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "80ca77a3",
-   "metadata": {},
-   "outputs": [],
-   "source": []
  }
 ],
 "metadata": {
@ -183,7 +242,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.11.4"
  }
 },
 "nbformat": 4,