You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/docs/docs/integrations/providers/upstage.ipynb

213 lines
6.1 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Upstage\n",
"\n",
"[Upstage](https://upstage.ai) is a leading artificial intelligence (AI) company specializing in delivering above-human-grade performance LLM components. \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Solar LLM\n",
"\n",
"**Solar Mini Chat** is a fast yet powerful advanced large language model focusing on English and Korean. It has been specifically fine-tuned for multi-turn chat purposes, showing enhanced performance across a wide range of natural language processing tasks, like multi-turn conversation or tasks that require an understanding of long contexts, such as RAG (Retrieval-Augmented Generation), compared to other models of a similar size. This fine-tuning equips it with the ability to handle longer conversations more effectively, making it particularly adept for interactive applications.\n",
"\n",
"Other than Solar, Upstage also offers features for real-world RAG (retrieval-augmented generation), such as **Groundedness Check** and **Layout Analysis**. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installation and Setup\n",
"\n",
"Install `langchain-upstage` package:\n",
"\n",
"```bash\n",
"pip install -qU langchain-core langchain-upstage\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Get [API Keys](https://console.upstage.ai) and set environment variables `UPSTAGE_API_KEY` and `UPSTAGE_DOCUMENT_AI_API_KEY`.\n",
"\n",
"> As of April 2024, you need separate API Keys for Solar and Document AI(Layout Analysis). The API Keys will be consolidated soon (hopefully in May) and you'll need just one key for all features."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Upstage LangChain integrations\n",
"\n",
"| API | Description | Import | Example usage |\n",
"| --- | --- | --- | --- |\n",
"| Chat | Build assistants using Solar Mini Chat | `from langchain_upstage import ChatUpstage` | [Go](../../chat/upstage) |\n",
"| Text Embedding | Embed strings to vectors | `from langchain_upstage import UpstageEmbeddings` | [Go](../../text_embedding/upstage) |\n",
"| Groundedness Check | Verify groundedness of assistant's response | `from langchain_upstage import UpstageGroundednessCheck` | [Go](../../tools/upstage_groundedness_check) |\n",
"| Layout Analysis | Serialize documents with tables and figures | `from langchain_upstage import UpstageLayoutAnalysisLoader` | [Go](../../document_loaders/upstage) |\n",
"\n",
"See [documentations](https://developers.upstage.ai/) for more details about the features."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Quick Examples\n",
"\n",
"### Environment Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"UPSTAGE_API_KEY\"] = \"YOUR_API_KEY\"\n",
"os.environ[\"UPSTAGE_DOCUMENT_AI_API_KEY\"] = \"YOUR_DOCUMENT_AI_API_KEY\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"### Chat\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain_upstage import ChatUpstage\n",
"\n",
"chat = ChatUpstage()\n",
"response = chat.invoke(\"Hello, how are you?\")\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"### Text embedding\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain_upstage import UpstageEmbeddings\n",
"\n",
"embeddings = UpstageEmbeddings()\n",
"doc_result = embeddings.embed_documents(\n",
" [\"Sam is a teacher.\", \"This is another document\"]\n",
")\n",
"print(doc_result)\n",
"\n",
"query_result = embeddings.embed_query(\"What does Sam do?\")\n",
"print(query_result)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"### Groundedness Check"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from langchain_upstage import UpstageGroundednessCheck\n",
"\n",
"groundedness_check = UpstageGroundednessCheck()\n",
"\n",
"request_input = {\n",
" \"context\": \"Mauna Kea is an inactive volcano on the island of Hawaii. Its peak is 4,207.3 m above sea level, making it the highest point in Hawaii and second-highest peak of an island on Earth.\",\n",
" \"answer\": \"Mauna Kea is 5,207.3 meters tall.\",\n",
"}\n",
"response = groundedness_check.invoke(request_input)\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"### Layout Analysis"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain_upstage import UpstageLayoutAnalysisLoader\n",
"\n",
"file_path = \"/PATH/TO/YOUR/FILE.pdf\"\n",
"layzer = UpstageLayoutAnalysisLoader(file_path, split=\"page\")\n",
"\n",
"# For improved memory efficiency, consider using the lazy_load method to load documents page by page.\n",
"docs = layzer.load() # or layzer.lazy_load()\n",
"\n",
"for doc in docs[:3]:\n",
" print(doc)"
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
}
},
"nbformat": 4,
"nbformat_minor": 1
}