{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Arcee\n", "\n", ">[Arcee](https://www.arcee.ai/about/about-us) helps with the development of the SLMs—small, specialized, secure, and scalable language models.\n", "\n", "This notebook demonstrates how to use the `ArceeRetriever` class to retrieve relevant document(s) for Arcee's `Domain Adapted Language Models` (`DALMs`)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup\n", "\n", "Before using `ArceeRetriever`, make sure the Arcee API key is set as `ARCEE_API_KEY` environment variable. You can also pass the api key as a named parameter." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from langchain_community.retrievers import ArceeRetriever\n", "\n", "retriever = ArceeRetriever(\n", " model=\"DALM-PubMed\",\n", " # arcee_api_key=\"ARCEE-API-KEY\" # if not already set in the environment\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Additional Configuration\n", "\n", "You can also configure `ArceeRetriever`'s parameters such as `arcee_api_url`, `arcee_app_url`, and `model_kwargs` as needed.\n", "Setting the `model_kwargs` at the object initialization uses the filters and size as default for all the subsequent retrievals." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "retriever = ArceeRetriever(\n", " model=\"DALM-PubMed\",\n", " # arcee_api_key=\"ARCEE-API-KEY\", # if not already set in the environment\n", " arcee_api_url=\"https://custom-api.arcee.ai\", # default is https://api.arcee.ai\n", " arcee_app_url=\"https://custom-app.arcee.ai\", # default is https://app.arcee.ai\n", " model_kwargs={\n", " \"size\": 5,\n", " \"filters\": [\n", " {\n", " \"field_name\": \"document\",\n", " \"filter_type\": \"fuzzy_search\",\n", " \"value\": \"Einstein\",\n", " }\n", " ],\n", " },\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Retrieving documents\n", "You can retrieve relevant documents from uploaded contexts by providing a query. Here's an example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query = \"Can AI-driven music therapy contribute to the rehabilitation of patients with disorders of consciousness?\"\n", "documents = retriever.invoke(query)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Additional parameters\n", "\n", "Arcee allows you to apply `filters` and set the `size` (in terms of count) of retrieved document(s). Filters help narrow down the results. Here's how to use these parameters:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define filters\n", "filters = [\n", " {\"field_name\": \"document\", \"filter_type\": \"fuzzy_search\", \"value\": \"Music\"},\n", " {\"field_name\": \"year\", \"filter_type\": \"strict_search\", \"value\": \"1905\"},\n", "]\n", "\n", "# Retrieve documents with filters and size params\n", "documents = retriever.invoke(query, size=5, filters=filters)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 4 }