{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Arcee\n",
    "\n",
    ">[Arcee](https://www.arcee.ai/about/about-us) helps with the development of the SLMs—small, specialized, secure, and scalable language models.\n",
    "\n",
    "This notebook demonstrates how to use the `ArceeRetriever` class to retrieve relevant document(s) for Arcee's `Domain Adapted Language Models` (`DALMs`)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup\n",
    "\n",
    "Before using `ArceeRetriever`, make sure the Arcee API key is set as `ARCEE_API_KEY` environment variable. You can also pass the api key as a named parameter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_community.retrievers import ArceeRetriever\n",
    "\n",
    "retriever = ArceeRetriever(\n",
    "    model=\"DALM-PubMed\",\n",
    "    # arcee_api_key=\"ARCEE-API-KEY\" # if not already set in the environment\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Additional Configuration\n",
    "\n",
    "You can also configure `ArceeRetriever`'s parameters such as `arcee_api_url`, `arcee_app_url`, and `model_kwargs` as needed.\n",
    "Setting the `model_kwargs` at the object initialization uses the filters and size as default for all the subsequent retrievals."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "retriever = ArceeRetriever(\n",
    "    model=\"DALM-PubMed\",\n",
    "    # arcee_api_key=\"ARCEE-API-KEY\", # if not already set in the environment\n",
    "    arcee_api_url=\"https://custom-api.arcee.ai\",  # default is https://api.arcee.ai\n",
    "    arcee_app_url=\"https://custom-app.arcee.ai\",  # default is https://app.arcee.ai\n",
    "    model_kwargs={\n",
    "        \"size\": 5,\n",
    "        \"filters\": [\n",
    "            {\n",
    "                \"field_name\": \"document\",\n",
    "                \"filter_type\": \"fuzzy_search\",\n",
    "                \"value\": \"Einstein\",\n",
    "            }\n",
    "        ],\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Retrieving documents\n",
    "You can retrieve relevant documents from uploaded contexts by providing a query. Here's an example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "query = \"Can AI-driven music therapy contribute to the rehabilitation of patients with disorders of consciousness?\"\n",
    "documents = retriever.invoke(query)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Additional parameters\n",
    "\n",
    "Arcee allows you to apply `filters` and set the `size` (in terms of count) of retrieved document(s). Filters help narrow down the results. Here's how to use these parameters:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define filters\n",
    "filters = [\n",
    "    {\"field_name\": \"document\", \"filter_type\": \"fuzzy_search\", \"value\": \"Music\"},\n",
    "    {\"field_name\": \"year\", \"filter_type\": \"strict_search\", \"value\": \"1905\"},\n",
    "]\n",
    "\n",
    "# Retrieve documents with filters and size params\n",
    "documents = retriever.invoke(query, size=5, filters=filters)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}