Commit Graph

202 Commits (master)

Author SHA1 Message Date
Davis Chase 2b2176a3c1
tfidf retriever (#5114)
Co-authored-by: vempaliakhil96 <vempaliakhil96@gmail.com>
1 year ago
Tian Wei d7f807b71f
Add AzureCognitiveServicesToolkit to call Azure Cognitive Services API (#5012)
# Add AzureCognitiveServicesToolkit to call Azure Cognitive Services
API: achieve some multimodal capabilities

This PR adds a toolkit named AzureCognitiveServicesToolkit which bundles
the following tools:
- AzureCogsImageAnalysisTool: calls Azure Cognitive Services image
analysis API to extract caption, objects, tags, and text from images.
- AzureCogsFormRecognizerTool: calls Azure Cognitive Services form
recognizer API to extract text, tables, and key-value pairs from
documents.
- AzureCogsSpeech2TextTool: calls Azure Cognitive Services speech to
text API to transcribe speech to text.
- AzureCogsText2SpeechTool: calls Azure Cognitive Services text to
speech API to synthesize text to speech.

This toolkit can be used to process image, document, and audio inputs.
---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Jamie Broomall d4fd589638
WhyLabs callback (#4906)
# Add a WhyLabs callback handler

* Adds a simple WhyLabsCallbackHandler
* Add required dependencies as optional
* protect against missing modules with imports
* Add docs/ecosystem basic example

based on initial prototype from @andrewelizondo

> this integration gathers privacy preserving telemetry on text with
whylogs and sends stastical profiles to WhyLabs platform to monitoring
these metrics over time. For more information on what WhyLabs is see:
https://whylabs.ai

After you run the notebook (if you have env variables set for the API
Keys, org_id and dataset_id) you get something like this in WhyLabs:
![Screenshot
(443)](https://github.com/hwchase17/langchain/assets/88007022/6bdb3e1c-4243-4ae8-b974-23a8bb12edac)

Co-authored-by: Andre Elizondo <andre@whylabs.ai>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Matt Rickard de6a401a22
Add OpenLM LLM multi-provider (#4993)
OpenLM is a zero-dependency OpenAI-compatible LLM provider that can call
different inference endpoints directly via HTTP. It implements the
OpenAI Completion class so that it can be used as a drop-in replacement
for the OpenAI API. This changeset utilizes BaseOpenAI for minimal added
code.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Gergely Imreh 69de33e024
Add Mastodon toots loader (#5036)
# Add Mastodon toots loader.

Loader works either with public toots, or Mastodon app credentials. Toot
text and user info is loaded.

I've also added integration test for this new loader as it works with
public data, and a notebook with example output run now.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Michael Landis 6eacd88ae7
fix: revert docarray explicit transitive dependencies and use extras instead (#5015)
tldr: The docarray [integration
PR](https://github.com/hwchase17/langchain/pull/4483) introduced a
pinned dependency to protobuf. This is a docarray dependency, not a
langchain dependency. Since this is handled by the docarray
dependencies, it is unnecessary here.

Further, as a pinned dependency, this quickly leads to incompatibilities
with application code that consumes the library. Much less with a
heavily used library like protobuf.

Detail: as we see in the [docarray

integration](https://github.com/hwchase17/langchain/pull/4483/files#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711R81-R83),
the transitive dependencies of docarray were also listed as langchain
dependencies. This is unnecessary as the docarray project has an
appropriate
[extras](a01a05542d/pyproject.toml (L70)).
The docarray project also does not require this _pinned_ version of
protobuf, rather [a minimum
version](a01a05542d/pyproject.toml (L41)).
So this pinned version was likely in error.

To fix this, this PR reverts the explicit hnswlib and protobuf
dependencies and adds the hnswlib extras install for docarray (which
installs hnswlib and protobuf, as originally intended). Because version
`0.32.0`
of the docarray hnswlib extras added protobuf, we bump the docarray
dependency from `^0.31.0` to `^0.32.0`.

# revert docarray explicit transitive dependencies and use extras
instead

## Who can review?

@dev2049 -- reviewed the original PR
@eyurtsev -- bumped the pinned protobuf dependency a few days ago

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Harrison Chase 10ba201d05
Harrison/neo4j (#5078)
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Harrison Chase b0431c672b
Harrison/psychic (#5063)
Co-authored-by: Ayan Bandyopadhyay <ayanb9440@gmail.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Davis Chase 080eb1b3fc
Fix graphql tool (#4984)
Fix construction and add unit test.
1 year ago
Mike McGarry ddd595fe81
feature/4493 Improve Evernote Document Loader (#4577)
# Improve Evernote Document Loader

When exporting from Evernote you may export more than one note.
Currently the Evernote loader concatenates the content of all notes in
the export into a single document and only attaches the name of the
export file as metadata on the document.

This change ensures that each note is loaded as an independent document
and all available metadata on the note e.g. author, title, created,
updated are added as metadata on each document.

It also uses an existing optional dependency of `html2text` instead of
`pypandoc` to remove the need to download the pandoc application via
`download_pandoc()` to be able to use the `pypandoc` python bindings.

Fixes #4493 

Co-authored-by: Mike McGarry <mike.mcgarry@finbourne.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Eugene Yurtsev e46202829f
feat #4479: TextLoader auto detect encoding and improved exceptions (#4927)
# TextLoader auto detect encoding and enhanced exception handling

- Add an option to enable encoding detection on `TextLoader`. 
- The detection is done using `chardet`
- The loading is done by trying all detected encodings by order of
confidence or raise an exception otherwise.

### New Dependencies:
- `chardet`

Fixes #4479 

## Before submitting

<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

- @eyurtsev

---------

Co-authored-by: blob42 <spike@w530>
1 year ago
Davis Chase 8966f61ca5
Zep memory (#4898)
Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
1 year ago
Eugene Yurtsev c5ab9782c6
Add beautiful soup 4 to extended testing extra (#4869)
# Add bs4 to extended testing extra

Updating extended testing extra in preparation for more refactors.
1 year ago
Adam Quigley e78c9be312
Add Confluence Loader unit tests (#3333)
Adds some basic unit tests for the ConfluenceLoader that can be extended
later. Ports this [PR from
llama-hub](https://github.com/emptycrown/llama-hub/pull/208) and adapts
it to `langchain`.

@Jflick58 and @zywilliamli adding you here as potential reviewers

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Raduan Al-Shedivat 00c6ec8a2d
fix(document_loaders/telegram): fix pandas calls + add tests (#4806)
# Fix Telegram API loader + add tests.
I was testing this integration and it was broken with next error:
```python
message_threads = loader._get_message_threads(df)
KeyError: False
```
Also, this particular loader didn't have any tests / related group in
poetry, so I added those as well.

@hwchase17 / @eyurtsev please take a look on this fix PR.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Eugene Yurtsev c3b6129beb
Block sockets for unit-tests (#4803)
# Block usage of sockets during unit tests

Catch any tests that attempt to use the network.
1 year ago
Eugene Yurtsev d403f659ea
Update google protobuf dep (#4798)
# Update google protobuf dep

Resolve: https://github.com/hwchase17/langchain/security/dependabot/11
1 year ago
Eugene Yurtsev 3ecd7c9641
Add check to verify poetry.toml (#4794)
# Add poetry check to github action

Check poetry toml file during tests for errors
1 year ago
Eugene Yurtsev 14bedf1cc5
Github Action: Fix poetry lock file checking (#4789)
Fix how poetry lock file is checked to avoid skipping caches silently.
1 year ago
Roma cb802edf75
[Feature] Add GraphQL Query Tool (#4409)
# Add GraphQL Query Support

This PR introduces a GraphQL API Wrapper tool that allows LLM agents to
query GraphQL databases. The tool utilizes the httpx and gql Python
packages to interact with GraphQL APIs and provides a simple interface
for running queries with LLM agents.

@vowelparrot

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago
Eugene Yurtsev 09587a3201
Clean up tests for pdf parsers (#4595)
# Organize tests for pdf parsers

Clean up tests for pdf parsers, remove duplicate tests, convert to unit
tests.
1 year ago
Eugene Yurtsev 3c490b5ba3
Docugami DataLoader (#4727)
### Adds a document loader for Docugami

Specifically:

1. Adds a data loader that talks to the [Docugami](http://docugami.com)
API to download processed documents as semantic XML
2. Parses the semantic XML into chunks, with additional metadata
capturing chunk semantics
3. Adds a detailed notebook showing how you can use additional metadata
returned by Docugami for techniques like the [self-querying
retriever](https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/self_query_retriever.html)
4. Adds an integration test, and related documentation

Here is an example of a result that is not possible without the
capabilities added by Docugami (from the notebook):

<img width="1585" alt="image"
src="https://github.com/hwchase17/langchain/assets/749277/bb6c1ce3-13dc-4349-a53b-de16681fdd5b">

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
Co-authored-by: Taqi Jaffri <tjaffri@gmail.com>
1 year ago
Harrison Chase cdc20d1203
Harrison/json loader fix (#4686)
Co-authored-by: Triet Le <112841660+triet-lq-holistics@users.noreply.github.com>
1 year ago
Eugene Yurtsev 08ed927c32
Turn on extended tests (#4588)
# Turn on strict extended tests

This PR turns on strict testing for extended tests.
1 year ago
Zander Chase d96f6a106b
Add Steamship Image Generation Tool (#4580)
Co-authored-by: Enias Cailliau <enias@steamship.com>
1 year ago
Davis Chase 46b100ea63
Add DocArray vector stores (#4483)
Thanks to @anna-charlotte and @jupyterjazz for the contribution! Made
few small changes to get it across the finish line

---------

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
Co-authored-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Co-authored-by: jupyterjazz <saba.sturua@jina.ai>
Co-authored-by: Saba Sturua <45267439+jupyterjazz@users.noreply.github.com>
1 year ago
Eugene Yurtsev 80558b5b27
Add workflow for testing with all deps (#4410)
# Add action to test with all dependencies installed

PR adds a custom action for setting up poetry that allows specifying a
cache key:
https://github.com/actions/setup-python/issues/505#issuecomment-1273013236

This makes it possible to run 2 types of unit tests: 

(1) unit tests with only core dependencies
(2) unit tests with extended dependencies (e.g., those that rely on an
optional pdf parsing library)


As part of this PR, we're moving some pdf parsing tests into the
unit-tests section and making sure that these unit tests get executed
when running with extended dependencies.
1 year ago
Aivin V. Solatorio 6567b73e1a
JSON loader (#4067)
This implements a loader of text passages in JSON format. The `jq`
syntax is used to define a schema for accessing the relevant contents
from the JSON file. This requires dependency on the `jq` package:
https://pypi.org/project/jq/.

---------

Signed-off-by: Aivin V. Solatorio <avsolatorio@gmail.com>
1 year ago
Harrison Chase fba6921b50
Harrison/one drive loader (#4081)
Co-authored-by: José Ferraz Neto <netoferraz@gmail.com>
1 year ago
Harrison Chase bd7e0a534c
Harrison/csv loader (#3771)
Co-authored-by: mrT23 <tal.r@codium.ai>
1 year ago
Harrison Chase c55ba43093
Harrison/vespa (#3761)
Co-authored-by: Lester Solbakken <lesters@users.noreply.github.com>
1 year ago
Davis Chase b807a114e4
Add query parsing unit tests (#3672) 1 year ago
Davis Chase 3b609642ae
Self-query with generic query constructor (#3607)
Alternate implementation of #3452 that relies on a generic query
constructor chain and language and then has vector store-specific
translation layer. Still refactoring and updating examples but general
structure is there and seems to work s well as #3452 on exampels

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Harrison Chase a35bbbfa9e
Harrison/lancedb (#3634)
Co-authored-by: Minh Le <minhle@canva.com>
1 year ago
Eduard van Valkenburg a3e3f26090
Some more PowerBI pydantic and import fixes (#3461) 1 year ago
Eduard van Valkenburg ba7a5ac9d7
Azure CosmosDB memory (#3434)
Still needs docs, otherwise works.
1 year ago
Davit Buniatyan 2c0023393b
Deep Lake mini upgrades (#3375)
Improvements
* set default num_workers for ingestion to 0
* upgraded notebooks for avoiding dataset creation ambiguity
* added `force_delete_dataset_by_path`
* bumped deeplake to 3.3.0
* creds arg passing to deeplake object that would allow custom S3

Notes
* please double check if poetry is not messed up (thanks!)

Asks
* Would be great to create a shared slack channel for quick questions

---------

Co-authored-by: Davit Buniatyan <d@activeloop.ai>
1 year ago
Harrison Chase a6664be79c
Harrison/myscale (#3352)
Co-authored-by: Fangrui Liu <fangruil@moqi.ai>
Co-authored-by: 刘 方瑞 <fangrui.liu@outlook.com>
Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>
1 year ago
Harrison Chase cc6fe18152
Harrison/power bi (#3205)
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
1 year ago
Harrison Chase d2520a5f1e
Harrison/ddg (#3206)
Co-authored-by: itai <itai.marks@gmail.com>
Co-authored-by: Itai Marks <itaim@users.noreply.github.com>
Co-authored-by: Tianyi Pan <60060750+tipani86@users.noreply.github.com>
Co-authored-by: Tianyi Pan <tianyi.pan@clobotics.com>
Co-authored-by: Adilzhan Ismailov <13088690+aismlv@users.noreply.github.com>
Co-authored-by: Justin Flick <Justinjayflick@gmail.com>
Co-authored-by: Justin Flick <jflick@homesite.com>
1 year ago
Harrison Chase f19b3890c9
Harrison/site map tqdm (#3184)
Co-authored-by: Tianyi Pan <60060750+tipani86@users.noreply.github.com>
Co-authored-by: Tianyi Pan <tianyi.pan@clobotics.com>
1 year ago
Harrison Chase 68cd37175e
Harrison/arxiv tool (#3186)
Co-authored-by: leo-gan <leo.gan.57@gmail.com>
1 year ago
Harrison Chase afd3e70ae5
Harrison/confluent loader (#2994)
Co-authored-by: Justin Flick <Justinjayflick@gmail.com>
1 year ago
vowelparrot 5ca7ce77cd
Remove pythonrepl from LLM-MathChain (#2943)
Use numexpr evaluate instead of the python REPL to avoid malicious code
injection.

Tested against the (limited) math dataset and got the same score as
before.

For more permissive tools (like the REPL tool itself), other approaches
ought to be provided (some combination of Sanitizer + Restricted python
+ unprivileged-docker + ...), but for a calculator tool, only
mathematical expressions should be permitted.

See https://github.com/hwchase17/langchain/issues/814
1 year ago
Ankush Gola ec59e9d886
Fix ChatAnthropic stop_sequences error (#2919) (#2920)
Note to self: Always run integration tests, even on "that last minute
change you thought would be safe" :)

---------

Co-authored-by: Mike Lambert <mike.lambert@anthropic.com>
1 year ago
Harrison Chase 1e9378d0a8
Harrison/weaviate fixes (#2872)
Co-authored-by: cs0lar <cristiano.solarino@gmail.com>
Co-authored-by: cs0lar <cristiano.solarino@brightminded.com>
1 year ago
sergerdn 04c458a270
feat: improve pinecone tests (#2806)
Improve the integration tests for Pinecone by adding an `.env.example`
file for local testing. Additionally, add some dev dependencies
specifically for integration tests.

This change also helps me understand how Pinecone deals with certain
things, see related issues
https://github.com/hwchase17/langchain/issues/2484
https://github.com/hwchase17/langchain/issues/2816
1 year ago
Harrison Chase e49f1e628c
Harrison/gpt cache (#2744)
Co-authored-by: SimFG <bang.fu@zilliz.com>
1 year ago
Harrison Chase 507cee5ee5
Harrison/pinecone hybrid update (#2742)
Co-authored-by: acatav <39461369+acatav@users.noreply.github.com>
Co-authored-by: Amnon Catav <catav.amnon1@gmail.com>
1 year ago
sergerdn 4bdcedab54
fix: some imports for integration tests (#2612)
Add more missed imports for integration tests. Bump `pytest` to the
current latest version.
Fix `tests/integration_tests/vectorstores/test_elasticsearch.py` to
update its cassette(easy fix).

Related PR: https://github.com/hwchase17/langchain/pull/2560
1 year ago
sergerdn cd9336469e
fix: missed deps integrations tests (#2560)
Almost all integration tests have failed, but we haven't encountered any
import errors yet. Some tests failed due to lazy import issues. It
doesn't seem like a problem to resolve some of these errors in the next
PR.
I have a headache from resolving conflicts with `deeplake` and `boto3`,
so I will temporarily comment out `boto3`.


fix https://github.com/hwchase17/langchain/issues/2426
1 year ago
Kacper Łukawski d8967e28d0
Upgrade Qdrant to 1.1.2 (#2554)
This is a minor upgrade for Qdrant. We made a small bugfix in the local
mode, so it might also be good to upgrade Qdrant for LangChain users.
1 year ago
sergerdn 6dc86ad48f
feat: add pytest-vcr for recording HTTP interactions in integration tests (#2445)
Using `pytest-vcr` in integration tests has several benefits. Firstly,
it removes the need to mock external services, as VCR records and
replays HTTP interactions on the fly. Secondly, it simplifies the
integration test setup by eliminating the need to set up and tear down
external services in some cases. Finally, it allows for more reliable
and deterministic integration tests by ensuring that HTTP interactions
are always replayed with the same response.
Overall, `pytest-vcr` is a valuable tool for simplifying integration
test setup and improving their reliability

This commit adds the `pytest-vcr` package as a dependency for
integration tests in the `pyproject.toml` file. It also introduces two
new fixtures in `tests/integration_tests/conftest.py` files for managing
cassette directories and VCR configurations.

In addition, the
`tests/integration_tests/vectorstores/test_elasticsearch.py` file has
been updated to use the `@pytest.mark.vcr` decorator for recording and
replaying HTTP interactions.

Finally, this commit removes the `documents` fixture from the
`test_elasticsearch.py` file and replaces it with a new fixture defined
in `tests/integration_tests/vectorstores/conftest.py` that yields a list
of documents to use in any other tests.

This also includes my second attempt to fix issue :
https://github.com/hwchase17/langchain/issues/2386

Maybe related https://github.com/hwchase17/langchain/issues/2484
1 year ago
Harrison Chase 26314d7004
Harrison/openapi parser (#2461)
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
1 year ago
sergerdn b410dc76aa
fix: elasticsearch (#2402)
- Create a new docker-compose file to start an Elasticsearch instance
for integration tests.
- Add new tests to `test_elasticsearch.py` to verify Elasticsearch
functionality.
- Include an optional group `test_integration` in the `pyproject.toml`
file. This group should contain dependencies for integration tests and
can be installed using the command `poetry install --with
test_integration`. Any new dependencies should be added by running
`poetry add some_new_deps --group "test_integration" `

Note:
New tests running in live mode, which involve end-to-end testing of the
OpenAI API. In the future, adding `pytest-vcr` to record and replay all
API requests would be a nice feature for testing process.More info:
https://pytest-vcr.readthedocs.io/en/latest/

Fixes https://github.com/hwchase17/langchain/issues/2386
1 year ago
Kacper Łukawski 585f60a5aa
Qdrant update to 1.1.1 & docs polishing (#2388)
This PR updates Qdrant to 1.1.1 and introduces local mode, so there is
no need to spin up the Qdrant server. By that occasion, the Qdrant
example notebooks also got updated, covering more cases and answering
some commonly asked questions. All the Qdrant's integration tests were
switched to local mode, so no Docker container is required to launch
them.
1 year ago
sergerdn 870cd33701
fix: testing in Windows and add missing dev dependency (#2340)
This changes addresses two issues.

First, we add `setuptools` to the dev dependencies in order to debug
tests locally with an IDE, especially with PyCharm. All dependencies dev
dependencies should be installed with `poetry install --extras "dev"`.

Second, we use PurePosixPath instead of Path for URL paths to fix issues
with testing in Windows. This ensures that forward slashes are used as
the path separator regardless of the operating system.

Closes https://github.com/hwchase17/langchain/issues/2334
1 year ago
Mike Lambert 393cd3c796
Bump anthropic version (#2352)
Improves async support (and a few other bug fixes I'd prefer folks be
forced to grab)
1 year ago
Harrison Chase b35260ed47
Harrison/memory base (#2122)
@3coins + @zoltan-fedor.... heres the pr + some minor changes i made.
thoguhts? can try to get it into tmrws release

---------

Co-authored-by: Zoltan Fedor <zoltan.0.fedor@gmail.com>
Co-authored-by: Piyush Jain <piyushjain@duck.com>
1 year ago
Ankush Gola ccee1aedd2
add async support for anthropic (#2114)
should not be merged in before
https://github.com/anthropics/anthropic-sdk-python/pull/11 gets released
1 year ago
Honkware aff33d52c5
Add OpenWeatherMap API Tool (#2083)
Added tool for OpenWeatherMap API
1 year ago
Harrison Chase f281033362
rm pandas dependency (#2102) 1 year ago
Harrison Chase eff5eed719
Harrison/jina (#2043)
Co-authored-by: numb3r3 <wangfelix87@gmail.com>
Co-authored-by: felix-wang <35718120+numb3r3@users.noreply.github.com>
1 year ago
Harrison Chase 30e3b31b04
Harrison/document cleanup (#2062)
Co-authored-by: Delip Rao <delip@users.noreply.github.com>
1 year ago
Memento Mori 31f9ecfc19
Fix tiktoken version (#1882)
Fix https://github.com/hwchase17/langchain/issues/1881
This issue occurs when using `'gpt-3.5-turbo'` with
`VectorDBQAWithSourcesChain`
1 year ago
Alexandros Mavrogiannis 5d8dc83ede
Bump duckdb-engine to 0.7.0 (#1726)
Resolves https://github.com/hwchase17/langchain/issues/1272
Resolves https://github.com/hwchase17/langchain/issues/1578
1 year ago
Harrison Chase 0b29e68c17
Harrison/pgvector (#1679)
Co-authored-by: Aman Kumar <krsingh.aman@gmail.com>
1 year ago
Eugene Yurtsev bd4a2a670b
Add copy button to sphinx notebooks (#1622)
This adds a copy button at the top right corner of all notebook cells in
sphinx
notebooks.
1 year ago
Kacper Łukawski 9ac442624c
Add Qdrant named arguments (#1386)
This PR:
- Increases `qdrant-client` version to 1.0.4
- Introduces custom content and metadata keys (as requested in #1087)
- Moves all the `QdrantClient` parameters into the method parameters to
simplify code completion
1 year ago
Harrison Chase 166cda2cc6
Harrison/deeplake (#1316)
Co-authored-by: Davit Buniatyan <d@activeloop.ai>
1 year ago
Harrison Chase aaad6cc954
Harrison/atlas db (#1315)
Co-authored-by: Brandon Duderstadt <brandonduderstadt@gmail.com>
1 year ago
Harrison Chase 002da6edc0
ruff ruff (#1203) 1 year ago
Harrison Chase 0963096491
fix imports (#1288) 1 year ago
Dennis Antela Martinez 53c67e04d4
add aleph alpha llm (#1207)
Integrate Aleph Alpha's client into Langchain to provide access to the
luminous models - more info on latest benchmarks here:
https://www.aleph-alpha.com/luminous-performance-benchmarks
1 year ago
Naveen Tatikonda 0118706fd6
Add Support for OpenSearch Vector database (#1191)
### Description
This PR adds a wrapper which adds support for the OpenSearch vector
database. Using opensearch-py client we are ingesting the embeddings of
given text into opensearch cluster using Bulk API. We can perform the
`similarity_search` on the index using the 3 popular searching methods
of OpenSearch k-NN plugin:

- `Approximate k-NN Search` use approximate nearest neighbor (ANN)
algorithms from the [nmslib](https://github.com/nmslib/nmslib),
[faiss](https://github.com/facebookresearch/faiss), and
[Lucene](https://lucene.apache.org/) libraries to power k-NN search.
- `Script Scoring` extends OpenSearch’s script scoring functionality to
execute a brute force, exact k-NN search.
- `Painless Scripting` adds the distance functions as painless
extensions that can be used in more complex combinations. Also, supports
brute force, exact k-NN search like Script Scoring.

### Issues Resolved 
https://github.com/hwchase17/langchain/issues/1054

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
1 year ago
Harrison Chase 88bebb4caa
Harrison/llm integrations (#1039)
Co-authored-by: jped <jonathanped@gmail.com>
Co-authored-by: Justin Torre <justintorre75@gmail.com>
Co-authored-by: Ivan Vendrov <ivan@anthropic.com>
1 year ago
Harrison Chase 0c553d2064
Harrion/kg (#1016)
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
1 year ago
Harrison Chase 231da14771
bump version to 0082 (#980)
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
1 year ago
Harrison Chase c64f98e2bb
Harrison/format agent instructions (#973)
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
1 year ago
Harrison Chase 5f8082bdd7
Harrison/deps (#963)
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
1 year ago
Ankush Gola bc7e56e8df
Add asyncio support for LLM (OpenAI), Chain (LLMChain, LLMMathChain), and Agent (#841)
Supporting asyncio in langchain primitives allows for users to run them
concurrently and creates more seamless integration with
asyncio-supported frameworks (FastAPI, etc.)

Summary of changes:

**LLM**
* Add `agenerate` and `_agenerate`
* Implement in OpenAI by leveraging `client.Completions.acreate`

**Chain**
* Add `arun`, `acall`, `_acall`
* Implement them in `LLMChain` and `LLMMathChain` for now

**Agent**
* Refactor and leverage async chain and llm methods
* Add ability for `Tools` to contain async coroutine
* Implement async SerpaPI `arun`

Create demo notebook.

Open questions:
* Should all the async stuff go in separate classes? I've seen both
patterns (keeping the same class and having async and sync methods vs.
having class separation)
1 year ago
Harrison Chase d43250bfa5
Harrison/ver0079 (#927) 1 year ago
Harrison Chase bc53c928fc
Harrison/athropic (#921)
Co-authored-by: Mike Lambert <mlambert@gmail.com>
Co-authored-by: mrbean <sam@you.com>
Co-authored-by: mrbean <43734688+sam-h-bean@users.noreply.github.com>
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
1 year ago
Ankush Gola 933441cc52
Add retry to OpenAI llm (#849)
add ability to retry when certain exceptions are raised by
`openai.Completions.create`

Test plan: ran all OpenAI integration tests.
1 year ago
Harrison Chase 7b4882a2f4
Harrison/tf embeddings (#817)
Co-authored-by: Ryohei Kuroki <10434946+yakigac@users.noreply.github.com>
1 year ago
Roy Williams 6086292252
Centralize logic for loading from LangChainHub, add ability to pin dependencies (#805)
It's generally considered to be a good practice to pin dependencies to
prevent surprise breakages when a new version of a dependency is
released. This commit adds the ability to pin dependencies when loading
from LangChainHub.

Centralizing this logic and using urllib fixes an issue identified by
some windows users highlighted in this video -
https://youtu.be/aJ6IQUh8MLQ?t=537
1 year ago
Ankush Gola 57609845df
add tracing support to langchain (#741)
* add implementations of `BaseCallbackHandler` to support tracing:
`SharedTracer` which is thread-safe and `Tracer` which is not and is
meant to be used locally.
* Tracers persist runs to locally running `langchain-server`

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Harrison Chase 0b204d8c21
Harrison/quadrant (#665)
Co-authored-by: Kacper Łukawski <kacperlukawski@users.noreply.github.com>
1 year ago
Harrison Chase ffc7e04d44
Harrison/wolfram alpha (#579)
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
1 year ago
Harrison Chase 9753bccc71
Feature: linkcheck-action (#534) (#542)
- Add support for local build and linkchecking of docs
- Add GitHub Action to automatically check links before prior to
publication
- Minor reformat of Contributing readme
- Fix existing broken links

Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>

Co-authored-by: Hunter Gerlach <HunterGerlach@users.noreply.github.com>
Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>
1 year ago
Harrison Chase 0072686aab
Harrison/new search engine (#477)
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
1 year ago
Harrison Chase 95157d0aad
Add schema property to sql database utility class (#448) (#462)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>

Signed-off-by: Diwank Singh Tomer <diwank.singh@gmail.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Diwank Singh Tomer <diwank.singh@gmail.com>
1 year ago
Nuno Campos 451665cfdf
Add watch mode for test runner (#453) 1 year ago
Samantha Whitmore 6bc8ae63ef
Add Redis cache implementation (#397)
I'm using a hash function for the key just to make sure its length
doesn't get out of hand, otherwise the implementation is quite similar.
1 year ago
Harrison Chase c104d507bf
Harrison/improve data augmented generation docs (#390)
Co-authored-by: cameronccohen <cameron.c.cohen@gmail.com>
Co-authored-by: Cameron Cohen <cameron.cohen@quantco.com>
1 year ago
Harrison Chase ffed5e0056
Harrison/jinja formatter (#385)
Co-authored-by: Benjamin <BenderV@users.noreply.github.com>
1 year ago
mrbean fe6695b9e7
Add HuggingFacePipeline LLM (#353)
https://github.com/hwchase17/langchain/issues/354

Add support for running your own HF pipeline locally. This would allow
you to get a lot more dynamic with what HF features and models you
support since you wouldn't be beholden to what is hosted in HF hub. You
could also do stuff with HF Optimum to quantize your models and stuff to
get pretty fast inference even running on a laptop.
1 year ago
Harrison Chase 2dd895d98c
add openai tokenizer (#355) 1 year ago
Hunter Gerlach 482611f426
unit test / code coverage improvements (#322)
This PR has two contributions:

1. Add test for when stop token is found in middle of text

2. Add code coverage tooling and instructions
- Add pytest-cov via poetry
- Add necessary config files
- Add new make instruction for `coverage`
- Update README with coverage guidance
- Update minor README formatting/spelling

Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>
1 year ago
Harrison Chase 28be37f470
LLMRequestsChain (#267) 2 years ago
Harrison Chase 5cd6956d58
Harrison/version 0028 (#259) 2 years ago
Steven Hoelscher 98fb19b535
chore: use poetry as dependency manager (#242)
* Adopts [Poetry](https://python-poetry.org/) as a dependency manager
* Introduces dependency version requirements
* Deprecates Python 3.7 support

**TODO**
- [x] Update developer guide
- [x] Add back `playwright`, `manifest-ml`, and `jupyter` to dependency
group

**Not Doing => Fast Follow**
- Investigate single source for version, perhaps relying on GitHub tags
and [tackling this
issue](https://github.com/hwchase17/langchain/issues/26)
2 years ago