Commit Graph

9400 Commits (master)
 

Author SHA1 Message Date
Massimiliano Pronesti 0c0db7c5db
feat(community): support semantic hybrid score threshold in Azure AI Search (#21527)
Support semantic hybrid search with a score threshold -- similar to what
we do for similarity search and for hybrid search (#20907).
4 days ago
Erick Friis 5e445a7e4e
docs: dont rewrite ipynb links that have double slash (#21775) 4 days ago
Eugene Yurtsev e3a03b324d
docs: concepts -- add information about tool calling models, update tools section (#21760)
- Add information about naitve tool calling capabilities
- Add information about standard langchain interface for tool calling
- Update description for tools

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
4 days ago
Bagatur 6416d16d39
anthropic[patch]: Release 0.1.13, tool_choice support (#21773) 4 days ago
Stefano Lottini 040597e832
community: init signature revision for Cassandra LLM cache classes + small maintenance (#17765)
This PR improves on the `CassandraCache` and `CassandraSemanticCache`
classes, mainly in the constructor signature, and also introduces
several minor improvements around these classes.

### Init signature

A (sigh) breaking change is tentatively introduced to the constructor.
To me, the advantages outweigh the possible discomfort: the new syntax
places the DB-connection objects `session` and `keyspace` later in the
param list, so that they can be given a default value. This is what
enables the pattern of _not_ specifying them, provided one has
previously initialized the Cassandra connection through the versatile
utility method `cassio.init(...)`.

In this way, a much less unwieldy instantiation can be done, such as
`CassandraCache()` and `CassandraSemanticCache(embedding=xyz)`,
everything else falling back to defaults.

A downside is that, compared to the earlier signature, this might turn
out to be breaking for those doing positional instantiation. As a way to
mitigate this problem, this PR typechecks its first argument trying to
detect the legacy usage.
(And to make this point less tricky in the future, most arguments are
left to be keyword-only).

If this is considered too harsh, I'd like guidance on how to further
smoothen this transition. **Our plan is to make the pattern of optional
session/keyspace a standard across all Cassandra classes**, so that a
repeatable strategy would be ideal. A possibility would be to keep
positional arguments for legacy reasons but issue a deprecation warning
if any of them is actually used, to later remove them with 0.2 - please
advise on this point.

### Other changes

- class docstrings: enriched, completely moved to class level, added
note on `cassio.init(...)` pattern, added tiny sample usage code.
- semantic cache: revised terminology to never mention "distance" (it is
in fact a similarity!). Kept the legacy constructor param with a
deprecation warning if used.
- `llm_caching` notebook: uniform flow with the Cassandra and Astra DB
separate cases; better and Cassandra-first description; all imports made
explicit and from community where appropriate.
- cache integration tests moved to community (incl. the imported tools),
env var bugfix for `CASSANDRA_CONTACT_POINTS`.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
4 days ago
fzowl 8db4a14648
docs: new voyageai text_embeddings model: voyage-large-2-instruct (#21706) 4 days ago
Bagatur 901e09aa30
docs: datacamp course (#21767) 4 days ago
Kyle Cassidy eca8c4bcc6
Standardized openai init params (#21739)
## Patch Summary
community:openai[patch]: standardize init args

## Details
I made changes to the OpenAI Chat API wrapper test in the Langchain
open-source repository

- **File**: `libs/community/tests/unit_tests/chat_models/test_openai.py`
- **Changes**:
  - Updated `max_retries` with Pydantic Field
  - Updated the corresponding unit test
- **Related Issues**: #20085
  - Updated max_retries with Pydantic Field, updated the unit test.

---------

Co-authored-by: JuHyung Son <sonju0427@gmail.com>
4 days ago
laishzh c03fd93fc1
docs: Remove unnecessary comment marks from the Makefile help section (#21749)
**Previous screenshot:**
<img width="758" alt="image"
src="https://github.com/langchain-ai/langchain/assets/1683919/7b90626e-35ab-4486-b41d-b664e69eec0b">

**Current:**
<img width="744" alt="image"
src="https://github.com/langchain-ai/langchain/assets/1683919/cdb69512-dc6c-4b7f-a466-4be92d94c076">
4 days ago
Ethan Yang e44b448ec3
community: update openvino doc with streaming support (#21519)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
4 days ago
Eugene Yurtsev 7022260bc5
How to: Streaming (#21715)
Update the how to guide on streaming

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
4 days ago
ccurme 19e6bf814b
community: fix CI (#21766) 4 days ago
Michael Ozery dda5a9c97a
docs: sql_qa.ipynb tutorial update (#21756)
1. Updated deprecated method usage.
2. Added LangGraph required installation in tutorial.

X: MichaelOzery
4 days ago
Mish Ushakov d77e60a7f4
community: updated Browserbase loader (#21757)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: updated Browserbase loader"

- [x] **PR message**:
    Updates the Browserbase loader with more options and improved docs.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
4 days ago
Ikko Eltociear Ashimine 1e6517ba73
docs: update sql_large_db.ipynb (#21765)
mispelling -> misspelling
4 days ago
Eugene Yurtsev 6ed0aa3239
core[major]: only use function description (#21622)
Do not prefix function signature

---

* Reason for this is that information is already present with tool
calling models.
* This will save on tokens for those models, and makes it more obvious
what the description is!
* The @tool can get more parameters to allow a user to re-introduce the
the signature if we want
4 days ago
William FH 8498b41cda
Finish agent migration doc (#21731) 4 days ago
Cheese 0ead09f84d
community: Implement `bind_tools` for ChatTongyi (#20725)
## Description

Implement `bind_tools` in ChatTongyi. Usage example:

```py
from langchain_core.tools import tool
from langchain_community.chat_models.tongyi import ChatTongyi

@tool
def multiply(first_int: int, second_int: int) -> int:
    """Multiply two integers together."""
    return first_int * second_int

llm = ChatTongyi(model="qwen-turbo")

llm_with_tools = llm.bind_tools([multiply])

msg = llm_with_tools.invoke("What's 5 times forty two")

print(msg)
```

Streaming is also supported.

## Dependencies

No Dependency is required for this change.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
4 days ago
yoogle b216a1dddb
docs: fix monorepo typo (#21761)
### Description
fix monorepo typo. `monorep` -> `monorepo`
4 days ago
Bagatur 347166874f
docs: aca-ds nit (#21759) 4 days ago
Bagatur 867adbf27b
docs: add aca-ds (#21746) 4 days ago
Bagatur 74f54599f4
docs: aza-ds cookbook (#21747) 4 days ago
Erick Friis be15740084
fireworks: add secret (#21744) 4 days ago
Erick Friis 06110e20b9
pinecone: bump min core version (#21742) 4 days ago
Erick Friis bd3e7d50f3
fireworks: bump min core version (#21741) 4 days ago
Erick Friis 1647b28a87
infra: release min version dont clobber current lib (#21740) 4 days ago
Erick Friis f5c31078d7
airbyte[patch]: airbyte-cdk compatible pydantic versions (#21738) 5 days ago
Erick Friis 3d33b89fa4
ibm[patch]: release 0.1.7 (#21737) 5 days ago
Erick Friis e41d801369
openai[patch]: fix embedding float precision issue (#21736)
also clean up + comment some of the embedding batching code
5 days ago
JuHyung Son 38c297a025
upstage: Support batch input in embedding request. (#21730)
**Description:** upstage embedding now supports batch input.
5 days ago
junefish c5a981e3b4
docs: Update Pinecone example notebook with embedded widget (#21719)
---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 days ago
Erick Friis 0aea7f4b1d
docs: fix installation link (#21728) 5 days ago
Harrison Chase 15be439719
Harrison/move flashrank rerank (#21448)
third party integration, should be in community
5 days ago
Harrison Chase c6c2649a5a
move installation (#21711) 5 days ago
Erick Friis aca98fd150
multiple: releases with relaxed core dep (#21724) 5 days ago
Bagatur af284518bc
openai[patch]: Release 0.1.7, bump tiktoken 0.7.0 (#21723) 5 days ago
Bagatur 0405933914
docs: add feedback link to 0.2 banner (#21600) 5 days ago
William FH ca768c8353
[Core] Check is async callable (#21714)
To permit proper coercion of objects like the following:


```python
class MyAsyncCallable:
    async def __call__(self, foo):
        return await ...

class MyAsyncGenerator:
    async def __call__(self, foo):
        await ...
        yield 
```
5 days ago
ccurme 7128c2d8ad
docs: add tutorial for vector stores and retrievers (#21683)
also update how-to guide for parent document retriever
5 days ago
Eugene Yurtsev 5c2cfabec6
core[minor]: Add v2 implementation of astream events (#21638)
This PR introduces a v2 implementation of astream events that removes
intermediate abstractions and fixes some issues with v1 implementation.

The v2 implementation significantly reduces relevant code that's
associated with the astream events implementation together with
overhead.

After this PR, the astream events implementation:

- Uses an async callback handler
- No longer relies on BaseTracer
- No longer relies on json patch

As a result of this re-write, a number of issues were discovered with
the existing implementation.

## Changes in V2 vs. V1

### on_chat_model_end `output`

The outputs associated with `on_chat_model_end` changed depending on
whether it was within a chain or not.

As a root level runnable the output was: 

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

As part of a chain the output was:

```
            "data": {
                "output": {
                    "generations": [
                        [
                            {
                                "generation_info": None,
                                "message": AIMessageChunk(
                                    content="hello world!", id=AnyStr()
                                ),
                                "text": "hello world!",
                                "type": "ChatGenerationChunk",
                            }
                        ]
                    ],
                    "llm_output": None,
                }
            },
```

After this PR, we will always use the simpler representation:

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

**NOTE** Non chat models (i.e., regular LLMs) are still associated with
the more verbose format.

### Remove some `_stream` events

`on_retriever_stream` and `on_tool_stream` events were removed -- these
were not real events, but created as an artifact of implementing on top
of astream_log.

The same information is already available in the `x_on_end` events.

### Propagating Names

Names of runnables have been updated to be more consistent

```python
  model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields(
        messages=ConfigurableField(
            id="messages",
            name="Messages",
            description="Messages return by the LLM",
        )
    )
```

Before:
```python
"name": "RunnableConfigurableFields",
```

After:
```python
"name": "GenericFakeChatModel",
```

### on_retriever_end

on_retriever_end will always return `output` which is a list of
documents (rather than a dict containing a key called "documents")

### Retry events

Removed the `on_retry` callback handler. It was incorrectly showing that
the failed function being retried has invoked `on_chain_end`


https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394
5 days ago
Rajendra Kadam 54e003268e
langchain[minor]: Add PebbloRetrievalQA chain with Identity & Semantic Enforcement support (#20641)
- **Description:** PebbloRetrievalQA chain introduces identity
enforcement using vector-db metadata filtering
- **Dependencies:** None
- **Issue:** None
- **Documentation:** Adding documentation for PebbloRetrievalQA chain in
a separate PR(https://github.com/langchain-ai/langchain/pull/20746)
- **Unit tests:** New unit-tests added

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
5 days ago
Bagatur f2f970f93d
docs: openai bind tools nit (#21692) 6 days ago
Erick Friis 5fa5a73dc0
docs: disable contextual search (#21691) 6 days ago
Erick Friis 3ee0747382
infra: remove prints from notebook build (#21688) 6 days ago
Erick Friis 024c11ff9c
docs: v0.2 search index (#21619) 6 days ago
Bagatur 241a6e43a5
docs: update structured how to (#21679) 6 days ago
Jib f369495fa0
mongodb: [performance] Increase DEFAULT_INSERT_BATCH_SIZE to 100,000 and introduce sizing constraints (#19608) 6 days ago
Eugene Yurtsev e69a9bedf8
core[patch]: Update mypy config (#21684)
Update mypy config to ignore checking deps from numpy and pytest (which are optional in langsmith sdk)
6 days ago
Erick Friis 9973547aef
mongodb: release 0.1.4 (#21678) 6 days ago
Jib a97473c846
mongodb[patch]: Make ObjectId JSON-serializable on generation (#21394) 6 days ago