ragflow

zlei6/ragflow

Fork 0

mirror of https://github.com/infiniflow/ragflow.git synced 2026-07-06 03:18:36 +08:00

Commit Graph

Author	SHA1	Message	Date
nickmopen	bebf6ed244	fix(llm): strip non-generation keys from gen_conf for LiteLLM providers (#15427 ) (#15432 ) ### What problem does this PR solve? Fixes #15427. All LiteLLM-routed chats fail with: - Anthropic: `litellm.BadRequestError: AnthropicException - {"type":"invalid_request_error","message":"model_type: Extra inputs are not permitted"}` - OpenAI: `litellm.BadRequestError: OpenAIException - Unknown parameter: 'model_type'` This is a regression from v0.25.4. #### Root cause A chat assistant's `llm_setting` is forwarded to the model as `gen_conf`. `llm_setting` can legitimately carry RAGFlow-internal metadata such as `model_type` (the chat REST APIs in `api/apps/restful_apis/` read it back out of `llm_setting`), so that key ends up inside `gen_conf`. `Base._clean_conf` (OpenAI-compatible providers) already whitelists the keys it forwards, so direct-OpenAI providers were unaffected. `LiteLLMBase._clean_conf` only dropped `max_tokens` and passed everything else straight through to `litellm.acompletion`, which forwarded `model_type` to the upstream provider — and Anthropic / OpenAI reject it. Because both Claude and GPT route through LiteLLM, every chat broke. #### Fix - Extract the allowed-key set into a shared `ALLOWED_GEN_CONF_KEYS` constant and reuse it in `Base._clean_conf`. - Apply the same whitelist in `LiteLLMBase._clean_conf`, plus the LiteLLM-specific reasoning params (`thinking`, `reasoning_effort`, `extra_body`) that the model-family policies inject for reasoning models. This covers all four LiteLLM completion paths (`async_chat`, `async_chat_streamly`, `async_chat_with_tools`, `async_chat_streamly_with_tools`), since they all route through `_clean_conf`. #### Tests Adds `test/unit_test/rag/llm/test_clean_conf_whitelist.py` covering both backends: `model_type` (and other stray keys) are dropped, genuine generation params and `thinking` survive, `max_tokens` is removed, and the whitelist invariants hold. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Added test cases	2026-06-02 10:04:11 +08:00
Yongteng Lei	b33d2fdea5	Refa: GraphRAG to use async chat methods instead of thread pool execution (#14002 ) ### What problem does this PR solve? GraphRAG _async_chat. ### Type of change - [x] Refactoring - [x] Performance Improvement <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Refactor * Unified chat calls to an async invocation across extractors, improving timeout handling and ensuring task IDs propagate reliably. * Tests * Added and expanded unit tests and mocks to cover extractor behavior, timeout scenarios, and safe test-package imports, reducing regression risk. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-09 19:57:35 +08:00
tunsuy	e1f1184b01	test: add unit tests for graphrag/utils.py (87 test cases) (#13328 ) Add comprehensive unit tests for `graphrag/utils.py`, covering 15 functions/classes with 87 test cases. Tested functions: - clean_str, dict_has_keys_with_types, perform_variable_replacements - get_from_to, compute_args_hash, is_float_regex - GraphChange dataclass - handle_single_entity_extraction, handle_single_relationship_extraction - graph_merge, tidy_graph - split_string_by_multi_markers, pack_user_ass_to_openai_messages - is_continuous_subsequence, merge_tuples, flat_uniq_list All 327 existing + new tests pass with no regressions.	2026-03-05 15:30:43 +08:00

Author

SHA1

Message

Date

nickmopen

bebf6ed244

fix(llm): strip non-generation keys from gen_conf for LiteLLM providers (#15427 ) (#15432 )

### What problem does this PR solve?

Fixes #15427.

All LiteLLM-routed chats fail with:

- Anthropic: `litellm.BadRequestError: AnthropicException -
{"type":"invalid_request_error","message":"model_type: Extra inputs are
not permitted"}`
- OpenAI: `litellm.BadRequestError: OpenAIException - Unknown parameter:
'model_type'`

This is a regression from v0.25.4.

#### Root cause

A chat assistant's `llm_setting` is forwarded to the model as
`gen_conf`. `llm_setting` can legitimately carry RAGFlow-internal
metadata such as `model_type` (the chat REST APIs in
`api/apps/restful_apis/` read it back out of `llm_setting`), so that key
ends up inside `gen_conf`.

`Base._clean_conf` (OpenAI-compatible providers) already **whitelists**
the keys it forwards, so direct-OpenAI providers were unaffected.
`LiteLLMBase._clean_conf` only dropped `max_tokens` and passed
everything else straight through to `litellm.acompletion`, which
forwarded `model_type` to the upstream provider — and Anthropic / OpenAI
reject it. Because both Claude and GPT route through LiteLLM, every chat
broke.

#### Fix

- Extract the allowed-key set into a shared `ALLOWED_GEN_CONF_KEYS`
constant and reuse it in `Base._clean_conf`.
- Apply the same whitelist in `LiteLLMBase._clean_conf`, plus the
LiteLLM-specific reasoning params (`thinking`, `reasoning_effort`,
`extra_body`) that the model-family policies inject for reasoning
models.

This covers all four LiteLLM completion paths (`async_chat`,
`async_chat_streamly`, `async_chat_with_tools`,
`async_chat_streamly_with_tools`), since they all route through
`_clean_conf`.

#### Tests

Adds `test/unit_test/rag/llm/test_clean_conf_whitelist.py` covering both
backends: `model_type` (and other stray keys) are dropped, genuine
generation params and `thinking` survive, `max_tokens` is removed, and
the whitelist invariants hold.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Added test cases

2026-06-02 10:04:11 +08:00

Yongteng Lei

b33d2fdea5

Refa: GraphRAG to use async chat methods instead of thread pool execution (#14002 )

### What problem does this PR solve?

GraphRAG _async_chat.

### Type of change

- [x] Refactoring
- [x] Performance Improvement


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* Unified chat calls to an async invocation across extractors, improving
timeout handling and ensuring task IDs propagate reliably.
* **Tests**
* Added and expanded unit tests and mocks to cover extractor behavior,
timeout scenarios, and safe test-package imports, reducing regression
risk.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

2026-04-09 19:57:35 +08:00

tunsuy

e1f1184b01

test: add unit tests for graphrag/utils.py (87 test cases) (#13328 )

Add comprehensive unit tests for `graphrag/utils.py`, covering 15
functions/classes with 87 test cases.

Tested functions:
- clean_str, dict_has_keys_with_types, perform_variable_replacements
- get_from_to, compute_args_hash, is_float_regex
- GraphChange dataclass
- handle_single_entity_extraction, handle_single_relationship_extraction
- graph_merge, tidy_graph
- split_string_by_multi_markers, pack_user_ass_to_openai_messages
- is_continuous_subsequence, merge_tuples, flat_uniq_list

All 327 existing + new tests pass with no regressions.

2026-03-05 15:30:43 +08:00

3 Commits