Commit Graph

3 Commits

Author SHA1 Message Date
nickmopen
bebf6ed244 fix(llm): strip non-generation keys from gen_conf for LiteLLM providers (#15427) (#15432)
### What problem does this PR solve?

Fixes #15427.

All LiteLLM-routed chats fail with:

- Anthropic: `litellm.BadRequestError: AnthropicException -
{"type":"invalid_request_error","message":"model_type: Extra inputs are
not permitted"}`
- OpenAI: `litellm.BadRequestError: OpenAIException - Unknown parameter:
'model_type'`

This is a regression from v0.25.4.

#### Root cause

A chat assistant's `llm_setting` is forwarded to the model as
`gen_conf`. `llm_setting` can legitimately carry RAGFlow-internal
metadata such as `model_type` (the chat REST APIs in
`api/apps/restful_apis/` read it back out of `llm_setting`), so that key
ends up inside `gen_conf`.

`Base._clean_conf` (OpenAI-compatible providers) already **whitelists**
the keys it forwards, so direct-OpenAI providers were unaffected.
`LiteLLMBase._clean_conf` only dropped `max_tokens` and passed
everything else straight through to `litellm.acompletion`, which
forwarded `model_type` to the upstream provider — and Anthropic / OpenAI
reject it. Because both Claude and GPT route through LiteLLM, every chat
broke.

#### Fix

- Extract the allowed-key set into a shared `ALLOWED_GEN_CONF_KEYS`
constant and reuse it in `Base._clean_conf`.
- Apply the same whitelist in `LiteLLMBase._clean_conf`, plus the
LiteLLM-specific reasoning params (`thinking`, `reasoning_effort`,
`extra_body`) that the model-family policies inject for reasoning
models.

This covers all four LiteLLM completion paths (`async_chat`,
`async_chat_streamly`, `async_chat_with_tools`,
`async_chat_streamly_with_tools`), since they all route through
`_clean_conf`.

#### Tests

Adds `test/unit_test/rag/llm/test_clean_conf_whitelist.py` covering both
backends: `model_type` (and other stray keys) are dropped, genuine
generation params and `thinking` survive, `max_tokens` is removed, and
the whitelist invariants hold.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Added test cases
2026-06-02 10:04:11 +08:00
Yongteng Lei
b33d2fdea5 Refa: GraphRAG to use async chat methods instead of thread pool execution (#14002)
### What problem does this PR solve?

GraphRAG _async_chat.

### Type of change

- [x] Refactoring
- [x] Performance Improvement


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* Unified chat calls to an async invocation across extractors, improving
timeout handling and ensuring task IDs propagate reliably.
* **Tests**
* Added and expanded unit tests and mocks to cover extractor behavior,
timeout scenarios, and safe test-package imports, reducing regression
risk.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-09 19:57:35 +08:00
tunsuy
e1f1184b01 test: add unit tests for graphrag/utils.py (87 test cases) (#13328)
Add comprehensive unit tests for `graphrag/utils.py`, covering 15
functions/classes with 87 test cases.

Tested functions:
- clean_str, dict_has_keys_with_types, perform_variable_replacements
- get_from_to, compute_args_hash, is_float_regex
- GraphChange dataclass
- handle_single_entity_extraction, handle_single_relationship_extraction
- graph_merge, tidy_graph
- split_string_by_multi_markers, pack_user_ass_to_openai_messages
- is_continuous_subsequence, merge_tuples, flat_uniq_list

All 327 existing + new tests pass with no regressions.
2026-03-05 15:30:43 +08:00