mirror of
https://github.com/infiniflow/ragflow.git
synced 2026-06-29 23:41:12 +08:00
### What problem does this PR solve? Multiple `requests.post()` calls across the LLM integration layer lack a `timeout` parameter. Without a timeout, a single unresponsive upstream service can block the calling thread **indefinitely**, eventually exhausting the thread pool and degrading the entire system. This is a well-known issue — Python's `requests` library defaults to `timeout=None` (infinite wait), and [the library docs explicitly recommend](https://requests.readthedocs.io/en/latest/user/advanced/#timeouts) always setting a timeout. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) ### Change Added `timeout` to all `requests.post()` calls missing it: | File | Calls fixed | Timeout | |------|-------------|---------| | `rag/llm/rerank_model.py` | 9 | 30s | | `rag/llm/embedding_model.py` | 8 | 30s | | `rag/llm/cv_model.py` | 3 | 60s | | `rag/llm/tts_model.py` | 2 | 60s | | `rag/llm/sequence2txt_model.py` | 2 | 60s | Embedding/rerank calls use 30s (lightweight API calls). Vision, TTS, and audio transcription use 60s (heavier workloads with file uploads). Note: other files in the codebase (e.g. `check_minio_alive`, `check_ragflow_server_alive`) already use `timeout=10`, so this PR brings the LLM layer in line with existing practice. Signed-off-by: Ricardo-M-L <Sibyl_Hartmanbnb@webname.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>