ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-29 23:41:12 +08:00

Files

Ricardo-M-L 13922209e6 fix(llm): add timeout to HTTP requests in LLM integration layer (#14313 )

### What problem does this PR solve?

Multiple `requests.post()` calls across the LLM integration layer lack a
`timeout` parameter. Without a timeout, a single unresponsive upstream
service can block the calling thread **indefinitely**, eventually
exhausting the thread pool and degrading the entire system.

This is a well-known issue — Python's `requests` library defaults to
`timeout=None` (infinite wait), and [the library docs explicitly
recommend](https://requests.readthedocs.io/en/latest/user/advanced/#timeouts)
always setting a timeout.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Change

Added `timeout` to all `requests.post()` calls missing it:

| File | Calls fixed | Timeout |
|------|-------------|---------|
| `rag/llm/rerank_model.py` | 9 | 30s |
| `rag/llm/embedding_model.py` | 8 | 30s |
| `rag/llm/cv_model.py` | 3 | 60s |
| `rag/llm/tts_model.py` | 2 | 60s |
| `rag/llm/sequence2txt_model.py` | 2 | 60s |

Embedding/rerank calls use 30s (lightweight API calls). Vision, TTS, and
audio transcription use 60s (heavier workloads with file uploads).

Note: other files in the codebase (e.g. `check_minio_alive`,
`check_ragflow_server_alive`) already use `timeout=10`, so this PR
brings the LLM layer in line with existing practice.

Signed-off-by: Ricardo-M-L <Sibyl_Hartmanbnb@webname.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>

2026-05-11 11:19:07 +08:00