ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-06-29 23:41:12 +08:00

Author	SHA1	Message	Date
dependabot[bot]	c1c79c2e55	build(deps): bump python-multipart from 0.0.21 to 0.0.31 (#16088 )	2026-06-17 19:39:42 +08:00
Liu An	4379269374	Docs: Update version references to v0.26.1 in READMEs and docs (#16158 ) ### What problem does this PR solve? - Update version tags in README files (including translations) from v0.26.0 to v0.26.1 - Modify Docker image references and documentation to reflect new version - Update version badges and image descriptions - Maintain consistency across all language variants of README files ### Type of change - [x] Documentation Update	2026-06-17 19:35:32 +08:00
Wang Qi	02ccd35241	Fix RAGFlow cannot start (#16116 ) # Summary - The culprit is commit `b4c8711d5` / PR #15415 (fix: upgrade crawl4ai to 0.8.0). - That upgrade brought in unclecode-litellm, which installs the same top-level litellm namespace as upstream litellm. - The crash happens when files from one LiteLLM distribution are mixed with files from the other: custom_guardrail.py expects GuardrailTracingDetail, but types/utils.py can come from the older conflicting package.	2026-06-17 11:27:31 +08:00
dependabot[bot]	b732636546	build(deps): bump aiohttp from 3.13.3 to 3.14.1 (#16090 )	2026-06-16 20:07:32 +08:00
dependabot[bot]	59aedad5e1	build(deps): bump starlette from 0.51.0 to 1.3.1 (#16089 ) Bumps [starlette](https://github.com/Kludex/starlette) from 0.51.0 to 1.3.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/Kludex/starlette/releases">starlette's releases</a>.</em></p> <blockquote> <h2>Version 1.3.1</h2> <h2>What's Changed</h2> <ul> <li>Use <code>StarletteDeprecationWarning</code> instead of <code>DeprecationWarning</code> by <a href="https://github.com/Kludex"><code>@Kludex</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3119">Kludex/starlette#3119</a></li> <li>Enforce <code>max_fields</code> and <code>max_part_size</code> in <code>FormParser</code> by <a href="https://github.com/Kludex"><code>@Kludex</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3329">Kludex/starlette#3329</a></li> <li>Enforce <code>FormParser</code> limits in parser callbacks by <a href="https://github.com/Kludex"><code>@Kludex</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3331">Kludex/starlette#3331</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/Kludex/starlette/compare/1.3.0...1.3.1">https://github.com/Kludex/starlette/compare/1.3.0...1.3.1</a></p> <h2>Version 1.3.0</h2> <h2>What's Changed</h2> <ul> <li>Clamp oversized suffix ranges in <code>FileResponse</code> by <a href="https://github.com/jiyujie2006"><code>@jiyujie2006</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3307">Kludex/starlette#3307</a></li> <li>Catch <code>OSError</code> alongside <code>MultiPartException</code> when closing temp files by <a href="https://github.com/N3XT3R1337"><code>@N3XT3R1337</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3191">Kludex/starlette#3191</a></li> <li>Add <code>httpx2</code> to the <code>full</code> extra by <a href="https://github.com/Kludex"><code>@Kludex</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3323">Kludex/starlette#3323</a></li> <li>Adjust testclient typing and warnings by <a href="https://github.com/waketzheng"><code>@waketzheng</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3322">Kludex/starlette#3322</a></li> <li>Fix IndexError in URL.replace() on a URL with no authority by <a href="https://github.com/LeSingh1"><code>@LeSingh1</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3317">Kludex/starlette#3317</a></li> <li>Annotate URLPath protocol parameter with Literal by <a href="https://github.com/Chang-LeHung"><code>@Chang-LeHung</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3285">Kludex/starlette#3285</a></li> <li>avoid collapsing exception groups from user code by <a href="https://github.com/graingert"><code>@graingert</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/2830">Kludex/starlette#2830</a></li> <li>Use <code>removeprefix</code> to strip weak ETag indicator in <code>is_not_modified</code> by <a href="https://github.com/gnosyslambda"><code>@gnosyslambda</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3193">Kludex/starlette#3193</a></li> <li>Build <code>request.url</code> from structured components by <a href="https://github.com/Kludex"><code>@Kludex</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3326">Kludex/starlette#3326</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/jiyujie2006"><code>@jiyujie2006</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3307">Kludex/starlette#3307</a></li> <li><a href="https://github.com/N3XT3R1337"><code>@N3XT3R1337</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3191">Kludex/starlette#3191</a></li> <li><a href="https://github.com/leestana01"><code>@leestana01</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3319">Kludex/starlette#3319</a></li> <li><a href="https://github.com/LeSingh1"><code>@LeSingh1</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3317">Kludex/starlette#3317</a></li> <li><a href="https://github.com/EmmanuelNiyonshuti"><code>@EmmanuelNiyonshuti</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3204">Kludex/starlette#3204</a></li> <li><a href="https://github.com/Chang-LeHung"><code>@Chang-LeHung</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3285">Kludex/starlette#3285</a></li> <li><a href="https://github.com/gnosyslambda"><code>@gnosyslambda</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3193">Kludex/starlette#3193</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/Kludex/starlette/compare/1.2.1...1.3.0">https://github.com/Kludex/starlette/compare/1.2.1...1.3.0</a></p> <h2>Version 1.2.1</h2> <h2>What's Changed</h2> <ul> <li>Use <code>httpx2</code> for type checking in the <code>testclient</code> module by <a href="https://github.com/leifwar"><code>@leifwar</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3304">Kludex/starlette#3304</a></li> <li>Add assert error for requires() when request param is not Request type by <a href="https://github.com/KeeganOP"><code>@KeeganOP</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3298">Kludex/starlette#3298</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/leifwar"><code>@leifwar</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3304">Kludex/starlette#3304</a></li> <li><a href="https://github.com/diskeu"><code>@diskeu</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3243">Kludex/starlette#3243</a></li> <li><a href="https://github.com/KeeganOP"><code>@KeeganOP</code></a> made their first contribution in <a href="https://redirect.github.com/Kludex/starlette/pull/3298">Kludex/starlette#3298</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/Kludex/starlette/compare/1.2.0...1.2.1">https://github.com/Kludex/starlette/compare/1.2.0...1.2.1</a></p> <h2>Version 1.2.0</h2> <h2>What's Changed</h2> <ul> <li>Support httpx2 in the test client by <a href="https://github.com/Kludex"><code>@Kludex</code></a> in <a href="https://redirect.github.com/Kludex/starlette/pull/3291">Kludex/starlette#3291</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/Kludex/starlette/compare/1.1.0...1.2.0">https://github.com/Kludex/starlette/compare/1.1.0...1.2.0</a></p> <h2>Version 1.1.0</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/Kludex/starlette/blob/main/docs/release-notes.md">starlette's changelog</a>.</em></p> <blockquote> <h2>1.3.1 (June 12, 2026)</h2> <h4>Fixed</h4> <ul> <li>Enforce <code>max_fields</code> and <code>max_part_size</code> in <code>FormParser</code> <a href="https://redirect.github.com/encode/starlette/pull/3329">#3329</a>.</li> <li>Enforce <code>FormParser</code> limits in parser callbacks <a href="https://redirect.github.com/encode/starlette/pull/3331">#3331</a>.</li> </ul> <h2>1.3.0 (June 11, 2026)</h2> <h4>Added</h4> <ul> <li>Add <code>httpx2</code> to the <code>full</code> extra <a href="https://redirect.github.com/encode/starlette/pull/3323">#3323</a>.</li> <li>Annotate the <code>URLPath</code> <code>protocol</code> parameter with <code>Literal</code> <a href="https://redirect.github.com/encode/starlette/pull/3285">#3285</a>.</li> </ul> <h4>Fixed</h4> <ul> <li>Build <code>request.url</code> from structured components <a href="https://redirect.github.com/encode/starlette/pull/3326">#3326</a>.</li> <li>Clamp oversized suffix ranges in <code>FileResponse</code> <a href="https://redirect.github.com/encode/starlette/pull/3307">#3307</a>.</li> <li>Catch <code>OSError</code> alongside <code>MultiPartException</code> when closing temp files <a href="https://redirect.github.com/encode/starlette/pull/3191">#3191</a>.</li> <li>Avoid collapsing exception groups raised from user code <a href="https://redirect.github.com/encode/starlette/pull/2830">#2830</a>.</li> <li>Use <code>removeprefix</code> to strip the weak <code>ETag</code> indicator in <code>is_not_modified</code> <a href="https://redirect.github.com/encode/starlette/pull/3193">#3193</a>.</li> <li>Fix <code>IndexError</code> in <code>URL.replace()</code> on a URL with no authority <a href="https://redirect.github.com/encode/starlette/pull/3317">#3317</a>.</li> <li>Adjust <code>testclient</code> typing and warnings <a href="https://redirect.github.com/encode/starlette/pull/3322">#3322</a>.</li> </ul> <h2>1.2.1 (May 31, 2026)</h2> <h4>Fixed</h4> <ul> <li>Use <code>httpx2</code> for type checking in the <code>testclient</code> module <a href="https://redirect.github.com/encode/starlette/pull/3304">#3304</a>.</li> <li>Add assert error for <code>requires()</code> when the request parameter is not a <code>Request</code> type <a href="https://redirect.github.com/encode/starlette/pull/3298">#3298</a>.</li> </ul> <h2>1.2.0 (May 28, 2026)</h2> <h4>Added</h4> <ul> <li>Support httpx2 in the test client <a href="https://redirect.github.com/encode/starlette/pull/3291">#3291</a>.</li> </ul> <h2>1.1.0 (May 23, 2026)</h2> <h4>Added</h4> <ul> <li>Use <code>"application/octet-stream"</code> as the <code>FileResponse</code> media type fallback <a href="https://redirect.github.com/encode/starlette/pull/3283">#3283</a>.</li> </ul> <h4>Fixed</h4> <ul> <li>Only dispatch standard HTTP verbs in <code>HTTPEndpoint</code> <a href="https://redirect.github.com/encode/starlette/pull/3286">#3286</a>.</li> <li>Reject absolute paths in <code>StaticFiles.lookup_path</code> <a href="https://redirect.github.com/encode/starlette/pull/3287">#3287</a>.</li> </ul> <h2>1.0.1 (May 21, 2026)</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`8ebffd0678`"><code>8ebffd0</code></a> Version 1.3.1 (<a href="https://redirect.github.com/Kludex/starlette/issues/3330">#3330</a>)</li> <li><a href="`25b8e179d8`"><code>25b8e17</code></a> Enforce <code>FormParser</code> limits in parser callbacks (<a href="https://redirect.github.com/Kludex/starlette/issues/3331">#3331</a>)</li> <li><a href="`dba1c4babc`"><code>dba1c4b</code></a> Enforce <code>max_fields</code> and <code>max_part_size</code> in <code>FormParser</code> (<a href="https://redirect.github.com/Kludex/starlette/issues/3329">#3329</a>)</li> <li><a href="`45e51dcf99`"><code>45e51dc</code></a> Use <code>StarletteDeprecationWarning</code> instead of <code>DeprecationWarning</code> (<a href="https://redirect.github.com/Kludex/starlette/issues/3119">#3119</a>)</li> <li><a href="`5f8610c386`"><code>5f8610c</code></a> Version 1.3.0 (<a href="https://redirect.github.com/Kludex/starlette/issues/3327">#3327</a>)</li> <li><a href="`167b5850e8`"><code>167b585</code></a> Build <code>request.url</code> from structured components (<a href="https://redirect.github.com/Kludex/starlette/issues/3326">#3326</a>)</li> <li><a href="`37309255b4`"><code>3730925</code></a> Use <code>removeprefix</code> to strip weak ETag indicator in <code>is_not_modified</code> (<a href="https://redirect.github.com/Kludex/starlette/issues/3193">#3193</a>)</li> <li><a href="`e6f7ad1ab8`"><code>e6f7ad1</code></a> avoid collapsing exception groups from user code (<a href="https://redirect.github.com/Kludex/starlette/issues/2830">#2830</a>)</li> <li><a href="`115228fcdc`"><code>115228f</code></a> Annotate URLPath protocol parameter with Literal (<a href="https://redirect.github.com/Kludex/starlette/issues/3285">#3285</a>)</li> <li><a href="`113f193a34`"><code>113f193</code></a> docs: replace inline ASGI server list with link to canonical implemen… (<a href="https://redirect.github.com/Kludex/starlette/issues/3204">#3204</a>)</li> <li>Additional commits viewable in <a href="https://github.com/Kludex/starlette/compare/0.51.0...1.3.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=starlette&package-manager=uv&previous-version=0.51.0&new-version=1.3.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/infiniflow/ragflow/network/alerts). </details> Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>	2026-06-16 19:24:45 +08:00
Zhichang Yu	3fa15c0e2f	feat(agent): Go port — canvas engine, 22 components, DSL v2, 13 endpoints (#15952 ) Ports the agent canvas subsystem from Python to Go. ## What's included ### Canvas Engine (Phase 0/1) - State engine, scheduler, variable resolver, Redis checkpoint store, cancel protocol - 209 tests across canvas / component / io packages ### 22 Components (P0–P4) \| Tier \| Components \| \|---\|---\| \| P0 T1+T2+T3 \| LLM, Agent, ExitLoop, Switch, Categorize, Begin, Message, Invoke \| \| P1 T3 \| VariableAggregator, VariableAssigner, StringTransform, ListOperations, DataOperations \| \| P2 T3 \| Iteration, IterationItem, Loop, LoopItem \| \| P3 T3 \| UserFillUp, Fillup \| \| P4 T5 \| Browser, ExcelProcessor, DocsGenerator \| ### DSL v2 Schema (Phase 2.5) - Typed v2 in-memory model with v1-to-v2 auto-detect converter - v1 legacy field stripping per plan §2.11.7 ### HTTP Endpoints & Bug Fixes (Plans PR1–PR3) - DELETE SQL bug fix: gorm v2 `Where("id = ?", id).Delete(...)` pattern - CreateAgent validation: title/DSL required, duplicate check, 103 envelope - 13 new endpoints: templates, prompts, tags, sessions CRUD, chat/completions (SSE + non-stream stubs), rerun, test_db_connection, logs, webhook/logs - 756 Go unit tests (745 → 756, +18) - 17 → 0 Python integration test failures (test_agents.py + test_session_management/) ### Tools 21 eino tools: HTTPHelper, search tools, financial/data tools, mandatory stubs ### Infrastructure OTel observability, NATS message queue, DeepDoc gRPC client, SSRF guards, IDOR mitigation	2026-06-12 22:58:28 +08:00
Kevin Hu	b5a426e6e0	Feat: chat channels — connect assistants to external messaging bots (#15850 ) ### What problem does this PR solve? #15844 Adds a Chat channels capability so a RAGFlow assistant (Dialog) can be exposed as a bot on external messaging platforms (Feishu/Lark, Discord, Telegram, Slack, WeCom, LINE, etc.). An admin configures a bot in the UI, connects it to an assistant, and inbound messages are answered from that assistant's knowledge base — replies are delivered back on the channel. Feishu/Lark is implemented and tested end-to-end. Discord, Telegram, LINE, and WeCom are scaffolded against the same interface; the remaining listed channels are tracked as follow-ups. ### Design Backend - New `chat_channel` table (`tenant_id`, `name`, `channel`, `config` JSON holding `{credential: {...}}`, `dialog_id`, `status`) + `ChatChannelService` and RESTful CRUD under `/api/v1/chat_channels`. - Channel framework under `api/channels/`: a `core` registry + per-channel packages that self-register a builder and implement a common `Channel` interface (`start`/`stop`/`send` + inbound normalization) over `IncomingMessage`/`OutgoingMessage`. - Embedded reconcile loop in `ragflow_server` (`api/channels/bootstrap.py`): loads enabled bots, and starts/stops/restarts them as rows change (no server restart needed). Inbound messages run the connected dialog via the non-streaming completion path, keeping per-end-user conversation history. - Missing optional channel SDKs degrade gracefully (channel skipped with a warning; others unaffected). Channel-level errors are logged, not crashed. - Feishu's WebSocket client runs in a dedicated thread with its own event loop to avoid cross-loop/contextvars conflicts with the channel runtime. Frontend - Settings → Chat channels panel: available-channels grid + configured-bots list with add/edit/delete and a Connect assistant popup that binds a bot to a dialog. - Brand icons via simple-icons / reused shared data-source assets, with colored fallbacks for brands not available. - Route, sidebar entry, i18n (en/zh), and a top-nav segment-boundary fix so the settings page no longer highlights the Chat tab. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### Notes - DB: new `chat_channel` table is auto-created; `chat_channel.dialog_id` is also covered by a `migrate_db` `alter_db_add_column` for existing installs. - Channel SDKs (`lark-oapi`, `discord.py`, `python-telegram-bot`, `line-bot-sdk`, `wechatpy`, `aiohttp`) added to dependencies. - Screenshots / per-channel credential docs to follow. <img width="1338" height="1290" alt="Image" src="https://github.com/user-attachments/assets/042cb2f9-0dad-4e6a-bcf7-43ced4bbd704" /> <img width="1344" height="738" alt="Image" src="https://github.com/user-attachments/assets/373cd08e-ec40-4c67-9c51-4d948b1ba617" /> <img width="672" height="887" alt="Image" src="https://github.com/user-attachments/assets/5a34953f-a9a3-4c1e-869e-5eff0dc64c84" /> ---------	2026-06-12 18:21:30 +08:00
Liu An	92c4b7688b	Docs: Update version references to v0.26.0 in READMEs and docs (#15941 ) ### What problem does this PR solve? - Update version tags in README files (including translations) from v0.25.6 to v0.26.0 - Modify Docker image references and documentation to reflect new version - Update version badges and image descriptions - Maintain consistency across all language variants of README files ### Type of change - [x] Documentation Update	2026-06-11 18:34:26 +08:00
OrbisAI Security	b4c8711d51	fix: upgrade crawl4ai to 0.8.0 (CVE-2026-26217) (#15415 ) ## Summary Upgrade crawl4ai from 0.7.6 to 0.8.0 to fix CVE-2026-26217. ## Vulnerability \| Field \| Value \| \|-------\|-------\| \| ID \| CVE-2026-26217 \| \| Severity \| CRITICAL \| \| Scanner \| trivy \| \| Rule \| `CVE-2026-26217` \| \| File \| `uv.lock` \| \| Assessment \| Likely exploitable \| Description: Crawl4AI Has Local File Inclusion in Docker API via file:// URLs ## Evidence Scanner confirmation: trivy rule `CVE-2026-26217` flagged this pattern. Production code: This file is in the production codebase, not test-only code. ## Threat Model Context This is a web service - vulnerabilities in request handlers are directly exploitable by remote attackers. ## Changes - `pyproject.toml` - `uv.lock` ## Verification - [x] Build passes - [x] Scanner re-scan confirms fix - [x] LLM code review passed --- This change addresses a pattern flagged by static analysis. The code path handles user-influenced input and the fix reduces the attack surface against both manual and automated exploitation. --- Automated security fix by [OrbisAI Security](https://orbisappsec.com)	2026-05-29 21:38:41 +08:00
Liu An	0639dba89a	Docs: Update version references to v0.25.6 in READMEs and docs (#15248 ) ### What problem does this PR solve? - Update version tags in README files (including translations) from v0.25.5 to v0.25.6 - Modify Docker image references and documentation to reflect new version - Update version badges and image descriptions - Maintain consistency across all language variants of README files ### Type of change - [x] Documentation Update	2026-05-26 19:45:43 +08:00
天海蒼灆	3e5b11a523	Feat(browser control)：Add new agent component 'browser' to control browser by AI (#14888 ) ### What problem does this PR solve? This PR adds a new `Browser` operator to Agent workflows, enabling prompt-driven browser automation in RAGFlow.Technically based ‘Browser-Use’ It includes: - Backend browser component execution with tenant LLM integration - Upload source support (file IDs, URLs, variables, CSV/JSON array) - Downloaded file persistence to RAGFlow storage - Frontend node/operator integration, form config, icon, and i18n updates - Unit tests for upload/download and ID parsing logic - Dependency and Docker updates for browser-use runtime support ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-21 15:32:32 +08:00
Jin Hai	90c76e73d0	Docs: Update version references to v0.25.5 in READMEs and docs (#15059 ) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-05-20 20:05:45 +08:00
Magicbook1108	b28e134944	Feat: add local & ssh provider in admin panel (#15039 ) ### What problem does this PR solve? Feat: add local & ssh provider in admin panel ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-20 16:56:20 +08:00
qinling0210	9d94527b1d	Bump to infinity v0.7.0 (#14968 ) ### What problem does this PR solve? Upgrade infinity ### Type of change - [x] Refactoring	2026-05-18 10:25:59 +08:00
wdeveloper16	14c0985182	feat: bump Python minimum from 3.12 to 3.13, drop strenum backport (#14767 ) Closes #14753 ## What changed \| File \| Change \| \|---\|---\| \| `pyproject.toml` \| `requires-python` → `>=3.13,<3.15`; remove `strenum==0.4.15` \| \| `Dockerfile` \| `uv python install 3.13`, `uv sync --python 3.13` \| \| `.github/workflows/tests.yml` \| `uv sync --python 3.13` on both matrix legs \| \| `CLAUDE.md` \| dev setup command + requirements note updated \| \| `deepdoc/parser/mineru_parser.py` \| `from strenum import StrEnum` → `from enum import StrEnum` \| \| `agent/tools/code_exec.py` \| same \| `StrEnum` has been in the stdlib since Python 3.11 — the `strenum` backport package is no longer needed once the floor is 3.13. ## Why uv.lock is not regenerated `uv lock --python 3.13` fails because: 1. The infiniflow/graspologic fork pins `numpy>=1.26.4,<2.0.0` 2. `tensorflow-cpu>=2.20.0` (the first release with cp313 wheels) depends on `ml-dtypes>=0.5.1`, which requires `numpy>=2.1.0` 3. These two constraints are irreconcilable on Python 3.13 The lockfile regeneration requires loosening the `numpy` upper bound in the `infiniflow/graspologic` fork. Once that fork commit is updated and the SHA in `pyproject.toml:49` is bumped, `uv lock --python 3.13` will succeed. ## RFC corrections Two claims in the original RFC (#14753) did not hold up under code review: - "graspologic hard-blocks 3.13" — the infiniflow fork at the pinned commit has no `<3.13` Python constraint. The blocker is the transitive `numpy<2.0.0` conflict with tensorflow-cpu's test dependency, not a direct Python version cap. - "free-threading throughput gains for I/O-bound workload" — Python 3.13 free-threading requires a special `--disable-gil` build and provides no benefit for async I/O code (the GIL is already released during I/O). The real motivation is forward compatibility and improved error messages.	2026-05-15 14:40:53 +08:00
Liu An	f038a34154	Docs: Update version references to v0.25.4 in READMEs and docs (#14912 ) ### What problem does this PR solve? - Update version tags in README files (including translations) from v0.25.3 to v0.25.4 - Modify Docker image references and documentation to reflect new version - Update version badges and image descriptions - Maintain consistency across all language variants of README files ### Type of change - [x] Documentation Update	2026-05-14 11:07:08 +08:00
Jin Hai	87516edadf	Bump to infinity v0.7.0-dev7 (#14897 ) ### What problem does this PR solve? Upgrade infinity ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-05-13 19:42:50 +08:00
Liu An	3182fd0789	Docs: Update version references to v0.25.3 in READMEs and docs (#14896 ) ### What problem does this PR solve? - Update version tags in README files (including translations) from v0.25.2 to v0.25.3 - Modify Docker image references and documentation to reflect new version - Update version badges and image descriptions - Maintain consistency across all language variants of README files ### Type of change - [x] Documentation Update	2026-05-13 18:42:42 +08:00
Jin Hai	c55e23e7e2	Go: refactor embedding interface (#14757 ) ### What problem does this PR solve? Provide embedding index according to the input text ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-05-11 14:45:30 +08:00
Wang Qi	3838770e7a	GraphRAG feature - Part 1 - add spacy to extract entity and relation (#14670 ) ### What problem does this PR solve? GraphRAG feature - Part 1 - add spacy to extract entity and relation <img width="1621" height="1288" alt="image" src="https://github.com/user-attachments/assets/aadeddad-94da-46c6-adad-9c3784181f61" /> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-11 12:59:59 +08:00
VincentLambert	08bb53bbb1	Feat: add BedrockCV for vision/image2text inference via LiteLLM (#14705 ) ## Summary - `CvModel["Bedrock"]` was absent from `rag/llm/cv_model.py`, causing `model_instance()` to return `None` when a Bedrock model was used as a PDF parser — even after correct model resolution. - This PR adds `BedrockCV`, enabling Bedrock vision models (e.g. `amazon.nova-pro-v1:0`, `anthropic.claude-3-5-sonnet`) to be used as PDF parsers. ## What problem does this PR solve? When a Bedrock model is selected as the PDF parser in a knowledge base, ingestion failed with: ``` 'LiteLLMBase' object has no attribute 'describe_with_prompt' ``` The root cause: `LiteLLMBase` (the Bedrock chat implementation) was the only registered handler for the Bedrock factory. It does not implement `describe_with_prompt`. `CvModel` had no Bedrock entry, so `model_instance()` returned `None` for `image2text` requests. ## Type of change - [x] New Feature (non-breaking change which adds functionality) ## Changes `rag/llm/cv_model.py` Adds `BedrockCV(Base)` with `_FACTORY_NAME = "Bedrock"`: - Uses `litellm.completion` with the `bedrock/` prefix (consistent with `LiteLLMBase`) - Parses AWS credentials from the JSON key assembled by `add_llm` (`auth_mode`, `bedrock_ak`, `bedrock_sk`, `bedrock_region`, `aws_role_arn`) - Supports three auth modes: `access_key_secret`, `iam_role` (via STS `assume_role`), and default credential chain (IRSA, instance profile) - Implements `describe_with_prompt` and `describe` ## Test plan - [ ] Configure a Bedrock vision model (e.g. `amazon.nova-pro-v1:0`) with valid AWS credentials - [ ] Select it as PDF parser in a knowledge base - [ ] Verify ingestion of a PDF document completes without errors - [ ] Verify `CvModel["Bedrock"]` resolves to `BedrockCV` 🤖 Generated with [Claude Code](https://claude.ai/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-11 10:29:58 +08:00
Liu An	57b24be6d6	Docs: Update version references to v0.25.2 in READMEs and docs (#14731 ) ### What problem does this PR solve? - Update version tags in README files (including translations) from v0.25.1 to v0.25.2 - Modify Docker image references and documentation to reflect new version - Update version badges and image descriptions - Maintain consistency across all language variants of README files ### Type of change - [x] Documentation Update	2026-05-09 19:06:05 +08:00
qinling0210	12f80f170c	Bump to infinity v0.7.0-dev6 (#14606 ) ### What problem does this PR solve? Bump to infinity v0.7.0-dev6 (uv lock --upgrade-package infinity-sdk) ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-05-07 10:51:17 +08:00
dependabot[bot]	9e4f3614de	Chore(deps-dev): Bump pillow from 12.1.1 to 12.2.0 (#14578 ) As title	2026-05-06 11:08:38 +08:00
Liu An	ce4c782fd7	Docs: Update version references to v0.25.1 in READMEs and docs (#14488 ) ### What problem does this PR solve? - Update version tags in README files (including translations) from v0.25.0 to v0.25.1 - Modify Docker image references and documentation to reflect new version - Update version badges and image descriptions - Maintain consistency across all language variants of README files ### Type of change - [x] Documentation Update	2026-04-30 10:49:26 +08:00
RazmikGevorgyan	c41b5e8a5d	fix: migrate Langfuse integration from start_generation to start_obse… (#14205 ) The Langfuse Python SDK v3+ removed `start_generation()` method. RagFlow's code called this non-existent method, causing AttributeError when Langfuse tracing is enabled. Replace all `start_generation()` calls with `start_observation(as_type="generation")` which is the correct v4 SDK API. Affected files: - api/db/services/llm_service.py (12 occurrences) - api/db/services/dialog_service.py (1 occurrence) Fixes #14204 Related to #9243 ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-24 10:03:57 +08:00
Jin Hai	01c5437fdf	Fix uv.lock (#14285 ) ### What problem does this PR solve? As title. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-22 13:09:21 +08:00
Jin Hai	74b44e1aa3	Go: add balance command (#14262 ) ### What problem does this PR solve? ``` RAGFlow(user)> list supported models from 'moonshot' 'test'; +---------------------------------+ \| model_name \| +---------------------------------+ \| moonshot-v1-32k-vision-preview \| \| kimi-k2.6 \| \| moonshot-v1-8k \| \| moonshot-v1-auto \| \| moonshot-v1-128k \| \| moonshot-v1-32k \| \| kimi-k2.5 \| \| moonshot-v1-8k-vision-preview \| \| moonshot-v1-128k-vision-preview \| +---------------------------------+ RAGFlow(user)> show balance from 'moonshot' 'test'; +---------+----------+ \| balance \| currency \| +---------+----------+ \| 0 \| CNY \| +---------+----------+ ``` ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-04-21 21:31:50 +08:00
Liu An	a33d0737cd	Docs: Update version references to v0.25.0 in READMEs and docs (#14257 ) ### What problem does this PR solve? - Update version tags in README files (including translations) from v0.24.0 to v0.25.0 - Modify Docker image references and documentation to reflect new version - Update version badges and image descriptions - Maintain consistency across all language variants of README files ### Type of change - [x] Documentation Update	2026-04-21 17:26:50 +08:00
dependabot[bot]	b34a726acd	Build(deps): Bump pypdf from 6.9.2 to 6.10.2 (#14184 ) Bumps [pypdf](https://github.com/py-pdf/pypdf) from 6.9.2 to 6.10.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/py-pdf/pypdf/releases">pypdf's releases</a>.</em></p> <blockquote> <h2>Version 6.10.2, 2026-04-15</h2> <h2>What's new</h2> <h3>Security (SEC)</h3> <ul> <li>Do not rely on possibly invalid /Size for incremental cloning (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3735">#3735</a>) by <a href="https://github.com/stefan6419846"><code>@stefan6419846</code></a></li> <li>Introduce limits for FlateDecode parameters and image decoding (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3734">#3734</a>) by <a href="https://github.com/stefan6419846"><code>@stefan6419846</code></a></li> </ul> <p><a href="https://github.com/py-pdf/pypdf/compare/6.10.1...6.10.2">Full Changelog</a></p> <h2>Version 6.10.1, 2026-04-14</h2> <h2>What's new</h2> <h3>Security (SEC)</h3> <ul> <li>Limit the allowed size of xref and object streams (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3733">#3733</a>) by <a href="https://github.com/stefan6419846"><code>@stefan6419846</code></a></li> </ul> <h3>Robustness (ROB)</h3> <ul> <li>Consider strict mode setting for decryption errors (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3731">#3731</a>) by <a href="https://github.com/stefan6419846"><code>@stefan6419846</code></a></li> </ul> <h3>Documentation (DOC)</h3> <ul> <li>Use new parameter names for compress_identical_objects by <a href="https://github.com/stefan6419846"><code>@stefan6419846</code></a></li> </ul> <p><a href="https://github.com/py-pdf/pypdf/compare/6.10.0...6.10.1">Full Changelog</a></p> <h2>Version 6.10.0, 2026-04-10</h2> <h2>What's new</h2> <h3>Security (SEC)</h3> <ul> <li>Disallow custom XML entity declarations for XMP metadata (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3724">#3724</a>) by <a href="https://github.com/stefan6419846"><code>@stefan6419846</code></a></li> </ul> <h3>New Features (ENH)</h3> <ul> <li>Skip MD5 key derivation for AES-256 encrypted PDFs (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3694">#3694</a>) by <a href="https://github.com/Ygnas"><code>@Ygnas</code></a></li> </ul> <h3>Bug Fixes (BUG)</h3> <ul> <li>Use remove_orphans in compress_identical_objects (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3310">#3310</a>) by <a href="https://github.com/j-t-1"><code>@j-t-1</code></a></li> <li>Fix PdfReadError when xref table contains comments before trailer (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3710">#3710</a>) by <a href="https://github.com/rassie"><code>@rassie</code></a></li> <li>Correctly verify AES padding during decryption (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3699">#3699</a>) by <a href="https://github.com/stefan6419846"><code>@stefan6419846</code></a></li> <li>Fix stale object cache from non-authoritative object streams (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3698">#3698</a>) by <a href="https://github.com/astahlman"><code>@astahlman</code></a></li> <li>Fix extract_links pairing when annotations include non-links (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3687">#3687</a>) by <a href="https://github.com/ReinerBRO"><code>@ReinerBRO</code></a></li> </ul> <h3>Documentation (DOC)</h3> <ul> <li>Add AI policy (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3717">#3717</a>) by <a href="https://github.com/stefan6419846"><code>@stefan6419846</code></a></li> </ul> <p><a href="https://github.com/py-pdf/pypdf/compare/6.9.2...6.10.0">Full Changelog</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/py-pdf/pypdf/blob/main/CHANGELOG.md">pypdf's changelog</a>.</em></p> <blockquote> <h2>Version 6.10.2, 2026-04-15</h2> <h3>Security (SEC)</h3> <ul> <li>Do not rely on possibly invalid /Size for incremental cloning (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3735">#3735</a>)</li> <li>Introduce limits for FlateDecode parameters and image decoding (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3734">#3734</a>)</li> </ul> <p><a href="https://github.com/py-pdf/pypdf/compare/6.10.1...6.10.2">Full Changelog</a></p> <h2>Version 6.10.1, 2026-04-14</h2> <h3>Security (SEC)</h3> <ul> <li>Limit the allowed size of xref and object streams (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3733">#3733</a>)</li> </ul> <h3>Robustness (ROB)</h3> <ul> <li>Consider strict mode setting for decryption errors (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3731">#3731</a>)</li> </ul> <h3>Documentation (DOC)</h3> <ul> <li>Use new parameter names for compress_identical_objects</li> </ul> <p><a href="https://github.com/py-pdf/pypdf/compare/6.10.0...6.10.1">Full Changelog</a></p> <h2>Version 6.10.0, 2026-04-10</h2> <h3>Security (SEC)</h3> <ul> <li>Disallow custom XML entity declarations for XMP metadata (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3724">#3724</a>)</li> </ul> <h3>New Features (ENH)</h3> <ul> <li>Skip MD5 key derivation for AES-256 encrypted PDFs (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3694">#3694</a>)</li> </ul> <h3>Bug Fixes (BUG)</h3> <ul> <li>Use remove_orphans in compress_identical_objects (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3310">#3310</a>)</li> <li>Fix PdfReadError when xref table contains comments before trailer (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3710">#3710</a>)</li> <li>Correctly verify AES padding during decryption (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3699">#3699</a>)</li> <li>Fix stale object cache from non-authoritative object streams (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3698">#3698</a>)</li> <li>Fix extract_links pairing when annotations include non-links (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3687">#3687</a>)</li> </ul> <h3>Documentation (DOC)</h3> <ul> <li>Add AI policy (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3717">#3717</a>)</li> </ul> <p><a href="https://github.com/py-pdf/pypdf/compare/6.9.2...6.10.0">Full Changelog</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`c476b4f293`"><code>c476b4f</code></a> REL: 6.10.2</li> <li><a href="`c50a0104cf`"><code>c50a010</code></a> SEC: Do not rely on possibly invalid /Size for incremental cloning (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3735">#3735</a>)</li> <li><a href="`ac734dab4e`"><code>ac734da</code></a> SEC: Introduce limits for FlateDecode parameters and image decoding (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3734">#3734</a>)</li> <li><a href="`b49e7eb454`"><code>b49e7eb</code></a> REL: 6.10.1</li> <li><a href="`62338e9d36`"><code>62338e9</code></a> SEC: Limit the allowed size of xref and object streams (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3733">#3733</a>)</li> <li><a href="`5dcc0aebaa`"><code>5dcc0ae</code></a> DEV: Update pytest-benchmark to 5.2.3</li> <li><a href="`b42e4aa98a`"><code>b42e4aa</code></a> DEV: Update pinned pillow and pytest where possible (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3732">#3732</a>)</li> <li><a href="`717446b121`"><code>717446b</code></a> ROB: Consider strict mode setting for decryption errors (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3731">#3731</a>)</li> <li><a href="`9e461d361b`"><code>9e461d3</code></a> DEV: Bump softprops/action-gh-release from 2 to 3 (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3730">#3730</a>)</li> <li><a href="`500d09d92f`"><code>500d09d</code></a> TST: Update <code>test_embedded_file__basic</code> to use <code>tmp_path</code> fixture (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3726">#3726</a>)</li> <li>Additional commits viewable in <a href="https://github.com/py-pdf/pypdf/compare/6.9.2...6.10.2">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pypdf&package-manager=uv&previous-version=6.9.2&new-version=6.10.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/infiniflow/ragflow/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-17 18:43:19 +08:00
Jack	3b7723855c	Fix: revert xgboost version to 1.6.0 (#13984 ) ### What problem does this PR solve? Revert xgboost version to 1.6.0 ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Updated xgboost dependency from version 3.2.0 to 1.6.0 <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 19:53:47 +08:00
Jack	c4b0aaa874	Fix: #6098 - Add validation logic for parser_config when update document (#13911 ) ### What problem does this PR solve? Add validation logic for parser_config. Refactor the processing flow. Before change, validation logics and update logics are mixed up - some validation logis executes followed by some update logic executes and then another such "validation-and-then-update" which is not good. After change, all validation logic executes firstly. Update logic will be executed after ALL validation logic executed. Validation logic for parameters (that come from front end) will be checked using Pydantic. For validation logic that depends on data from DB, they will be in separate methods. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Refactoring	2026-04-07 11:33:05 +08:00
qinling0210	a8bbe167a9	Bump to infinity v0.7.0-dev5 (#13846 ) ### What problem does this PR solve? Bump to infinity v0.7.0-dev5 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-03-30 10:19:06 +08:00
KeJun	cb78ce0a7b	feat: support rss datasource (#13721 ) ### What problem does this PR solve? Supporting public RSS/Atom feed URLs as data sources for RagFlow. link https://github.com/infiniflow/ragflow/issues/12313 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-03-27 22:58:44 +08:00
Krishna Chaitanya	cdbbd2620c	Fix: upgrade pyasn1 from 0.6.2 to 0.6.3 to address CVE-2026-30922 (#13773 ) ## Summary - Adds `pyasn1>=0.6.3` as a `[tool.uv.constraint-dependencies]` entry to mitigate CVE-2026-30922 (CVSS 7.5 HIGH) - Regenerates `uv.lock` so the resolved pyasn1 version moves from 0.6.2 to 0.6.3 ## Details CVE-2026-30922 is a Denial of Service vulnerability in pyasn1 caused by unbounded recursion when decoding ASN.1 data with deeply nested structures. An attacker can send crafted payloads with thousands of nested SEQUENCE or SET tags to trigger a `RecursionError` crash or memory exhaustion. - Severity: HIGH (CVSS 7.5) - Affected versions: pyasn1 < 0.6.3 - Fixed in: pyasn1 >= 0.6.3 - NVD: https://nvd.nist.gov/vuln/detail/CVE-2026-25769 `pyasn1` is not a direct dependency of RAGFlow but is pulled in transitively via `google-auth` -> `rsa` -> `pyasn1-modules` -> `pyasn1`. The `constraint-dependencies` mechanism in uv is the correct way to enforce a minimum version for transitive dependencies without polluting the direct dependency list. ## Test plan - [x] `pyproject.toml` passes TOML validation - [x] `uv lock` resolves successfully with the new constraint - [x] pyasn1 version in `uv.lock` is now 0.6.3 - [ ] Existing CI/CD tests continue to pass Closes #13686	2026-03-27 10:37:34 +08:00
Yongteng Lei	ea1430bec5	Security: do not use litellm 1.82.7 and 1.82.8 (#13768 ) ### What problem does this PR solve? See [issue](https://github.com/BerriAI/litellm/issues/24518) from Litellm. Upgraded from `1.81.15` to `1.82.6`, so RAGFlow is safe as always. ### Type of change - [x] Security Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-03-25 22:39:33 +08:00
Liu An	5b3bb25010	Fix: switch Python package mirror from Tsinghua to Aliyun (#13617 ) ### What problem does this PR solve? Replace pypi.tuna.tsinghua.edu.cn with mirrors.aliyun.com to resolve issues with missing packages on the Tsinghua mirror. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-03-16 12:12:25 +08:00
guptas6est	32d31284cc	Fix: upgrade pypdf to 6.7.5 and migrate from deprecated pypdf2 to fix CVE-2026-28804 and CVE-2023-36464 (#13454 ) ### What problem does this PR solve? This PR addresses security vulnerabilities in PDF processing dependencies identified by Trivy security scan: 1. CVE-2026-28804 (MEDIUM): pypdf 6.7.4 vulnerable to inefficient decoding of ASCIIHexDecode streams 2. CVE-2023-36464 (MEDIUM): pypdf2 3.0.1 susceptible to infinite loop when parsing malformed comments Since pypdf2 is deprecated with no available fixes, this PR migrates all pypdf2 usage to the actively maintained pypdf library (version 6.7.5), which resolves both vulnerabilities. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2026-03-09 12:06:00 +08:00
Heyang Wang	c217b8f3d8	Feat: add DingTalk AI Table connector and integration for data synch… (#13413 ) ### What problem does this PR solve? Add DingTalk AI Table connector and integration for data synchronization Issue #13400 ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: wangheyang <wangheyang@corp.netease.com>	2026-03-06 21:13:23 +08:00
Jin Hai	6bb00e2762	Update graspologic to gitee (#13362 ) ### What problem does this PR solve? Accelerate python module downloading ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2026-03-04 17:48:47 +08:00
dependabot[bot]	b956ad180c	Build(deps): Bump pypdf from 6.7.3 to 6.7.4 (#13298 ) Bumps [pypdf](https://github.com/py-pdf/pypdf) from 6.7.3 to 6.7.4. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/py-pdf/pypdf/releases">pypdf's releases</a>.</em></p> <blockquote> <h2>Version 6.7.4, 2026-02-27</h2> <h2>What's new</h2> <h3>Security (SEC)</h3> <ul> <li>Allow limiting output length for RunLengthDecode filter (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3664">#3664</a>) by <a href="https://github.com/stefan6419846"><code>@stefan6419846</code></a></li> </ul> <h3>Robustness (ROB)</h3> <ul> <li>Deal with invalid annotations in extract_links (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3659">#3659</a>) by <a href="https://github.com/stefan6419846"><code>@stefan6419846</code></a></li> </ul> <p><a href="https://github.com/py-pdf/pypdf/compare/6.7.3...6.7.4">Full Changelog</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/py-pdf/pypdf/blob/main/CHANGELOG.md">pypdf's changelog</a>.</em></p> <blockquote> <h2>Version 6.7.4, 2026-02-27</h2> <h3>Security (SEC)</h3> <ul> <li>Allow limiting output length for RunLengthDecode filter (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3664">#3664</a>)</li> </ul> <h3>Robustness (ROB)</h3> <ul> <li>Deal with invalid annotations in extract_links (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3659">#3659</a>)</li> </ul> <p><a href="https://github.com/py-pdf/pypdf/compare/6.7.3...6.7.4">Full Changelog</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`1650bc31e8`"><code>1650bc3</code></a> REL: 6.7.4</li> <li><a href="`f309c60037`"><code>f309c60</code></a> SEC: Allow limiting output length for RunLengthDecode filter (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3664">#3664</a>)</li> <li><a href="`993f052748`"><code>993f052</code></a> DEV: Bump actions/upload-artifact from 6 to 7 (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3662">#3662</a>)</li> <li><a href="`a3c996bffc`"><code>a3c996b</code></a> DEV: Bump actions/download-artifact from 7 to 8 (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3663">#3663</a>)</li> <li><a href="`37de32022e`"><code>37de320</code></a> ROB: Deal with invalid annotations in extract_links (<a href="https://redirect.github.com/py-pdf/pypdf/issues/3659">#3659</a>)</li> <li>See full diff in <a href="https://github.com/py-pdf/pypdf/compare/6.7.3...6.7.4">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pypdf&package-manager=uv&previous-version=6.7.3&new-version=6.7.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/infiniflow/ragflow/network/alerts). </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>	2026-03-02 11:32:12 +08:00
Magicbook1108	158503a1aa	Feat: optimize ingestion pipeline with preprocess (#13211 ) ### What problem does this PR solve? Feat: optimize ingestion pipeline with preprocess ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-02-26 10:24:13 +08:00
Magicbook1108	98e1d5aa5c	Refact: switch from google-generativeai to google-genai (#13140 ) ### What problem does this PR solve? Refact: switch from oogle-generativeai to google-genai #13132 Refact: commnet out unused pywencai. ### Type of change - [x] Refactoring	2026-02-24 10:28:33 +08:00
Liu An	392ec99651	Docs: Update version references to v0.24.0 in READMEs and docs (#13095 ) ### What problem does this PR solve? - Update version tags in README files (including translations) from v0.23.1 to v0.24.0 - Modify Docker image references and documentation to reflect new version - Update version badges and image descriptions - Maintain consistency across all language variants of README files ### Type of change - [x] Documentation Update	2026-02-10 17:24:03 +08:00
Clint-chan	38289084a8	Chore/upgrade dashscope to 1.25.11 (#13007 ) ## Description Upgrade dashscope package to support text-embedding-v4 model. ## Changes - Update dashscope version from 1.20.11 to 1.25.11 in pyproject.toml ## Reason The text-embedding-v4 model requires dashscope >= 1.25.0 to function properly. This upgrade ensures compatibility with the latest embedding models. Co-authored-by: Clint-chan <Clint-chan@users.noreply.github.com>	2026-02-06 19:06:41 +08:00
MkDev11	6f31c5fed2	feat/add MySQL and PostgreSQL data source connectors (#12817 ) ### What problem does this PR solve? This PR adds MySQL and PostgreSQL as data source connectors, allowing users to import data directly from relational databases into RAGFlow for RAG workflows. Many users store their knowledge in databases (product catalogs, documentation, FAQs, etc.) and currently have no way to sync this data into RAGFlow without exporting to files first. This feature lets them connect directly to their databases, run SQL queries, and automatically create documents from the results. Closes #763 Closes #11560 ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): ### What this PR does New capabilities: - Connect to MySQL and PostgreSQL databases - Run custom SQL queries to extract data - Map database columns to document content (vectorized) and metadata (searchable) - Support incremental sync using a timestamp column - Full frontend UI with connection form and tooltips Files changed: Backend: - `common/constants.py` - Added MYSQL/POSTGRESQL to FileSource enum - `common/data_source/config.py` - Added to DocumentSource enum - `common/data_source/rdbms_connector.py` - New connector (368 lines) - `common/data_source/__init__.py` - Exported the connector - `rag/svr/sync_data_source.py` - Added MySQL and PostgreSQL sync classes - `pyproject.toml` - Added mysql-connector-python dependency Frontend: - `web/src/pages/user-setting/data-source/constant/index.tsx` - Form fields - `web/src/locales/en.ts` - English translations - `web/src/assets/svg/data-source/mysql.svg` - MySQL icon - `web/src/assets/svg/data-source/postgresql.svg` - PostgreSQL icon ### Testing done Tested with MySQL 8.0 and PostgreSQL 16: - Connection validation works correctly - Full sync imports all query results as documents - Incremental sync only fetches rows updated since last sync - Custom SQL queries filter data as expected - Invalid credentials show clear error messages - Lint checks pass (`ruff check` returns no errors) --------- Co-authored-by: mkdev11 <YOUR_GITHUB_ID+MkDev11@users.noreply.github.com>	2026-02-04 10:14:32 +08:00
Haipeng LI	e385b19d67	Test: Add code coverage reporting to CI (#12874 ) ### What problem does this PR solve? Add code coverage reporting to CI ### Type of change - [x] Test (please describe): coverage report --------- Co-authored-by: Liu An <asiro@qq.com>	2026-01-30 14:49:16 +08:00
Kevin Hu	f1c2fac03e	Refa: remove ppt image. (#12909 ) ### What problem does this PR solve? remove `aspose` ### Type of change - [x] Refactoring	2026-01-30 13:35:42 +08:00
Magicbook1108	2c4499ec45	Fix: key error "content" #12844 (#12847 ) ### What problem does this PR solve? Fix: key error "content" #12844 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2026-01-28 14:39:34 +08:00
Zhichang Yu	fd11aca8e5	feat: Implement pluggable multi-provider sandbox architecture (#12820 ) ## Summary Implement a flexible sandbox provider system supporting both self-managed (Docker) and SaaS (Aliyun Code Interpreter) backends for secure code execution in agent workflows. Key Changes: - ✅ Aliyun Code Interpreter provider using official `agentrun-sdk>=0.0.16` - ✅ Self-managed provider with gVisor (runsc) security - ✅ Arguments parameter support for dynamic code execution - ✅ Database-only configuration (removed fallback logic) - ✅ Configuration scripts for quick setup Issue #12479 ## Features ### 🔌 Provider Abstraction Layer 1. Self-Managed Provider (`agent/sandbox/providers/self_managed.py`) - Wraps existing executor_manager HTTP API - gVisor (runsc) for secure container isolation - Configurable pool size, timeout, retry logic - Languages: Python, Node.js, JavaScript - ⚠️ Requires: gVisor installation, Docker, base images 2. Aliyun Code Interpreter (`agent/sandbox/providers/aliyun_codeinterpreter.py`) - SaaS integration using official agentrun-sdk - Serverless microVM execution with auto-authentication - Hard timeout: 30 seconds max - Credentials: `AGENTRUN_ACCESS_KEY_ID`, `AGENTRUN_ACCESS_KEY_SECRET`, `AGENTRUN_ACCOUNT_ID`, `AGENTRUN_REGION` - Automatically wraps code to call `main()` function 3. E2B Provider (`agent/sandbox/providers/e2b.py`) - Placeholder for future integration ### ⚙️ Configuration System - `conf/system_settings.json`: Default provider = `aliyun_codeinterpreter` - `agent/sandbox/client.py`: Enforces database-only configuration - Admin UI: `/admin/sandbox-settings` - Configuration validation via `validate_config()` method - Health checks for all providers ### 🎯 Key Capabilities Arguments Parameter Support: All providers support passing arguments to `main()` function: ```python # User code def main(name: str, count: int) -> dict: return {"message": f"Hello {name}!" * count} # Executed with: arguments={"name": "World", "count": 3} # Result: {"message": "Hello World!Hello World!Hello World!"} ``` Self-Describing Providers: Each provider implements `get_config_schema()` returning form configuration for Admin UI Error Handling: Structured `ExecutionResult` with stdout, stderr, exit_code, execution_time ## Configuration Scripts Two scripts for quick Aliyun sandbox setup: Shell Script (requires jq): ```bash source scripts/configure_aliyun_sandbox.sh ``` Python Script (interactive): ```bash python3 scripts/configure_aliyun_sandbox.py ``` ## Testing ```bash # Unit tests uv run pytest agent/sandbox/tests/test_providers.py -v # Aliyun provider tests uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter.py -v # Integration tests (requires credentials) uv run pytest agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py -v # Quick SDK validation python3 agent/sandbox/tests/verify_sdk.py ``` Test Coverage: - 30 unit tests for provider abstraction - Provider-specific tests for Aliyun - Integration tests with real API - Security tests for executor_manager ## Documentation - `docs/develop/sandbox_spec.md` - Complete architecture specification - `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration from legacy sandbox - `agent/sandbox/tests/QUICKSTART.md` - Quick start guide - `agent/sandbox/tests/README.md` - Testing documentation ## Breaking Changes ⚠️ Migration Required: 1. Directory Move: `sandbox/` → `agent/sandbox/` - Update imports: `from sandbox.` → `from agent.sandbox.` 2. Mandatory Configuration: - SystemSettings must have `sandbox.provider_type` configured - Removed fallback default values - Configuration must exist in database (from `conf/system_settings.json`) 3. Aliyun Credentials: - Requires `AGENTRUN_` environment variables (not `ALIYUN_`) - `AGENTRUN_ACCOUNT_ID` is now required (Aliyun primary account ID) 4. Self-Managed Provider: - gVisor (runsc) must be installed for security - Install: `go install gvisor.dev/gvisor/runsc@latest` ## Database Schema Changes ```python # SystemSettings.value: CharField → TextField api/db/db_models.py: Changed for unlimited config length # SystemSettingsService.get_by_name(): Fixed query precision api/db/services/system_settings_service.py: startswith → exact match ``` ## Files Changed ### Backend (Python) - `agent/sandbox/providers/base.py` - SandboxProvider ABC interface - `agent/sandbox/providers/manager.py` - ProviderManager - `agent/sandbox/providers/self_managed.py` - Self-managed provider - `agent/sandbox/providers/aliyun_codeinterpreter.py` - Aliyun provider - `agent/sandbox/providers/e2b.py` - E2B provider (placeholder) - `agent/sandbox/client.py` - Unified client (enforces DB-only config) - `agent/tools/code_exec.py` - Updated to use provider system - `admin/server/services.py` - SandboxMgr with registry & validation - `admin/server/routes.py` - 5 sandbox API endpoints - `conf/system_settings.json` - Default: aliyun_codeinterpreter - `api/db/db_models.py` - TextField for SystemSettings.value - `api/db/services/system_settings_service.py` - Exact match query ### Frontend (TypeScript/React) - `web/src/pages/admin/sandbox-settings.tsx` - Settings UI - `web/src/services/admin-service.ts` - Sandbox service functions - `web/src/services/admin.service.d.ts` - Type definitions - `web/src/utils/api.ts` - Sandbox API endpoints ### Documentation - `docs/develop/sandbox_spec.md` - Architecture spec - `agent/sandbox/tests/MIGRATION_GUIDE.md` - Migration guide - `agent/sandbox/tests/QUICKSTART.md` - Quick start - `agent/sandbox/tests/README.md` - Testing guide ### Configuration Scripts - `scripts/configure_aliyun_sandbox.sh` - Shell script (jq) - `scripts/configure_aliyun_sandbox.py` - Python script ### Tests - `agent/sandbox/tests/test_providers.py` - 30 unit tests - `agent/sandbox/tests/test_aliyun_codeinterpreter.py` - Provider tests - `agent/sandbox/tests/test_aliyun_codeinterpreter_integration.py` - Integration tests - `agent/sandbox/tests/verify_sdk.py` - SDK validation ## Architecture ``` Admin UI → Admin API → SandboxMgr → ProviderManager → [SelfManaged\|Aliyun\|E2B] ↓ SystemSettings ``` ## Usage ### 1. Configure Provider Via Admin UI: 1. Navigate to `/admin/sandbox-settings` 2. Select provider (Aliyun Code Interpreter / Self-Managed) 3. Fill in configuration 4. Click "Test Connection" to verify 5. Click "Save" to apply Via Configuration Scripts: ```bash # Aliyun provider export AGENTRUN_ACCESS_KEY_ID="xxx" export AGENTRUN_ACCESS_KEY_SECRET="yyy" export AGENTRUN_ACCOUNT_ID="zzz" export AGENTRUN_REGION="cn-shanghai" source scripts/configure_aliyun_sandbox.sh ``` ### 2. Restart Service ```bash cd docker docker compose restart ragflow-server ``` ### 3. Execute Code in Agent ```python from agent.sandbox.client import execute_code result = execute_code( code='def main(name: str) -> dict: return {"message": f"Hello {name}!"}', language="python", timeout=30, arguments={"name": "World"} ) print(result.stdout) # {"message": "Hello World!"} ``` ## Troubleshooting ### "Container pool is busy" (Self-Managed) - Cause: Pool exhausted (default: 1 container in `.env`) - Fix: Increase `SANDBOX_EXECUTOR_MANAGER_POOL_SIZE` to 5+ ### "Sandbox provider type not configured" - Cause: Database missing configuration - Fix: Run config script or set via Admin UI ### "gVisor not found" - Cause: runsc not installed - Fix: `go install gvisor.dev/gvisor/runsc@latest && sudo cp ~/go/bin/runsc /usr/local/bin/` ### Aliyun authentication errors - Cause: Wrong environment variable names - Fix: Use `AGENTRUN_` prefix (not `ALIYUN_`) ## Checklist - [x] All tests passing (30 unit tests + integration tests) - [x] Documentation updated (spec, migration guide, quickstart) - [x] Type definitions added (TypeScript) - [x] Admin UI implemented - [x] Configuration validation - [x] Health checks implemented - [x] Error handling with structured results - [x] Breaking changes documented - [x] Configuration scripts created - [x] gVisor requirements documented Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-28 13:28:21 +08:00

1 2 3 4

158 Commits