Commit Graph

5750 Commits

Author SHA1 Message Date
Qi Wang
57aec2e65d Fix bug: run Knowledge graph or RAPTOR, it will update an existing task (#14102)
### What problem does this PR solve?

It fixed the bug: https://github.com/infiniflow/ragflow/issues/14101
When run Knowledge graph or RAPTOR, the last document running status
will be wrongly set, see below:
It should never touch existing document result.

![Image](https://github.com/user-attachments/assets/14fe1f9e-0541-4093-8111-ed0bd25b87ba)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-14 16:37:41 +08:00
balibabu
27ebc64ec0 Feat: Adapted for the upgraded knowledge graph of @antv/g6. (#14103)
### What problem does this PR solve?

Feat: Adapted for the upgraded knowledge graph of @antv/g6.

### Type of change

- [x] Refactoring
2026-04-14 16:33:52 +08:00
Magicbook1108
1376c004a9 Fix: update docs generator (#14070)
### What problem does this PR solve?

Refactor: update docs generator

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

1. Support multiple document generator components and correctly display
messages in the message component. The document generator will not
overwrite other messages.

<img width="700" alt="Screenshot from 2026-04-13 13-56-17"
src="https://github.com/user-attachments/assets/3f3e06e8-33ce-4df1-8b05-510c86af70a4"
/>

2. Support Chinese content and ensure correct Markdown rendering in PDF
and DOCX
<img width="700" alt="image"
src="https://github.com/user-attachments/assets/69bf1f7b-261d-48e5-a9f3-8e94462b90ed"
/>

3. Simplify configuration page and support more output format
 
<img height="700" alt="image"
src="https://github.com/user-attachments/assets/8647374c-c055-4daa-ad71-cd9052eb138e"
/>

4. Hide download from other components except for message 
<img width="700" alt="image"
src="https://github.com/user-attachments/assets/a723dfcb-b60d-4eb5-b2f6-d41ca5955eb4"
/>

<img width="700" alt="image"
src="https://github.com/user-attachments/assets/a8762ac4-807b-4f0b-9287-65f82f7c9c98"
/>

5. Sanitize filename
 
<img width="700" alt="image"
src="https://github.com/user-attachments/assets/df49509f-37c0-40f9-b03d-bd6ce7fdefa8"
/>


6. And more changes on usability
2026-04-14 15:24:43 +08:00
chanx
1031aebc8f feat(file): Add file ancestor directory lookup feature by go (#14037)
### What problem does this PR solve?

feat(file): Add file ancestor directory lookup feature by go

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-04-14 15:22:03 +08:00
chanx
6aec8058bb refactor: Remove knowledge base-related API handlers that are already included in the dataset. (#14094)
### What problem does this PR solve?

refactor: Remove knowledge base-related API handlers that are already
included in the dataset.

### Type of change

- [x] Refactoring
2026-04-14 15:19:31 +08:00
Jin Hai
2b6c50734f Sync code from EE (#14080)
### What problem does this PR solve?

As title.

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-14 15:03:46 +08:00
Ricardo-M-L
c22811f096 fix: close file handles in json.load() calls in resume parser (#14061)
## Summary
- Replace `json.load(open(...))` with `with open(...) as f:
json.load(f)` in 2 resume parser files
- Fixes 4 leaked file descriptors in `corporations.py` (3) and
`schools.py` (1)

## Why
In a long-running server process like RAGFlow, leaked file handles can
accumulate and hit the OS file descriptor limit (`OSError: [Errno 24]
Too many open files`). The other instances mentioned in the issue
(`infinity_conn_base.py` and `init_data.py`) have already been fixed.

## Test plan
- [x] Verified affected files use `with` statement after fix
- [x] Grep confirms no remaining `json.load(open(` patterns in codebase

Fixes #13996

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:43:58 +08:00
Idriss Sbaaoui
de6a8e789a Fix: rerank overflow by enforcing top_k and 64 cap (#14084)
### What problem does this PR solve?

This fixes rerank overflow where retrieval could send more documents
than allowed (for example 66 when `page_size=6`), causing provider 400
errors and bypassing the user’s `top_k` intent in rerank-enabled paths.
this pr fixes #14081

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-14 10:47:25 +08:00
Idriss Sbaaoui
d6987b4d8f Fix p3 ci fails (#14069)
### What problem does this PR solve?

fix issue with stale tests on p3 level

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-14 10:47:07 +08:00
balibabu
d2b744facd Fix: The indented tree text generated on the search page overlaps. #14077 (#14078)
### What problem does this PR solve?

Fix: The indented tree text generated on the search page overlaps.
#14077

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-04-14 10:02:00 +08:00
Magicbook1108
8723c3aa86 Feat: more templates (#14075)
### What problem does this PR solve?

Feat: more templates
<img width="700" alt="image"
src="https://github.com/user-attachments/assets/533e88f1-fc56-4337-a026-6623fc978893"
/>


### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
2026-04-14 10:00:55 +08:00
chanx
6ffa566ec3 Refactor: Standardize naming convention to camelCase (#14079)
### What problem does this PR solve?

Refactor: Standardize naming convention to camelCase

### Type of change

- [x] Refactoring
2026-04-13 21:07:07 +08:00
balibabu
9a38af7cbf Feat: Hide the download button embedded in the agent page. (#14083)
### What problem does this PR solve?

Feat: Hide the download button embedded in the agent page.

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2026-04-13 21:06:41 +08:00
Syed Shahmeer Ali
c7ce062ea8 Fix: model_type not passed in ensure_tenant_model_id_for_params causing wrong tenant model lookup (#13782)
Summary

When setting a default model for an OpenAI-API-Compatible provider,
ensure_tenant_model_id_for_params called get_api_key
without a model_type filter. If the same model name was registered under
multiple types (e.g., both chat and embedding),
it could return the wrong tenant_llm_id, leading to Model(@None) not
authorized errors during chat.

This applies the same type-scoped fix that PR #13569 introduced in
get_model_config_by_type_and_name — now consistently
  in tenant_utils.py as well.

  Changes

  - Added _KEY_TO_MODEL_TYPE mapping in tenant_utils.py
- Each model key (llm_id, embd_id, etc.) now passes its correct LLMType
to get_api_key

  Fixes #13775
2026-04-13 20:57:28 +08:00
天海蒼灆
356d45fda1 Feat: add cell type coercion for Excel export (#13808)
### What problem does this PR solve?

- Implemented a helper function to convert markdown cell text to native
numeric types for Excel output.
- Ensured that leading zeros are preserved and handled various numeric
formats, including those with thousand separators and scientific
notation.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-04-13 20:54:57 +08:00
Lynn
47d3741dcc Feat: migrate script (#14076)
### What problem does this PR solve?

Add command line arguments for mysql config.

### Type of change

- [x] Other (please describe): tool scripts.
2026-04-13 20:45:11 +08:00
bitloi
853021ff2a feat: support multiple canvas_types for agent templates and remove duplicate files (#14030)
### What problem does this PR solve?

Closes #13907

The template catalog had duplicate files (e.g. `*_r.json`) only to place
the same template into multiple sidebar groups.
This increases maintenance cost and makes template updates error-prone.

This PR adds first-class support for multiple template categories in a
single file via `canvas_types`, then removes duplicate template files.

What changed:
- Added `canvas_types` to `CanvasTemplate` model and DB migration.
- Added normalization logic when loading templates:
  - accepts legacy `canvas_type`
  - accepts new `canvas_types`
  - merges/deduplicates values
- preserves backward compatibility by keeping `canvas_type` as first
normalized value.
- Updated template import flow to load only `.json` files and in stable
sorted order.
- Updated frontend template filtering to match on `canvas_types` first,
with fallback to legacy `canvas_type`.
- Consolidated duplicated template pairs into single files and removed:
  - `deep_search_r.json`
  - `reflective_academic_paper_generator_r.json`
  - `seo_article_writer_r.json`
- Added regression/edge-case tests for category normalization and route
serialization expectations.

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2026-04-13 20:26:30 +08:00
writinwaters
ef07faea80 Doc: Updated frequently asked questions and answers. (#14085)
### What problem does this PR solve?

Updated frequently asked questions. 

### Type of change

- [x] Documentation Update
2026-04-13 20:26:16 +08:00
Tong Liu
6fdca2d212 [Security] Fix jinja2 SSTI vulnerability using SandboxedEnvironment (#14068) 2026-04-13 19:24:13 +08:00
balibabu
a023305b96 Fix: The chat page is not displaying the meta tags. (#14071)
### What problem does this PR solve?

Fix: The chat page is not displaying the meta tags.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-13 16:18:25 +08:00
Krishna Chaitanya
5ece2d8aa8 Fix: upgrade Apache Tika from 3.2.3 to 3.3.0 to address GHSA-72hv-8253-57qq (#13769)
### What problem does this PR solve?

Upgrades Apache Tika from 3.2.3 to 3.3.0 to address the security
vulnerability GHSA-72hv-8253-57qq (TIKA-4687).

Closes #13601

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Changes

- `Dockerfile`: Updated tika JAR filename and `TIKA_SERVER_JAR` env var
from 3.2.3 to 3.3.0
- `Dockerfile.deps`: Updated tika JAR filename in COPY instruction from
3.2.3 to 3.3.0
- `download_deps.py`: Updated both Maven Central and Huawei Cloud mirror
download URLs from 3.2.3 to 3.3.0

### References

- Apache Tika 3.3.0 release:
https://www.apache.org/dyn/closer.lua/tika/3.3.0/tika-app-3.3.0.jar
- TIKA-4687: https://issues.apache.org/jira/browse/TIKA-4687
- GHSA-72hv-8253-57qq
2026-04-13 16:01:08 +08:00
Jin Hai
3e787b3b09 Go: update search (#14023)
### What problem does this PR solve?

Update search

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-13 15:07:04 +08:00
Yongteng Lei
1638083e18 Fix: sandbox cannot accept large args list (#14063)
### What problem does this PR solve?

Sandbox cannot accept large args list.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-13 14:14:08 +08:00
Jack
51ce6aab01 Consolidate set_meta into update_document (#14045)
### What problem does this PR solve?

Consolidate "set_meta" API into "update_document" .

Before consolidation
Web API: POST /api/v1/document/set_meta
Http API - PUT /v1/datasets/<dataset_id>/document/<document_id>

After consolidation, Restful API -- PUT
/v1/datasets/<dataset_id>/document/<document_id>

### Type of change

- [x] Refactoring
2026-04-13 12:47:17 +08:00
akie
3911d90993 Fix: agent application can not show Cite (#14047)
Close #14018

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### Problem
In Agent applications, even with the cite option enabled, only inline
[ID: x] citation markers are visible (showing chunk content on hover).
The Agent does not display the referenced file cards below the response,
unlike Chat applications.

### Root Cause
The Agent's Retrieval tool (agent/tools/retrieval.py) calls
retriever.retrieval() with aggs=False, which means the retrieval results
do not include doc_aggs (document aggregation) data. Without doc_aggs,
the frontend ReferenceDocumentList component has no data to render the
file cards.

In contrast, the Chat application (api/db/services/dialog_service.py)
calls the same retriever.retrieval() method with aggs=True.

### Fix
Changed aggs=False to aggs=True in agent/tools/retrieval.py so that
document aggregation data is returned along with the retrieved chunks.
2026-04-13 11:06:14 +08:00
writinwaters
52442c8eb5 Docs: Added a guide on adding Github repo as data source (#14048)
### What problem does this PR solve?

Added a guide on adding Github repo as data source

### Type of change


- [x] Documentation Update
2026-04-10 21:32:26 +08:00
balibabu
462be53b76 Fix: When creating a dataset, if no chunk_method is selected, there is no indication that this is a required field. (#14039)
### What problem does this PR solve?
Fix: When creating a dataset, if no `chunk_method` is selected, there is
no indication that this is a required field.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-10 19:05:14 +08:00
Magicbook1108
82d74fd276 Refact: update pipeline template (#14036)
### What problem does this PR solve?

Refact: update pipeline template

### Type of change

- [x] Refactoring
2026-04-10 19:04:52 +08:00
Jack
4046a4cfb6 Consolidateion metadata summary API (#14031)
### What problem does this PR solve?

Consolidation WEB API & HTTP API for document metadata summary

Before consolidation
Web API: POST /api/v1/document/metadata/summary
Http API - GET /v1/datasets/<dataset_id>/metadata/summary

After consolidation, Restful API -- GET
/v1/datasets/<dataset_id>/metadata/summary

### Type of change

- [x] Refactoring
2026-04-10 18:41:30 +08:00
balibabu
11c89d87da Fix: The dataset on the search page is not displaying the required field error message. (#14041)
### What problem does this PR solve?

Fix: The dataset on the search page is not displaying the required field
error message.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-10 18:20:50 +08:00
Zhichang Yu
a9ca4ea1a1 Disable flask and quart debug (#14042)
### What problem does this PR solve?

Visit
`http://127.0.0.1:9381/?__debugger__=yes&cmd=resource&f=debugger.js`
will expose the flask code:
```
docReady(() => {
  if (!EVALEX_TRUSTED) {
    initPinBox();
  }
  // if we are in console mode, show the console.
  if (CONSOLE_MODE && EVALEX) {
    createInteractiveConsole();
  }

  const frames = document.querySelectorAll("div.traceback div.frame");
  if (EVALEX) {
    addConsoleIconToFrames(frames);
  }
  addEventListenersToElements(document.querySelectorAll("div.detail"), "click", () =>
    document.querySelector("div.traceback").scrollIntoView(false)
  );
  addToggleFrameTraceback(frames);
  addToggleTraceTypesOnClick(document.querySelectorAll("h2.traceback"));
  addInfoPrompt(document.querySelectorAll("span.nojavascript"));
  wrapPlainTraceback();
});

function addToggleFrameTraceback(frames) {
  frames.forEach((frame) => {
    frame.addEventListener("click", () => {
      frame.getElementsByTagName("pre")[0].parentElement.classList.toggle("expanded");
    });
  })
}

```

### Type of change

- [x] Other (please describe): Fix security risk
2026-04-10 18:01:49 +08:00
Jin Hai
cfc2928de2 Go: remove unused API route (#14028)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-10 18:00:41 +08:00
Jin Hai
3d59448b0d Go: add parameter parsing of list chats (#14026)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-10 14:33:32 +08:00
Magicbook1108
18cafff790 Fix: markdown parser in pipeline (#14032)
### What problem does this PR solve?

Fix: markdown parser in pipeline

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-10 14:11:14 +08:00
Magicbook1108
9ce293a736 Refact: update exesql notification (#14027)
### What problem does this PR solve?

Refact: update exesql notification

### Type of change


- [x] Refactoring
2026-04-10 13:42:57 +08:00
Magicbook1108
87a87a7122 Feat: pipeline support ONE chunking method (#14024)
### What problem does this PR solve?

Feat: pipeline support ONE chunking method

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-04-10 13:11:22 +08:00
Jin Hai
a37605cbd2 Go: add get chat (#14025)
### What problem does this PR solve?

As title

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-10 13:06:51 +08:00
eason
aa92abe73c fix: close file handles properly in json.load() calls (#13997)
## Summary

Fixes #13996

Replace `json.load(open(...))` with `with open(...) as f: json.load(f)`
in two files to ensure file descriptors are properly closed.

**Affected files:**
- `common/doc_store/infinity_conn_base.py` — schema loading for Infinity
doc store
- `api/db/init_data.py` — agent template loading at startup

## Why this matters

In a long-running server process like RAGFlow, leaked file descriptors
from `json.load(open(...))` can accumulate over time. While CPython's
refcounting usually cleans these up, it's not guaranteed (especially
under memory pressure or with alternative Python runtimes), and can lead
to `OSError: [Errno 24] Too many open files`.

## Test plan

- [ ] Verify Infinity doc store schema loading still works correctly
- [ ] Verify agent templates load correctly on startup

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Refactor**
* Improved file handling in internal data processing to ensure proper
resource cleanup.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: easonysliu <easonysliu@tencent.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 12:16:49 +08:00
chanx
4538910b52 feat: Implement file-related functionality (#14011)
### What problem does this PR solve?

feat: Implement file-related functionality

- Implement file deletion API and business logic
- Add context support for file deletion operations and prevent root
folder deletion
-  Implement file move functionality
-  Add File Download API Endpoints and Utility Functions

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-04-10 12:15:27 +08:00
corevibe555
e7d044413f Fix: Google Drive connector missing new files after initial sync (#13943)
Closes https://github.com/infiniflow/ragflow/issues/13939

## What problem does this PR solve?

The Google Drive connector fails to detect new files after the initial
sync (#13939). The root cause is that `generate_time_range_filter()`
applies a strict `modifiedTime > poll_range_start` cutoff when querying
the Google Drive API. Files uploaded to Google Drive that retain their
original `modifiedTime` (common behavior) get silently excluded if their
timestamp predates the last sync's cutoff.

Unlike the Confluence and Jira connectors which use a configurable time
buffer (`CONFLUENCE_SYNC_TIME_BUFFER_SECONDS`) to offset
`poll_range_start` backward, the Google Drive connector had no such
mechanism — resulting in a razor-sharp timestamp boundary with zero
tolerance for overlap.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)


## Summary

* **New Features**
* Added a configurable time buffer for Google Drive synchronization to
address timing delays and improve sync reliability.
* Improved file detection logic to include recently created files
alongside modified ones, reducing missed synchronizations.
2026-04-10 11:39:19 +08:00
NeedmeFordev
7315d25cbc Fix retrieval API handling for omitted dataset IDs (#13990)
### What problem does this PR solve?

This PR fixes a mismatch between the MCP retrieval contract and the
backend retrieval API.

`ragflow_retrieval` already describes `dataset_ids` as optional, but
`/api/v1/retrieval` still rejected omitted or empty `dataset_ids` with
`` `dataset_ids` is required. ``. That made MCP retrieval fail even
though the tool schema promised that the request could search across all
available datasets.

This change updates `/api/v1/retrieval` to accept missing or empty
`dataset_ids`, resolve all accessible datasets for the authenticated
user, and keep the route schema aligned with the new runtime behavior.
It also adds focused unit coverage for the fallback resolution path and
the no-accessible-datasets case.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Fixes: #13981

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Improved dataset resolution to reliably discover all accessible
datasets through proper pagination, replacing the previous parsing
method.
* Enhanced error handling with clearer messaging when no datasets are
available for retrieval.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-10 11:34:15 +08:00
Magicbook1108
27329b40ed Refact: refact on parser structure (#14012)
### What problem does this PR solve?

Refact: refact on parser structure

### Type of change

- [x] Refactoring
2026-04-10 10:03:44 +08:00
Jin Hai
cd04467b9b Go: add delete search (#14014)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-10 09:42:37 +08:00
balibabu
56810ec5a3 Fix: The knowledge base selected by the retrieval node is not displayed. (#14013)
### What problem does this PR solve?

Fix: The knowledge base selected by the retrieval node is not displayed.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-10 09:40:35 +08:00
Magicbook1108
52f5880d21 Fix: support vlm fall back in pipeline (#14007)
### What problem does this PR solve?

Fix: support vlm fall back in pipeline for img/table parsing

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-04-09 20:20:11 +08:00
Jin Hai
5951e2b564 Go: Add create search (#13998)
### What problem does this PR solve?

As title

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-09 20:04:06 +08:00
Yongteng Lei
b33d2fdea5 Refa: GraphRAG to use async chat methods instead of thread pool execution (#14002)
### What problem does this PR solve?

GraphRAG _async_chat.

### Type of change

- [x] Refactoring
- [x] Performance Improvement


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* Unified chat calls to an async invocation across extractors, improving
timeout handling and ensuring task IDs propagate reliably.
* **Tests**
* Added and expanded unit tests and mocks to cover extractor behavior,
timeout scenarios, and safe test-package imports, reducing regression
risk.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-09 19:57:35 +08:00
Octopus
c2ce49e037 fix: strip single quotes from synonym terms to prevent Infinity TokenError (#13969)
Fixes #13823

## Problem

When querying with words like `cat`, RAGFlow's query expansion system
looks up synonyms via WordNet, which can return terms containing single
quotes (e.g., `cat-o'-nine-tails`). When using Infinity as the document
store, these unescaped single quotes in the query string cause a
`TokenError` because Infinity's lexer treats `'` as a string delimiter.

```
TokenError: Error tokenizing ' OR "big cat" OR "computerized tomography")^0.7)': Missing ' from 1:531
```

## Solution

Strip single quotes from synonym terms before they are inserted into
query expressions, consistent with how single quotes are already
stripped from the input query text (line 51 of `query.py`):

- **`common/query_base.py`**: In `sub_special_char()`, strip `'` before
escaping other special characters. This fixes the Chinese text
processing path and the `paragraph()` method.
- **`rag/nlp/query.py`**: In the English text path, strip `'` from
tokenized synonym terms.
- **`memory/services/query.py`**: Same fix for the memory query English
text path.

## Testing

The fix can be verified by:
1. Using Infinity as the document store (`DOC_ENGINE=infinity`)
2. Creating a dataset and running a retrieval test with the keyword
`cat`
3. Confirming no `TokenError` is raised and results are returned
normally

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Enhanced special character handling in query processing and synonym
expansion by properly sanitizing single quotes before text processing.
* Simplified OCR detection output by removing timing metadata while
preserving core detection accuracy.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: ximi <octo-patch@github.com>
2026-04-09 19:10:34 +08:00
Jin Hai
e2b879b258 Fix tiny issues (#14006)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Chores**
* Improved authentication error logging to better distinguish between
JWT and API token failures.
* Enhanced code documentation with clarifying comments for better
maintainability.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-09 19:01:36 +08:00
balibabu
3c5a3e5fb4 Feat: Integrate the name, avatar, and description of chat and search into a single component. (#14008)
### What problem does this PR solve?

Feat: Integrate the name, avatar, and description of chat and search
into a single component.
### Type of change


- [x] New Feature (non-breaking change which adds functionality)


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
  * Inline-editable avatar, name, and description fields
  * Expandable content blocks in search results
  * New RAGFlow heading/logo component

* **Refactor**
* Replaced scattered form fields with a composed Avatar/Name/Description
component
  * Mindmap drawer converted to a sheet-based drawer and layout cleanup
* Simplified search page controls and layout; improved scroll viewport
handling

* **Chores**
* Added/updated English and Chinese localization keys (placeholders,
view more/less)
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2026-04-09 18:51:45 +08:00