Files
ragflow/web
akie 35b2a714f9 Fix: tag datasets not visible in tag sets dropdown (#13921)
## Problem Description

When a user creates Dataset A using the **Tag parser** (for CSV/Excel
files with tag definitions), and then creates Dataset B, the Tag Sets
dropdown in Dataset B's Configuration page cannot display Dataset A.

### Steps to Reproduce
1. Create Dataset A with **Tag** as the chunking method
2. Upload a CSV file to Dataset A to generate tags
3. Create Dataset B
4. Navigate to Dataset B → Configuration → Tag Sets
5. **Expected**: Dataset A should appear in the dropdown
6. **Actual**: The dropdown is empty, Dataset A is not visible

---

## Root Cause Analysis

After thorough code review, **the original code logic is correct**. The
`chunk_method` field flows properly through the system:

### Data Flow

```mermaid
sequenceDiagram
    participant Frontend
    participant Pydantic
    participant API
    participant Database

    Note over Frontend,Database: Creating a Tag Dataset
    Frontend->>Pydantic: POST {chunk_method: "tag"}
    Pydantic->>API: serialization_alias converts<br/>chunk_method → parser_id
    API->>Database: INSERT {parser_id: "tag"}

    Note over Frontend,Database: Querying Datasets
    Frontend->>API: GET /api/v1/datasets
    API->>Database: SELECT parser_id, ...
    Database-->>API: Returns {parser_id: "tag"}
    API->>API: remap_dictionary_keys()<br/>parser_id → chunk_method
    API-->>Frontend: {chunk_method: "tag"}

    Note over Frontend: Filter: x.chunk_method === 'tag'
    Note over Frontend:  Match found!
```

### Field Mapping

**Location**: `api/utils/api_utils.py:657-662`
```python
DEFAULT_KEY_MAP = {
    "chunk_num": "chunk_count",
    "doc_num": "document_count",
    "parser_id": "chunk_method",  # Maps DB field to API response
    "embd_id": "embedding_model",
}
```

### Frontend Filtering (Already Correct)

**Location**:
`web/src/pages/dataset/dataset-setting/components/tag-item.tsx:24`
```typescript
const knowledgeOptions = knowledgeList
  .filter((x) => x.chunk_method === 'tag')  //  Correct field
  .map((x) => ({...}));
```

---

## Actual Issue

The most likely causes for the "bug" are:

1. **Browser Cache**: Old data cached before proper deployment
2. **Stale Data**: Datasets created before the code was fully deployed
3. **Container Not Restarted**: Changes not applied to running container

---

## Resolution

**No code changes are needed.** The existing code correctly:

1. Accepts `chunk_method` from frontend
2. Converts to `parser_id` via Pydantic serialization_alias
3. Stores in database as `parser_id`
4. Maps back to `chunk_method` in API response
5. Frontend filters by `chunk_method === 'tag'`
2026-04-03 17:29:10 +08:00
..
2026-01-05 15:28:57 +08:00
2024-11-21 18:14:22 +08:00
2026-01-04 19:14:20 +08:00

Install front-end dependencies

npm install

Launch front-end

npm run dev

The following output confirms a successful launch of the system:

Login to RAGFlow web UI

Open your browser and navigate to:

http://localhost:9222 or http://[YOUR_MACHINE_IP]:9222

Replace [YOUR_MACHINE_IP] with your actual machine IP address (e.g., http://192.168.1.49:9222).

Login to RAGFlow web admin UI

Open your browser and navigate to:

http://localhost:9222/admin or http://[YOUR_MACHINE_IP]:9222/admin

Replace [YOUR_MACHINE_IP] with your actual machine IP address (e.g., http://192.168.1.49:9222/admin).

Shutdown front-end

Ctrl + C or

kill -f "umi dev"