29 Commits

Author SHA1 Message Date
Jin Hai
3202ec6abf Go CLI: refactor commands (#16447)
### Summary

1. Move debug commands to dev file.
2. Refactor some commands syntax

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-29 17:03:26 +08:00
Zhichang Yu
195bfffb5e fix(security): address 93 CodeQL code-scanning alerts across 61 files (#16407)
## Summary

Resolves all 93 open alerts at
https://github.com/infiniflow/ragflow/security/code-scanning by rule:

| Rule | Count | Treatment |
|------|-------|-----------|
| py/clear-text-logging-sensitive-data | 23 | Real fix — log scrubbing |
| go/path-injection | 15 | Real fix where possible, suppression with
rationale |
| go/request-forgery | 8 | Suppression with rationale
(operator-controlled URLs) |
| go/clear-text-logging | 10 | Real fix — log scrubbing |
| go/unsafe-quoting | 5 | Real fix — escape or refactor |
| go/sql-injection | 3 | Real fix — orderby whitelist + CodeQL comment |
| go/uncontrolled-allocation-size | 2 | Real fix — cap to 1024 |
| go/incorrect-integer-conversion | 3 | Real fix — ParseInt + range
check |
| go/insecure-hostkeycallback | 1 | Real fix — known_hosts file |
| go/disabled-certificate-check | 2 | Suppression with rationale |
| go/command-injection | 1 | Suppression (sanitized via shq()) |
| go/email-injection | 1 | Suppression with rationale |
| go/cookie-httponly-not-set | 1 | Suppression (SPA bootstrap) |
| js/stack-trace-exposure | 1 | Real fix — generic client message |
| js/prototype-pollution-utility | 1 | Real fix — reject
__proto__/constructor/prototype |
| py/weak-sensitive-data-hashing | 1 | Real fix — MD5 → SHA-256 |
| py/incomplete-url-substring-sanitization | 3 | Real fix —
urlparse(hostname) |
| py/paramiko-missing-host-key-validation | 1 | Real fix —
load_system_host_keys + RejectPolicy |
| cpp/integer-multiplication-cast-to-long | 2 | Real fix — cast to
size_t |

## Real fixes (with measurable security improvement)

**SSH host key verification (Go + Python)**  
Replace `InsecureIgnoreHostKey()` / `paramiko.AutoAddPolicy()` with
proper host key verification against a known_hosts file (configurable
via `SSH_KNOWN_HOSTS` env / `known_hosts` config field; fail-closed when
unset). Loads `~/.ssh/known_hosts` first via `load_system_host_keys()`
so existing setups keep working.

**SQL injection in `user_canvas`**  
Add `userCanvasOrderableColumns` whitelist + `userCanvasOrderClause`
helper. Both `GetList()` and `ListByTenantIDs()` now route the
user-supplied `orderby` query param through the helper, defaulting to
`create_time` on miss.

**SQL injection in `pipeline_operation_log`**  
Existing whitelist documented via CodeQL comment.

**Real SQL injection in `infinity/chunk.go:931`**  
Escape `'` → `''` on user-controlled `questionText` before splicing into
`filter_fulltext(...)` SQL filter.

**Real SQL injection in `elasticsearch/sql.go:75`**  
Defense-in-depth escape on tokenizer output before splicing into
`MATCH(...)`.

**Python code injection in `result_protocol.go`**  
Replace raw JSON literal embedding into Python/JS expressions with
base64 + `json.loads` / `JSON.parse(Buffer.from(...,
'base64').toString('utf8'))`. Eliminates both the unsafe-quoting sink
and the brittleness of mixing JSON true/false/null with Python syntax.

**URL substring check bypass in `embedding_model.py`**  
Replace `if "dashscope-intl.aliyuncs.com" in u` with
`urlparse(u).hostname == "dashscope-intl.aliyuncs.com"` so a base_url
like `https://attacker.example/?u=dashscope-intl.aliyuncs.com` cannot
bypass the routing.

**Prototype pollution in `setNestedValue` (TS)**  
Reject `__proto__`/`constructor`/`prototype` keys before any assignment.

**Integer overflow**  
- scrypt params via `ParseInt` + non-positive check
(`internal/common/password.go`)
- `topN` and `n` caps to 1024 (retrieval_service.go, dataset.go)
- `nalloc*statesize` cast to `size_t` (cpp/re2/onepass.cc)

**Cookie httponly**  
Set explicitly with rationale: this is the OAuth bootstrap cookie
intentionally read by the SPA.

**Stack trace exposure**  
Replace `error.message` in HTTP 500 response with generic `"internal
error"`; full error still logged server-side via `console.error`.

**Weak hashing**  
MD5 → SHA-256 for deterministic `conv_id` derivation
(`conversation_service.py`).

**Log scrubbing**  
Remove or redact user-controlled / sensitive content from clear-text
logs across 8 ingestion parsers, `llm_service.py` ×11,
`tenant_llm_service.py` ×7, `misc_utils.py` ×4, `redis_conn.py` ×10,
`conftest.py` ×4, `init_data.py`, `dataset_api_service.py`,
`generator.py`, `mysql_migration.py`, `cli.go`, `user_command.go`,
`pdf_parser.go`. Most patterns converted to parameterized logging
(`logging.info("...: %d", n)`) or static messages.

## CodeQL suppressions (each with rationale)

For alerts where the data flow is genuinely safe but CodeQL can't see
the context — operator-controlled URLs, sanitized inputs, etc. — I added
`// codeql[go/<rule>] <rationale>` annotations rather than dismissing
them, so future readers can audit the rationale inline:

- `internal/agent/component/invoke.go:135` — Invoke is a generic canvas
HTTP client
- `internal/service/langfuse.go` ×2 — host is per-tenant operator config
- `internal/service/file.go:1184` — already SSRF-guarded by
`assertURLSafe`
- `internal/utility/mcp_client.go` ×3 — already `AssertURLSafe` +
IP-pinned
- `internal/entity/models/bedrock.go` — sigv4-signed request, URL can't
be tampered
- `internal/service/deep_researcher.go:269` — `callback` is SSE display
string, not SQL
- `internal/engine/infinity/chunk.go:346` — UUIDs can't contain `'` (RFC
4122)
- `internal/cli/common_command.go` ×2 — CLI trusts operator-configured
URL
- `internal/utility/smtp.go:194` — msg is server-built, not user form
input
- `internal/entity/models/*` ×14 (path-injection) — audio file paths are
caller-supplied

## Test plan

-  All 13 modified Go packages build cleanly
-  663 tests pass across `internal/agent/sandbox`, `internal/common`,
`internal/agent/component`, `internal/engine/infinity`, `internal/dao`
-  All 11 modified Python files parse via `ast.parse`
-  TypeScript `tsc --noEmit` clean on the modified
`use-provider-fields.tsx`
-  `node --check` clean on the modified JS file

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-06-29 09:45:16 +08:00
Jin Hai
1fc02606ea Go CLI: fix key commands (#16306)
### What problem does this PR solve?

```
RAGFlow(api/default)> set key 'ragflow-JgnarFSCUiV99oOvvMDei7ZzZg1cVlqGd1AMHrHeKE4';
SUCCESS
RAGFlow(api/default)> unset key;
SUCCESS

RAGFlow(api/default)> list provider 'zhipu-ai' instances;

RAGFlow(api/default)> list providers;

RAGFlow(api/default)> list available providers;

RAGFlow(api/default)> list provider 'zhipu-ai' instance 'test' models;
```

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-24 18:48:09 +08:00
Jin Hai
9624f70b22 Go CLI: refactor (#16299)
```
RAGFlow(api/default)> list dataset 'e93ab2c04ad111f1b17438a74640adcc' documents;
Total: 1

RAGFlow(api/default)> list datasets;


RAGFlow(api/default)> list chats;
Total: 2

RAGFlow(api/default)> list agents;
Total: 1

RAGFlow(api/default)> list searches;
Total: 1

RAGFlow(api/default)> list keys;
+----------------------------------+---------------+----------------------------------+-----------------------------------------------------+---------------+
| beta                             | create_time   | tenant_id                        | token                                               | update_time   |
+----------------------------------+---------------+----------------------------------+-----------------------------------------------------+---------------+
| GKsLEdSUkl76gJz1k_4fJpSQRIlWsiki | 1782285917523 | 2ba4881420fa11f19e9c38a74640adcc | ragflow-JgnarFSCUiV99oOvvMDei7ZzZg1cVlqGd1AMHrHeKE4 | 1782285917523 |
+----------------------------------+---------------+----------------------------------+-----------------------------------------------------+---------------+
RAGFlow(api/default)> create key;
SUCCESS

RAGFlow(api/default)> drop key 'ragflow-aA4R7AuUD158yh2LDh7IDBiqwOKFDKeTwUSQSLVdPdM';
SUCCESS
```

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-24 16:50:40 +08:00
Jin Hai
49714865c1 Go: rename ragflow_cli to ragflow-cli (#16270)
### What problem does this PR solve?

rename ragflow cli binary

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-23 19:20:49 +08:00
Jin Hai
e15130534f Go: default public key (#16265)
### What problem does this PR solve?

Provider default public key for CLI

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-23 17:43:26 +08:00
qinling0210
563d855780 Implement OpenAI chat completions in GO (#16177)
### What problem does this PR solve?

Implement OpenAI chat completions in GO

POST /api/v1/openai/<chat_id>/chat/completions

OpenAI chat cli: internal/development.md

### Type of change

- [x] Refactoring
2026-06-18 18:07:27 +08:00
Jin Hai
417f805bd9 Go: add API mode check in file system command (#16022)
### What problem does this PR solve?

As title.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-15 16:37:47 +08:00
Jin Hai
115b730d07 Go: parse ingestion DSL (#15938)
PR #15938

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-12 17:58:36 +08:00
Jin Hai
3e4fb8cf1c Go: fix test and remove unused code (#15909)
### What problem does this PR solve?

1. Fix go test, some cases still failed.
2. Remove unused code.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-10 20:38:43 +08:00
Jin Hai
139f4515e8 Go: refactor CLI (#15898)
### What problem does this PR solve?

1. remove unused code
2. fix login issue

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-10 16:06:30 +08:00
Yingfeng
01a2a44766 Clean CLI for filesystem (#15838)
### Type of change

- [x] Refactoring
2026-06-09 17:00:10 +08:00
Jin Hai
d02eb6b596 Go: refactor CLI (#15728)
### What problem does this PR solve?

```
RAGFlow(user)> add api server 'ccc' host '127.0.0.1:9980';
SUCCESS
RAGFlow(user)> list api server;
+------------+---------------+-----------------+---------+-------------+---------------+
| api_server | api_server_ip | api_server_port | auth    | user_name   | user_password |
+------------+---------------+-----------------+---------+-------------+---------------+
| ccc        | 127.0.0.1     | 9980            | no auth |             |               |
| default    | 127.0.0.1     | 9384            | login   | aaa@aaa.com | ***           |
+------------+---------------+-----------------+---------+-------------+---------------+
RAGFlow(user)> delete api server 'ccc';
SUCCESS
RAGFlow(user)> list api server;
+------------+---------------+-----------------+---------+
| api_server | api_server_ip | api_server_port | auth    |
+------------+---------------+-----------------+---------+
| default    | 127.0.0.1     | 9384            | no auth |
+------------+---------------+-----------------+---------+

RAGFlow(user)> show admin server;
+--------------+-------+
| field        | value |
+--------------+-------+
| admin_server | N/A   |
+--------------+-------+
RAGFlow(user)> add admin server host '127.0.0.1:9880';
SUCCESS
RAGFlow(user)> show admin server;
+-------------------+-----------+
| field             | value     |
+-------------------+-----------+
| admin_server_ip   | 127.0.0.1 |
| admin_server_port | 9880      |
| auth              | no auth   |
+-------------------+-----------+
RAGFlow(user)> delete admin server;
SUCCESS
RAGFlow(user)> show admin server;
+--------------+-------+
| field        | value |
+--------------+-------+
| admin_server | N/A   |
+--------------+-------+

RAGFlow(user)> show current
+-----------------+-------------+
| field           | value       |
+-----------------+-------------+
| api_server_port | 9384        |
| user_name       | aaa@aaa.com |
| user_password   | ***         |
| mode            | api         |
| verbose         | false       |
| api_server      | default     |
| api_server_ip   | 127.0.0.1   |
| auth            | login       |
| output          | table       |
| interactive     | true        |
+-----------------+-------------+
```
### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-06-09 15:22:50 +08:00
qinling0210
c960dc2a4c Refine handling of POST /api/v1/datasets/search in GO (#15583)
### What problem does this PR solve?

Refine handling of POST /api/v1/datasets/search in GO

### Type of change

- [x] Refactoring
2026-06-08 11:49:37 +08:00
Yingfeng
4ee0702aed Feat: add skills space to context engine (#13908)
### What problem does this PR solve?

issue #13714

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-04-30 12:36:03 +08:00
Jin Hai
5673245134 Refactor context command (#13948)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-07 11:30:09 +08:00
qinling0210
49386bc1b5 Implement UpdateDataset and UpdateMetadata in GO (#13928)
### What problem does this PR solve?

Implement UpdateDataset and UpdateMetadata in GO

Add cli:
UPDATE CHUNK <chunk_id> OF DATASET <dataset_name> SET <update_fields>
REMOVE TAGS 'tag1', 'tag2' from DATASET 'dataset_name';
SET METADATA OF DOCUMENT <doc_id> TO <meta>


### Type of change

- [ ] Refactoring
2026-04-07 09:44:51 +08:00
Jin Hai
6d9430a125 Add think chat to CLI (#13922)
### What problem does this PR solve?

Now user can use 'think mode' to chat with LLM

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-03 18:11:23 +08:00
Jin Hai
6c29128de1 Refactor model provider and command (#13887)
### What problem does this PR solve?

Introduce 5 new tables, including model groups and provider instance.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-04-02 20:20:35 +08:00
Jin Hai
efd6ecc3e5 New provider and models API and CLI (#13865)
### What problem does this PR solve?

As title.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-31 18:42:12 +08:00
Jin Hai
1fff48b656 Add minio go test (#13800)
### What problem does this PR solve?

1. Add go test
2. Update CI process

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-27 18:12:56 +08:00
Yingfeng
6e309f9d0a Feat: Initialize context engine CLI (#13776)
### What problem does this PR solve?

- Add multiple output format to ragflow_cli
- Initialize contextengine to Go module
  - ls datasets/ls files
  - cat file
  - search -d dir -q query

issue: #13714

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-03-26 21:07:06 +08:00
Jin Hai
e705ac6643 Add logout (#13796)
### What problem does this PR solve?

Add command: logout

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-26 11:54:23 +08:00
qinling0210
ebf36950e4 Implement Create/Drop Index/Metadata index in GO (#13791)
### What problem does this PR solve?

Implement Create/Drop Index/Metadata index in GO

New API handling in GO:
POST/kb/index 
DELETE /kb/index
POST /tenant/doc_meta_index
DELETE /tenant/doc_meta_index

CREATE INDEX FOR DATASET 'dataset_name' VECTOR_SIZE 1024;
DROP INDEX FOR DATASET 'dataset_name';
CREATE INDEX DOC_META;
DROP INDEX DOC_META;

### Type of change

- [x] Refactoring
2026-03-26 11:54:10 +08:00
Jin Hai
61cc5ffef2 Add api tokens commands of go admin cli (#13765)
### What problem does this PR solve?

- GENERATE TOKENS OF USER 'xxx@xxx.com'
- DROP KEY 'ragflow-yyyyy' OF 'xxx@xxx.com'
- LIST KEYS OF 'xxx@xxx.com'

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-25 21:39:14 +08:00
Jin Hai
b308cd3a02 Update go cli (#13717)
### What problem does this PR solve?

Go cli

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2026-03-24 20:08:36 +08:00
chanx
0da9c4618d feat(cli): Enhance CLI functionality and add administrator mode support (#13539)
### What problem does this PR solve?

feat(cli): Enhance CLI functionality and add administrator mode support

- Modify `parseActivateUser` in `parser.go` to support 'on'/'off' states
- Add administrator mode switching and host port settings functionality
to `cli.go`
- Implement user management API calls in `client.go`

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2026-03-12 13:33:13 +08:00
qinling0210
d201a81db7 Add command history in ragflow cli (#13538)
### What problem does this PR solve?

In ragflow cli,  use Up/Down arrows to navigate command history,

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2026-03-11 19:14:18 +08:00
Jin Hai
70e9743ef1 RAGFlow go API server (#13240)
# RAGFlow Go Implementation Plan 🚀

This repository tracks the progress of porting RAGFlow to Go. We'll
implement core features and provide performance comparisons between
Python and Go versions.

## Implementation Checklist

- [x] User Management APIs
- [x] Dataset Management Operations
- [x] Retrieval Test
- [x] Chat Management Operations
- [x] Infinity Go SDK

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
Co-authored-by: Yingfeng Zhang <yingfeng.zhang@gmail.com>
2026-03-04 19:17:16 +08:00