Files
ragflow/test
galuis116 d1f6594618 Fix: JWT algorithm-confusion in OIDC ID token verification (#15181)
### What problem does this PR solve?

Closes #15180.

`OIDCClient.parse_id_token` in `api/apps/auth/oidc.py` read the JWT
signing
algorithm from the **unverified** JWT header and passed it through to
`jwt.decode(..., algorithms=[alg], ...)` as the trust anchor. This is
the
textbook JWT algorithm-confusion vulnerability (CWE-345 / CWE-347). Any
unauthenticated client capable of reaching the OIDC callback could take
over
an arbitrary account on any RAGFlow deployment with OIDC login enabled:

1. **`alg: "none"`** — present a JWT with `{"alg": "none"}` and no
   signature segment → `jwt.decode(..., algorithms=["none"])` → PyJWT's
   `NoneAlgorithm` accepts the token without verification → login as any
   user.
2. **RSA / HMAC confusion** — fetch the public RSA key from the
provider's
   JWKS (it's public), forge a JWT with `{"alg": "HS256"}` HMAC-signed
   using the public-key bytes as the secret → `jwt.decode(...,
   algorithms=["HS256"], key=public_key)` → verifier accepts → login as
   any user. (Modern PyJWT independently refuses to use a PEM-formatted
   key as an HMAC secret, which mitigates this leg for PEM key formats;
the fix here is the only mitigation for raw / DER / JWK octet keys and
   for older PyJWT versions.)

### What changed

**`api/apps/auth/oidc.py`:**

- New module constants `_ALLOWED_OIDC_SIGNING_ALGS` (asymmetric-only:
  `RS*`, `ES*`, `PS*`, `EdDSA` — explicitly excludes `none` and `HS*`)
  and `_DEFAULT_OIDC_SIGNING_ALGS = ("RS256",)` (the OIDC Core 1.0 §2
  spec default).
- New helper `_resolve_id_token_signing_algs(metadata)` — intersects the
  provider's advertised `id_token_signing_alg_values_supported` from
`/.well-known/openid-configuration` with the safe allowlist; falls back
  to RS256 when the field is missing or contains only unsafe values.
- `OIDCClient.__init__` now stores the resolved allowlist on
  `self.id_token_signing_algs` — pinned once, from a trusted source, at
  construction time.
- `parse_id_token` no longer calls `jwt.get_unverified_header` and no
  longer reads `alg` from the JWT header. It passes
  `self.id_token_signing_algs` to `jwt.decode(..., algorithms=...)`.
  `PyJWKClient.get_signing_key_from_jwt` still reads the `kid` from the
  header internally for JWKS lookup — that's fine, `kid` is not a
  security decision; the signature still proves which key was actually
  used.


**`test/testcases/test_web_api/test_auth_app/test_oidc_client_unit.py`:**

- Existing `test_parse_id_token_success_and_error` drops its
`jwt.get_unverified_header` mock (no longer called by `parse_id_token`).
- `_metadata` and `_make_client` helpers grew an optional `signing_algs`
parameter so tests can configure what the discovery document advertises.
- New `TestSSRFValidation` / algorithm-confusion regression block (7
  tests):
  - `test_id_token_signing_algs_default_to_rs256_when_metadata_missing`
  - `test_id_token_signing_algs_intersect_metadata_with_safe_allowlist`
  - `test_id_token_signing_algs_fall_back_when_only_unsafe_advertised`
  - `test_id_token_signing_algs_ignores_non_string_entries`
  - `test_id_token_signing_algs_handles_non_list_metadata_field`
  - `test_parse_id_token_passes_pinned_algorithms_to_jwt_decode` —
    sabotages `jwt.get_unverified_header` to raise on call, proving the
    verification path never consults the unverified header.
- `test_parse_id_token_rejects_alg_none` — uses real PyJWT to encode an
    `alg: "none"` token; `parse_id_token` raises `ValueError("Error
    parsing ID Token: …")` instead of accepting it.
  - `test_parse_id_token_rejects_hs256_when_allowlist_is_asymmetric` —
    uses real PyJWT to forge an `alg: "HS256"` token with a non-PEM
    shared secret (so PyJWT's incidental PEM-as-HMAC refusal isn't what
    blocks it); `parse_id_token` raises because `HS256` is not in the
    pinned allowlist.

Sanity-checked end-to-end with real PyJWT outside the project test
runner:

- `alg=none` forged token + `algorithms=["RS256"]` →
`InvalidAlgorithmError` ✓
- `alg=HS256` forged token + `algorithms=["RS256"]` →
`InvalidAlgorithmError` ✓
- Same `alg=HS256` token + `algorithms=["HS256"]` → **accepted**
({'sub': 'admin'})
  — confirming the attack path was real before the fix.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: galuis116 <contact@duerrimports.com>
2026-05-29 19:37:01 +08:00
..


(1). Deploy RAGFlow services and images

https://ragflow.io/docs/build_docker_image

(2). Configure the required environment for testing

Install Python dependencies (including test dependencies):

uv sync --python 3.13 --only-group test --no-default-groups --frozen

Activate the environment:

source .venv/bin/activate

Install SDK:

uv pip install sdk/python 

Modify the .env file: Add the following code:

COMPOSE_PROFILES=${COMPOSE_PROFILES},tei-cpu
TEI_MODEL=BAAI/bge-small-en-v1.5
RAGFLOW_IMAGE=infiniflow/ragflow:v0.25.6 #Replace with the image you are using

Start the containerwait two minutes:

docker compose -f docker/docker-compose.yml up -d


(3). Test Elasticsearch

a) Run sdk tests against Elasticsearch:

export HTTP_API_TEST_LEVEL=p2
export HOST_ADDRESS=http://127.0.0.1:9380  # Ensure that this port is the API port mapped to your localhost
pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_sdk_api 

b) Run http api tests against Elasticsearch:

pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_http_api 


(4). Test Infinity

Modify the .env file:

DOC_ENGINE=${DOC_ENGINE:-infinity}

Start the container:

docker compose -f docker/docker-compose.yml down -v 
docker compose -f docker/docker-compose.yml up -d

a) Run sdk tests against Infinity:

DOC_ENGINE=infinity pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_sdk_api 

b) Run http api tests against Infinity:

DOC_ENGINE=infinity pytest -s --tb=short --level=${HTTP_API_TEST_LEVEL} test/testcases/test_http_api