Commit Graph

4 Commits

Author SHA1 Message Date
NeedmeFordev
6b7989b4b4 Add file type validation (#13802)
### What problem does this PR solve?

This PR fixes WebDAV sync behavior for unsupported file types
([#13795](https://github.com/infiniflow/ragflow/issues/13795)).

Previously, the WebDAV connector selected files primarily by modified
time (and size threshold) and could still pass unsupported extensions
into the download/document-generation path. This caused unnecessary
processing and inconsistent behavior compared with connectors that
validate file type earlier.

This change adds extension validation in two places:

1. **Early filter during recursive listing** to skip unsupported files
before they enter the download flow.
2. **Defensive filter before download/document creation** to prevent
unsupported files from being processed if any listing edge case slips
through.

It also wires `allow_images` into the WebDAV sync path so image
extension handling follows connector policy.

Scope is intentionally limited to WebDAV for a focused bug-fix PR.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

### How was this tested?

- Manual verification with mixed file types under the configured WebDAV
path:
  - supported: `.pdf`, `.txt`, `.md`
  - unsupported: `.exe`, `.bin`, `.dat`
- Triggered full sync and polling sync.
- Confirmed unsupported files are skipped before download.
- Confirmed supported files are still indexed normally.
- Confirmed image handling follows `allow_images` setting.

Fixes: #13795
2026-04-02 14:12:27 +08:00
Magicbook1108
96810b7d97 Fix: webdav connector (#12380)
### What problem does this PR solve?

fix webdav #11422

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-12-31 19:00:00 +08:00
Jonah Hartmann
6587acef88 Feat: use filepath for files with the same name (#11752)
### What problem does this PR solve?

When there are multiple files with the same name the file would just
duplicate, making it hard to distinguish between the different files.
Now if there are multiple files with the same name, they will be named
after their folder path in the webdav storage unit.

The same could be done for the other connectors, too, since most of them
will have similars issues, when iterating through the folder paths.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Contribution by RAGcon GmbH, visit us [here](https://www.ragcon.ai/)
2025-12-05 10:10:26 +08:00
Jonah Hartmann
2fd5ac1031 Feat: Add Webdav storage as data source (#11422)
### What problem does this PR solve?

This PR adds webdav storage as data source for data sync service.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-11-26 14:14:42 +08:00