feat: implement Microsoft Teams data source connector (#15193)

### What problem does this PR solve?

Closes #15191.

RAGFlow shipped a Microsoft Teams connector stub
(`common/data_source/teams_connector.py`) whose document-loading methods
all returned `[]`, `Teams._generate()` was a `pass`, and Teams was
commented out of the data-source settings UI. As a result there was no
way to index Teams channel conversations into a knowledge base.

This PR implements the connector end to end on top of Microsoft Graph
(Office365-REST-Python-Client). It shares the MSAL client-credentials
auth shape with the SharePoint connector.

**Backend**

- `common/data_source/teams_connector.py`
- `load_credentials()` now builds the Graph client using an MSAL
client-credentials **token callback** — the form `GraphClient` actually
expects. (The previous stub passed a raw access-token string to
`GraphClient(...)`, which is not how that client is driven.) Token
acquisition is lazy, so credential loading performs no network call.
  - `validate_connector_settings()` lists teams via Graph.
- `load_from_checkpoint()` is now a generator that pages teams →
channels → messages, flattens each top-level post together with its
replies into one blob-based `Document` (`extension` `.txt`/`.html`,
`blob`, `size_bytes`, `doc_updated_at`). Incremental syncs are bounded
by message `lastModifiedDateTime` (falling back to `createdDateTime`).
Per-message errors surface as `ConnectorFailure` instead of aborting the
run.
- `retrieve_all_slim_docs_perm_sync()` yields id-only `SlimDocument`
batches and the checkpoint helpers return proper `TeamsCheckpoint`s.
- ACL → `ExternalAccess` mapping is intentionally left best-effort
(`load_from_checkpoint_with_perm_sync` delegates to the standard load)
because the sync pipeline does not currently persist `ExternalAccess`.
- `rag/svr/sync_data_source.py`
- Implemented `Teams._generate()` using the existing
`CheckpointOutputWrapper` pattern (same shape as Confluence/Jira/Google
Drive), supporting full reindex and incremental polling from
`poll_range_start`.
- `TeamsConnector` is already exported from
`common/data_source/__init__.py`.

**Frontend (`web/`)**

- Enabled the `TEAMS` data-source enum and added its form fields
(`tenant_id`, `client_id`, `client_secret`), default values, display
metadata, and a Teams icon.
- Added `teamsDescription` / `teamsTenantIdTip` to `en.ts` and `zh.ts`.

**Tests**

- `test/unit_test/data_source/test_teams_connector_unit.py`: mock-based
unit tests covering credential loading (incomplete creds raise, happy
path sets the Graph client, fetch-without-creds raises), post/reply
flattening (incl. the HTML vs text extension), incremental
`lastModifiedDateTime` filtering, and slim-doc listing. All 6 pass;
`ruff check` is clean.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
This commit is contained in:
web-dev0521
2026-05-28 03:10:38 -06:00
committed by GitHub
parent b7d88f0b09
commit 98bc9ca6ac
7 changed files with 642 additions and 69 deletions

View File

@@ -0,0 +1,9 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 48 48" width="48" height="48">
<path d="M28.5 19h14.2c.7 0 1.3.6 1.3 1.3v12.4a7.1 7.1 0 0 1-7.1 7.1 7.1 7.1 0 0 1-7.1-7.1V19z" fill="#5059C9"/>
<circle cx="38" cy="11.5" r="4.5" fill="#5059C9"/>
<circle cx="23.5" cy="9.5" r="6.5" fill="#7B83EB"/>
<path d="M31.8 19H13.4c-.7 0-1.3.6-1.2 1.4v11.6a11 11 0 0 0 9.3 10.9 11 11 0 0 0 12.6-9.7c0-.4.1-.8.1-1.2V20.4c0-.8-.6-1.4-1.4-1.4z" fill="#7B83EB"/>
<path d="M24 16H12.2v18.4c2 .4 4 .3 5.8-.4V19h6V16z" opacity=".1"/>
<rect x="2" y="13" width="22" height="22" rx="2" fill="#4B53BC"/>
<path d="M18.2 20.3H7.8v2.7h3.6v9.4h3.2V23h3.6v-2.7z" fill="#fff"/>
</svg>

After

Width:  |  Height:  |  Size: 673 B

View File

@@ -1238,6 +1238,10 @@ Example: Virtual Hosted Style`,
'Upload the OAuth JSON generated from Google Console. If it only contains client credentials, run the browser-based verification once to mint long-lived refresh tokens.',
dropboxDescription:
'Connect your Dropbox to sync files and folders from a chosen account.',
teamsDescription:
'Connect Microsoft Teams via Microsoft Graph to sync channel posts and replies.',
teamsTenantIdTip:
'Azure AD tenant ID. Requires an app with Team.ReadBasic.All and ChannelMessage.Read.All application permissions (admin consent).',
slackDescription:
'Connect your Slack workspace to sync channel messages and threads.',
slackBotTokenTip:

View File

@@ -1099,6 +1099,9 @@ NER使用 spaCy NER 和基于规则的关键词提取来抽取实体和关系
gmailTokenTip:
'请上传由 Google Console 生成的 OAuth JSON。如果仅包含 client credentials请通过浏览器授权一次以获取长期有效的刷新 Token。',
dropboxDescription: '连接 Dropbox同步指定账号下的文件与文件夹。',
teamsDescription: '通过 Microsoft Graph 连接 Microsoft Teams同步频道帖子与回复。',
teamsTenantIdTip:
'Azure AD 租户 ID。需要具备 Team.ReadBasic.All 与 ChannelMessage.Read.All 应用权限(管理员同意)的应用。',
slackDescription: '连接你的 Slack 工作区,同步频道消息与讨论串。',
slackBotTokenTip:
'Slack 机器人用户 OAuth Token以 xoxb- 开头)。应用需具备 channels:read、channels:history 和 users:read 权限。',

View File

@@ -43,10 +43,9 @@ export enum DataSourceKey {
POSTGRESQL = 'postgresql',
REST_API = 'rest_api',
RSS = 'rss',
TEAMS = 'teams',
SLACK = 'slack',
SHAREPOINT = 'sharepoint',
// TEAMS = 'teams',
}
type DataSourceFeatureVisibility = {
@@ -130,6 +129,9 @@ export const DataSourceFeatureVisibilityMap: Partial<
[DataSourceKey.MOODLE]: {
syncDeletedFiles: true,
},
[DataSourceKey.TEAMS]: {
syncDeletedFiles: true,
},
[DataSourceKey.SLACK]: {
syncDeletedFiles: true,
},
@@ -219,6 +221,11 @@ export const generateDataSourceInfo = (t: TFunction) => {
description: t(`setting.${DataSourceKey.MOODLE}Description`),
icon: <SvgIcon name={'data-source/moodle'} width={38} />,
},
[DataSourceKey.TEAMS]: {
name: 'Microsoft Teams',
description: t(`setting.${DataSourceKey.TEAMS}Description`),
icon: <SvgIcon name={'data-source/teams'} width={38} />,
},
[DataSourceKey.SLACK]: {
name: 'Slack',
description: t(`setting.${DataSourceKey.SLACK}Description`),
@@ -670,6 +677,27 @@ export const DataSourceFormFields = {
required: true,
},
],
[DataSourceKey.TEAMS]: [
{
label: 'Tenant ID',
name: 'config.credentials.tenant_id',
type: FormFieldType.Text,
required: true,
tooltip: t('setting.teamsTenantIdTip'),
},
{
label: 'Client ID',
name: 'config.credentials.client_id',
type: FormFieldType.Text,
required: true,
},
{
label: 'Client Secret',
name: 'config.credentials.client_secret',
type: FormFieldType.Password,
required: true,
},
],
[DataSourceKey.SLACK]: [
{
label: 'Slack Bot Token',
@@ -1569,6 +1597,17 @@ export const DataSourceFormDefaultValues = {
},
},
},
[DataSourceKey.TEAMS]: {
name: '',
source: DataSourceKey.TEAMS,
config: {
credentials: {
tenant_id: '',
client_id: '',
client_secret: '',
}
}
},
[DataSourceKey.SLACK]: {
name: '',
source: DataSourceKey.SLACK,