ragflow

mirror of https://github.com/infiniflow/ragflow.git synced 2026-07-04 01:29:35 +08:00

Author	SHA1	Message	Date
sapienza yoan	811e9826d0	perf: avoid O(n²) array growth in embedding accumulation (#14369 ) ### What problem does this PR solve? Both tokenizer (`rag/flow/tokenizer/tokenizer.py`) and `BuiltinEmbed.encode` (`rag/llm/embedding_model.py`) currently accumulate embedding batches via `np.concatenate` inside the per-batch loop. `np.concatenate` allocates a new array and copies all existing data on every call, so accumulating N batches is O(N²) in both time and peak memory. Replacing the incremental concatenate with a list-of-batches + a single `np.vstack` at the end gives O(N) total work. For tokenizer the title-vector broadcast `np.concatenate([vts[0]] * N)` is also replaced by `np.tile`, which does the same job with a single contiguous allocation instead of building a Python list of references. This is purely a CPU/memory optimisation — output shape and dtype are unchanged. Measured impact grows with document size: - 1k chunks (batch 512, 2 iters): ~negligible - 10k chunks (20 iters): ~10× speedup on this stage - 100k chunks (195 iters): ~100× speedup, and peak RAM drops from O(N) extra to near-zero ### Type of change - [x] Performance Improvement Co-authored-by: yoan sapienza <Yoan Sapienza yoan.sapienza@orange.fr Yoan Sapienza zappy@macbookpro.home>	2026-04-30 11:00:10 +08:00
FuturMix	2548c28d65	feat: add FuturMix as model provider (#14419 ) ## Summary Add [FuturMix](https://futurmix.ai) as a new model provider. FuturMix is an OpenAI-compatible unified AI gateway that provides access to 22+ models (GPT, Claude, Gemini, DeepSeek, and more) through a single API endpoint and key. - API Base: `https://futurmix.ai/v1` (OpenAI-compatible) - Supported capabilities: Chat, Embedding, Image2Text, TTS, Speech2Text, Rerank ### Changes \| File \| Change \| \|------\|--------\| \| `rag/llm/__init__.py` \| Add `FuturMix` to `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` \| \| `rag/llm/chat_model.py` \| Add `FuturMixChat(Base)` — follows Astraflow/Avian pattern \| \| `rag/llm/embedding_model.py` \| Add `FuturMixEmbed(OpenAIEmbed)` — follows Astraflow pattern \| \| `rag/llm/cv_model.py` \| Add `FuturMixCV(GptV4)` — follows SILICONFLOW/OpenRouter pattern \| \| `rag/llm/tts_model.py` \| Add `FuturMixTTS(OpenAITTS)` — follows CometAPI/DeerAPI pattern \| \| `rag/llm/sequence2txt_model.py` \| Add `FuturMixSeq2txt(GPTSeq2txt)` — follows StepFun pattern \| \| `rag/llm/rerank_model.py` \| Add `FuturMixRerank(OpenAI_APIRerank)` \| \| `conf/llm_factories.json` \| Add factory config with 8 chat, 2 embedding, 1 image2text, 2 TTS, 1 speech2text models \| \| `docs/guides/models/supported_models.mdx` \| Add FuturMix to supported models table \| ### Models included - Chat: claude-sonnet-4-20250514, claude-3.5-haiku, gpt-4o, gpt-4o-mini, gemini-2.5-flash, gemini-2.0-flash, deepseek-chat, deepseek-reasoner - Embedding: text-embedding-3-small, text-embedding-3-large - Image2Text: gpt-4o - TTS: tts-1, tts-1-hd - Speech2Text: whisper-1 ## Test plan - [ ] Verify FuturMix appears in the model provider list in RAGFlow UI - [ ] Configure FuturMix with API key and test chat completion - [ ] Test embedding model with document indexing - [ ] Test image2text with a sample image 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-30 10:59:37 +08:00
ucloudnb666	f853a39b40	feat: Add Astraflow provider support (global + China endpoints) (#14270 ) ## Add Astraflow Provider Support This PR integrates [Astraflow](https://astraflow.ucloud.cn/) (by UCloud / 优刻得) as a new AI model provider in RAGFlow, with support for both global and China endpoints. ### About Astraflow Astraflow is an OpenAI-compatible AI model aggregation platform supporting 200+ models from major providers including DeepSeek, Qwen, GPT, Claude, Gemini, Llama, Mistral, and more. \| Variant \| Factory Name \| Endpoint \| Env Var \| \|---------\|-------------\|----------\|---------\| \| Global \| `Astraflow` \| `https://api-us-ca.umodelverse.ai/v1` \| `ASTRAFLOW_API_KEY` \| \| China \| `Astraflow-CN` \| `https://api.modelverse.cn/v1` \| `ASTRAFLOW_CN_API_KEY` \| - API key signup: https://astraflow.ucloud.cn/ --- ### Files Changed \| File \| Change \| \|------\|--------\| \| `rag/llm/__init__.py` \| Register `Astraflow` and `Astraflow-CN` in `SupportedLiteLLMProvider` enum, `FACTORY_DEFAULT_BASE_URL`, and `LITELLM_PROVIDER_PREFIX` \| \| `rag/llm/chat_model.py` \| Add `AstraflowChat` and `AstraflowCNChat` (OpenAI-compatible `Base` subclass) \| \| `rag/llm/embedding_model.py` \| Add `AstraflowEmbed` and `AstraflowCNEmbed` (subclasses of `OpenAIEmbed`) \| \| `rag/llm/rerank_model.py` \| Add `AstraflowRerank` and `AstraflowCNRerank` (subclasses of `OpenAI_APIRerank`) \| \| `rag/llm/cv_model.py` \| Add `AstraflowCV` and `AstraflowCNCV` (subclasses of `GptV4`) \| \| `rag/llm/tts_model.py` \| Add `AstraflowTTS` and `AstraflowCNTTS` (subclasses of `OpenAITTS`) \| \| `rag/llm/sequence2txt_model.py` \| Add `AstraflowSeq2txt` and `AstraflowCNSeq2txt` (subclasses of `GPTSeq2txt`) \| \| `conf/llm_factories.json` \| Register `Astraflow` and `Astraflow-CN` factories with a curated list of popular models \| --- ### Supported Model Types - ✅ Chat / LLM — DeepSeek-V3/R1, Qwen3, GPT-4o/4.1, Claude 3.5/3.7, Gemini 2.0/2.5 Flash, Llama 3.3/4, Mistral, and 200+ more - ✅ Text Embedding — text-embedding-3-small/large - ✅ Image / Vision (IMAGE2TEXT) — GPT-4o, GPT-4.1, Claude, Gemini, Llama-4, etc. - ✅ Text Re-Rank - ✅ TTS — tts-1 - ✅ Speech-to-Text (SPEECH2TEXT) — whisper-1 ### Implementation Notes - Uses the `openai/` LiteLLM prefix — consistent with other OpenAI-compatible aggregation platforms (SILICONFLOW, DeerAPI, CometAPI, OpenRouter, n1n, Avian, etc.) - `Astraflow` (global, rank 250) and `Astraflow-CN` (China, rank 249) are separate factory entries, allowing users to choose the optimal endpoint based on their region. - All model classes cleanly subclass existing base classes (`Base`, `OpenAIEmbed`, `OpenAI_APIRerank`, `GptV4`, `OpenAITTS`, `GPTSeq2txt`) with no custom logic needed — the provider is fully OpenAI-compatible. --------- Co-authored-by: user <user@xzaaaMacBook-Air.local>	2026-04-22 15:38:34 +08:00
tmimmanuel	13d0df1562	feat: add Perplexity contextualized embeddings API as a new model provider (#13709 ) ### What problem does this PR solve? Adds Perplexity contextualized embeddings API as a new model provider, as requested in #13610. - `PerplexityEmbed` provider in `rag/llm/embedding_model.py` supporting both standard (`/v1/embeddings`) and contextualized (`/v1/contextualizedembeddings`) endpoints - All 4 Perplexity embedding models registered in `conf/llm_factories.json`: `pplx-embed-v1-0.6b`, `pplx-embed-v1-4b`, `pplx-embed-context-v1-0.6b`, `pplx-embed-context-v1-4b` - Frontend entries (enum, icon mapping, API key URL) in `web/src/constants/llm.ts` - Updated `docs/guides/models/supported_models.mdx` - 22 unit tests in `test/unit_test/rag/llm/test_perplexity_embed.py` Perplexity's API returns `base64_int8` encoded embeddings (not OpenAI-compatible), so this uses a custom `requests`-based implementation. Contextualized vs standard model is auto-detected from the model name. Closes #13610 ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2026-03-20 10:47:48 +08:00
Jonah Hartmann	6023eb27ac	feat: add Ragcon provider (#13425 ) ### What problem does this PR solve? This PR aims to extend the list of possible providers. Adds new Provider "RAGcon" within the Ollama Modal. It provides all model types except OCR via Openai-compatible endpoints. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Jakob <16180662+hauberj@users.noreply.github.com>	2026-03-06 09:37:27 +08:00
Magicbook1108	5fc3bd38b0	Feat: Support siliconflow.com (#13308 ) ### What problem does this PR solve? Feat: Support siliconflow.com ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2026-03-02 15:37:42 +08:00
Magicbook1108	98e1d5aa5c	Refact: switch from google-generativeai to google-genai (#13140 ) ### What problem does this PR solve? Refact: switch from oogle-generativeai to google-genai #13132 Refact: commnet out unused pywencai. ### Type of change - [x] Refactoring	2026-02-24 10:28:33 +08:00
Yongteng Lei	3a86e7c224	Feat: support doubao-embedding-vision model (#12983 ) ### What problem does this PR solve? Add support `doubao-embedding-vision` model. `doubao-embedding-large-text` is deprecated. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality)	2026-02-05 09:49:46 +08:00
Magicbook1108	f8fd1ea7e1	Feat: Further update Bedrock model configs (#12029 ) ### What problem does this PR solve? Feat: Further update Bedrock model configs #12020 #12008 <img width="700" alt="2b4f0f7fab803a2a2d5f345c756a2c69" src="https://github.com/user-attachments/assets/e1b9eaad-5c60-47bd-a6f4-88a104ce0c63" /> <img width="700" alt="afe88ec3c58f745f85c5c507b040c250" src="https://github.com/user-attachments/assets/9de39745-395d-4145-930b-96eb452ad6ef" /> <img width="700" alt="1a21bb2b7cd8003dce1e5207f27efc69" src="https://github.com/user-attachments/assets/ddba1682-6654-4954-aa71-41b8ebc04ac0" /> ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-19 11:32:20 +08:00
Magicbook1108	e84d5412bc	Feat: bedrock iam authentication (#12020 ) ### What problem does this PR solve? Feat: bedrock iam authentication #12008 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-12-18 17:13:09 +08:00
Stephen Hu	a63dcfed6f	Refactor: improve cohere calculate total counts (#12007 ) ### What problem does this PR solve? improve cohere calculate total counts ### Type of change - [x] Refactoring	2025-12-18 10:04:28 +08:00
Yongteng Lei	03f9be7cbb	Refa: only support MinerU-API now (#11977 ) ### What problem does this PR solve? Only support MinerU-API now, still need to complete frontend for pipeline to allow the configuration of MinerU options. ### Type of change - [x] Refactoring	2025-12-17 12:58:48 +08:00
Stephen Hu	ef5d1d4b74	Fix: 'AzureEmbed' object has no attribute 'total_token_count_from_response' (#11962 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/11956 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-16 11:29:07 +08:00
Stephen Hu	2a0f835ffe	Refactor: Improve the logic to calculate embedding total token count (#11943 ) ### What problem does this PR solve? Improve the logic to calculate embedding total token count ### Type of change - [x] Refactoring	2025-12-15 11:33:57 +08:00
Magicbook1108	7d23c3aed0	Fix: presentation parsing & Embedding encode exception handling (#11933 ) ### What problem does this PR solve? Fix: presentation parsing #11920 Fix: Embeddin encode exception handling ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-13 11:37:42 +08:00
Billy Bao	41cff3e09e	Fix: jina embedding issue (#11628 ) ### What problem does this PR solve? Fix: jina embedding issue #11614 Feat: Add jina embedding v4 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-12-01 14:24:35 +08:00
cnJasonZ	3fcf2ee54c	feat: add new LLM provider Jiekou.AI (#11300 ) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] New Feature (non-breaking change which adds functionality) Co-authored-by: Jason <ggbbddjm@gmail.com>	2025-11-17 19:47:46 +08:00
Zhichang Yu	68b952abb1	Don't select vector on infinity (#11151 ) ### What problem does this PR solve? Don't select vector on infinity ### Type of change - [x] Performance Improvement	2025-11-10 18:01:40 +08:00
Jin Hai	f98b24c9bf	Move api.settings to common.settings (#11036 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-06 09:36:38 +08:00
Jin Hai	1a9215bc6f	Move some vars to globals (#11017 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-05 14:14:38 +08:00
Jin Hai	96c015fb85	Fix and refactor imports (#11010 ) ### What problem does this PR solve? 1. Move EMBEDDING_CFG to common.globals 2. Fix error imports 3. Move signal handles to common/signal_utils.py ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-05 11:07:54 +08:00
Jin Hai	378bdfccfc	Refactor log utils (#10973 ) ### What problem does this PR solve? As title. ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-03 20:25:02 +08:00
Jin Hai	360f5c1179	Move token related functions to common (#10942 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2025-11-03 08:50:05 +08:00
Zhichang Yu	fe4852cb71	TEI auto truncate inputs (#10916 ) ### What problem does this PR solve? TEI auto truncate inputs ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-10-31 16:46:20 +08:00
Zhichang Yu	73144e278b	Don't release full image (#10654 ) ### What problem does this PR solve? Introduced gpu profile in .env Added Dockerfile_tei fix datrie Removed LIGHTEN flag ### Type of change - [x] Documentation Update - [x] Refactoring	2025-10-23 23:02:27 +08:00
Kevin Hu	cbf04ee470	Feat: Use data pipeline to visualize the parsing configuration of the knowledge base (#10423 ) ### What problem does this PR solve? #9869 ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: jinhai <haijin.chn@gmail.com> Signed-off-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: chanx <1243304602@qq.com> Co-authored-by: balibabu <cike8899@users.noreply.github.com> Co-authored-by: Lynn <lynn_inf@hotmail.com> Co-authored-by: 纷繁下的无奈 <zhileihuang@126.com> Co-authored-by: huangzl <huangzl@shinemo.com> Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com> Co-authored-by: Wilmer <33392318@qq.com> Co-authored-by: Adrian Weidig <adrianweidig@gmx.net> Co-authored-by: Zhichang Yu <yuzhichang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yongteng Lei <yongtengrey@outlook.com> Co-authored-by: Liu An <asiro@qq.com> Co-authored-by: buua436 <66937541+buua436@users.noreply.github.com> Co-authored-by: BadwomanCraZY <511528396@qq.com> Co-authored-by: cucusenok <31804608+cucusenok@users.noreply.github.com> Co-authored-by: Russell Valentine <russ@coldstonelabs.org> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Billy Bao <newyorkupperbay@gmail.com> Co-authored-by: Zhedong Cen <cenzhedong2@126.com> Co-authored-by: TensorNull <129579691+TensorNull@users.noreply.github.com> Co-authored-by: TensorNull <tensor.null@gmail.com> Co-authored-by: TeslaZY <TeslaZY@outlook.com> Co-authored-by: Ajay <160579663+aybanda@users.noreply.github.com> Co-authored-by: AB <aj@Ajays-MacBook-Air.local> Co-authored-by: 天海蒼灆 <huangaoqin@tecpie.com> Co-authored-by: He Wang <wanghechn@qq.com> Co-authored-by: Atsushi Hatakeyama <atu729@icloud.com> Co-authored-by: Jin Hai <haijin.chn@gmail.com> Co-authored-by: Mohamed Mathari <155896313+melmathari@users.noreply.github.com> Co-authored-by: Mohamed Mathari <nocodeventure@Mac-mini-van-Mohamed.fritz.box> Co-authored-by: Stephen Hu <stephenhu@seismic.com> Co-authored-by: Shaun Zhang <zhangwfjh@users.noreply.github.com> Co-authored-by: zhimeng123 <60221886+zhimeng123@users.noreply.github.com> Co-authored-by: mxc <mxc@example.com> Co-authored-by: Dominik Novotný <50611433+SgtMarmite@users.noreply.github.com> Co-authored-by: EVGENY M <168018528+rjohny55@users.noreply.github.com> Co-authored-by: mcoder6425 <mcoder64@gmail.com> Co-authored-by: lemsn <lemsn@msn.com> Co-authored-by: lemsn <lemsn@126.com> Co-authored-by: Adrian Gora <47756404+adagora@users.noreply.github.com> Co-authored-by: Womsxd <45663319+Womsxd@users.noreply.github.com> Co-authored-by: FatMii <39074672+FatMii@users.noreply.github.com>	2025-10-09 12:36:19 +08:00
DeerAPI	dfc5fa1f4d	Feat: add DeerAPI support (#10303 ) ### Related issues #10078 ### What problem does this PR solve? Integrate DeerAPI provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update Co-authored-by: DeerAPI <tensor.null@gmail.com>	2025-10-09 11:14:49 +08:00
TensorNull	ef59c5bab9	FIX: Rename the CometEmbed and CometSeq2txt classes to CometAPIEmbed and CometAPISeq2txt, and correct supported_models.mdx. (#10298 ) ### What problem does this PR solve? Rename the CometEmbed and CometSeq2txt classes to CometAPIEmbed and CometAPISeq2txt, and correct supported_models.mdx. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-26 10:50:56 +08:00
Stephen Hu	94dbd4aac9	Refactor: use the same implement for total token count from res (#10197 ) ### What problem does this PR solve? use the same implement for total token count from res ### Type of change - [x] Refactoring	2025-09-22 17:17:06 +08:00
buua436	91b609447d	Fix: embedding model failure in CometAPI (#10137 ) ### What problem does this PR solve? Related PR: Feat: add CometAPI to LLMFactory and update related mappings #10119 Change: Fixes the issue where the embedding model in CometAPI was not being called correctly ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: TensorNull <tensor.null@gmail.com>	2025-09-18 14:49:47 +08:00
TensorNull	f12b9fdcd4	Feat: add CometAPI to LLMFactory and update related mappings (#10119 ) ### Related issues #10078 ### What problem does this PR solve? Integrate CometAPI provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2025-09-18 09:51:29 +08:00
Lynn	3d39b96c6f	Fix: token num exceed (#10046 ) ### What problem does this PR solve? fix text input exceed token num limit when using siliconflow's embedding model BAAI/bge-large-zh-v1.5 and BAAI/bge-large-en-v1.5, truncate before input. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-09-11 12:02:12 +08:00
Yongteng Lei	0d9c1f1c3c	Feat: dataflow supports Spreadsheet and Word processor document (#9996 ) ### What problem does this PR solve? Dataflow supports Spreadsheet and Word processor document ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-09-10 13:02:53 +08:00
Haiyue Wang	2e00d8d3d4	Use 'float' explicitly for OpenAI's embedding "encoding_format" (#9838 ) ### What problem does this PR solve? The default value for OpenAI '/v1/embeddings' parameter 'encoding_format' is 'base64'. Use 'float' explicitly to avoid base64 encoding & decoding, larger data size. https://github.com/openai/openai-python/blob/main/src/openai/resources/embeddings.py if not is_given(encoding_format): params["encoding_format"] = "base64" ### Type of change - [x] Performance Improvement	2025-09-02 10:31:51 +08:00
so95	35539092d0	Add kwargs to model base class constructors (#9252 ) Updated constructors for base and derived classes in chat, embedding, rerank, sequence2txt, and tts models to accept kwargs. This change improves extensibility and allows passing additional parameters without breaking existing interfaces. - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: IT: Sop.Son <sop.son@feavn.local> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-07 09:45:37 +08:00
JI4JUN	aeaeb169e4	Feat/support 302ai provider (#8742 ) ### What problem does this PR solve? Support 302.AI provider. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-31 14:48:30 +08:00
Stephen Hu	20b4d88098	Refactor: Improve the try catch logic for XinferenceEmbed (#9128 ) ### What problem does this PR solve? Improve the try catch logic for XinferenceEmbed ### Type of change - [x] Refactoring	2025-07-31 12:14:50 +08:00
謝富祥	021e8b57ae	Fix: fix error 429 api rate limit when building knowledge graph for all chat model and Mistral embedding model (#9106 ) ### What problem does this PR solve? fix error 429 api rate limit when building knowledge graph for all chat model and Mistral embedding model. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-30 11:37:49 +08:00
Stephen Hu	ba563f8095	Update embedding_model.py (#9083 ) ### What problem does this PR solve? Reduce the logic scope for DefaultEmbedding ### Type of change - [x] Refactoring	2025-07-30 09:44:30 +08:00
Stephen Hu	86b4da0844	Refactor: Remove Useless split for BedrockEmbed (#9067 ) ### What problem does this PR solve? Remove Useless split for BedrockEmbed ### Type of change - [x] Refactoring	2025-07-28 10:16:38 +08:00
Stephen Hu	53b0b0e583	get keep alive from env (#9039 ) ### What problem does this PR solve? get keepalive from env ### Type of change - [x] Refactoring	2025-07-25 12:16:33 +08:00
Yongteng Lei	a2f73af1a4	Fix: typo Bearer token (#8998 ) ### What problem does this PR solve? Typo Bearer token. #8960 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-23 18:10:51 +08:00
Yongteng Lei	7ebc1f0943	Feat: add model provider DeepInfra (#9003 ) ### What problem does this PR solve? Add model provider DeepInfra. This model list comes from our community. NOTE: most endpoints haven't been tested, but they should work as OpenAI does. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-07-23 18:10:35 +08:00
Stephen Hu	ec21d9a98f	Refactor:remove use less convert for FastEmbed (#8984 ) ### What problem does this PR solve? remove use less convert for FastEmbed ### Type of change - [x] Refactoring	2025-07-23 10:51:48 +08:00
Stephen Hu	5fa6f2f151	Update embedding_model.py (#8836 ) ### What problem does this PR solve? Remove useless covert for bge encode_queries ### Type of change - [x] Performance Improvement	2025-07-15 14:04:58 +08:00
Stephen Hu	5383e254c4	Perf:Remove Useless Convert When BGE Embedding (#8816 ) ### What problem does this PR solve? FlagModel internal support returns as numpy ### Type of change - [x] Performance Improvement	2025-07-14 14:02:48 +08:00
Stephen Hu	8d027813f5	Refactor: Improve How To Handle QWenEmbed (#8765 ) ### What problem does this PR solve? Based on https://github.com/infiniflow/ragflow/issues/8740 1. A better handle for 'NoneType' object is not subscriptable 2. Add some logs to get the internal message ### Type of change - [x] Refactoring	2025-07-10 10:30:18 +08:00
Stephen Hu	19419281c3	Fix: Change Ollama Embedding Keep Alive (#8734 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8733 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-09 12:17:26 +08:00
Stephen Hu	e60ec0a31b	Fix:disallowed special token while embedding (#8692 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/8567 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-07-07 14:13:37 +08:00
6607changchun	9580e99650	fix: retry embedding with Qwen family models when limits temporarily reached. (#8690 ) fix: retry embedding with Qwen family models when limits temporarily reached. APIs of Qwen family models are limited by calling rates. When reached, the "output" attribute of the "resp" will be None, and in turn cause TypeError when trying to retrieve "embeddings". Since these limits are almost temporary, I have added a simple retry mechanism to avoid it. Besides, if retry_max reached, the error can be early raised, instead of hidden behind "TypeError". ### What problem does this PR solve? Sometimes Qwen blocks calling due to rate limits, but it will cause the whole parsing procedure stops when creating knowledge base. In this situation, resp["output"] will be None, and resp["output"]["embeddings"] will cause TypeError. Since the limits are temporary, I apply a simple retry mechanism to solve it. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-07-07 12:15:52 +08:00

1 2 3 4

156 Commits