2024-09-29 16:28:07 +08:00
|
|
|
#!/bin/bash
|
|
|
|
|
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
# Exit immediately if a command exits with a non-zero status
|
|
|
|
|
set -e
|
|
|
|
|
|
2026-06-16 11:53:13 +08:00
|
|
|
usage() {
|
|
|
|
|
local exit_code=${1:-1}
|
|
|
|
|
echo "Usage: $0 [ragflow|task_executor|admin|data_sync]..."
|
|
|
|
|
echo
|
|
|
|
|
echo "Without arguments, starts ragflow and task_executor."
|
|
|
|
|
echo "Available service types:"
|
2026-06-17 16:09:53 +08:00
|
|
|
echo " ragflow Start RAGFlow server based on API_PROXY_SCHEME"
|
2026-06-16 11:53:13 +08:00
|
|
|
echo " task_executor Start rag/svr/task_executor.py workers"
|
2026-06-17 16:09:53 +08:00
|
|
|
echo " admin Start Admin server based on API_PROXY_SCHEME"
|
2026-06-16 11:53:13 +08:00
|
|
|
echo " data_sync Start rag/svr/sync_data_source.py"
|
|
|
|
|
echo
|
|
|
|
|
echo "Examples:"
|
|
|
|
|
echo " $0"
|
|
|
|
|
echo " $0 ragflow"
|
|
|
|
|
echo " $0 task_executor"
|
|
|
|
|
echo " $0 admin"
|
|
|
|
|
echo " $0 data_sync"
|
|
|
|
|
exit "$exit_code"
|
|
|
|
|
}
|
|
|
|
|
|
2025-03-20 18:35:04 +08:00
|
|
|
# Function to load environment variables from .env file
|
|
|
|
|
load_env_file() {
|
|
|
|
|
# Get the directory of the current script
|
|
|
|
|
local script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
|
|
|
local env_file="$script_dir/.env"
|
|
|
|
|
|
|
|
|
|
# Check if .env file exists
|
|
|
|
|
if [ -f "$env_file" ]; then
|
|
|
|
|
echo "Loading environment variables from: $env_file"
|
|
|
|
|
# Source the .env file
|
|
|
|
|
set -a
|
2026-06-15 16:54:25 +08:00
|
|
|
source "$env_file"
|
2025-03-20 18:35:04 +08:00
|
|
|
set +a
|
|
|
|
|
else
|
|
|
|
|
echo "Warning: .env file not found at: $env_file"
|
|
|
|
|
fi
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# Load environment variables
|
|
|
|
|
load_env_file
|
|
|
|
|
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
# Unset HTTP proxies that might be set by Docker daemon
|
2024-09-29 16:28:07 +08:00
|
|
|
export http_proxy=""; export https_proxy=""; export no_proxy=""; export HTTP_PROXY=""; export HTTPS_PROXY=""; export NO_PROXY=""
|
2025-03-04 15:23:44 +08:00
|
|
|
export PYTHONPATH=$(pwd)
|
2024-09-29 16:28:07 +08:00
|
|
|
|
|
|
|
|
export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/
|
2025-03-05 14:48:03 +08:00
|
|
|
JEMALLOC_PATH=$(pkg-config --variable=libdir jemalloc)/libjemalloc.so
|
2024-09-29 16:28:07 +08:00
|
|
|
|
|
|
|
|
PY=python3
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
|
|
|
|
|
# Set default number of workers if WS is not set or less than 1
|
2024-09-29 16:28:07 +08:00
|
|
|
if [[ -z "$WS" || $WS -lt 1 ]]; then
|
|
|
|
|
WS=1
|
|
|
|
|
fi
|
|
|
|
|
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
# Maximum number of retries for each task executor and server
|
|
|
|
|
MAX_RETRIES=5
|
|
|
|
|
|
|
|
|
|
# Flag to control termination
|
|
|
|
|
STOP=false
|
|
|
|
|
|
|
|
|
|
# Array to keep track of child PIDs
|
|
|
|
|
PIDS=()
|
|
|
|
|
|
2025-05-06 14:39:05 +08:00
|
|
|
# Set the path to the NLTK data directory
|
|
|
|
|
export NLTK_DATA="./nltk_data"
|
|
|
|
|
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
# Function to handle termination signals
|
|
|
|
|
cleanup() {
|
|
|
|
|
echo "Termination signal received. Shutting down..."
|
|
|
|
|
STOP=true
|
|
|
|
|
# Terminate all child processes
|
|
|
|
|
for pid in "${PIDS[@]}"; do
|
|
|
|
|
if kill -0 "$pid" 2>/dev/null; then
|
|
|
|
|
echo "Killing process $pid"
|
|
|
|
|
kill "$pid"
|
|
|
|
|
fi
|
|
|
|
|
done
|
|
|
|
|
exit 0
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# Trap SIGINT and SIGTERM to invoke cleanup
|
|
|
|
|
trap cleanup SIGINT SIGTERM
|
|
|
|
|
|
|
|
|
|
# Function to execute task_executor with retry logic
|
|
|
|
|
task_exe(){
|
|
|
|
|
local task_id=$1
|
|
|
|
|
local retry_count=0
|
|
|
|
|
while ! $STOP && [ $retry_count -lt $MAX_RETRIES ]; do
|
|
|
|
|
echo "Starting task_executor.py for task $task_id (Attempt $((retry_count+1)))"
|
2026-05-27 21:54:17 +08:00
|
|
|
LD_PRELOAD=$JEMALLOC_PATH $PY rag/svr/task_executor.py -i "$task_id"
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
EXIT_CODE=$?
|
|
|
|
|
if [ $EXIT_CODE -eq 0 ]; then
|
|
|
|
|
echo "task_executor.py for task $task_id exited successfully."
|
|
|
|
|
break
|
|
|
|
|
else
|
|
|
|
|
echo "task_executor.py for task $task_id failed with exit code $EXIT_CODE. Retrying..." >&2
|
|
|
|
|
retry_count=$((retry_count + 1))
|
|
|
|
|
sleep 2
|
|
|
|
|
fi
|
2024-09-29 16:28:07 +08:00
|
|
|
done
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
|
|
|
|
|
if [ $retry_count -ge $MAX_RETRIES ]; then
|
|
|
|
|
echo "task_executor.py for task $task_id failed after $MAX_RETRIES attempts. Exiting..." >&2
|
|
|
|
|
cleanup
|
|
|
|
|
fi
|
2024-09-29 16:28:07 +08:00
|
|
|
}
|
|
|
|
|
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
# Function to execute ragflow_server with retry logic
|
|
|
|
|
run_server(){
|
2026-06-17 16:09:53 +08:00
|
|
|
local server_name="ragflow_server.py"
|
|
|
|
|
local -a server_cmd=("$PY" "api/ragflow_server.py")
|
|
|
|
|
if [[ "${API_PROXY_SCHEME}" == "go" ]]; then
|
2026-06-17 20:20:37 +08:00
|
|
|
prepare_for_go
|
2026-06-17 16:09:53 +08:00
|
|
|
server_name="ragflow_server"
|
|
|
|
|
server_cmd=("bin/ragflow_server")
|
|
|
|
|
fi
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
local retry_count=0
|
|
|
|
|
while ! $STOP && [ $retry_count -lt $MAX_RETRIES ]; do
|
2026-06-17 16:09:53 +08:00
|
|
|
echo "Starting $server_name (Attempt $((retry_count+1)))"
|
|
|
|
|
"${server_cmd[@]}"
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
EXIT_CODE=$?
|
|
|
|
|
if [ $EXIT_CODE -eq 0 ]; then
|
2026-06-17 16:09:53 +08:00
|
|
|
echo "$server_name exited successfully."
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
break
|
|
|
|
|
else
|
2026-06-17 16:09:53 +08:00
|
|
|
echo "$server_name failed with exit code $EXIT_CODE. Retrying..." >&2
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
retry_count=$((retry_count + 1))
|
|
|
|
|
sleep 2
|
|
|
|
|
fi
|
|
|
|
|
done
|
|
|
|
|
|
|
|
|
|
if [ $retry_count -ge $MAX_RETRIES ]; then
|
2026-06-17 16:09:53 +08:00
|
|
|
echo "$server_name failed after $MAX_RETRIES attempts. Exiting..." >&2
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
cleanup
|
|
|
|
|
fi
|
|
|
|
|
}
|
|
|
|
|
|
2026-06-16 11:53:13 +08:00
|
|
|
# Function to execute admin_server with retry logic
|
|
|
|
|
run_admin_server(){
|
2026-06-17 16:09:53 +08:00
|
|
|
local server_name="admin_server.py"
|
|
|
|
|
local -a server_cmd=("$PY" "admin/server/admin_server.py")
|
|
|
|
|
if [[ "${API_PROXY_SCHEME}" == "go" ]]; then
|
2026-06-17 20:20:37 +08:00
|
|
|
prepare_for_go
|
2026-06-17 16:09:53 +08:00
|
|
|
server_name="admin_server"
|
|
|
|
|
server_cmd=("bin/admin_server")
|
|
|
|
|
fi
|
2026-06-16 11:53:13 +08:00
|
|
|
local retry_count=0
|
|
|
|
|
while ! $STOP && [ $retry_count -lt $MAX_RETRIES ]; do
|
2026-06-17 16:09:53 +08:00
|
|
|
echo "Starting $server_name (Attempt $((retry_count+1)))"
|
|
|
|
|
"${server_cmd[@]}"
|
2026-06-16 11:53:13 +08:00
|
|
|
EXIT_CODE=$?
|
|
|
|
|
if [ $EXIT_CODE -eq 0 ]; then
|
2026-06-17 16:09:53 +08:00
|
|
|
echo "$server_name exited successfully."
|
2026-06-16 11:53:13 +08:00
|
|
|
break
|
|
|
|
|
else
|
2026-06-17 16:09:53 +08:00
|
|
|
echo "$server_name failed with exit code $EXIT_CODE. Retrying..." >&2
|
2026-06-16 11:53:13 +08:00
|
|
|
retry_count=$((retry_count + 1))
|
|
|
|
|
sleep 2
|
|
|
|
|
fi
|
|
|
|
|
done
|
|
|
|
|
if [ $retry_count -ge $MAX_RETRIES ]; then
|
2026-06-17 16:09:53 +08:00
|
|
|
echo "$server_name failed after $MAX_RETRIES attempts. Exiting..." >&2
|
2026-06-16 11:53:13 +08:00
|
|
|
cleanup
|
|
|
|
|
fi
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# Function to execute sync_data_source with retry logic
|
|
|
|
|
run_data_sync(){
|
|
|
|
|
local retry_count=0
|
|
|
|
|
while ! $STOP && [ $retry_count -lt $MAX_RETRIES ]; do
|
|
|
|
|
echo "Starting sync_data_source.py (Attempt $((retry_count+1)))"
|
|
|
|
|
$PY rag/svr/sync_data_source.py
|
|
|
|
|
EXIT_CODE=$?
|
|
|
|
|
if [ $EXIT_CODE -eq 0 ]; then
|
|
|
|
|
echo "sync_data_source.py exited successfully."
|
|
|
|
|
break
|
|
|
|
|
else
|
|
|
|
|
echo "sync_data_source.py failed with exit code $EXIT_CODE. Retrying..." >&2
|
|
|
|
|
retry_count=$((retry_count + 1))
|
|
|
|
|
sleep 2
|
|
|
|
|
fi
|
|
|
|
|
done
|
|
|
|
|
|
|
|
|
|
if [ $retry_count -ge $MAX_RETRIES ]; then
|
|
|
|
|
echo "sync_data_source.py failed after $MAX_RETRIES attempts. Exiting..." >&2
|
|
|
|
|
cleanup
|
|
|
|
|
fi
|
|
|
|
|
}
|
|
|
|
|
|
2026-06-02 15:04:33 +08:00
|
|
|
ensure_db_init() {
|
|
|
|
|
echo "Initializing database tables..."
|
|
|
|
|
"$PY" -c "from api.db.db_models import init_database_tables as init_web_db; init_web_db()"
|
|
|
|
|
echo "Database tables initialized."
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
run_mysql_migrations() {
|
|
|
|
|
echo "Running model provider table migrations..."
|
2026-06-03 11:51:42 +08:00
|
|
|
"$PY" tools/scripts/mysql_migration.py \
|
|
|
|
|
--stages tenant_model_provider,tenant_model_instance,tenant_model,model_id_config \
|
|
|
|
|
--config conf/service_conf.yaml \
|
|
|
|
|
--execute \
|
2026-06-17 19:35:32 +08:00
|
|
|
--database-version "v0.26.1" \
|
2026-06-03 11:51:42 +08:00
|
|
|
--mark-database-version-on-success
|
2026-06-02 15:04:33 +08:00
|
|
|
echo "Model provider table migrations completed."
|
|
|
|
|
}
|
|
|
|
|
|
2026-06-17 20:20:37 +08:00
|
|
|
prepare_for_go() {
|
|
|
|
|
if [ -d /usr/share/infinity/resource ]; then
|
|
|
|
|
echo "Resource directory already exists. Skipping preparation."
|
|
|
|
|
return
|
|
|
|
|
fi
|
|
|
|
|
mkdir -p /usr/share/infinity/resource
|
|
|
|
|
if [ "$NEED_MIRROR" == "1" ]; then
|
|
|
|
|
git clone --depth 1 --single-branch https://gitee.com/infiniflow/resource /tmp/resource;
|
|
|
|
|
else
|
|
|
|
|
git clone --depth 1 --single-branch https://github.com/infiniflow/resource.git /tmp/resource;
|
|
|
|
|
fi
|
|
|
|
|
cp -r /tmp/resource/* /usr/share/infinity/resource
|
|
|
|
|
rm -rf /tmp/resource
|
|
|
|
|
}
|
|
|
|
|
|
2026-06-16 11:53:13 +08:00
|
|
|
START_RAGFLOW=0
|
|
|
|
|
START_TASK_EXECUTOR=0
|
|
|
|
|
START_ADMIN=0
|
|
|
|
|
START_DATA_SYNC=0
|
|
|
|
|
|
|
|
|
|
if [ $# -eq 0 ]; then
|
|
|
|
|
START_RAGFLOW=1
|
|
|
|
|
START_TASK_EXECUTOR=1
|
|
|
|
|
fi
|
|
|
|
|
|
|
|
|
|
for arg in "$@"; do
|
|
|
|
|
case $arg in
|
|
|
|
|
ragflow|server|webserver)
|
|
|
|
|
START_RAGFLOW=1
|
|
|
|
|
;;
|
|
|
|
|
task_executor|task-executor|taskexecutor)
|
|
|
|
|
START_TASK_EXECUTOR=1
|
|
|
|
|
;;
|
|
|
|
|
admin|admin_server|admin-server)
|
|
|
|
|
START_ADMIN=1
|
|
|
|
|
;;
|
|
|
|
|
data_sync|data-sync|datasync)
|
|
|
|
|
START_DATA_SYNC=1
|
|
|
|
|
;;
|
|
|
|
|
all)
|
|
|
|
|
START_RAGFLOW=1
|
|
|
|
|
START_TASK_EXECUTOR=1
|
|
|
|
|
START_ADMIN=1
|
|
|
|
|
START_DATA_SYNC=1
|
|
|
|
|
;;
|
|
|
|
|
-h|--help)
|
|
|
|
|
usage 0
|
|
|
|
|
;;
|
|
|
|
|
*)
|
|
|
|
|
echo "Unknown service type: $arg" >&2
|
|
|
|
|
usage
|
|
|
|
|
;;
|
|
|
|
|
esac
|
|
|
|
|
done
|
|
|
|
|
|
|
|
|
|
if [[ "$START_RAGFLOW" -eq 1 ]]; then
|
|
|
|
|
ensure_db_init
|
|
|
|
|
run_mysql_migrations
|
|
|
|
|
fi
|
2026-06-02 15:04:33 +08:00
|
|
|
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
# Start task executors
|
2026-06-16 11:53:13 +08:00
|
|
|
if [[ "$START_TASK_EXECUTOR" -eq 1 ]]; then
|
|
|
|
|
for ((i=0;i<WS;i++))
|
|
|
|
|
do
|
|
|
|
|
task_exe "$i" &
|
|
|
|
|
PIDS+=($!)
|
|
|
|
|
done
|
|
|
|
|
fi
|
|
|
|
|
|
|
|
|
|
# Start the RAGFlow server
|
|
|
|
|
if [[ "$START_RAGFLOW" -eq 1 ]]; then
|
|
|
|
|
run_server &
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
PIDS+=($!)
|
2026-06-16 11:53:13 +08:00
|
|
|
fi
|
|
|
|
|
|
|
|
|
|
# Start the Admin server
|
|
|
|
|
if [[ "$START_ADMIN" -eq 1 ]]; then
|
|
|
|
|
run_admin_server &
|
|
|
|
|
PIDS+=($!)
|
|
|
|
|
fi
|
2024-09-29 16:28:07 +08:00
|
|
|
|
2026-06-16 11:53:13 +08:00
|
|
|
# Start the data sync server
|
|
|
|
|
if [[ "$START_DATA_SYNC" -eq 1 ]]; then
|
|
|
|
|
run_data_sync &
|
|
|
|
|
PIDS+=($!)
|
|
|
|
|
fi
|
2024-09-29 16:28:07 +08:00
|
|
|
|
Change launch backend script to handle errors gracefully (#3334)
### What problem does this PR solve?
The `launch_backend_service.sh` script enters infinite loops for both
the task executors and the backend server. When an error occurs in any
of these processes, the script continuously restarts them without
properly handling termination signals. This behavior causes the script
to even ignore interrupts, leading to persistent error messages and
making it difficult to exit the script gracefully.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Explanation of Modifications
1. **Signal Trapping with `trap`:**
- The `trap cleanup SIGINT SIGTERM` line ensures that when a `SIGINT` or
`SIGTERM` signal is received, the cleanup function is invoked.
- The `cleanup` function sets the `STOP` flag to `true`, iterates
through all child process IDs stored in the `PIDS` array, and sends a
`kill` signal to each process to terminate them gracefully.
2. **Retry Limits:**
- Introduced a `MAX_RETRIES` variable to limit the number of restart
attempts for both `task_executor.py` and `ragflow_server.py`
- The loops now check if the retry count has reached the maximum limit.
If so, they invoke the `cleanup` function to terminate all processes and
exit the script.
3. **Process Tracking with `PIDS` Array:**
- After launching each background process (`task_exe` and `run_server`),
their Process IDs (PIDs) are stored in the `PIDS` array.
- This allows the `cleanup` function to terminate all child processes
effectively when needed.
4. **Graceful Shutdown:**
- When the `cleanup` function is called, it iterates over all child PIDs
and sends a termination signal (`kill`) to each, ensuring that all
subprocesses are stopped before the script exits.
5. **Logging Enhancements:**
- Added `echo` statements to provide clearer logs about the state of
each process, including attempts, successes, failures, and retries.
6. **Exit on Successful Completion:**
- If `ragflow_server.py` or a `task_executor.py` process exits with a
success code (0), the loop breaks, preventing unnecessary retries.
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-11-12 12:51:38 +05:00
|
|
|
# Wait for all background processes to finish
|
2025-03-04 15:23:44 +08:00
|
|
|
wait
|