diff --git a/README.md b/README.md index c59e70f9a4a3c7760acc3b42faaa0b4bea119bf8..382d33225bc71b320b96a4ffe5b9b9f8182aa6e4 100644 --- a/README.md +++ b/README.md @@ -4,74 +4,73 @@ ModelGate logo

-ModelGate is a FastAPI-based LLM gateway for multi-provider routing, API key management, request logging, and dashboard monitoring. - -## Highlights - -- Multi-provider routing: Zhipu, DeepSeek, Ollama, Minimax, and any OpenAI-compatible API -- OpenAI-compatible proxy endpoints: `/v1/chat/completions`, `/v1/embeddings`, `/v1/models` -- Layered concurrency control: API key model limit -> provider key limit with per-key semaphore -- Provider multi-key support with sticky routing and key-level disable/reenable -- Auto-disable provider/key on usage limit errors, auto-reenable on scheduled task -- API key management with per-key model access control -- Streaming request lifecycle tracking: `pending` -> `success` / `error` / `timeout` -- Upstream and downstream status code logging -- MCP proxy: proxy remote MCP servers with API key binding, admin UI, tool sync, logging, and stats -- AI-powered daily error analysis with persisted reports -- AI-powered model recommendations and timing advice for users -- AI-powered usage report generation (DOCX export with stats, trends, and fun awards) -- API key time-based access rules (time windows, date ranges, weekday restrictions) -- Document sharing for admin and user portal -- User portal: personal stats, health score, recommendations, OpenCode config export -- OpenCode integration: auto-generated config with per-model context/output limits -- WeChat iLink Bot integration via MCP (QR login, auto-reply, message persistence) -- MinIO integration for file storage -- English / Chinese i18n with Babel -- Desktop and mobile admin UI with dark/light theme -- Localized static assets (no CDN dependencies) -- Reverse proxy support via configurable base path -- Docker Compose with Nginx reverse proxy and static file serving -- Daily stats aggregation and 30-day log archiving - -## Screenshots - -### Admin Dashboard +ModelGate 是一个基于 FastAPI 的多提供商路由、API 密钥管理、请求日志记录和仪表盘监控的 LLM 网关。 + +## 功能亮点 + +- 多提供商路由:支持 Zhipu、DeepSeek、Ollama、Minimax 和任何兼容 OpenAI 的 API +- 兼容 OpenAI 的代理端点:`/v1/chat/completions`, `/v1/embeddings`, `/v1/models` +- 分层并发控制:API 密钥模型限制 -> 提供商密钥限制(每个密钥信号量) +- 提供商多密钥支持,具有粘性路由和密钥级启用/禁用功能 +- 出现使用限制错误时自动禁用提供商/密钥,定时任务自动重新启用 +- API 密钥管理,每个密钥模型访问控制 +- 流式请求生命周期追踪:`pending` -> `success` / `error` / `timeout` +- 上游和下游状态码日志记录 +- MCP 代理:代理远程 MCP 服务器,绑定 API 密钥,管理后台,工具同步,日志记录和统计 +- AI 驱动的每日错误分析,报告持久化 +- AI 驱动的模型推荐和用户时间建议 +- AI 驱动的使用报告生成(DOCX 格式,包含统计、趋势和趣味奖项) +- API 密钥基于时间的访问规则(时间窗口、日期范围、工作日限制) +- 管理员和用户门户的文档共享 +- 用户门户:个人统计、健康评分、推荐、OpenCode 配置导出 +- OpenCode 集成:自动生成配置,包含每个模型的上下文/输出限制 +- 微信 iLink 机器人集成(二维码登录、自动回复、消息持久化) +- MinIO 文件存储集成 +- 英文 / 中文多语言支持(Babel) +- 桌面和移动端管理界面,支持深色/浅色主题 +- 本地静态资源(无 CDN 依赖) +- 可配置基础路径的反向代理支持 +- Docker Compose 集成 Nginx 反向代理和静态文件服务 +- 每日统计汇总和30天日志归档 + +## 截图 + +### 管理员仪表盘 ![Admin Dashboard](image/admin-dashboard.png) -### Admin Monitor +### 管理员监控页 ![Admin Monitor](image/admin-monitor.png) -### User Dashboard +### 用户仪表盘 ![User Dashboard](image/user-dashboard.png) -### User Report +### 用户报告 ![User Report](image/user-report.png) - -### Mobile Dashboard +### 移动端仪表盘 ![Mobile Dashboard](image/mobile-dashboard.png) -## Quick Start +## 快速开始 ```bash pip install -r requirements.txt python main.py ``` -Default local addresses: +默认本地地址: -- Server: `http://localhost:8765` -- Admin: `http://localhost:8765/admin/home` -- User portal: `http://localhost:8765/user/login` +- 服务端:`http://localhost:8765` +- 管理端:`http://localhost:8765/admin/home` +- 用户门户:`http://localhost:8765/user/login` -Windows helper: `start.bat` prompts for log level and restarts the service on port 8765. +Windows 辅助工具:`start.bat` 提示输入日志级别,并在 8765 端口重启服务。 -## Docker +## Docker 部署 ### Docker Run @@ -93,202 +92,202 @@ docker run -d --name modelgate \ ### Docker Compose -The repository includes a `docker-compose.yml` with ModelGate + Nginx services. Nginx handles static file serving and reverse proxying with WebSocket support. +仓库中包含一个 `docker-compose.yml`,集成 ModelGate 和 Nginx 服务。Nginx 处理静态文件服务和反向代理,支持 WebSocket。 ```bash docker compose up -d ``` -See [DEPLOY.md](DEPLOY.md) for full deployment instructions. +详细部署说明请查看 [DEPLOY.md](DEPLOY.md)。 -## Environment Variables +## 环境变量 -| Variable | Required | Description | +| 变量 | 是否必需 | 描述 | |----------|----------|-------------| -| `DATABASE_URL` | Yes | PostgreSQL connection string | -| `PORT` | No | Service port, default `8765` | -| `ADMIN_USERS` | Recommended | Admin accounts, format: `user:pass,user:pass` | -| `ADMIN_USERNAME` | No | Fallback admin username | -| `ADMIN_PASSWORD` | No | Fallback admin password | -| `LOG_LEVEL` | No | `DEBUG`, `INFO`, `WARNING`, `ERROR` | -| `MINIO_ENDPOINT` | No | MinIO endpoint, default `localhost:9000` | -| `MINIO_ACCESS_KEY` | No | MinIO access key | -| `MINIO_SECRET_KEY` | No | MinIO secret key | -| `MINIO_BUCKET` | No | MinIO bucket name, default `modelgate` | -| `MINIO_SECURE` | No | Use HTTPS for MinIO, default `false` | -| `ICP_NUMBER` | No | ICP filing number shown on landing page | - -## Database +| `DATABASE_URL` | 是 | PostgreSQL 连接字符串 | +| `PORT` | 否 | 服务端口,默认 `8765` | +| `ADMIN_USERS` | 推荐 | 管理员账号,格式:`user:pass,user:pass` | +| `ADMIN_USERNAME` | 否 | 后备管理员用户名 | +| `ADMIN_PASSWORD` | 否 | 后备管理员密码 | +| `LOG_LEVEL` | 否 | `DEBUG`, `INFO`, `WARNING`, `ERROR` | +| `MINIO_ENDPOINT` | 否 | MinIO 地址,默认 `localhost:9000` | +| `MINIO_ACCESS_KEY` | 否 | MinIO 访问密钥 | +| `MINIO_SECRET_KEY` | 否 | MinIO 秘密密钥 | +| `MINIO_BUCKET` | 否 | MinIO 存储桶名,默认 `modelgate` | +| `MINIO_SECURE` | 否 | 是否使用 HTTPS,默认 `false` | +| `ICP_NUMBER` | 否 | ICP 备案号(显示在首页) | + +## 数据库 ```sql CREATE USER "modelgate" WITH PASSWORD 'your_password'; CREATE DATABASE "modelgate" OWNER "modelgate"; ``` -Schema: [`schema.sql`](schema.sql) +数据库结构详见 [`schema.sql`](schema.sql)。 -The app performs runtime compatibility migrations on startup (e.g., adding new columns to `request_logs`). +应用在启动时会自动执行兼容性迁移(如在 `request_logs` 表中添加新列)。 -## API +## API 接口 -### OpenAI-compatible Endpoints +### 兼容 OpenAI 的端点 -- `POST /v1/chat/completions` - Chat completions (streaming and non-streaming) -- `POST /v1/embeddings` - Text embeddings -- `GET /v1/models` - List available models +- `POST /v1/chat/completions` - 聊天补全(支持流式和非流式) +- `POST /v1/embeddings` - 文本嵌入 +- `GET /v1/models` - 列出可用模型 -### Model Naming +### 模型命名格式 ```text provider/model ``` -Examples: `zhipu/glm-4`, `deepseek/chat`, `minimax/MiniMax-M2.5` +示例:`zhipu/glm-4`, `deepseek/chat`, `minimax/MiniMax-M2.5` -## Dashboards +## 仪表盘 -### Admin +### 管理端 -- `/admin/home` - Overview, realtime stats, slow requests, trends -- `/admin/config` - Provider, model, and binding configuration -- `/admin/api-keys` - API key management and per-key model access -- `/admin/monitor` - Composition, hotspots, response-time analysis -- `/admin/errors` - Daily error log viewer with AI-powered analysis reports -- `/admin/reports` - AI-powered usage report generation and DOCX download -- `/admin/system-config` - Outbound User-Agent management and UA stats -- `/admin/usage` - Client configuration examples and setup guides -- `/admin/m` - Mobile admin dashboard +- `/admin/home` - 概览,实时统计,慢请求,趋势 +- `/admin/config` - 提供商、模型和绑定配置 +- `/admin/api-keys` - API 密钥管理以及每个密钥的统计/日志 +- `/admin/monitor` - 组成分析、热点、响应时间分析 +- `/admin/errors` - 每日错误日志查看器和 AI 分析报告 +- `/admin/reports` - AI 生成使用报告和 DOCX 下载 +- `/admin/system-config` - 出站 User-Agent 管理和 UA 统计 +- `/admin/usage` - 客户端配置示例和设置指南 +- `/admin/m` - 移动端管理仪表盘 -### User Portal +### 用户门户 -API key holders log in at `/user/login` to access: +API 密钥持有者可通过 `/user/login` 登录访问: -- Personal request and token statistics (day/week/month) -- 20-minute system health score (error rate, latency, load, active users) -- AI-powered model recommendations with scored reasons -- AI-generated timing advice based on hourly usage patterns -- Active session tracking -- Model catalog with context/output limits and multimodal info -- OpenCode configuration export (`/opencode/setup.md?api_key=...`) +- 个人请求和 token 统计(日/周/月) +- 20 分钟系统健康评分(错误率、延迟、负载、活跃用户) +- AI 驱动的模型推荐(带评分原因) +- AI 生成的使用时间建议(基于每小时使用模式) +- 活动会话追踪 +- 模型目录(上下文/输出限制、多模态信息) +- OpenCode 配置导出(`/opencode/setup.md?api_key=...`) -## API Key Time-Based Access Rules +## API 密钥时段访问规则 -API keys can be restricted by time of day, date ranges, and weekdays. Rules are validated on every request: +API 密钥可以基于时间窗口、日期范围和工作日进行限制。每条规则在每次请求时都会进行验证: -- **Time windows** — `start_time` / `end_time` (e.g., only allow 09:00–18:00) -- **Date ranges** — `start_date` / `end_date` -- **Weekday filters** — restrict to specific days of the week -- **Allow/deny semantics** — explicit `allowed` flag per rule +- **时间窗口** — `start_time` / `end_time`(例如仅允许 09:00–18:00) +- **日期范围** — `start_date` / `end_date` +- **工作日限制** — 限制特定星期几 +- **允许/拒绝语义** — 每条规则具有 `allowed` 标志 -## WeChat iLink Bot (MCP) +## 微信 iLink 机器人(MCP) -ModelGate includes an MCP (Model Context Protocol) server for WeChat iLink Bot integration at `/weixin`: +ModelGate 包含微信 iLink 机器人集成的 MCP(模型上下文协议)服务器,位于 `/weixin`: -- QR code login flow -- Message polling, sending, and auto-reply via internal LLM proxy -- Message persistence to database -- Per-user context threading for conversations -- See [docs/guides/weixin-mcp.md](docs/guides/weixin-mcp.md) for setup instructions +- 二维码登录流程 +- 消息轮询、发送和自动回复 +- 消息持久化到数据库 +- 每个用户的上下文线程 +- 详见 [docs/guides/weixin-mcp.md](docs/guides/weixin-mcp.md) -## Request Logging +## 请求日志 -`request_logs` stores: API key, provider, model, tokens, latency, status, upstream/downstream HTTP status codes, client IP, user agent, and error details. +`request_logs` 表记录 API 密钥、提供商、模型、token 数量、延迟、状态、上游/下游 HTTP 状态码、客户端 IP、User-Agent 和错误详情。 -Streaming requests are inserted as `pending` first, then updated to `success`, `error`, `timeout`, or `cancelled`. +流式请求首先以 `pending` 状态插入,后续更新为 `success`、`error`、`timeout` 或 `cancelled`。 -Logs older than 30 days are automatically archived to `request_logs_history`. A `request_logs_all` view unions both tables for transparent querying. +超过 30 天的日志自动归档至 `request_logs_history`,`request_logs_all` 视图联合两张表,透明查询。 -## Concurrency Control +## 并发控制 -Three-layer semaphore-based rate control: +三层信号量并发控制: -1. **API key model limit** — per (api_key, model) concurrency cap -2. **Provider key limit** — per provider key with configurable max_concurrent -3. **System-level limit** — global concurrency with `local_rate_limited` rejection when exceeded +1. **API 密钥模型限制** — 每个 (api_key, model) 的最大并发 +2. **提供商密钥限制** — 每个提供商密钥,可配置 max_concurrent +3. **系统级限制** — 全局限制,超过则返回 `local_rate_limited` -Provider keys support sticky routing (requests from the same API key route to the same provider key). +提供商密钥支持粘性路由(来自同一 API 密钥的请求路由到同一提供商密钥)。 -## Provider Auto-Disable & Reenable +## 提供商自动启用/禁用 -- When a usage limit error is detected (quota exceeded, billing deactivated, etc.), the provider or provider key is automatically disabled with a reason -- Disabled state is shown in admin dashboard with warning icons and error details returned to the client -- A scheduled task reenables all disabled providers and keys every 5 minutes -- Manual reset available in admin config page +- 检测到使用限制错误时(配额已满、账单失效等),提供商或提供商密钥自动禁用,并附原因 +- 禁用状态在管理仪表盘显示警告图标,客户端返回错误详情 +- 定时任务每 5 分钟重新启用一次所有被禁用的提供商和密钥 +- 支持在管理配置页手动重置 -## Scheduled Tasks +## 定时任务 -| Task | Schedule | Description | +| 任务 | 时间间隔 | 描述 | |------|----------|-------------| -| Auto-reenable | Every 5 minutes | Reenable disabled provider keys and providers | -| Timeout cleanup | Every 10 minutes | Mark stale pending requests (>10 min) as `timeout` | -| Daily aggregation | 00:05 | Aggregate request counts into daily/hourly stats tables | -| MCP stats aggregation | 00:10 | Aggregate MCP tool usage stats | -| Log archival | 00:20 | Archive request logs older than 30 days | -| Recommendation analysis | 08:00 | Daily AI-powered model recommendation analysis | +| 自动启用 | 每 5 分钟 | 重新启用被禁用的提供商密钥和提供商 | +| 超时清理 | 每 10 分钟 | 标记超过 10 分钟的待处理请求为 `timeout` | +| 每日统计聚合 | 00:05 | 聚合请求计数到每日/每小时统计表 | +| MCP 统计聚合 | 00:10 | 聚合 MCP 工具使用统计 | +| 日志归档 | 00:20 | 归档超过 30 天的请求日志 | +| 推荐分析 | 08:00 | 每日 AI 驱动模型推荐分析 | -## Project Structure +## 项目结构 ```text modelgate/ -├── main.py # App init, middleware, routers, exception handlers +├── main.py # 应用初始化、中间件、路由、异常处理 ├── core/ -│ ├── config.py # Logging, caches, stats, session management -│ ├── database.py # SQLAlchemy async engine, all ORM models -│ ├── deps.py # Auth dependencies -│ ├── i18n.py # Internationalization -│ ├── app_paths.py # Base path for reverse proxy -│ ├── client_ip.py # Multi-header client IP extraction -│ └── log_sanitizer.py # Sensitive data redaction for logs +│ ├── config.py # 日志、缓存、统计、会话管理 +│ ├── database.py # SQLAlchemy 异步引擎和 ORM 模型 +│ ├── deps.py # 认证依赖 +│ ├── i18n.py # 多语言支持 +│ ├── app_paths.py # 基础路径(反向代理) +│ ├── client_ip.py # 多头客户端 IP 提取 +│ └── log_sanitizer.py # 日志中的敏感数据脱敏 ├── routes/ │ ├── proxy.py # /v1/chat/completions, /v1/embeddings, /v1/models -│ ├── auth.py # Admin login/logout -│ ├── providers.py # Provider CRUD -│ ├── models.py # Model CRUD -│ ├── provider_models.py # Provider-model bindings + auto-sync -│ ├── keys.py # API key CRUD + per-key stats/logs + time rules -│ ├── stats.py # Statistics, aggregation, live WebSocket -│ ├── logs.py # Log viewer + AI error analysis -│ ├── pages.py # Admin HTML pages -│ ├── user.py # User portal API + pages -│ ├── opencode.py # OpenCode config generation -│ ├── reports.py # Usage report generation + DOCX export -│ ├── system_config.py # System config (outbound UA management) -│ ├── mcp.py # MCP server CRUD endpoints -│ └── weixin.py # WeChat MCP server endpoints +│ ├── auth.py # 管理员登录/登出 +│ ├── providers.py # 提供商 CRUD +│ ├── models.py # 模型 CRUD +│ ├── provider_models.py # 提供商-模型绑定 + 自动同步 +│ ├── keys.py # API 密钥 CRUD + 每个密钥的统计/日志 + 时间规则 +│ ├── stats.py # 统计、聚合、实时 WebSocket +│ ├── logs.py # 日志查看器 + AI 错误分析 +│ ├── pages.py # HTML 页面(管理员) +│ ├── user.py # 用户门户 API + 页面 +│ ├── opencode.py # OpenCode 配置生成 +│ ├── reports.py # 使用报告生成 + DOCX 导出 +│ ├── system_config.py # 系统配置(出站 UA 管理) +│ ├── mcp.py # MCP 服务器 CRUD 接口 +│ └── weixin.py # 微信 MCP 服务器接口 ├── services/ -│ ├── proxy.py # Main proxy logic, streaming, provider dispatch -│ ├── proxy_runtime/ # Runtime helpers: SSE, MiniMax, message preprocessing -│ ├── auth.py # API key validation + time-based access rules -│ ├── provider.py # Provider/model resolution, sticky routing -│ ├── provider_limiter.py # Provider/key disable, reenable, usage limit detection -│ ├── scheduler.py # APScheduler tasks -│ ├── stats_aggregator.py # Daily stats aggregation, archiving -│ ├── logging.py # Request log CRUD -│ ├── tokens.py # Token estimation and response parsing -│ ├── message.py # Message preprocessing (merge, truncate) -│ ├── minimax.py # MiniMax-specific response/tool_call parsing -│ ├── sse.py # SSE stream normalization -│ ├── analysis_store.py # AI analysis task persistence -│ ├── usage_report.py # DOCX usage report generation -│ ├── system_config.py # Outbound UA auto-detection -│ ├── mcp.py # MCP server pool, tool sync, proxy -│ └── weixin.py # WeChat iLink Bot client -├── templates/ # Jinja2 HTML (admin/, user/, public/, components/) -├── nginx/ # nginx.conf for Docker reverse proxy -├── locales/ # i18n: en, zh +│ ├── proxy.py # 代理逻辑、流式处理、提供商调度 +│ ├── proxy_runtime/ # 运行时帮助类:SSE、MiniMax、消息预处理 +│ ├── auth.py # API 密钥验证 + 时间规则访问 +│ ├── provider.py # 提供商/模型解析,粘性路由 +│ ├── provider_limiter.py # 提供商/密钥禁用、重新启用、使用限制检测 +│ ├── scheduler.py # APScheduler 任务 +│ ├── stats_aggregator.py # 每日统计聚合、归档 +│ ├── logging.py # 请求日志 CRUD +│ ├── tokens.py # token 估算和响应解析 +│ ├── message.py # 消息预处理(合并、截断) +│ ├── minimax.py # MiniMax 特定响应/工具调用解析 +│ ├── sse.py # SSE 流标准化 +│ ├── analysis_store.py # AI 分析任务持久化 +│ ├── usage_report.py # DOCX 使用报告生成 +│ ├── system_config.py # 出站 UA 自动检测 +│ ├── mcp.py # MCP 服务器池、工具同步、代理 +│ └── weixin.py # 微信 iLink 客户端 +├── templates/ # Jinja2 HTML(管理员、用户、公共、组件) +├── nginx/ # Docker 反向代理的 nginx.conf +├── locales/ # i18n:en, zh ├── schema.sql ├── Dockerfile └── DEPLOY.md ``` -## Development +## 开发说明 -- Python 3.10+ | FastAPI | SQLAlchemy async | PostgreSQL -- Lint & format: `ruff check . && ruff format .` -- Type check: `mypy main.py core/*.py --ignore-missing-imports` -- i18n compile: `pybabel compile -d locales` -- Logs: `logs/proxy.log`, `logs/admin.log`, `logs/error.log` +- Python 3.10+ | FastAPI | SQLAlchemy 异步 | PostgreSQL +- 代码规范:`ruff check . && ruff format .` +- 类型检查:`mypy main.py core/*.py --ignore-missing-imports` +- i18n 编译:`pybabel compile -d locales` +- 日志:`logs/proxy.log`, `logs/admin.log`, `logs/error.log` ## License -Apache 2.0 +Apache 2.0 \ No newline at end of file