Mag1cFall
diff --git a/‎README.md‎
Lines changed: 5 additions & 0 deletions b/‎README.md‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎README_en.md‎
Lines changed: 6 additions & 0 deletions b/‎README_en.md‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎docs/api-usage.md‎
Lines changed: 16 additions & 0 deletions b/‎docs/api-usage.md‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎docs/development-guide.md‎
Lines changed: 7 additions & 0 deletions b/‎docs/development-guide.md‎
Lines changed: 7 additions & 0 deletions
@@ -24,6 +24,8 @@
 
 - **OpenAI 兼容 API**: 完全兼容 OpenAI 格式的 `/v1/chat/completions` 端点
 - **TTS 语音生成**: 支持 Gemini 2.5 TTS 模型的单/多说话人音频生成
+- **图片生成**: 支持 Imagen 3 和 Gemini 2.5 Flash (Nano Banana) 图片生成
+- **视频生成**: 支持 Veo 2 视频生成，包含图片转视频功能
 - **智能模型切换**: 通过 `model` 字段动态切换 AI Studio 中的模型
 - **反指纹检测**: 使用 Camoufox 浏览器降低被检测风险
 - **图形界面启动器**: 功能丰富的 **网页** 启动器，简化配置和管理
@@ -274,6 +276,7 @@ AIStudio2API/
 │   ├── config/                  # 配置管理
 │   ├── models/                  # 数据模型
 │   ├── tts/                     # TTS 语音生成模块
+│   ├── media/                   # 媒体生成模块 (Imagen/Veo/Nano)
 │   ├── proxy/                   # 流式代理
 │   └── static/                  # 静态资源
 ├── data/                        # 运行时数据目录
@@ -352,6 +355,8 @@ cp .env.example .env
 ## 📅 开发计划
 
 - ✅ **TTS 支持**: 已适配 `gemini-2.5-flash/pro-preview-tts` 语音生成模型
+- ✅ **媒体生成**: 已支持 Imagen 3、Veo 2、Nano Banana 图片/视频生成
+- **点击逻辑统一**: 将 `_safe_click` 方法提取到全局 `operations.py`，统一所有控制器的点击操作
 - **文档完善**: 更新并优化 `docs/` 目录下的详细使用文档与 API 规范
 - **一键部署**: 提供 Windows/Linux/macOS 的全自动化安装与启动脚本
 - **Docker 支持**: 提供标准 Dockerfile 及 Docker Compose 编排文件，简化部署流程
 
@@ -24,6 +24,8 @@
 
 - **OpenAI Compatible API**: Fully compatible with OpenAI format `/v1/chat/completions` endpoint
 - **TTS Speech Generation**: Supports Gemini 2.5 TTS models for single/multi-speaker audio generation
+- **Image Generation**: Supports Imagen 3 and Gemini 2.5 Flash (Nano Banana) image generation
+- **Video Generation**: Supports Veo 2 video generation, including image-to-video
 - **Smart Model Switching**: Dynamically switch models in AI Studio via the `model` field
 - **Anti-Fingerprint Detection**: Uses Camoufox browser to reduce detection risk
 - **GUI Launcher**: Feature-rich **web** launcher for simplified configuration and management
@@ -268,6 +270,7 @@ AIStudio2API/
 │   ├── config/                  # Configuration management
 │   ├── models/                  # Data models
 │   ├── tts/                     # TTS Speech Generation modules
+│   ├── media/                   # Media Generation modules (Imagen/Veo/Nano)
 │   ├── proxy/                   # Streaming proxy
 │   └── static/                  # Static resources
 ├── data/                        # Runtime data directory
@@ -346,10 +349,13 @@ Issues and Pull Requests are welcome!
 ## 📅 Development Roadmap
 
 - ✅ **TTS Support**: Adapted `gemini-2.5-flash/pro-preview-tts` speech generation models
+- ✅ **Media Generation**: Supports Imagen 3, Veo 2, Nano Banana image/video generation
+- **Unified Click Logic**: Extract `_safe_click` method to global `operations.py`, unify click operations across all controllers
 - **Documentation**: Update and optimize documentation in `docs/` directory
 - **One-Click Deployment**: Provide fully automated install and launch scripts for Windows/Linux/macOS
 - **Docker Support**: Provide standard Dockerfile and Docker Compose orchestration files
 - **Go Refactoring**: Migrate core proxy service to Go for improved concurrency and reduced resource usage
 - **CI/CD Pipeline**: Establish GitHub Actions automated testing and build release process
 - **Unit Testing**: Increase test coverage for core modules (especially browser automation)
 - **Load Balancing**: Support multi-Google account rotation pool for higher concurrency limits
+
@@ -259,6 +259,22 @@ curl -X POST http://localhost:2048/generate-speech \
 
 **详细文档**: 参见 [TTS 使用指南](tts-guide.md)
 
+### 图片/视频生成
+
+**端点**: 
+- `POST /generate-image` - Imagen 图片生成
+- `POST /generate-video` - Veo 视频生成
+- `POST /nano/generate` - Nano Banana 图片生成
+
+支持 Imagen 3、Veo 2 和 Gemini 2.5 Flash 进行图片/视频生成。
+
+**支持的模型**:
+- Imagen: `imagen-3.0-generate-002`
+- Veo: `veo-2.0-generate-001`
+- Nano Banana: `gemini-2.5-flash-image`
+
+**详细文档**: 参见 [媒体生成指南](media-generation-guide.md)
+
 ### Ollama 兼容层
 
 项目还提供 Ollama 格式的 API 兼容：
 
@@ -61,6 +61,13 @@ AIStudio2API/
 │   │   ├── models.py           # TTS 数据模型
 │   │   ├── tts_controller.py   # TTS 页面控制器
 │   │   └── tts_processor.py    # TTS 请求处理器
+│   ├── media/                  # 媒体生成模块
+│   │   ├── __init__.py         # 模块初始化
+│   │   ├── models.py           # 媒体数据模型
+│   │   ├── nano_controller.py  # Nano Banana 控制器
+│   │   ├── imagen_controller.py# Imagen 控制器
+│   │   ├── veo_controller.py   # Veo 控制器
+│   │   └── media_processor.py  # 媒体请求处理器
 │   ├── proxy/                  # 流式代理服务
 │   │   ├── runner.py           # 代理服务入口
 │   │   ├── server.py           # 代理服务器