自問自答,安放學習筆記。
一、背景
Manus這個產品就先不評價了,沒有GA之前的行銷都沒什麼意義,benchmark和demo看看就行,最終都得拉出來溜溜(讓人用)才作數。
OpenManus是對Manus的一個開源複現,趁著這個熱度收割了一部分流量,大概看了下專案,目前還比較簡單,只能算是個玩具專案,跑了幾個例子感覺效果一般。和之前火過一陣的AutoGPT、MetaGPT概念上其實沒有本質區別。隨著基模型越來越強,大家對這個方向的希望也會間歇性地被點燃。
PS:以下基於2025.03.07版本。
二、OpenManus概覽
OpenManus 是一個基於智慧體(Agent)的工作流自動化框架,採用分層架構設計,支援複雜任務的規劃、執行和驗證。系統通過可擴展的工具集和提示範本庫實現靈活的任務處理能力。
目前只是一個雛形,代碼量很少,大致的結構如下:

三、核心元件說明
1. 智慧體系統 (Agents)

BaseAgent
- 核心職責:所有Agent的基類
- 關鍵功能:
- 管理Agent生命週期(初始化、運行、終止)
- 維護狀態機(THINKING/ACTING/WAITING)
- 記憶管理(messages屬性存儲對話歷史)
- 提供卡死檢測與恢復機制(is_stuck/handle_stuck_state)
ReActAgent
- 設計定位:實現經典的 ReAct 範式
- 抽象方法:
- think(): 生成下一步決策(返回bool表示是否繼續)
- act(): 執行具體操作(返回執行結果摘要)
ToolCallAgent
- 核心價值:擴展基礎工具調用能力
- 主要特性:
- 統一工具執行介面(execute_tool)
- 特殊工具處理機制(_handle_special_tool)
- 執行流程控制(_should_finish_execution)
- 支援工具白名單機制
PlanningAgent
- 業務場景:複雜任務規劃與執行
- 核心能力:
- 多步驟計劃生成(create_initial_plan)
- 計劃狀態跟蹤(update_plan_status)
- 步驟索引管理(_get_current_step_index)
- 工具驗證機制(initialize_plan_and_verify_tools)
SWEAgent
- 專業領域:軟體工程任務處理
- 特色功能:
- 代碼生成驗證機制
- 自動化測試集成
- 依賴關係分析
- 支援git操作等開發工具
Manus
- 業務定位:通用任務處理代理
- 典型應用:
- 自然語言指令解析
- 多工具協同工作流
- 上下文感知的任務執行
- 支援外掛程式式工具擴展
2. 工作流引擎 (Flow)

3. 工具系統 (Tools)

支援的工具:
工具名稱 | 功能描述 | 類別 |
---|
BashTool | 執行Bash命令,支援後台進程和互動式會話 | 代碼執行類 |
PythonExecuteTool | 執行Python代碼並捕獲執行結果 | 代碼執行類 |
FileSaverTool | 檔保存與路徑驗證 | 檔操作類 |
StrReplaceEditor | 正則表達式文本替換工具 | 檔操作類 |
GoogleSearchTool | 谷歌搜索API集成 | 網路工具類 |
BrowserUseTool | 瀏覽器自動化控制 | 網路工具類 |
TerminateTool | 進程終止與資源清理 | 系統工具類 |
CreateChatCompletion | LLM對話生成介面 | 系統工具類 |
4. 擴展能力
- 工具擴展:在 app/tool/ 目錄添加新工具類
- 流程擴展:通過 FlowFactory 註冊新流程類型
- Agent擴展:繼承 BaseAgent/ToolCallAgent 實現新策略
- 提示範本擴展:在 app/prompt/ 添加範本檔
四、專案中使用的Prompts分析
1. app/prompt/manus.py
SYSTEM_PROMPT = "You are OpenManus, an all-capable AI assistant, aimed at solving any task presented by the user. You have various tools at your disposal that you can call upon to efficiently complete complex requests. Whether it's programming, information retrieval, file processing, or web browsing, you can handle it all." NEXT_STEP_PROMPT = """You can interact with the computer using PythonExecute, save important content and information files through FileSaver, open browsers with BrowserUseTool, and retrieve information using GoogleSearch. PythonExecute: Execute Python code to interact with the computer system, data processing, automation tasks, etc. FileSaver: Save files locally, such as txt, py, html, etc. BrowserUseTool: Open, browse, and use web browsers.If you open a local HTML file, you must provide the absolute path to the file. GoogleSearch: Perform web information retrieval Based on user needs, proactively select the most appropriate tool or combination of tools. For complex tasks, you can break down the problem and use different tools step by step to solve it. After using each tool, clearly explain the execution results and suggest the next steps. """
- SYSTEM_PROMPT: 描述OpenManus的角色和能力,強調其可以使用各種工具完成複雜任務。
- NEXT_STEP_PROMPT: 提供工具的詳細說明,並指導如何選擇和使用這些工具。
2. app/prompt/planning.py
PLANNING_SYSTEM_PROMPT = """ You are an expert Planning Agent tasked with solving complex problems by creating and managing structured plans. Your job is: 1. Analyze requests to understand the task scope 2. Create clear, actionable plans with the `planning` tool 3. Execute steps using available tools as needed 4. Track progress and adapt plans dynamically 5. Use `finish` to conclude when the task is complete Available tools will vary by task but may include: - `planning`: Create, update, and track plans (commands: create, update, mark_step, etc.) - `finish`: End the task when complete Break tasks into logical, sequential steps. Think about dependencies and verification methods. """ NEXT_STEP_PROMPT = """ Based on the current state, what's your next step? Consider: 1. Do you need to create or refine a plan? 2. Are you ready to execute a specific step? 3. Have you completed the task? Provide reasoning, then select the appropriate tool or action. """
- PLANNING_SYSTEM_PROMPT: 描述Planning Agent的角色和任務,包括分析請求、創建計劃、執行步驟、跟蹤進度和動態調整計劃。
- NEXT_STEP_PROMPT: 提供關於下一步行動的指導,幫助用戶決定是否需要創建或優化計劃、執行具體步驟或完成任務。
3. app/prompt/swe.py
SYSTEM_PROMPT = """SETTING: You are an autonomous programmer, and you're working directly in the command line with a special interface. The special interface consists of a file editor that shows you {{WINDOW}} lines of a file at a time. In addition to typical bash commands, you can also use specific commands to help you navigate and edit files. To call a command, you need to invoke it with a function call/tool call. Please note that THE EDIT COMMAND REQUIRES PROPER INDENTATION. If you'd like to add the line ' print(x)' you must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run. RESPONSE FORMAT: Your shell prompt is formatted as follows: (Open file: ) (Current directory: ) bash-$ First, you should _always_ include a general thought about what you're going to do next. Then, for every response, you must include exactly _ONE_ tool call/function call. Remember, you should always include a _SINGLE_ tool call/function call and then wait for a response from the shell before continuing with more discussion and commands. Everything you include in the DISCUSSION section will be saved for future reference. If you'd like to issue two commands at once, PLEASE DO NOT DO THAT! Please instead first submit just the first tool call, and then after receiving a response you'll be able to issue the second tool call. Note that the environment does NOT support interactive session commands (e.g. python, vim), so please do not invoke them. """ NEXT_STEP_TEMPLATE = """{{observation}} (Open file: {{open_file}}) (Current directory: {{working_dir}}) bash-$ """
- SYSTEM_PROMPT: 描述自主程式師的工作環境和規則,包括如何使用工具調用來編輯檔、導航目錄以及正確的代碼縮進要求。
- NEXT_STEP_TEMPLATE: 提供一個範本,用於顯示當前觀察結果、打開的檔案路徑和工作目錄。
4. app/prompt/toolcall.py
SYSTEM_PROMPT = "You are an agent that can execute tool calls" NEXT_STEP_PROMPT = ( "If you want to stop interaction, use `terminate` tool/function call." )
- SYSTEM_PROMPT: 描述一個可以執行工具調用的代理角色。
- NEXT_STEP_PROMPT: 提供關於如何終止交互的指導。
五、總結
整體我感覺還沒有超出AutoGPT/MetaGPT的範疇。不管是不是蹭,人家開源了,就值得鼓勵。希望後面越做越完善。