shadow/openmanus_runtime

---
name: openmanus-runtime
description: "OpenManus on shadow runs from its venv against Ollama qwen2.5:3b; works, but tool-calling quality is poor at 3B"
metadata:
node_type: memory
type: project
originSessionId: 71d40fa2-b151-4c81-9821-f0dfeb7a0f66
---

OpenManus is configured to run on shadow as of 2026-05-15. Invocation:

```sh
cd /home/ubuntu/workspace/ai-agents/openmanus
./venv/bin/python main.py --prompt ""
# or interactive:
./venv/bin/python main.py
```

Other entrypoints: `run_mcp.py`, `run_flow.py`, `run_mcp_server.py`,
`sandbox_main.py` — all use the same `config/config.toml`.

**Config** (`config/config.toml`) was edited 2026-05-15:
- `[llm]` and `[llm.vision]` model: `deepseek-r1:7b` → `qwen2.5-coder:7b`
(after a brief detour through `qwen2.5:3b` while only that model was
installed). qwen2.5:3b kept on disk for the vvv-bots monitor.
- Added `api_type = "ollama"` and `api_version = ""` to both sections;
`app/config.py` `LLMSettings` lists those as `Field(...)` (required).
- Backup: `config/config.toml.before-2026-05-15.bak`.

**Tool-calling patch in `app/llm.py`** (2026-05-15): qwen2.5-coder (and
likely other Ollama-served models) emits tool calls as JSON inside the
assistant message `content` instead of the OpenAI-style `tool_calls`
array. Without rescue, OpenManus reports "selected 0 tools to use" and
the agent loops forever. The patch adds `_rescue_ollama_tool_calls()`
plus a post-response hook in `ask_tool` that:
1. fires only when `api_type == "ollama"` and `tool_calls` is empty,
2. parses `...` blocks if present, otherwise
falls back to bare-JSON `{"name": ..., "arguments": ...}`,
3. wraps each match into a `ChatCompletionMessageToolCall` and clears
the original `content`.

End-to-end verified 2026-05-15: `--prompt "Create a file at /tmp/...
with content '...' then stop"` actually creates the file and calls
`terminate`. Per-step time is ~25–30 s on shadow.

**Why local patch:** Ollama 0.23.4 (latest as of 2026-05-15) still
behaves the same; bumping versions did not help. The model template
asks for `` wrappers, but the model frequently skips them.
A future Ollama or model template fix may make the rescue redundant —
the patch is harmless when `tool_calls` are already populated.

**Why kept simple:** Both vision and text use the same `qwen2.5-coder:7b`
because no real multimodal model is installed; vision flows that include
images will degrade. Memory is tight: model resident ~4.6 GB, swap is
already taking ~1–2 GB at idle on this host (see [[shadow-role]]).

**How to apply:** OpenManus is *working* now, including tool calls.
If a user says it "doesn't use tools," check whether they are running
against `api_type = "ollama"`; the rescue only fires for that path.
For non-Ollama backends the original AsyncOpenAI behaviour is unchanged.

memory shadow claude-memory