shadow/monitor_model_mismatch

---
name: monitor-model-mismatch
description: monitor and aider use Ollama qwen2.5:3b on shadow because RAM (~3 GB free) cannot hold a 7B model
metadata:
node_type: memory
type: project
originSessionId: 71d40fa2-b151-4c81-9821-f0dfeb7a0f66
---

`ai-agents/monitor/summarize_logs.py` was changed from `deepseek-r1:7b` to
`qwen2.5:3b` on 2026-05-15. The 7B variant was pulled then deleted because
attempting to load it returned `model requires more system memory (4.3 GiB)
than is available (3.1 GiB)`.

**Why:** Shadow has 7.7 GB physical RAM and **no swap**. Dify (Celery + beat
+ 2× gunicorn ≈ 1.2 GB), n8n (~200 MB), VS Code remote extension host
(~566 MB), and Ollama itself leave only ~3 GB free at idle. A 7B Q4 model
needs >4 GB resident.

**How to apply:** Don't propose pulling 7B+ models on shadow without first
freeing memory or adding swap. If the user wants better summary quality,
the cheapest options are (1) enable swap (LLM via swap is slow but works
for a once-a-day job), (2) stop Dify in a cron pre/post hook, or (3) move
inference to arcana. Don't silently re-pull deepseek-r1:7b.

**daily_bot_report.sh Ollama detection** (fixed 2026-05-15): the script
no longer runs `ollama serve &` directly — that would conflict with
the systemd `ollama.service`. It now tries `systemctl start ollama`
(which fails without sudo and is logged), then falls through to the
`summarize_logs.py` fallback path. Net effect: if Ollama is down, the
report still ships, just with the ⚠️ stats-only fallback. Don't add
sudo or polkit rules for ollama unless the user explicitly wants the
script to be able to revive it on its own.

memory shadow claude-memory