Keeping up with technical developments across AI, infrastructure, developer tooling, and open source is a multi-source problem. The relevant signal is distributed across dozens of RSS feeds, GitHub trending repositories, X/Twitter accounts, and search results — none of which overlap cleanly, and all of which update continuously. Reading across even 20 of them manually takes 45 minutes. At 109, it is not a human-scale task.
The 109-source digest agent solves this with a three-stage pipeline: parallel fetch across all source types, a per-item quality score that identifies the highest-signal items, and a deduplication pass that collapses the five articles covering the same product launch into one ranked entry. The result is a single digest delivered every morning — the 10–15 items most worth reading from the previous 24 hours, ranked by relevance score.
This is one of the highest-volume OpenClaw use cases in the Research & Learning category, and the one where per-call cost savings compound most visibly at scale.
By the end of this tutorial you'll have a daily pipeline fetching from 109+ sources across RSS, X/Twitter, GitHub, and web search — scoring each item, deduplicating overlapping stories, and delivering a ranked digest every morning. The per-item WisGate savings at that call volume add up to a specific annual figure calculated in Section 5. Validate the scoring logic against a sample batch at wisgate.ai/models before the first full run. Get your key at wisgate.ai/hall/tokens.
WisGate + OpenClaw Configuration
Step 1 — Open the configuration file
nano ~/.openclaw/openclaw.json
Step 2 — Add the WisGate provider
Paste the following into your models section. This registers Claude Haiku (claude-haiku-4-5-20251001) for per-item quality scoring and Claude Sonnet (claude-sonnet-4-5) for the deduplication and ranking pass:
"models": {
"mode": "merge",
"providers": {
"moonshot": {
"baseUrl": "https://api.wisgate.ai/v1",
"apiKey": "WISGATE-API-KEY",
"api": "openai-completions",
"models": [
{
"id": "claude-haiku-4-5-20251001",
"name": "Claude Haiku 4.5",
"reasoning": false,
"input": ["text"],
"cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
"contextWindow": 256000,
"maxTokens": 8192
},
{
"id": "claude-sonnet-4-5",
"name": "Claude Sonnet 4.5",
"reasoning": false,
"input": ["text"],
"cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
"contextWindow": 256000,
"maxTokens": 8192
}
]
}
}
}
Replace WISGATE-API-KEY with your key from wisgate.ai/hall/tokens. Confirm both model prices at wisgate.ai/models before calculating annual cost.
Step 3 — Save and restart
Ctrl + O→Enter→Ctrl + Xto save and exitCtrl + Cto stop, then runopenclaw tui
Step 4 — Validate the scoring prompt before the first full run
curl -s -X POST "https://api.wisgate.ai/v1/chat/completions" \
-H "Authorization: Bearer $WISDOM_GATE_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-haiku-4-5-20251001",
"messages": [
{"role": "system", "content": "[PASTE QUALITY SCORING PROMPT HERE]"},
{"role": "user", "content": "Score these items:\n\n[PASTE 5 SAMPLE ITEMS]"}
],
"max_tokens": 512
}' | jq -r '.choices[0].message.content'
Note: OpenClaw was previously known as ClawdBot and MoltBot. These steps apply to all versions.
OpenClaw API News Aggregation: Four Source Types, Two Endpoints
| Source type | Example | Fetch method | WisGate endpoint |
|---|---|---|---|
| RSS feeds | Hacker News, ArXiv, vendor blogs | HTTP GET + XML parse | OpenAI-compatible |
| GitHub | Trending repos by topic | GitHub API | OpenAI-compatible |
| X/Twitter | Specific handles or lists | X API Bearer token | OpenAI-compatible |
| Web search queries | "LLM inference optimization 2026" | Gemini-native + google_search | Gemini-native (grounding required) |
RSS, GitHub, and X content is fetched directly — the agent receives raw text and scores it. Web search queries require live retrieval via the google_search grounding tool on the Gemini-native endpoint. Using the OpenAI-compatible endpoint for web search returns stale training-data results, not current web content. This distinction is architectural, not optional.
sources.yaml — store your full source list here:
sources:
rss:
- name: "Hacker News"
url: "https://news.ycombinator.com/rss"
topic_tags: ["dev", "ai", "infra"]
- name: "ArXiv CS.AI"
url: "https://arxiv.org/rss/cs.AI"
topic_tags: ["ai", "research"]
github:
- topic: "large-language-models"
language: "python"
time_window: "daily"
twitter:
- handle: "example_handle"
max_posts: 10
web_search:
- query: "LLM inference optimization 2026"
- query: "open source AI models this week"
LLM Multi-Source Summarization: Quality Scoring and Deduplication Prompts
Two-pass processing:
| Pass | Model | Input | Output |
|---|---|---|---|
| Pass 1: Quality scoring | Haiku | One batch per source (all items) | Score + one-line summary per item |
| Pass 2: Deduplication + ranking | Sonnet | All scored items above threshold | Deduplicated, ranked digest (max 15) |
Pass 1 — Quality scoring system prompt (Haiku, run per source batch):
You are a tech news quality scorer.
For each item in the list provided, return a JSON object:
{
"id": "[ITEM ID]",
"score": [1-10],
"novelty": "new" | "follow-up" | "evergreen",
"summary": "one sentence — the specific technical claim or development",
"tags": ["tag1", "tag2"]
}
Scoring criteria:
- 8–10: Significant new development, benchmark, release, or architectural finding
- 5–7: Useful reference, interesting but not novel, or follow-up to known story
- 1–4: Promotional content, opinion without new data, or already widely covered
Rules:
- Score for a technically literate developer audience
- summary must contain the specific claim, not just the topic area
- Return a JSON array only. No preamble.
Pass 2 — Deduplication and ranking system prompt (Sonnet, run once on full scored set):
You are a tech news digest editor.
INPUT: quality-scored items from multiple sources, some covering the same story.
TASK:
1. Group items covering the same underlying story or development
2. Keep the highest-scored item per group; append: "Also covered by: [N] other sources"
3. Discard all items with score < 5
4. Rank remaining items by score descending
5. Return the top 15 as a ranked digest:
## [RANK]. [ONE-LINE SUMMARY]
**Score:** [X]/10 | **Tags:** [tag1, tag2] | **Source:** [source name]
[One sentence of context if the summary alone is ambiguous]
Return clean Markdown. No preamble. Max 15 items.
Raw fetch cache: store all fetched results in ~/.openclaw/news-digest/raw/[DATE]/ — one file per source. If the scoring pass fails, re-run it without re-fetching 109 sources. This is an operational requirement, not an optional optimization.
Daily Cron Schedule and Grounded Web Search
Four-stage cron — fetch, score, deduplicate, deliver by 07:00:
0 5 * * * openclaw run --agent news-digest --mode fetch
0 6 * * * openclaw run --agent news-digest --mode score
30 6 * * * openclaw run --agent news-digest --mode deduplicate-and-rank
45 6 * * * openclaw run --agent news-digest --mode deliver
Grounded web search call (Gemini-native endpoint, web_search sources only):
curl -s -X POST \
"https://wisgate.ai/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
-H "x-goog-api-key: $WISDOM_GATE_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{"parts": [{"text": "Search for recent developments: [INSERT QUERY]"}]}],
"tools": [{"google_search": {}}],
"generationConfig": {"responseModalities": ["TEXT"]}
}' | jq -r '.candidates[0].content.parts[0].text' >> raw_web_results.md
Run one grounded call per web search query in sources.yaml. RSS, GitHub, and X sources use standard HTTP fetch — the Gemini-native endpoint is only for live web search queries.
OpenClaw Use Cases: Annual Cost Across 109 Sources
Daily token estimate:
| Operation | Model | Calls/day | Tokens/call | Daily tokens |
|---|---|---|---|---|
| Quality scoring | Haiku | 109 | ~700 | ~76,300 |
| Deduplication + ranking | Sonnet | 1 | ~9,200 | ~9,200 |
| Grounded web search | Gemini-native | ~10 | ~1,000 | ~10,000 |
| Daily total | Mixed | 120 | — | ~95,500 |
| Annual total (365 days) | — | — | — | ~34.9M |
Annual cost comparison — confirm all prices from wisgate.ai/models before publishing:
| Model / endpoint | Annual tokens | WisGate cost | Direct API cost | Annual saving |
|---|---|---|---|---|
| Haiku — scoring | ~27.8M | Confirm + calculate | Confirm + calculate | Calculate |
| Sonnet — dedup | ~3.36M | Confirm + calculate | Confirm + calculate | Calculate |
| Gemini-native — search | ~3.65M | Confirm + calculate | Confirm + calculate | Calculate |
| Total | ~34.9M | Calculate | Calculate | Calculate |
At 109 sources per day, the per-token savings that appear negligible on a single call accumulate into a material annual figure. This is the OpenClaw use case where volume makes the cost differential most legible — insert confirmed dollar figures before publishing.
The source list format is defined. Both system prompts are ready to copy. The endpoint routing is documented — OpenAI-compatible for RSS, GitHub, and X; Gemini-native for web search queries. The four-stage cron runs fetch, score, deduplicate, and deliver as separate steps with cached intermediate outputs.
Deployment sequence: populate sources.yaml with your first 10–20 sources → run the Step 4 validation call → execute one full pipeline run manually → review the digest output → activate the daily cron. Start at 20 sources and add in batches of 10 once the scoring threshold is calibrated.
All configuration is on this page. Generate your WisGate key at wisgate.ai/hall/tokens — one key covers the OpenAI-compatible endpoint for Haiku and Sonnet and the Gemini-native endpoint for grounded web search. Before enabling the full cron schedule, test the scoring prompt against five real items at wisgate.ai. Start the pipeline running tonight — the first digest lands tomorrow morning.