Initial commit with translated description

This commit is contained in:
2026-03-29 10:21:46 +08:00
commit 18e90b0b09
67 changed files with 20609 additions and 0 deletions

212
docs/PREMIUM_SOURCES.md Normal file
View File

@@ -0,0 +1,212 @@
# Premium Source Authentication
## Contents
- [Overview](#overview)
- [Option 1: Keep It Simple (Recommended)](#option-1-keep-it-simple-recommended)
- [Option 2: Use Premium Sources (Advanced)](#option-2-use-premium-sources-advanced)
- [Troubleshooting](#troubleshooting)
- [Alternative: Use APIs Instead](#alternative-use-apis-instead)
- [Recommendation](#recommendation)
## Overview
WSJ and Barron's are premium financial news sources that require subscriptions. This guide explains how to authenticate and use premium sources with the finance-news skill.
**Recommendation:** For simplicity, we recommend using **free sources only** (Yahoo Finance, CNBC, MarketWatch). Premium sources add complexity and maintenance burden.
If you have subscriptions and want premium content, follow the steps below.
---
## Option 1: Keep It Simple (Recommended)
**Use free sources only.** They provide 90% of the value without authentication complexity:
- ✅ Yahoo Finance (free, reliable)
- ✅ CNBC (free, real-time news)
- ✅ MarketWatch (free, broad coverage)
- ✅ Reuters (free via Yahoo RSS)
**To disable premium sources:**
1. Edit `config/config.json` (legacy: `config/sources.json`)
2. Set `"enabled": false` for WSJ/Barron's entries
3. Done - no authentication needed
---
## Option 2: Use Premium Sources (Advanced)
### Prerequisites
- Active WSJ or Barron's subscription
- Browser with active login session (Chrome/Firefox)
- **Option B only:** Install `requests` library if needed:
```bash
pip install requests
```
### Step 1: Export Cookies from Browser
**Chrome:**
1. Install extension: [EditThisCookie](https://chrome.google.com/webstore/detail/editthiscookie/)
2. Navigate to wsj.com (logged in)
3. Click EditThisCookie icon → Export → Copy JSON
**Firefox:**
1. Install extension: [Cookie Quick Manager](https://addons.mozilla.org/en-US/firefox/addon/cookie-quick-manager/)
2. Navigate to wsj.com (logged in)
3. Right-click page → Inspect → Storage → Cookies
4. Copy relevant cookies (see format below)
### Step 2: Create Cookie File
Create `config/cookies.json` (this file is gitignored):
```json
{
"feeds.a.dj.com": {
"wsjgeo": "US",
"djcs_session": "YOUR_SESSION_TOKEN_HERE",
"djcs_route": "YOUR_ROUTE_HERE"
},
"www.barrons.com": {
"wsjgeo": "US",
"djcs_session": "YOUR_SESSION_TOKEN_HERE"
}
}
```
**Important:** Cookie domain must match feed URL domain:
- WSJ feeds use `feeds.a.dj.com` (not `wsj.com`)
- Barron's feeds use `www.barrons.com`
- Check `config/config.json` for actual feed URLs
**Note:** Cookie names/values vary by site. Export from browser to get actual values.
### Step 3: Pass Cookies to fetch_news.py
**Option A: Modify fetch_news.py (not officially supported)**
Add cookie loading to `fetch_rss()` function (maintains existing signature):
```python
import json
import urllib.request
from pathlib import Path
from urllib.parse import urlparse
def fetch_rss(url: str, limit: int = 10) -> list[dict]:
"""Fetch and parse RSS feed with optional cookie authentication."""
# Load cookies if they exist
cookie_file = Path(__file__).parent.parent / "config" / "cookies.json"
cookies = {}
if cookie_file.exists():
with open(cookie_file) as f:
all_cookies = json.load(f)
# Extract domain from URL (e.g., feeds.a.dj.com)
domain = urlparse(url).netloc
cookies = all_cookies.get(domain, {})
# Fetch with cookies and User-Agent
req = urllib.request.Request(url, headers={'User-Agent': 'OpenClaw/1.0'})
if cookies:
cookie_header = "; ".join([f"{k}={v}" for k, v in cookies.items()])
req.add_header("Cookie", cookie_header)
# ... rest of function (unchanged)
```
**Note:** This is a doc-only suggestion, not officially supported by the skill.
**Option B: Use requests library instead of urllib**
Replace `urllib` with `requests` for easier cookie handling (maintains API signature):
```python
import requests
def fetch_rss(url: str, limit: int = 10, cookies_dict: dict = None) -> list[dict]:
response = requests.get(url, cookies=cookies_dict, timeout=10)
response.raise_for_status()
# ... parse with feedparser
```
### Step 4: Security Considerations
**Critical: Do NOT commit cookies to git**
1. **`.gitignore` already includes cookie files:**
- `config/cookies.json`
- `*.cookie`
- No action needed (already configured)
2. **Set restrictive file permissions:**
```bash
chmod 600 config/cookies.json
```
2. **Set restrictive file permissions:**
```bash
chmod 600 config/cookies.json
```
3. **Rotate cookies regularly:**
- Browser session cookies expire (usually 7-30 days)
- Re-export cookies when authentication fails
4. **Never share cookie files:**
- Cookies grant full account access
- Treat like passwords
---
## Troubleshooting
### "HTTP 403 Forbidden" errors
**Cause:** Cookies expired or invalid
**Fix:**
1. Log in to WSJ/Barron's in browser
2. Re-export cookies
3. Update `config/cookies.json`
### "Paywall detected" in articles
**Cause:** RSS feed doesn't require auth, but full article does
**Fix:**
- Premium sources often provide headlines/snippets in RSS (no auth needed)
- Full articles require subscription + cookie auth
- If you only need headlines → no cookies needed
### Cookies not working
**Debug checklist:**
- [ ] Correct domain in cookies.json:
- WSJ: Use `feeds.a.dj.com` (not `wsj.com`)
- Barron's: Use `www.barrons.com` (not `barrons.com`)
- Check `config/config.json` for actual feed URLs
- [ ] Cookie values copied completely (no truncation)
- [ ] Browser session still active (test by visiting site)
- [ ] File permissions correct (chmod 600)
---
## Alternative: Use APIs Instead
Some premium sources offer APIs:
- **WSJ API:** Not publicly available
- **Barron's API:** Part of Dow Jones API (enterprise only)
- **Bloomberg API:** Enterprise only
**Conclusion:** Cookie-based auth is the only practical option for individual users.
---
## Recommendation
**For most users:** Stick with free sources. They're reliable, no auth needed, and provide comprehensive market coverage.
**For premium subscribers:** Follow Option 2, but be prepared to maintain cookie files and handle expiration.