Initial commit with translated description

2026-03-29 09:49:22 +08:00
commit dc93ba7c9d
7 changed files with 593 additions and 0 deletions
--- a/SKILL.md
+++ b/SKILL.md
@@ -0,0 +1,165 @@
 ---
 name: Data Analysis
 slug: data-analysis
 version: 1.0.2
 homepage: https://clawic.com/skills/data-analysis
 description: "数据分析和可视化。"
 changelog: Added metric contracts, chart guidance, and decision brief templates for more reliable analysis.
 metadata: {"clawdbot":{"emoji":"D","requires":{"bins":[]},"os":["linux","darwin","win32"]}}
 ---
 ## When to Use
 Use this skill when the user needs to analyze, explain, or visualize data from SQL, spreadsheets, notebooks, dashboards, exports, or ad hoc tables.
 Use it for KPI debugging, experiment readouts, funnel or cohort analysis, anomaly reviews, executive reporting, and quality checks on metrics or query logic.
 Prefer this skill over generic coding or spreadsheet help when the hard part is analytical judgment: metric definition, comparison design, interpretation, or recommendation.
 User asks about: analyzing data, finding patterns, understanding metrics, testing hypotheses, cohort analysis, A/B testing, churn analysis, or statistical significance.
 ## Core Principle
 Analysis without a decision is just arithmetic. Always clarify: **What would change if this analysis shows X vs Y?**
 ## Methodology First
 Before touching data:
 1. **What decision** is this analysis supporting?
 2. **What would change your mind?** (the real question)
 3. **What data do you actually have** vs what you wish you had?
 4. **What timeframe** is relevant?
 ## Statistical Rigor Checklist
 - [ ] Sample size sufficient? (small N = wide confidence intervals)
 - [ ] Comparison groups fair? (same time period, similar conditions)
 - [ ] Multiple comparisons? (20 tests = 1 "significant" by chance)
 - [ ] Effect size meaningful? (statistically significant != practically important)
 - [ ] Uncertainty quantified? ("12-18% lift" not just "15% lift")
 ## Architecture
 This skill does not require local folders, persistent memory, or setup state.
 Use the included reference files as lightweight guides:
 - `metric-contracts.md` for KPI definitions and caveats
 - `chart-selection.md` for visual choice and chart anti-patterns
 - `decision-briefs.md` for stakeholder-facing outputs
 - `pitfalls.md` and `techniques.md` for analytical rigor and method choice
 ## Quick Reference
 Load only the smallest relevant file to keep context focused.
 | Topic | File |
 |-------|------|
 | Metric definition contracts | `metric-contracts.md` |
 | Visual selection and chart anti-patterns | `chart-selection.md` |
 | Decision-ready output formats | `decision-briefs.md` |
 | Failure modes to catch early | `pitfalls.md` |
 | Method selection by question type | `techniques.md` |
 ## Core Rules
 ### 1. Start from the decision, not the dataset
 - Identify the decision owner, the question that could change a decision, and the deadline before doing analysis.
 - If no decision would change, reframe the request before computing anything.
 ### 2. Lock the metric contract before calculating
 - Define entity, grain, numerator, denominator, time window, timezone, filters, exclusions, and source of truth.
 - If any of those are ambiguous, state the ambiguity explicitly before presenting results.
 ### 3. Separate extraction, transformation, and interpretation
 - Keep query logic, cleanup assumptions, and analytical conclusions distinguishable.
 - Never hide business assumptions inside SQL, formulas, or notebook code without naming them in the write-up.
 ### 4. Choose visuals to answer a question
 - Select charts based on the analytical question: trend, comparison, distribution, relationship, composition, funnel, or cohort retention.
 - Do not add charts that make the deck look fuller but do not change the decision.
 ### 5. Brief every result in decision format
 - Every output should include the answer, evidence, confidence, caveats, and recommended next action.
 - If the output is going to a stakeholder, translate the method into business implications instead of leading with technical detail.
 ### 6. Stress-test claims before recommending action
 - Segment by obvious confounders, compare the right baseline, quantify uncertainty, and check sensitivity to exclusions or time windows.
 - Strong-looking numbers without robustness checks are not decision-ready.
 ### 7. Escalate when the data cannot support the claim
 - Block or downgrade conclusions when sample size is weak, the source is unreliable, definitions drifted, or confounding is unresolved.
 - It is better to say "unknown yet" than to produce false confidence.
 ## Common Traps
 - Reusing a KPI name after changing numerator, denominator, or exclusions -> trend comparisons become invalid.
 - Comparing daily, weekly, and monthly grains in one chart -> movement looks real but is mostly aggregation noise.
 - Showing percentages without underlying counts -> leadership overreacts to tiny denominators.
 - Using a pretty chart instead of the right chart -> the output looks polished but hides the actual decision signal.
 - Hunting for interesting cuts after seeing the result -> narrative follows chance instead of evidence.
 - Shipping automated reports without metric owners or caveats -> bad numbers spread faster than they can be corrected.
 - Treating observational patterns as causal proof -> action plans get built on correlation alone.
 ## Approach Selection
 | Question type | Approach | Key output |
 |---------------|----------|------------|
 | "Is X different from Y?" | Hypothesis test | p-value + effect size + CI |
 | "What predicts Z?" | Regression/correlation | Coefficients + R² + residual check |
 | "How do users behave over time?" | Cohort analysis | Retention curves by cohort |
 | "Are these groups different?" | Segmentation | Profiles + statistical comparison |
 | "What's unusual?" | Anomaly detection | Flagged points + context |
 For technique details and when to use each, see `techniques.md`.
 ## Output Standards
 1. **Lead with the insight**, not the methodology
 2. **Quantify uncertainty** - ranges, not point estimates
 3. **State limitations** - what this analysis can't tell you
 4. **Recommend next steps** - what would strengthen the conclusion
 ## Red Flags to Escalate
 - User wants to "prove" a predetermined conclusion
 - Sample size too small for reliable inference
 - Data quality issues that invalidate analysis
 - Confounders that can't be controlled for
 ## External Endpoints
 This skill makes no external network requests.
 | Endpoint | Data Sent | Purpose |
 |----------|-----------|---------|
 | None | None | N/A |
 No data is sent externally.
 ## Security & Privacy
 Data that leaves your machine:
 - Nothing by default.
 Data that stays local:
 - Nothing by default.
 This skill does NOT:
 - Access undeclared external endpoints.
 - Store credentials or raw exports in hidden local memory files.
 - Create or depend on local folder systems for persistence.
 - Create automations or background jobs without explicit user confirmation.
 - Rewrite its own instruction source files.
 ## Related Skills
 Install with `clawhub install <slug>` if user confirms:
 - `sql` - query design and review for reliable data extraction.
 - `csv` - cleanup and normalization for tabular inputs before analysis.
 - `dashboard` - implementation patterns for KPI visualization layers.
 - `report` - structured stakeholder-facing deliverables after analysis.
 - `business-intelligence` - KPI systems and operating cadence beyond one-off analysis.
 ## Feedback
 - If useful: `clawhub star data-analysis`
 - Stay updated: `clawhub sync`
--- a/_meta.json
+++ b/_meta.json
@@ -0,0 +1,6 @@
 {
  "ownerId": "kn73vp5rarc3b14rc7wjcw8f8580t5d1",
  "slug": "data-analysis",
  "version": "1.0.2",
  "publishedAt": 1773241910484
 }
--- a/chart-selection.md
+++ b/chart-selection.md
@@ -0,0 +1,40 @@
 # Chart Selection
 Choose visuals based on the question, not on what is easiest to render.
 ## Question to Chart Map
 | Question | Preferred chart | Notes |
 |----------|-----------------|-------|
 | How is a metric changing over time? | line chart | annotate structural breaks and missing data |
 | Which groups are highest or lowest? | sorted bar chart | keep a shared baseline |
 | How is the distribution shaped? | histogram or box plot | avoid average-only summaries |
 | Are two variables related? | scatter plot | show trend and outliers separately |
 | How do parts contribute to the whole? | stacked bar with totals | keep category count low |
 | Where are users dropping? | funnel chart | define the time window explicitly |
 | How do cohorts retain over time? | cohort table or heatmap | show cohort size alongside retention |
 ## Default Rules
 - Bars start at zero unless there is a strong reason not to.
 - Show underlying counts next to percentages when denominators are small.
 - Prefer direct labels over legends when possible.
 - Use one chart per decision question, not one chart per available metric.
 ## Visual Anti-Patterns
 - Pie charts with many slices -> comparisons become guesswork.
 - Dual-axis charts -> viewers infer relationships that are not there.
 - Cumulative-only charts -> hide recent deterioration or recovery.
 - Truncated bar axes -> exaggerate small differences.
 - Stacked areas with many categories -> impossible to compare layers.
 ## Before Shipping a Chart
 Check:
 1. What decision question this chart answers.
 2. Whether the baseline is visible.
 3. Whether the grain and time window match the narrative.
 4. Whether annotations explain outages, launches, or missing data.
 5. Whether a table would be clearer than the chart.
--- a/decision-briefs.md
+++ b/decision-briefs.md
@@ -0,0 +1,45 @@
 # Decision Briefs
 Use these templates to turn analysis into action instead of dumping findings.
 ## Standard Decision Brief
 1. Decision question.
 2. Short answer.
 3. Evidence: key numbers and comparison baseline.
 4. Confidence: high, medium, or low, with one sentence why.
 5. Caveats and what could still change the conclusion.
 6. Recommended next action, owner, and due date.
 ## Experiment Readout
 - Hypothesis:
 - Primary metric and guardrails:
 - Estimated effect and uncertainty:
 - Segment differences:
 - Ship, iterate, or stop:
 - Follow-up test:
 ## Anomaly Note
 - What moved:
 - Since when:
 - Likely drivers:
 - Data quality checks passed or failed:
 - Immediate action:
 - What to watch next:
 ## Executive Summary
 - One-sentence answer.
 - Two or three supporting bullets with numbers.
 - One caveat.
 - One decision or escalation request.
 ## Writing Rules
 - Lead with the answer, not the method.
 - Translate statistics into business implications.
 - Separate observations from recommendations.
 - If confidence is low, say what would raise confidence.
 - Avoid dumping every cut you explored; keep only evidence that changes the decision.
--- a/metric-contracts.md
+++ b/metric-contracts.md
@@ -0,0 +1,48 @@
 # Metric Contracts
 Use this when a KPI, dashboard tile, or report number could be interpreted in more than one way.
 ## Contract Template
 Capture each metric in this order before trusting comparisons:
 1. Business question the metric is meant to answer.
 2. Entity and grain: user, account, order, session, day, week, month.
 3. Numerator and denominator with exact inclusion logic.
 4. Filters and exclusions: internal traffic, refunds, test accounts, paused users.
 5. Time window, timezone, and refresh cadence.
 6. Source of truth and owner.
 7. Known caveats, version changes, and safe interpretation range.
 ## Minimum Contract Output
 | Field | Example |
 |-------|---------|
 | Metric | Paid conversion rate |
 | Question | Is onboarding quality improving? |
 | Grain | weekly |
 | Numerator | first paid subscriptions |
 | Denominator | qualified onboarding starts |
 | Filters | excludes employees and QA accounts |
 | Timezone | UTC |
 | Source | warehouse.subscriptions_daily |
 | Owner | Growth lead |
 | Caveat | Launch week excluded because tracking was partial |
 ## Stop Conditions
 Do not present a metric as stable if:
 - Numerator or denominator changed between periods.
 - Source ownership is unclear.
 - Filters were applied ad hoc and not documented.
 - Time windows or timezones differ across comparisons.
 - A dashboard label hides a formula change.
 ## Fast Questions to Ask
 - "What exactly counts in the numerator?"
 - "Who is excluded and why?"
 - "What is the comparison baseline?"
 - "Has this definition changed over time?"
 - "Who would dispute this number internally?"
--- a/pitfalls.md
+++ b/pitfalls.md
@@ -0,0 +1,120 @@
 # Analytical Pitfalls — Detailed Examples
 ## Simpson's Paradox
 **What it is:** A trend that appears in aggregated data reverses when you segment by a key variable.
 **Example:**
 - Overall: Treatment A has 80% success, Treatment B has 85% -> "B is better"
 - But segmented by severity:
  - Mild cases: A=90%, B=85% -> A is better
  - Severe cases: A=70%, B=65% -> A is better
 - Paradox: A is better in BOTH groups, but B looks better overall because B got more mild cases
 **How to catch:** Always segment by obvious confounders (user type, time period, source, severity) before concluding.
 ---
 ## Survivorship Bias
 **What it is:** Drawing conclusions only from "survivors" while ignoring those who dropped out.
 **Example:**
 - "Users who completed onboarding have 80% retention!"
 - Problem: You're only looking at users who already demonstrated commitment by completing onboarding
 - The 60% who abandoned onboarding aren't in your "user" dataset
 **How to catch:** Ask "Who is NOT in this dataset that should be?" Include churned users, failed attempts, non-converters.
 ---
 ## Comparing Unequal Periods
 **What it is:** Comparing metrics across time periods of different lengths or characteristics.
 **Examples:**
 - February (28 days) vs January (31 days) revenue
 - Holiday week vs normal week traffic
 - Q4 (holiday season) vs Q1 for e-commerce
 **How to catch:**
 - Normalize to per-day, per-user, or per-session
 - Compare same period last year (YoY) not sequential months
 - Flag seasonal factors explicitly
 ---
 ## p-Hacking (Multiple Comparisons)
 **What it is:** Running many statistical tests until finding a "significant" result, then reporting only that one.
 **Example:**
 - Test 20 different user segments for conversion difference
 - At p=0.05, expect 1 "significant" result by chance alone
 - Report: "Segment X shows significant improvement!" (cherry-picked)
 **How to catch:**
 - Apply Bonferroni correction (divide alpha by number of tests)
 - Pre-register hypotheses before looking at data
 - Report ALL tests run, not just significant ones
 ---
 ## Spurious Correlation in Time Series
 **What it is:** Two variables both trending over time appear correlated, but the relationship is meaningless.
 **Example:**
 - "Revenue and employee count are 95% correlated!"
 - Both grew over time. Controlling for time, there's no relationship.
 - Classic: "Ice cream sales correlate with drowning deaths" (both rise in summer)
 **How to catch:**
 - Detrend both series before correlating
 - Check if relationship holds within time periods
 - Ask: "Is there a causal mechanism, or just shared time trend?"
 ---
 ## Aggregating Percentages
 **What it is:** Averaging percentages instead of recalculating from underlying totals.
 **Example:**
 - Store A: 10/100 = 10% conversion
 - Store B: 5/10 = 50% conversion
 - Wrong: "Average conversion is 30%"
 - Right: 15/110 = 13.6% conversion
 **How to catch:** Never average percentages. Sum numerators, sum denominators, recalculate.
 ---
 ## Selection Bias in A/B Tests
 **What it is:** Treatment and control groups differ systematically before treatment is applied.
 **Examples:**
 - Users who opted into new feature vs those who didn't
 - Early adopters (Monday signups) vs late week (Friday signups)
 - Users who saw the experiment (loaded fast enough) vs those who didn't
 **How to catch:**
 - Verify pre-experiment metrics are balanced
 - Use intention-to-treat analysis
 - Check for differential attrition
 ---
 ## Confusing Causation
 **What it is:** Assuming X causes Y when the relationship might be: Y causes X, Z causes both, or it's coincidental.
 **Example:**
 - "Power users have higher retention"
 - Did power usage cause retention? Or did retained users become power users over time? Or does a third factor (job role) drive both?
 **How to catch:**
 - Can you run an experiment? (randomize treatment)
 - Is there a natural experiment? (policy change, feature rollout)
 - At minimum: control for obvious confounders
--- a/techniques.md
+++ b/techniques.md
@@ -0,0 +1,169 @@
 # Analysis Techniques — When to Use Each
 ## Hypothesis Testing
 **Use when:** Comparing two groups to determine if a difference is real or random chance.
 **Technique selection:**
 | Data type | Groups | Test |
 |-----------|--------|------|
 | Continuous | 2 | t-test (if normal) or Mann-Whitney |
 | Continuous | 3+ | ANOVA or Kruskal-Wallis |
 | Proportions | 2 | Chi-square or Fisher's exact |
 | Paired data | 2 | Paired t-test or Wilcoxon signed-rank |
 **Key outputs:**
 - p-value (probability of seeing this difference by chance)
 - Effect size (how big is the difference - Cohen's d, odds ratio)
 - Confidence interval (range of plausible true values)
 **Watch out for:**
 - Large samples make everything "significant" - focus on effect size
 - Multiple comparisons inflate false positives
 - Normality assumptions (use non-parametric if violated)
 ---
 ## Cohort Analysis
 **Use when:** Understanding how user behavior changes over time, segmented by when they started.
 **Types:**
 - **Retention cohorts:** % of users still active N days after signup
 - **Revenue cohorts:** Revenue per cohort over time
 - **Behavioral cohorts:** Feature adoption by signup cohort
 **Setup:**
 1. Define cohort (usually signup week/month)
 2. Define event (login, purchase, specific action)
 3. Define time windows (day 1, 7, 30, 90)
 4. Build matrix: cohort × time period
 **Key outputs:**
 - Retention curves (line chart by cohort)
 - Cohort comparison (are newer cohorts performing better?)
 - Time-to-event patterns
 **Watch out for:**
 - Cohort size differences (small cohorts = noisy data)
 - Seasonality (December cohort behaves differently)
 - Definition consistency (what counts as "active"?)
 ---
 ## Funnel Analysis
 **Use when:** Understanding conversion through a multi-step process.
 **Setup:**
 1. Define stages (visit -> signup -> activate -> purchase)
 2. Count users at each stage
 3. Calculate drop-off rates between stages
 **Key outputs:**
 - Conversion rates per stage
 - Biggest drop-off points
 - Segment comparison (mobile vs desktop funnels)
 **Watch out for:**
 - Time window (did they convert eventually, or just not today?)
 - Stage ordering (users don't always follow linear paths)
 - Defining "same session" vs "ever"
 ---
 ## Regression Analysis
 **Use when:** Understanding what predicts an outcome, controlling for other factors.
 **Types:**
 - **Linear:** Continuous outcome (revenue, time spent)
 - **Logistic:** Binary outcome (churned/retained, converted/didn't)
 - **Poisson:** Count outcome (purchases, logins)
 **Key outputs:**
 - Coefficients (effect of each variable, holding others constant)
 - R² (how much variance is explained)
 - p-values per variable
 - Residual plots (are assumptions met?)
 **Watch out for:**
 - Multicollinearity (correlated predictors)
 - Omitted variable bias (missing important controls)
 - Extrapolation beyond data range
 - Causation claims from observational data
 ---
 ## Segmentation/Clustering
 **Use when:** Discovering natural groups in your data.
 **Techniques:**
 - **K-means:** Simple, fast, assumes spherical clusters
 - **Hierarchical:** Shows cluster relationships, good for exploration
 - **RFM:** Business-specific (Recency, Frequency, Monetary)
 **Process:**
 1. Select features (what defines a segment?)
 2. Normalize features (so scale doesn't dominate)
 3. Choose number of clusters (elbow method, silhouette score)
 4. Profile each cluster (what makes them different?)
 **Key outputs:**
 - Cluster profiles (avg values per segment)
 - Segment sizes
 - Distinguishing characteristics
 **Watch out for:**
 - Garbage in, garbage out (feature selection matters)
 - Cluster count is subjective
 - Stability (do clusters hold with different random seeds?)
 ---
 ## Anomaly Detection
 **Use when:** Finding unusual data points that warrant investigation.
 **Approaches:**
 - **Statistical:** Points beyond 2-3 standard deviations
 - **IQR method:** Below Q1-1.5×IQR or above Q3+1.5×IQR
 - **Isolation Forest:** For multivariate anomalies
 - **Domain rules:** Negative revenue, future dates, impossible values
 **Key outputs:**
 - Flagged records with anomaly scores
 - Context (why is this unusual?)
 - Severity (how far from normal?)
 **Watch out for:**
 - Seasonality (Black Friday isn't an anomaly)
 - Trends (growth makes old "normal" look like anomalies)
 - False positives (investigate before acting)
 ---
 ## Time Series Analysis
 **Use when:** Understanding patterns in data over time.
 **Components:**
 - **Trend:** Long-term direction
 - **Seasonality:** Repeating patterns (daily, weekly, yearly)
 - **Noise:** Random variation
 **Techniques:**
 - **Moving averages:** Smooth out noise
 - **Decomposition:** Separate trend, seasonal, residual
 - **Year-over-year:** Compare same period last year
 **Key outputs:**
 - Trend direction and strength
 - Seasonal patterns identified
 - Forecast with uncertainty bands
 **Watch out for:**
 - Comparing different lengths (months vary in days)
 - Holidays/events (one-time vs recurring)
 - Structural breaks (COVID, product changes)