Initial commit with translated description
This commit is contained in:
189
SKILL.md
Normal file
189
SKILL.md
Normal file
@@ -0,0 +1,189 @@
|
|||||||
|
---
|
||||||
|
name: Playwright (Automation + MCP + Scraper)
|
||||||
|
slug: playwright
|
||||||
|
version: 1.0.3
|
||||||
|
homepage: https://clawic.com/skills/playwright
|
||||||
|
description: "通过Playwright MCP进行浏览器自动化。"
|
||||||
|
changelog: Clarified the MCP-first browser automation flow and improved quick-start guidance for forms, screenshots, and extraction.
|
||||||
|
metadata: {"clawdbot":{"emoji":"P","requires":{"bins":["node","npx"]},"os":["linux","darwin","win32"],"install":[{"id":"npm-playwright","kind":"npm","package":"playwright","bins":["playwright"],"label":"Install Playwright"},{"id":"npm-playwright-mcp","kind":"npm","package":"@playwright/mcp","bins":["playwright-mcp"],"label":"Install Playwright MCP (optional)"}]}}
|
||||||
|
---
|
||||||
|
|
||||||
|
## When to Use
|
||||||
|
|
||||||
|
Use this skill for real browser tasks: JS-rendered pages, multi-step forms, screenshots or PDFs, UI debugging, Playwright test authoring, MCP-driven browser control, and structured extraction from rendered pages.
|
||||||
|
|
||||||
|
Prefer it when static fetch is insufficient or when the task depends on browser events, visible DOM state, authentication context, uploads or downloads, or user-facing rendering.
|
||||||
|
|
||||||
|
If the user mainly wants the agent to drive a browser with simple actions like navigate, click, fill, screenshot, download, or extract, treat MCP as a first-class path.
|
||||||
|
|
||||||
|
Use direct Playwright for scripts and tests. Use MCP when browser tools are already in the loop, the user explicitly wants MCP, or the fastest path is browser actions rather than writing new automation code.
|
||||||
|
|
||||||
|
Primary fit is repo-owned browser work: tests, debugging, repros, screenshots, and deterministic automation. Treat rendered-page extraction as a secondary use case, not the default identity.
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
This skill is instruction-only. It does not create local memory, setup folders, or persistent profiles by default.
|
||||||
|
|
||||||
|
Load only the smallest reference file needed for the task. Keep auth state temporary unless the repository already standardizes it and the user explicitly wants browser-session reuse.
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### MCP browser path
|
||||||
|
```bash
|
||||||
|
npx @playwright/mcp --headless
|
||||||
|
```
|
||||||
|
|
||||||
|
Use this path when the agent already has browser tools available or the user wants browser automation without writing new Playwright code.
|
||||||
|
|
||||||
|
### Common MCP actions
|
||||||
|
|
||||||
|
Typical Playwright MCP tool actions include:
|
||||||
|
- `browser_navigate` for opening a page
|
||||||
|
- `browser_click` and `browser_press` for interaction
|
||||||
|
- `browser_type` and `browser_select_option` for forms
|
||||||
|
- `browser_snapshot` and `browser_evaluate` for inspection and extraction
|
||||||
|
- `browser_choose_file` for uploads
|
||||||
|
- screenshot, PDF, trace, and download capture through the active browser workflow
|
||||||
|
|
||||||
|
### Common browser outcomes
|
||||||
|
|
||||||
|
| Goal | Typical MCP-style action |
|
||||||
|
|------|--------------------------|
|
||||||
|
| Open and inspect a site | navigate, wait, inspect, screenshot |
|
||||||
|
| Complete a form | navigate, click, fill, select, submit |
|
||||||
|
| Capture evidence | screenshot, PDF, download, trace |
|
||||||
|
| Pull structured page data | navigate, wait for rendered state, extract |
|
||||||
|
| Reproduce a UI bug | headed run, trace, console or network inspection |
|
||||||
|
|
||||||
|
### Existing test suite
|
||||||
|
```bash
|
||||||
|
npx playwright test
|
||||||
|
npx playwright test --headed
|
||||||
|
npx playwright test --trace on
|
||||||
|
```
|
||||||
|
|
||||||
|
### Bootstrap selectors and flows
|
||||||
|
```bash
|
||||||
|
npx playwright codegen https://example.com
|
||||||
|
```
|
||||||
|
|
||||||
|
### Direct script path
|
||||||
|
```javascript
|
||||||
|
const { chromium } = require('playwright');
|
||||||
|
|
||||||
|
(async () => {
|
||||||
|
const browser = await chromium.launch({ headless: true });
|
||||||
|
const page = await browser.newPage();
|
||||||
|
await page.goto('https://example.com');
|
||||||
|
await page.screenshot({ path: 'page.png', fullPage: true });
|
||||||
|
await browser.close();
|
||||||
|
})();
|
||||||
|
```
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
| Topic | File |
|
||||||
|
|------|------|
|
||||||
|
| Selector strategy and frame handling | `selectors.md` |
|
||||||
|
| Failure analysis, traces, logs, and headed runs | `debugging.md` |
|
||||||
|
| Test architecture, mocks, auth, and assertions | `testing.md` |
|
||||||
|
| CI defaults, retries, workers, and failure artifacts | `ci-cd.md` |
|
||||||
|
| Rendered-page extraction, pagination, and respectful throttling | `scraping.md` |
|
||||||
|
|
||||||
|
## Approach Selection
|
||||||
|
|
||||||
|
| Situation | Best path | Why |
|
||||||
|
|----------|-----------|-----|
|
||||||
|
| Static HTML or a simple HTTP response is enough | Use a cheaper fetch path first | Faster, cheaper, less brittle |
|
||||||
|
| You need a reliable first draft of selectors or flows | Start with `codegen` or a headed exploratory run | Faster than guessing selectors from source or stale DOM |
|
||||||
|
| Local app, staging app, or repo-owned E2E suite | Use `@playwright/test` | Best fit for repeatable tests and assertions |
|
||||||
|
| One-off browser automation, screenshots, downloads, or rendered extraction | Use direct Playwright API | Simple, explicit, and easy to debug in code |
|
||||||
|
| Agent/browser-tool workflow already depends on `browser_*` tools or the user wants no-code browser control | Use Playwright MCP | Fastest path for navigate-click-fill-screenshot workflows |
|
||||||
|
| CI failures, flake, or environment drift | Start with `debugging.md` and `ci-cd.md` | Traces and artifacts matter more than new code |
|
||||||
|
|
||||||
|
## Core Rules
|
||||||
|
|
||||||
|
### 1. Test user-visible behavior and the real browser boundary
|
||||||
|
- Do not spend Playwright on implementation details that unit or API tests can cover more cheaply.
|
||||||
|
- Use Playwright when success depends on rendered UI, actionability, auth, uploads or downloads, navigation, or browser-only behavior.
|
||||||
|
|
||||||
|
### 2. Make runs isolated before making them clever
|
||||||
|
- Keep tests and scripts independent so retries, parallelism, and reruns do not inherit hidden state.
|
||||||
|
- Extend the repository's existing Playwright harness, config, and fixtures before inventing a parallel testing shape from scratch.
|
||||||
|
- Do not share mutable accounts, browser state, or server-side data across parallel runs unless the suite was explicitly designed for it.
|
||||||
|
|
||||||
|
### 3. Reconnaissance before action
|
||||||
|
- Open, wait, and inspect the rendered state before locking selectors or assertions.
|
||||||
|
- Use `codegen`, headed mode, or traces to discover stable locators instead of guessing from source or stale DOM.
|
||||||
|
- For flaky or CI-only failures, capture a trace before rewriting selectors or waits.
|
||||||
|
|
||||||
|
### 4. Prefer resilient locators and web-first assertions
|
||||||
|
- Use role, label, text, alt text, title, or test ID before CSS or XPath.
|
||||||
|
- Assert the user-visible outcome with Playwright assertions instead of checking only that a click or fill command executed.
|
||||||
|
- If a locator is ambiguous, disambiguate it. Do not silence strictness with `first()`, `last()`, or `nth()` unless position is the actual behavior under test.
|
||||||
|
|
||||||
|
### 5. Wait on actionability and app state, not arbitrary time
|
||||||
|
- Let Playwright's actionability checks work for you before reaching for sleeps or forced actions.
|
||||||
|
- Prefer `expect`, URL waits, response waits, and explicit app-ready signals over generic timing guesses.
|
||||||
|
|
||||||
|
### 6. Control what you do not own
|
||||||
|
- Mock or isolate third-party services, flaky upstream APIs, analytics noise, and cross-origin dependencies whenever the goal is to verify your app.
|
||||||
|
- For rendered extraction, prefer documented APIs or plain HTTP paths before driving a full browser.
|
||||||
|
- Do not make live third-party widgets or upstream integrations the reason your suite flakes unless that exact integration is what the user asked to validate.
|
||||||
|
|
||||||
|
### 7. Keep auth, production access, and persistence explicit
|
||||||
|
- Do not persist saved browser state by default.
|
||||||
|
- Reuse auth state only when the repository already standardizes it or the user explicitly asks for session reuse.
|
||||||
|
- For destructive, financial, medical, production, or otherwise high-stakes flows, prefer staging or local environments and require explicit user confirmation before continuing.
|
||||||
|
|
||||||
|
## Playwright Traps
|
||||||
|
|
||||||
|
- Guessing selectors from source or using `first()`, `last()`, or `nth()` to silence ambiguity -> the automation works once and then flakes.
|
||||||
|
- Starting a new Playwright structure when the repo already has config, fixtures, auth setup, or conventions -> the new flow fights the existing harness and wastes time.
|
||||||
|
- Testing internal implementation details instead of visible outcomes -> the suite passes while the user path is still broken.
|
||||||
|
- Sharing one authenticated state across parallel tests that mutate server-side data -> failures become order-dependent and hard to trust.
|
||||||
|
- Reaching for `force: true` before understanding overlays, disabled state, or actionability -> the test hides a real bug.
|
||||||
|
- Waiting on `networkidle` for chatty SPAs -> analytics, polling, or sockets keep the page "busy" even when the UI is ready.
|
||||||
|
- Driving a full browser when HTTP or an API would answer the question -> more cost, more flake, less signal.
|
||||||
|
- Treating third-party widgets and live upstream services as if they were stable parts of your own product -> failures stop being actionable.
|
||||||
|
|
||||||
|
## External Endpoints
|
||||||
|
|
||||||
|
| Endpoint | Data Sent | Purpose |
|
||||||
|
|----------|-----------|---------|
|
||||||
|
| User-requested web origins | Browser requests, form input, cookies, uploads, and page interactions required by the task | Automation, testing, screenshots, PDFs, and rendered extraction |
|
||||||
|
| `https://registry.npmjs.org` | Package metadata and tarballs during optional installation | Install Playwright or Playwright MCP |
|
||||||
|
|
||||||
|
No other data is sent externally.
|
||||||
|
|
||||||
|
## Security & Privacy
|
||||||
|
|
||||||
|
Data that leaves your machine:
|
||||||
|
- Requests sent to the websites the user asked to automate.
|
||||||
|
- Optional package-install traffic to npm when installing Playwright tooling.
|
||||||
|
|
||||||
|
Data that stays local:
|
||||||
|
- Source code, traces, screenshots, videos, PDFs, and temporary browser state kept in the workspace or system temp directory.
|
||||||
|
|
||||||
|
This skill does NOT:
|
||||||
|
- Create hidden memory files or local folder systems.
|
||||||
|
- Recommend browser-fingerprint hacks, challenge-solving services, or rotating exits.
|
||||||
|
- Persist sessions or credentials by default.
|
||||||
|
- Make undeclared network requests beyond the target sites involved in the task and optional tool installation.
|
||||||
|
- Treat high-stakes production flows as safe to automate without explicit user direction.
|
||||||
|
|
||||||
|
## Trust
|
||||||
|
|
||||||
|
By using this skill, browser requests go to the websites you automate and optional package downloads go through npm.
|
||||||
|
Only install if you trust those services and the sites involved in your workflow.
|
||||||
|
|
||||||
|
## Related Skills
|
||||||
|
Install with `clawhub install <slug>` if user confirms:
|
||||||
|
- `web` - HTTP-first investigation before escalating to a real browser.
|
||||||
|
- `scrape` - Broader extraction workflows when browser automation is not the main challenge.
|
||||||
|
- `screenshots` - Capture and polish visual artifacts after browser work.
|
||||||
|
- `multi-engine-web-search` - Find and shortlist target pages before automating them.
|
||||||
|
|
||||||
|
## Feedback
|
||||||
|
- If useful: `clawhub star playwright`
|
||||||
|
- Stay updated: `clawhub sync`
|
||||||
6
_meta.json
Normal file
6
_meta.json
Normal file
@@ -0,0 +1,6 @@
|
|||||||
|
{
|
||||||
|
"ownerId": "kn73vp5rarc3b14rc7wjcw8f8580t5d1",
|
||||||
|
"slug": "playwright",
|
||||||
|
"version": "1.0.3",
|
||||||
|
"publishedAt": 1773251092810
|
||||||
|
}
|
||||||
176
ci-cd.md
Normal file
176
ci-cd.md
Normal file
@@ -0,0 +1,176 @@
|
|||||||
|
# CI Success Defaults
|
||||||
|
|
||||||
|
## GitHub Actions
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
name: Playwright Tests
|
||||||
|
on: [push, pull_request]
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
test:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v4
|
||||||
|
|
||||||
|
- uses: actions/setup-node@v4
|
||||||
|
with:
|
||||||
|
node-version: 20
|
||||||
|
cache: 'npm'
|
||||||
|
|
||||||
|
- name: Install dependencies
|
||||||
|
run: npm ci
|
||||||
|
|
||||||
|
- name: Install Playwright browsers
|
||||||
|
run: npx playwright install --with-deps
|
||||||
|
|
||||||
|
- name: Run tests
|
||||||
|
run: npx playwright test
|
||||||
|
|
||||||
|
- uses: actions/upload-artifact@v4
|
||||||
|
if: failure()
|
||||||
|
with:
|
||||||
|
name: playwright-report
|
||||||
|
path: playwright-report/
|
||||||
|
retention-days: 7
|
||||||
|
```
|
||||||
|
|
||||||
|
Use the official Playwright image or install browsers explicitly. Always keep traces and failure artifacts.
|
||||||
|
|
||||||
|
## GitLab CI
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
playwright:
|
||||||
|
image: mcr.microsoft.com/playwright:v1.40.0-jammy
|
||||||
|
stage: test
|
||||||
|
script:
|
||||||
|
- npm ci
|
||||||
|
- npx playwright test
|
||||||
|
artifacts:
|
||||||
|
when: on_failure
|
||||||
|
paths:
|
||||||
|
- playwright-report/
|
||||||
|
expire_in: 7 days
|
||||||
|
```
|
||||||
|
|
||||||
|
## Docker Setup
|
||||||
|
|
||||||
|
```dockerfile
|
||||||
|
FROM mcr.microsoft.com/playwright:v1.40.0-jammy
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
COPY package*.json ./
|
||||||
|
RUN npm ci
|
||||||
|
COPY . .
|
||||||
|
|
||||||
|
CMD ["npx", "playwright", "test"]
|
||||||
|
```
|
||||||
|
|
||||||
|
## When to Add Sharding
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# GitHub Actions matrix
|
||||||
|
jobs:
|
||||||
|
test:
|
||||||
|
strategy:
|
||||||
|
matrix:
|
||||||
|
shard: [1, 2, 3, 4]
|
||||||
|
steps:
|
||||||
|
- name: Run tests
|
||||||
|
run: npx playwright test --shard=${{ matrix.shard }}/4
|
||||||
|
```
|
||||||
|
|
||||||
|
Do not shard a small or unstable suite just to look faster. Add sharding only after the suite is already deterministic.
|
||||||
|
|
||||||
|
## playwright.config.ts for CI
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
import { defineConfig, devices } from '@playwright/test';
|
||||||
|
|
||||||
|
export default defineConfig({
|
||||||
|
testDir: './tests',
|
||||||
|
fullyParallel: true,
|
||||||
|
forbidOnly: !!process.env.CI,
|
||||||
|
retries: process.env.CI ? 2 : 0,
|
||||||
|
workers: process.env.CI ? 4 : undefined,
|
||||||
|
reporter: process.env.CI
|
||||||
|
? [['html'], ['github']]
|
||||||
|
: [['html']],
|
||||||
|
|
||||||
|
use: {
|
||||||
|
trace: 'on-first-retry',
|
||||||
|
screenshot: 'only-on-failure',
|
||||||
|
video: 'on-first-retry',
|
||||||
|
},
|
||||||
|
|
||||||
|
projects: [
|
||||||
|
{
|
||||||
|
name: 'chromium',
|
||||||
|
use: { ...devices['Desktop Chrome'] },
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: 'firefox',
|
||||||
|
use: { ...devices['Desktop Firefox'] },
|
||||||
|
},
|
||||||
|
{
|
||||||
|
name: 'webkit',
|
||||||
|
use: { ...devices['Desktop Safari'] },
|
||||||
|
},
|
||||||
|
],
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
## Caching Browsers
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# GitHub Actions
|
||||||
|
- name: Cache Playwright browsers
|
||||||
|
uses: actions/cache@v4
|
||||||
|
with:
|
||||||
|
path: ~/.cache/ms-playwright
|
||||||
|
key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Environment Variables
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
env:
|
||||||
|
BASE_URL: https://staging.example.com
|
||||||
|
CI: true
|
||||||
|
```
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// playwright.config.ts
|
||||||
|
use: {
|
||||||
|
baseURL: process.env.BASE_URL || 'http://localhost:3000',
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Flaky Test Management
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// Mark known flaky test
|
||||||
|
test('sometimes fails', {
|
||||||
|
annotation: { type: 'flaky', description: 'Network timing issue' },
|
||||||
|
}, async ({ page }) => {
|
||||||
|
// test code
|
||||||
|
});
|
||||||
|
|
||||||
|
// Retry configuration
|
||||||
|
export default defineConfig({
|
||||||
|
retries: 2,
|
||||||
|
expect: {
|
||||||
|
timeout: 10000,
|
||||||
|
},
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
## Common CI Issues
|
||||||
|
|
||||||
|
| Issue | Fix |
|
||||||
|
|-------|-----|
|
||||||
|
| Browsers not found | Use official Playwright Docker image |
|
||||||
|
| Display errors | Headless mode or `xvfb-run` |
|
||||||
|
| Out of memory | Reduce workers, close contexts |
|
||||||
|
| Timeouts | Increase `actionTimeout`, add retries |
|
||||||
|
| Inconsistent screenshots | Set fixed viewport, disable animations |
|
||||||
|
| Order-dependent failures | Remove shared auth or shared mutable test data |
|
||||||
136
debugging.md
Normal file
136
debugging.md
Normal file
@@ -0,0 +1,136 @@
|
|||||||
|
# Debugging Guide
|
||||||
|
|
||||||
|
## First Moves
|
||||||
|
|
||||||
|
1. Reproduce in headed mode.
|
||||||
|
2. Capture a trace before rewriting selectors or waits.
|
||||||
|
3. Check whether the failure is selector drift, actionability, environment drift, shared state, or a real product regression.
|
||||||
|
|
||||||
|
## Inspector and Headed Runs
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npx playwright test --debug
|
||||||
|
npx playwright test my-test.spec.ts --debug
|
||||||
|
npx playwright test --headed
|
||||||
|
```
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
await page.pause();
|
||||||
|
```
|
||||||
|
|
||||||
|
## Trace Viewer
|
||||||
|
|
||||||
|
```bash
|
||||||
|
npx playwright test --trace on
|
||||||
|
npx playwright show-trace trace.zip
|
||||||
|
```
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
use: {
|
||||||
|
trace: 'retain-on-failure',
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Start with traces for CI and flaky failures. Use screenshots and videos as supporting evidence, not as the primary debugging tool.
|
||||||
|
|
||||||
|
## Common Errors
|
||||||
|
|
||||||
|
### Element Not Found
|
||||||
|
|
||||||
|
Use explicit waits and confirm the right frame or shadow boundary before rewriting selectors. If the locator is ambiguous, improve the locator instead of clicking the first match.
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
await page.waitForSelector('.element');
|
||||||
|
const frame = page.frameLocator('iframe');
|
||||||
|
await frame.locator('.element').click();
|
||||||
|
await page.click('.element', { timeout: 60000 });
|
||||||
|
```
|
||||||
|
|
||||||
|
### Flaky Click
|
||||||
|
|
||||||
|
Check visibility, scrolling, overlays, and disabled state before forcing the click.
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
await page.locator('.btn').waitFor({ state: 'visible' });
|
||||||
|
await page.locator('.btn').scrollIntoViewIfNeeded();
|
||||||
|
await page.locator('.btn').click();
|
||||||
|
```
|
||||||
|
|
||||||
|
Use `force: true` only after confirming that the overlay or disabled state is not the real bug.
|
||||||
|
|
||||||
|
If the click target keeps changing, inspect actionability conditions first: visible, stable, enabled, and actually receiving pointer events.
|
||||||
|
|
||||||
|
### Timeout in CI
|
||||||
|
|
||||||
|
Slow environments usually need better waits, traces, or fewer workers before they need bigger timeouts.
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
export default defineConfig({
|
||||||
|
timeout: 60000,
|
||||||
|
expect: { timeout: 10000 },
|
||||||
|
});
|
||||||
|
|
||||||
|
await expect.poll(async () => {
|
||||||
|
return await page.locator('.items').count();
|
||||||
|
}, { timeout: 30000 }).toBeGreaterThan(5);
|
||||||
|
```
|
||||||
|
|
||||||
|
### Network Issues
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
page.on('request', request => {
|
||||||
|
console.log('>>', request.method(), request.url());
|
||||||
|
});
|
||||||
|
|
||||||
|
page.on('response', response => {
|
||||||
|
console.log('<<', response.status(), response.url());
|
||||||
|
});
|
||||||
|
|
||||||
|
const responsePromise = page.waitForResponse('**/api/data');
|
||||||
|
await page.click('.load-data');
|
||||||
|
const response = await responsePromise;
|
||||||
|
```
|
||||||
|
|
||||||
|
## Failure Artifacts
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
test.afterEach(async ({ page }, testInfo) => {
|
||||||
|
if (testInfo.status !== 'passed') {
|
||||||
|
await page.screenshot({
|
||||||
|
path: `screenshots/${testInfo.title}.png`,
|
||||||
|
fullPage: true,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
## Console and Runtime Errors
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
page.on('console', msg => {
|
||||||
|
console.log('PAGE LOG:', msg.text());
|
||||||
|
});
|
||||||
|
|
||||||
|
page.on('pageerror', error => {
|
||||||
|
console.log('PAGE ERROR:', error.message);
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
## Compare Local vs CI
|
||||||
|
|
||||||
|
| Check | Command |
|
||||||
|
|-------|---------|
|
||||||
|
| Viewport | `await page.viewportSize()` |
|
||||||
|
| User agent | `await page.evaluate(() => navigator.userAgent)` |
|
||||||
|
| Timezone | `await page.evaluate(() => Intl.DateTimeFormat().resolvedOptions().timeZone)` |
|
||||||
|
| Network | `page.on('request', ...)` |
|
||||||
|
| Shared auth/data | verify whether tests mutate the same account or fixtures |
|
||||||
|
|
||||||
|
## Debugging Checklist
|
||||||
|
|
||||||
|
1. Run with `--debug` or `--headed`.
|
||||||
|
2. Add `await page.pause()` before the failure point.
|
||||||
|
3. Capture trace, screenshot, and console output before changing selectors.
|
||||||
|
4. Check for iframes, shadow DOM, overlays, loading states, and shared auth or data collisions.
|
||||||
|
5. Compare viewport, network behavior, workers, and environment flags between local and CI.
|
||||||
|
6. Only then rewrite selectors, waits, fixtures, or test structure.
|
||||||
139
scraping.md
Normal file
139
scraping.md
Normal file
@@ -0,0 +1,139 @@
|
|||||||
|
# Rendered-Page Extraction Patterns
|
||||||
|
|
||||||
|
Use browser extraction only when the data is hidden behind rendering, client-side interactions, pagination, or downloads. If a plain HTTP fetch or documented API can answer the question, use that first.
|
||||||
|
|
||||||
|
## Basic Extraction
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
const browser = await chromium.launch();
|
||||||
|
const page = await browser.newPage();
|
||||||
|
await page.goto('https://example.com/products');
|
||||||
|
await page.waitForSelector('.product-card');
|
||||||
|
|
||||||
|
const products = await page.$$eval('.product-card', cards =>
|
||||||
|
cards.map(card => ({
|
||||||
|
name: card.querySelector('.name')?.textContent?.trim(),
|
||||||
|
price: card.querySelector('.price')?.textContent?.trim(),
|
||||||
|
url: card.querySelector('a')?.href,
|
||||||
|
}))
|
||||||
|
);
|
||||||
|
|
||||||
|
await browser.close();
|
||||||
|
```
|
||||||
|
|
||||||
|
## Wait Strategies for Dynamic Apps
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
await page.waitForSelector('[data-loaded="true"]');
|
||||||
|
await page.waitForSelector('.loading-spinner', { state: 'hidden' });
|
||||||
|
|
||||||
|
await expect.poll(async () => {
|
||||||
|
return await page.locator('.product').count();
|
||||||
|
}).toBeGreaterThan(0);
|
||||||
|
```
|
||||||
|
|
||||||
|
Use `networkidle` only when the app genuinely becomes quiet. Polling, analytics, and sockets often make it the wrong condition.
|
||||||
|
|
||||||
|
## Infinite Scroll
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
async function scrollToBottom(page: Page) {
|
||||||
|
let previousHeight = 0;
|
||||||
|
let previousCount = 0;
|
||||||
|
|
||||||
|
while (true) {
|
||||||
|
const currentHeight = await page.evaluate(() => document.body.scrollHeight);
|
||||||
|
if (currentHeight === previousHeight) break;
|
||||||
|
|
||||||
|
previousHeight = currentHeight;
|
||||||
|
previousCount = await page.locator('.product-card').count();
|
||||||
|
await page.evaluate(() => window.scrollTo(0, document.body.scrollHeight));
|
||||||
|
await expect.poll(async () => {
|
||||||
|
return await page.locator('.product-card').count();
|
||||||
|
}).toBeGreaterThan(previousCount);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Pagination
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
async function scrapeAllPages(page: Page) {
|
||||||
|
const allData = [];
|
||||||
|
|
||||||
|
while (true) {
|
||||||
|
allData.push(...await extractData(page));
|
||||||
|
|
||||||
|
const nextButton = page.getByRole('button', { name: 'Next' });
|
||||||
|
if (!(await nextButton.isVisible()) || await nextButton.isDisabled()) break;
|
||||||
|
|
||||||
|
await nextButton.click();
|
||||||
|
await page.locator('.results').waitFor();
|
||||||
|
}
|
||||||
|
|
||||||
|
return allData;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Retries and Error Handling
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
async function scrapeWithRetry(url: string, retries = 3) {
|
||||||
|
for (let attempt = 1; attempt <= retries; attempt++) {
|
||||||
|
const page = await context.newPage();
|
||||||
|
try {
|
||||||
|
await page.goto(url, { timeout: 30000 });
|
||||||
|
return await extractData(page);
|
||||||
|
} catch (error) {
|
||||||
|
if (attempt === retries) throw error;
|
||||||
|
await new Promise(resolve => setTimeout(resolve, 1000 * attempt));
|
||||||
|
} finally {
|
||||||
|
await page.close();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Respectful Throttling
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
class RateLimiter {
|
||||||
|
private lastRequest = 0;
|
||||||
|
|
||||||
|
constructor(private minDelay: number) {}
|
||||||
|
|
||||||
|
async wait() {
|
||||||
|
const elapsed = Date.now() - this.lastRequest;
|
||||||
|
if (elapsed < this.minDelay) {
|
||||||
|
await new Promise(resolve => setTimeout(resolve, this.minDelay - elapsed));
|
||||||
|
}
|
||||||
|
this.lastRequest = Date.now();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Throttle multi-page work, respect robots and terms where applicable, and keep scope aligned with the user's request.
|
||||||
|
|
||||||
|
## Structured Data Extraction
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
const jsonLd = await page.$eval(
|
||||||
|
'script[type="application/ld+json"]',
|
||||||
|
el => JSON.parse(el.textContent || '{}')
|
||||||
|
);
|
||||||
|
|
||||||
|
const tableData = await page.$$eval('table tbody tr', rows =>
|
||||||
|
rows.map(row =>
|
||||||
|
Array.from(row.querySelectorAll('td')).map(td => td.textContent?.trim())
|
||||||
|
)
|
||||||
|
);
|
||||||
|
```
|
||||||
|
|
||||||
|
When extracting repeated items, prefer locating the correct collection first and only then evaluating over that collection. Do not scrape the whole page if the user only needs one bounded region.
|
||||||
|
|
||||||
|
## Avoid by Default
|
||||||
|
|
||||||
|
- Browser-fingerprint hacks, rotating exits, or challenge-solving services.
|
||||||
|
- Blind session persistence across tasks.
|
||||||
|
- Extraction plans that ignore cheaper API or HTTP paths.
|
||||||
|
- Wide crawls when the user only asked for one page, one result set, or one bounded workflow.
|
||||||
84
selectors.md
Normal file
84
selectors.md
Normal file
@@ -0,0 +1,84 @@
|
|||||||
|
# Selector Strategies
|
||||||
|
|
||||||
|
## Hierarchy (Most to Least Resilient)
|
||||||
|
|
||||||
|
### 1. Role-Based (Best)
|
||||||
|
```typescript
|
||||||
|
page.getByRole('button', { name: 'Submit' })
|
||||||
|
page.getByRole('link', { name: /sign up/i })
|
||||||
|
page.getByRole('heading', { level: 1 })
|
||||||
|
page.getByRole('textbox', { name: 'Email' })
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Test IDs (Explicit)
|
||||||
|
```typescript
|
||||||
|
page.getByTestId('checkout-button')
|
||||||
|
page.getByTestId('product-card')
|
||||||
|
```
|
||||||
|
Configure in `playwright.config.ts`:
|
||||||
|
```typescript
|
||||||
|
use: { testIdAttribute: 'data-testid' }
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Label/Placeholder (Forms)
|
||||||
|
```typescript
|
||||||
|
page.getByLabel('Email address')
|
||||||
|
page.getByPlaceholder('Enter your email')
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Text Content (Visible)
|
||||||
|
```typescript
|
||||||
|
page.getByText('Add to Cart', { exact: true })
|
||||||
|
page.getByText(/welcome/i) // regex for flexibility
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. CSS (Last Resort)
|
||||||
|
```typescript
|
||||||
|
// Avoid these patterns:
|
||||||
|
page.locator('.css-1a2b3c') // generated class
|
||||||
|
page.locator('div > span:nth-child(2)') // positional
|
||||||
|
page.locator('#root > div > div > button') // deep nesting
|
||||||
|
|
||||||
|
// Acceptable:
|
||||||
|
page.locator('[data-product-id="123"]') // semantic attribute
|
||||||
|
page.locator('form.login-form') // stable class
|
||||||
|
```
|
||||||
|
|
||||||
|
## Chaining and Filtering
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
page.getByRole('listitem').filter({ hasText: 'Product A' })
|
||||||
|
|
||||||
|
page.getByTestId('cart').getByRole('button', { name: 'Remove' })
|
||||||
|
```
|
||||||
|
|
||||||
|
Prefer filtering or parent-child chaining over `first()`, `last()`, or `nth()`. Use positional locators only when order is the thing being tested or there is genuinely no stable identity.
|
||||||
|
|
||||||
|
If a locator matches multiple elements, do not silence strictness with position by default. Disambiguate the locator until it represents the intended target.
|
||||||
|
|
||||||
|
## Frame Handling
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// Named frame
|
||||||
|
const frame = page.frameLocator('iframe[name="checkout"]')
|
||||||
|
frame.getByRole('button', { name: 'Pay' }).click()
|
||||||
|
|
||||||
|
// Frame by URL
|
||||||
|
page.frameLocator('iframe[src*="stripe"]')
|
||||||
|
```
|
||||||
|
|
||||||
|
## Shadow DOM
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// Playwright pierces shadow DOM by default
|
||||||
|
page.locator('my-component').getByRole('button')
|
||||||
|
```
|
||||||
|
|
||||||
|
## Common Mistakes
|
||||||
|
|
||||||
|
| Mistake | Better |
|
||||||
|
|---------|--------|
|
||||||
|
| `page.locator('button').click()` | `page.getByRole('button', { name: 'Submit' }).click()` |
|
||||||
|
| `page.getByTestId('product-card').first()` | filter or chain until only the intended card matches |
|
||||||
|
| `nth-child(3)` | Filter by text, role, test ID, or parent context |
|
||||||
|
| `//div[@class="xyz"]/span[2]` | Role-based or test ID |
|
||||||
148
testing.md
Normal file
148
testing.md
Normal file
@@ -0,0 +1,148 @@
|
|||||||
|
# Testing Patterns
|
||||||
|
|
||||||
|
## What to Test First
|
||||||
|
|
||||||
|
Prioritize critical user journeys, auth boundaries, payments, file upload or download flows, and state transitions that are expensive to break.
|
||||||
|
|
||||||
|
Do not spend E2E budget on trivial presentational details that cheaper unit or component tests can cover.
|
||||||
|
|
||||||
|
Prefer assertions on what the user can observe: visible state, text, URL, enabled or disabled controls, downloads, navigation, and persisted app state.
|
||||||
|
|
||||||
|
## Test Structure
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
import { test, expect } from '@playwright/test';
|
||||||
|
|
||||||
|
test.describe('Checkout Flow', () => {
|
||||||
|
test.beforeEach(async ({ page }) => {
|
||||||
|
await page.goto('/products');
|
||||||
|
});
|
||||||
|
|
||||||
|
test('completes purchase with valid card', async ({ page }) => {
|
||||||
|
await page.getByTestId('product-card')
|
||||||
|
.filter({ hasText: 'Product A' })
|
||||||
|
.click();
|
||||||
|
await page.getByRole('button', { name: 'Add to Cart' }).click();
|
||||||
|
await page.getByRole('link', { name: 'Checkout' }).click();
|
||||||
|
await expect(page.getByRole('heading', { name: 'Order Summary' })).toBeVisible();
|
||||||
|
});
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
## Page Object Model
|
||||||
|
|
||||||
|
Use page objects when a flow is reused across many tests. Do not build a giant abstraction layer before duplication is real.
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
export class CheckoutPage {
|
||||||
|
constructor(private page: Page) {}
|
||||||
|
|
||||||
|
readonly cartItems = this.page.getByTestId('cart-item');
|
||||||
|
readonly checkoutButton = this.page.getByRole('button', { name: 'Checkout' });
|
||||||
|
readonly totalPrice = this.page.getByTestId('total-price');
|
||||||
|
|
||||||
|
async removeItem(name: string) {
|
||||||
|
await this.cartItems
|
||||||
|
.filter({ hasText: name })
|
||||||
|
.getByRole('button', { name: 'Remove' })
|
||||||
|
.click();
|
||||||
|
}
|
||||||
|
|
||||||
|
async expectTotal(amount: string) {
|
||||||
|
await expect(this.totalPrice).toHaveText(amount);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Fixtures
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
import { test as base } from '@playwright/test';
|
||||||
|
import { CheckoutPage } from './pages/checkout.page';
|
||||||
|
|
||||||
|
type Fixtures = {
|
||||||
|
checkoutPage: CheckoutPage;
|
||||||
|
};
|
||||||
|
|
||||||
|
export const test = base.extend<Fixtures>({
|
||||||
|
checkoutPage: async ({ page }, use) => {
|
||||||
|
await page.goto('/checkout');
|
||||||
|
await use(new CheckoutPage(page));
|
||||||
|
},
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
## Isolation Rules
|
||||||
|
|
||||||
|
- Keep tests independent so they can run alone, in parallel, or after retries without hidden dependencies.
|
||||||
|
- If tests mutate shared backend state, use dedicated accounts, seeded data, or per-worker isolation instead of reusing one mutable user everywhere.
|
||||||
|
- When auth is shared, prefer the Playwright setup-project pattern or one account per worker over ad hoc state reuse.
|
||||||
|
|
||||||
|
## Mock What You Do Not Need to Re-Test
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
test('shows error on API failure', async ({ page }) => {
|
||||||
|
await page.route('**/api/checkout', route => {
|
||||||
|
route.fulfill({
|
||||||
|
status: 500,
|
||||||
|
body: JSON.stringify({ error: 'Payment failed' }),
|
||||||
|
});
|
||||||
|
});
|
||||||
|
|
||||||
|
await page.goto('/checkout');
|
||||||
|
await page.getByRole('button', { name: 'Pay' }).click();
|
||||||
|
await expect(page.getByText('Payment failed')).toBeVisible();
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
Avoid testing third-party widgets, analytics, payment processors, or upstream APIs end to end unless the point of the test is that exact integration.
|
||||||
|
|
||||||
|
## Visual Regression
|
||||||
|
|
||||||
|
Use visual assertions for layout or rendering regressions that humans would otherwise miss. Keep viewport, fonts, and animations deterministic.
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
test('matches snapshot', async ({ page }) => {
|
||||||
|
await page.goto('/dashboard');
|
||||||
|
await expect(page).toHaveScreenshot('dashboard.png', {
|
||||||
|
maxDiffPixels: 100,
|
||||||
|
});
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
## Parallelization
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
export default defineConfig({
|
||||||
|
workers: process.env.CI ? 4 : undefined,
|
||||||
|
fullyParallel: true,
|
||||||
|
});
|
||||||
|
|
||||||
|
test.describe.configure({ mode: 'parallel' });
|
||||||
|
test.describe.configure({ mode: 'serial' });
|
||||||
|
```
|
||||||
|
|
||||||
|
## Authentication State
|
||||||
|
|
||||||
|
Persist auth only when the suite already standardizes that pattern and the stored state is safe to reuse.
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
const authFile = 'playwright/.auth/user.json';
|
||||||
|
// Reuse a saved auth file only in suites that intentionally standardize it.
|
||||||
|
```
|
||||||
|
|
||||||
|
For one-off debugging, privileged accounts, or stateful flows that mutate backend data, prefer logging in inside the test or using isolated worker accounts instead of carrying one shared session everywhere.
|
||||||
|
|
||||||
|
## Assertions
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
await expect(locator).toBeVisible();
|
||||||
|
await expect(locator).toHaveText('Expected');
|
||||||
|
await expect(locator).toBeEnabled();
|
||||||
|
await expect(locator).toHaveAttribute('href', '/path');
|
||||||
|
await expect(page).toHaveURL(/dashboard/);
|
||||||
|
|
||||||
|
await expect.poll(async () => {
|
||||||
|
return await page.evaluate(() => window.dataLoaded);
|
||||||
|
}).toBe(true);
|
||||||
|
```
|
||||||
Reference in New Issue
Block a user