feat: audit any live site with npx aeo.js check <url>#63
Conversation
Ports the check.aeojs.org crawler into the library: AEO surface discovery (robots.txt, llms.txt, llms-full.txt, sitemap.xml, ai-index.json, homepage), a 23-bot AI crawler access matrix parsed from robots.txt, and a bounded page crawler (sitemap-first with homepage-link fallback, configurable timeout/concurrency/max pages). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
remoteAuditSite() runs the same 5-category / 100-point GEO audit as auditSite() but against crawled live-site data (discovery surface + pages). buildRemoteReport() adds per-page citability scores, platform hints (including Claude and Gemini driven by live bot access), and aeo.js usage detection. formatRemoteReport() renders the terminal view. This is the scoring engine check.aeojs.org reimplements today; it can now import it from the library so web, CLI, and extension stay in sync. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
`npx aeo.js check mysite.com` (or a full https:// URL) now scans the live site — discovery surface, bot access matrix, bounded crawl — and prints the same 5-category GEO readiness score the local audit uses. `report <url>` adds platform hints and per-page citability; --json emits the full scan report with raw HTML stripped. Bare domains get https:// prepended; invalid targets and unreachable hosts exit 1. Also exports the remote crawl/audit API from the package root so check.aeojs.org and the browser extension can share the same scoring. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Docs PreviewPreview URL: https://feat-remote-url-check.aeojs.pages.dev This preview was deployed from the latest commit on this PR. |
Greptile SummaryThis PR adds
Confidence Score: 3/5Safe to merge for CLI use, but the unbounded response-body read in fetchText needs addressing before check.aeojs.org imports these functions server-side. The crawler's fetchText buffers the full response body with no size limit. Because the PR explicitly designates this code for import by the check.aeojs.org web service where arbitrary user-submitted URLs are processed, a target serving a large response body could exhaust server memory. The SSRF gap (private IPs not blocked) compounds this for the server context. Both issues are straightforward to fix but should be resolved before the web service migration. src/core/remote-crawl.ts — specifically fetchText (response body size) and the discover/crawlPages public API (SSRF validation).
|
| Filename | Overview |
|---|---|
| src/core/remote-crawl.ts | New crawler: fetches AEO surface files, parses robots.txt into a 23-bot access matrix, and crawls up to 10 inner pages. Unbounded res.text() call in fetchText poses a DoS risk for server-side use; no SSRF guard on the public API; all exported types use interface instead of type. |
| src/core/remote-audit.ts | New 5-category / 100-point remote audit engine mirroring the local audit, plus platform hints for Claude and Gemini driven by live bot-access data. Logic looks correct; RemoteScanReport uses interface instead of type. |
| src/cli.ts | Adds positional URL support to check and report, bare-domain normalization, Node 18 guard, and remote scan dispatch. cmdCheckRemote and cmdReportRemote produce identical JSON output despite having different text-mode behavior. |
| src/index.ts | Re-exports all new crawler and audit symbols, making them available as a package-level API for consumers like check.aeojs.org. |
| src/core/remote-crawl.test.ts | Good coverage of robots.txt parsing (wildcard, specific bot, grouped agents, allow-override), sitemap URL extraction, link extraction, and discover/crawlPages with mocked fetch including an unreachable-site path. |
| src/core/remote-audit.test.ts | Covers all five audit categories, aeo.js detection, full report construction (including Claude/Gemini platform hints), and the terminal formatter. Tests are well-structured and representative. |
| src/cli.test.ts | Adds tests for positional argument capture, mixed flags/positionals, and normalizeTargetUrl including rejection of localhost, FTP, and bare non-domain strings. |
Sequence Diagram
sequenceDiagram
participant User
participant CLI as cli.ts (main)
participant RC as remote-crawl.ts
participant RA as remote-audit.ts
participant Site as Target Site
User->>CLI: npx aeo.js check mysite.com
CLI->>CLI: "normalizeTargetUrl("mysite.com") -> "https://mysite.com/""
CLI->>RC: discover(targetUrl)
par AEO surface discovery
RC->>Site: GET /robots.txt
RC->>Site: GET /llms.txt
RC->>Site: GET /llms-full.txt
RC->>Site: GET /sitemap.xml
RC->>Site: GET /ai-index.json
RC->>Site: GET / (homepage)
end
RC->>RC: "parseRobotsTxtBotAccess(robotsTxt) -> 23-bot matrix"
RC-->>CLI: DiscoveryResult
CLI->>RC: crawlPages(discovery, targetUrl)
loop "sitemap URLs (up to 10), batched by concurrency=5"
RC->>Site: GET /page-n
end
RC-->>CLI: CrawledPage[]
CLI->>RA: buildRemoteReport(url, discovery, pages)
RA->>RA: "remoteAuditSite() -> 5 categories / 100 pts"
RA->>RA: scorePageCitability() per page
RA->>RA: generatePlatformHints() + Claude/Gemini hints
RA-->>CLI: RemoteScanReport
CLI->>User: formatRemoteReport(report) or JSON
Comments Outside Diff (2)
-
src/core/remote-crawl.ts, line 1193-1197 (link)Unbounded response body buffering
res.text()reads the entire response body into a single string with no size cap. The PR explicitly targets check.aeojs.org as a future consumer of these exports, meaningdiscoverwill run server-side against arbitrary user-submitted URLs. A malicious or pathological target can serve a multi-GB HTML body;res.text()will buffer all of it before returning, risking OOM and crashing the server.Consider reading the
content-lengthheader first and bailing early when it exceeds a reasonable threshold (e.g. 2 MB). Note that a missing or lyingcontent-lengthwon't be caught this way; streaming is safer for a hardened server deployment.Prompt To Fix With AI
This is a comment left during a code review. Path: src/core/remote-crawl.ts Line: 1193-1197 Comment: **Unbounded response body buffering** `res.text()` reads the entire response body into a single string with no size cap. The PR explicitly targets check.aeojs.org as a future consumer of these exports, meaning `discover` will run server-side against arbitrary user-submitted URLs. A malicious or pathological target can serve a multi-GB HTML body; `res.text()` will buffer all of it before returning, risking OOM and crashing the server. Consider reading the `content-length` header first and bailing early when it exceeds a reasonable threshold (e.g. 2 MB). Note that a missing or lying `content-length` won't be caught this way; streaming is safer for a hardened server deployment. How can I resolve this? If you propose a fix, please make it concise.
-
src/core/remote-crawl.ts, line 1333-1378 (link)SSRF exposure when used as a library
discover(and by extensioncrawlPages) will faithfully fetch whatever URL it is given, including private IP ranges (10.x.x.x,192.168.x.x,172.16-31.x.x), loopback (127.0.0.1,::1), and cloud metadata endpoints (169.254.169.254). The CLI'snormalizeTargetUrlrejects barelocalhostbut does nothing for raw IP addresses or link-local hosts.Because the PR description explicitly calls out that check.aeojs.org will import these functions directly, any server-side integration that passes a user-supplied URL to
discoverwithout an additional validation layer would be vulnerable to SSRF.Prompt To Fix With AI
This is a comment left during a code review. Path: src/core/remote-crawl.ts Line: 1333-1378 Comment: **SSRF exposure when used as a library** `discover` (and by extension `crawlPages`) will faithfully fetch whatever URL it is given, including private IP ranges (`10.x.x.x`, `192.168.x.x`, `172.16-31.x.x`), loopback (`127.0.0.1`, `::1`), and cloud metadata endpoints (`169.254.169.254`). The CLI's `normalizeTargetUrl` rejects bare `localhost` but does nothing for raw IP addresses or link-local hosts. Because the PR description explicitly calls out that check.aeojs.org will import these functions directly, any server-side integration that passes a user-supplied URL to `discover` without an additional validation layer would be vulnerable to SSRF. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 5 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 5
src/core/remote-crawl.ts:3-40
`interface` used for simple data structures across both new files, violating custom rules that mandate `type` for DTOs and data structures that don't use inheritance or extension. This pattern repeats for `BotAccessEntry`, `DiscoveryResult`, `CrawledPage`, and `RemoteCrawlOptions` in this file, and `RemoteScanReport` in `remote-audit.ts`.
```suggestion
export type BotAccessEntry = {
bot: string;
company: string;
purpose: string;
allowed: boolean;
};
export type DiscoveryResult = {
robotsTxt: { exists: boolean; content: string | null; hasAiDisallow: boolean };
llmsTxt: { exists: boolean; contentLength: number; content?: string | null };
llmsFullTxt: { exists: boolean; contentLength: number };
sitemap: { exists: boolean; urls: string[] };
aiIndex: { exists: boolean; content?: string | null };
homepage: { html: string; url: string } | null;
botAccess: BotAccessEntry[];
};
export type CrawledPage = {
url: string;
pathname: string;
html: string;
title?: string;
description?: string;
content?: string;
jsonLd?: object[];
ogTags?: Record<string, string>;
};
export type RemoteCrawlOptions = {
/** Per-request timeout in milliseconds. Default: 12000. */
timeoutMs?: number;
/** Maximum inner pages to crawl beyond the homepage. Default: 10. */
maxPages?: number;
/** Concurrent page fetches. Default: 5. */
concurrency?: number;
/** User-Agent header for all requests. */
userAgent?: string;
};
```
### Issue 2 of 5
src/core/remote-audit.ts:9-22
`RemoteScanReport` is a pure data container with no inheritance — should be `type` per the same convention as the other DTOs in this module.
```suggestion
export type RemoteScanReport = {
url: string;
scannedAt: string;
discovery: DiscoveryResult;
pages: CrawledPage[];
audit: AuditResult;
citability: {
averageScore: number;
pages: PageCitabilityResult[];
};
platformHints: PlatformHint[];
botAccess: BotAccessEntry[];
usesAeoJs: boolean;
};
```
### Issue 3 of 5
src/core/remote-crawl.ts:1193-1197
**Unbounded response body buffering**
`res.text()` reads the entire response body into a single string with no size cap. The PR explicitly targets check.aeojs.org as a future consumer of these exports, meaning `discover` will run server-side against arbitrary user-submitted URLs. A malicious or pathological target can serve a multi-GB HTML body; `res.text()` will buffer all of it before returning, risking OOM and crashing the server.
Consider reading the `content-length` header first and bailing early when it exceeds a reasonable threshold (e.g. 2 MB). Note that a missing or lying `content-length` won't be caught this way; streaming is safer for a hardened server deployment.
### Issue 4 of 5
src/core/remote-crawl.ts:1333-1378
**SSRF exposure when used as a library**
`discover` (and by extension `crawlPages`) will faithfully fetch whatever URL it is given, including private IP ranges (`10.x.x.x`, `192.168.x.x`, `172.16-31.x.x`), loopback (`127.0.0.1`, `::1`), and cloud metadata endpoints (`169.254.169.254`). The CLI's `normalizeTargetUrl` rejects bare `localhost` but does nothing for raw IP addresses or link-local hosts.
Because the PR description explicitly calls out that check.aeojs.org will import these functions directly, any server-side integration that passes a user-supplied URL to `discover` without an additional validation layer would be vulnerable to SSRF.
### Issue 5 of 5
src/cli.ts:213-245
**`check --json` and `report --json` produce identical output**
Both `cmdCheckRemote` and `cmdReportRemote` delegate to `remoteReportJson(report)` in the JSON branch, so `npx aeo.js check mysite.com --json` and `npx aeo.js report mysite.com --json` emit byte-for-byte the same JSON. The extra detail that `cmdReportRemote` adds in text mode (platform hints loop, per-page citability list) is not reflected in the structured output. A note in the help text or docs would prevent confusion for users scripting around the output.
Reviews (1): Last reviewed commit: "docs: document URL mode for check and re..." | Re-trigger Greptile
| export interface BotAccessEntry { | ||
| bot: string; | ||
| company: string; | ||
| purpose: string; | ||
| allowed: boolean; | ||
| } | ||
|
|
||
| export interface DiscoveryResult { | ||
| robotsTxt: { exists: boolean; content: string | null; hasAiDisallow: boolean }; | ||
| llmsTxt: { exists: boolean; contentLength: number; content?: string | null }; | ||
| llmsFullTxt: { exists: boolean; contentLength: number }; | ||
| sitemap: { exists: boolean; urls: string[] }; | ||
| aiIndex: { exists: boolean; content?: string | null }; | ||
| homepage: { html: string; url: string } | null; | ||
| botAccess: BotAccessEntry[]; | ||
| } | ||
|
|
||
| export interface CrawledPage { | ||
| url: string; | ||
| pathname: string; | ||
| html: string; | ||
| title?: string; | ||
| description?: string; | ||
| content?: string; | ||
| jsonLd?: object[]; | ||
| ogTags?: Record<string, string>; | ||
| } | ||
|
|
||
| export interface RemoteCrawlOptions { | ||
| /** Per-request timeout in milliseconds. Default: 12000. */ | ||
| timeoutMs?: number; | ||
| /** Maximum inner pages to crawl beyond the homepage. Default: 10. */ | ||
| maxPages?: number; | ||
| /** Concurrent page fetches. Default: 5. */ | ||
| concurrency?: number; | ||
| /** User-Agent header for all requests. */ | ||
| userAgent?: string; | ||
| } |
There was a problem hiding this comment.
interface used for simple data structures across both new files, violating custom rules that mandate type for DTOs and data structures that don't use inheritance or extension. This pattern repeats for BotAccessEntry, DiscoveryResult, CrawledPage, and RemoteCrawlOptions in this file, and RemoteScanReport in remote-audit.ts.
| export interface BotAccessEntry { | |
| bot: string; | |
| company: string; | |
| purpose: string; | |
| allowed: boolean; | |
| } | |
| export interface DiscoveryResult { | |
| robotsTxt: { exists: boolean; content: string | null; hasAiDisallow: boolean }; | |
| llmsTxt: { exists: boolean; contentLength: number; content?: string | null }; | |
| llmsFullTxt: { exists: boolean; contentLength: number }; | |
| sitemap: { exists: boolean; urls: string[] }; | |
| aiIndex: { exists: boolean; content?: string | null }; | |
| homepage: { html: string; url: string } | null; | |
| botAccess: BotAccessEntry[]; | |
| } | |
| export interface CrawledPage { | |
| url: string; | |
| pathname: string; | |
| html: string; | |
| title?: string; | |
| description?: string; | |
| content?: string; | |
| jsonLd?: object[]; | |
| ogTags?: Record<string, string>; | |
| } | |
| export interface RemoteCrawlOptions { | |
| /** Per-request timeout in milliseconds. Default: 12000. */ | |
| timeoutMs?: number; | |
| /** Maximum inner pages to crawl beyond the homepage. Default: 10. */ | |
| maxPages?: number; | |
| /** Concurrent page fetches. Default: 5. */ | |
| concurrency?: number; | |
| /** User-Agent header for all requests. */ | |
| userAgent?: string; | |
| } | |
| export type BotAccessEntry = { | |
| bot: string; | |
| company: string; | |
| purpose: string; | |
| allowed: boolean; | |
| }; | |
| export type DiscoveryResult = { | |
| robotsTxt: { exists: boolean; content: string | null; hasAiDisallow: boolean }; | |
| llmsTxt: { exists: boolean; contentLength: number; content?: string | null }; | |
| llmsFullTxt: { exists: boolean; contentLength: number }; | |
| sitemap: { exists: boolean; urls: string[] }; | |
| aiIndex: { exists: boolean; content?: string | null }; | |
| homepage: { html: string; url: string } | null; | |
| botAccess: BotAccessEntry[]; | |
| }; | |
| export type CrawledPage = { | |
| url: string; | |
| pathname: string; | |
| html: string; | |
| title?: string; | |
| description?: string; | |
| content?: string; | |
| jsonLd?: object[]; | |
| ogTags?: Record<string, string>; | |
| }; | |
| export type RemoteCrawlOptions = { | |
| /** Per-request timeout in milliseconds. Default: 12000. */ | |
| timeoutMs?: number; | |
| /** Maximum inner pages to crawl beyond the homepage. Default: 10. */ | |
| maxPages?: number; | |
| /** Concurrent page fetches. Default: 5. */ | |
| concurrency?: number; | |
| /** User-Agent header for all requests. */ | |
| userAgent?: string; | |
| }; |
Rule Used: Use type by default in TypeScript unless you spe... (source)
Learned From
cytonic-network/ai-frontend#48
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/core/remote-crawl.ts
Line: 3-40
Comment:
`interface` used for simple data structures across both new files, violating custom rules that mandate `type` for DTOs and data structures that don't use inheritance or extension. This pattern repeats for `BotAccessEntry`, `DiscoveryResult`, `CrawledPage`, and `RemoteCrawlOptions` in this file, and `RemoteScanReport` in `remote-audit.ts`.
```suggestion
export type BotAccessEntry = {
bot: string;
company: string;
purpose: string;
allowed: boolean;
};
export type DiscoveryResult = {
robotsTxt: { exists: boolean; content: string | null; hasAiDisallow: boolean };
llmsTxt: { exists: boolean; contentLength: number; content?: string | null };
llmsFullTxt: { exists: boolean; contentLength: number };
sitemap: { exists: boolean; urls: string[] };
aiIndex: { exists: boolean; content?: string | null };
homepage: { html: string; url: string } | null;
botAccess: BotAccessEntry[];
};
export type CrawledPage = {
url: string;
pathname: string;
html: string;
title?: string;
description?: string;
content?: string;
jsonLd?: object[];
ogTags?: Record<string, string>;
};
export type RemoteCrawlOptions = {
/** Per-request timeout in milliseconds. Default: 12000. */
timeoutMs?: number;
/** Maximum inner pages to crawl beyond the homepage. Default: 10. */
maxPages?: number;
/** Concurrent page fetches. Default: 5. */
concurrency?: number;
/** User-Agent header for all requests. */
userAgent?: string;
};
```
**Rule Used:** Use `type` by default in TypeScript unless you spe... ([source](https://app.greptile.com/multivm-labs/-/custom-context?memory=c862f053-5655-4b41-be69-c840e3c9f280))
**Learned From**
[cytonic-network/ai-frontend#48](https://github.com/cytonic-network/ai-frontend/pull/48)
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| export interface RemoteScanReport { | ||
| url: string; | ||
| scannedAt: string; | ||
| discovery: DiscoveryResult; | ||
| pages: CrawledPage[]; | ||
| audit: AuditResult; | ||
| citability: { | ||
| averageScore: number; | ||
| pages: PageCitabilityResult[]; | ||
| }; | ||
| platformHints: PlatformHint[]; | ||
| botAccess: BotAccessEntry[]; | ||
| usesAeoJs: boolean; | ||
| } |
There was a problem hiding this comment.
RemoteScanReport is a pure data container with no inheritance — should be type per the same convention as the other DTOs in this module.
| export interface RemoteScanReport { | |
| url: string; | |
| scannedAt: string; | |
| discovery: DiscoveryResult; | |
| pages: CrawledPage[]; | |
| audit: AuditResult; | |
| citability: { | |
| averageScore: number; | |
| pages: PageCitabilityResult[]; | |
| }; | |
| platformHints: PlatformHint[]; | |
| botAccess: BotAccessEntry[]; | |
| usesAeoJs: boolean; | |
| } | |
| export type RemoteScanReport = { | |
| url: string; | |
| scannedAt: string; | |
| discovery: DiscoveryResult; | |
| pages: CrawledPage[]; | |
| audit: AuditResult; | |
| citability: { | |
| averageScore: number; | |
| pages: PageCitabilityResult[]; | |
| }; | |
| platformHints: PlatformHint[]; | |
| botAccess: BotAccessEntry[]; | |
| usesAeoJs: boolean; | |
| }; |
Rule Used: Use type instead of interface for DTOs and sim... (source)
Learned From
cytonic-network/ai-frontend#48
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/core/remote-audit.ts
Line: 9-22
Comment:
`RemoteScanReport` is a pure data container with no inheritance — should be `type` per the same convention as the other DTOs in this module.
```suggestion
export type RemoteScanReport = {
url: string;
scannedAt: string;
discovery: DiscoveryResult;
pages: CrawledPage[];
audit: AuditResult;
citability: {
averageScore: number;
pages: PageCitabilityResult[];
};
platformHints: PlatformHint[];
botAccess: BotAccessEntry[];
usesAeoJs: boolean;
};
```
**Rule Used:** Use `type` instead of `interface` for DTOs and sim... ([source](https://app.greptile.com/multivm-labs/-/custom-context?memory=2b2a7a55-162e-44b9-8c4c-3f52514f7037))
**Learned From**
[cytonic-network/ai-frontend#48](https://github.com/cytonic-network/ai-frontend/pull/48)
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
|
||
| async function scanRemote(targetUrl: string): Promise<RemoteScanReport> { | ||
| if (typeof fetch !== 'function') { | ||
| console.error('[aeo.js] URL checks require Node 18+ (global fetch).'); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| console.error(`[aeo.js] Scanning ${targetUrl} ...`); | ||
| const discovery = await discover(targetUrl); | ||
|
|
||
| if (!discovery.homepage) { | ||
| console.error(`[aeo.js] Could not reach ${targetUrl} — check the URL and try again.`); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| const pages = await crawlPages(discovery, targetUrl); | ||
| console.error(`[aeo.js] Crawled ${pages.length} page(s).`); | ||
| return buildRemoteReport(targetUrl, discovery, pages); | ||
| } | ||
|
|
||
| /** Report JSON for terminal output — drops raw page HTML to keep it readable. */ | ||
| function remoteReportJson(report: RemoteScanReport): string { | ||
| return JSON.stringify( | ||
| { | ||
| ...report, | ||
| discovery: { | ||
| ...report.discovery, | ||
| homepage: report.discovery.homepage ? { url: report.discovery.homepage.url } : null, | ||
| }, | ||
| pages: report.pages.map(({ html: _html, ...page }) => page), | ||
| }, | ||
| null, | ||
| 2 |
There was a problem hiding this comment.
check --json and report --json produce identical output
Both cmdCheckRemote and cmdReportRemote delegate to remoteReportJson(report) in the JSON branch, so npx aeo.js check mysite.com --json and npx aeo.js report mysite.com --json emit byte-for-byte the same JSON. The extra detail that cmdReportRemote adds in text mode (platform hints loop, per-page citability list) is not reflected in the structured output. A note in the help text or docs would prevent confusion for users scripting around the output.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cli.ts
Line: 213-245
Comment:
**`check --json` and `report --json` produce identical output**
Both `cmdCheckRemote` and `cmdReportRemote` delegate to `remoteReportJson(report)` in the JSON branch, so `npx aeo.js check mysite.com --json` and `npx aeo.js report mysite.com --json` emit byte-for-byte the same JSON. The extra detail that `cmdReportRemote` adds in text mode (platform hints loop, per-page citability list) is not reflected in the structured output. A note in the help text or docs would prevent confusion for users scripting around the output.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
What
The audit no longer requires a local build —
checkandreportnow accept a URL or bare domain:How
src/core/remote-crawl.ts— AEO surface discovery (robots.txt, llms.txt, llms-full.txt, sitemap.xml, ai-index.json, homepage), a 23-bot AI crawler access matrix parsed from robots.txt, and a bounded crawler (sitemap-first, homepage-link fallback; configurable timeout/concurrency/max pages, defaults 12s/5/10).src/core/remote-audit.ts— the same 5-category / 100-point GEO audit driven by live data, per-page citability, platform hints (adds Claude + Gemini based on actual bot access), aeo.js usage detection, and a terminal formatter.mysite.com→https://mysite.com/),--jsonoutput with raw page HTML stripped, exit 1 on invalid/unreachable targets, Node 18+ guard for global fetch.lib/remote-audit.ts/lib/crawler.tswith these imports and the web checker, CLI, and future browser extension report identical scores.This is the scoring logic that already runs in production on check.aeojs.org, upstreamed (minus checker-specific language/country detection, which stays in the app).
Verification
tsc --noEmitcleancheck aeojs.org→ 100/100,check example.com→ 32/100, invalid input and unreachable hosts exit 1🤖 Generated with Claude Code