feat: audit any live site with `npx aeo.js check <url>` by rubenmarcus · Pull Request #63 · multivmlabs/aeo.js

rubenmarcus · 2026-06-10T11:03:09Z

What

The audit no longer requires a local build — check and report now accept a URL or bare domain:

npx aeo.js check mysite.com
npx aeo.js report https://mysite.com --json

How

src/core/remote-crawl.ts — AEO surface discovery (robots.txt, llms.txt, llms-full.txt, sitemap.xml, ai-index.json, homepage), a 23-bot AI crawler access matrix parsed from robots.txt, and a bounded crawler (sitemap-first, homepage-link fallback; configurable timeout/concurrency/max pages, defaults 12s/5/10).
src/core/remote-audit.ts — the same 5-category / 100-point GEO audit driven by live data, per-page citability, platform hints (adds Claude + Gemini based on actual bot access), aeo.js usage detection, and a terminal formatter.
CLI — positional URL support, bare-domain normalization (mysite.com → https://mysite.com/), --json output with raw page HTML stripped, exit 1 on invalid/unreachable targets, Node 18+ guard for global fetch.
All of it is exported from the package root, so check.aeojs.org can replace its reimplemented lib/remote-audit.ts/lib/crawler.ts with these imports and the web checker, CLI, and future browser extension report identical scores.

This is the scoring logic that already runs in production on check.aeojs.org, upstreamed (minus checker-specific language/country detection, which stays in the app).

Verification

tsc --noEmit clean
210 tests pass (30 new: crawler with mocked fetch, audit categories, report builder, formatter, arg parsing, URL normalization)
Smoke-tested against live sites: check aeojs.org → 100/100, check example.com → 32/100, invalid input and unreachable hosts exit 1

🤖 Generated with Claude Code

Ports the check.aeojs.org crawler into the library: AEO surface discovery (robots.txt, llms.txt, llms-full.txt, sitemap.xml, ai-index.json, homepage), a 23-bot AI crawler access matrix parsed from robots.txt, and a bounded page crawler (sitemap-first with homepage-link fallback, configurable timeout/concurrency/max pages). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

remoteAuditSite() runs the same 5-category / 100-point GEO audit as auditSite() but against crawled live-site data (discovery surface + pages). buildRemoteReport() adds per-page citability scores, platform hints (including Claude and Gemini driven by live bot access), and aeo.js usage detection. formatRemoteReport() renders the terminal view. This is the scoring engine check.aeojs.org reimplements today; it can now import it from the library so web, CLI, and extension stay in sync. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

`npx aeo.js check mysite.com` (or a full https:// URL) now scans the live site — discovery surface, bot access matrix, bounded crawl — and prints the same 5-category GEO readiness score the local audit uses. `report <url>` adds platform hints and per-page citability; --json emits the full scan report with raw HTML stripped. Bare domains get https:// prepended; invalid targets and unreachable hosts exit 1. Also exports the remote crawl/audit API from the package root so check.aeojs.org and the browser extension can share the same scoring. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

vercel · 2026-06-10T11:03:15Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
aeo-js	Ready	Preview, Comment	Jun 10, 2026 11:03am

github-actions · 2026-06-10T11:04:03Z

Docs Preview

Preview URL: https://feat-remote-url-check.aeojs.pages.dev

This preview was deployed from the latest commit on this PR.

greptile-apps · 2026-06-10T11:10:27Z

Greptile Summary

This PR adds npx aeo.js check <url> / report <url> support, letting users audit any live site without a local build. It introduces two new modules (remote-crawl.ts, remote-audit.ts) that fetch a site's AEO surface files, parse a 23-bot robots.txt access matrix, crawl up to 10 pages, and run the same 5-category / 100-point GEO audit already used by check.aeojs.org.

remote-crawl.ts — parallel discovery (robots.txt, llms.txt, sitemap, ai-index.json, homepage) + bounded concurrent page crawler + robots.txt parser into a per-bot access matrix.
remote-audit.ts — 5-category remote audit, per-page citability scoring, Claude/Gemini platform hints driven by live bot-access data, and a terminal formatter.
CLI — positional URL support, bare-domain normalization (mysite.com → https://mysite.com/), Node 18 guard, exit 1 on invalid/unreachable targets; all new symbols exported from src/index.ts.

Confidence Score: 3/5

Safe to merge for CLI use, but the unbounded response-body read in fetchText needs addressing before check.aeojs.org imports these functions server-side.

The crawler's fetchText buffers the full response body with no size limit. Because the PR explicitly designates this code for import by the check.aeojs.org web service where arbitrary user-submitted URLs are processed, a target serving a large response body could exhaust server memory. The SSRF gap (private IPs not blocked) compounds this for the server context. Both issues are straightforward to fix but should be resolved before the web service migration.

src/core/remote-crawl.ts — specifically fetchText (response body size) and the discover/crawlPages public API (SSRF validation).

Security Review

SSRF (library boundary) — discover and crawlPages accept any URL including private IP ranges, loopback, and cloud metadata endpoints (e.g. 169.254.169.254). The CLI's normalizeTargetUrl only blocks bare localhost; raw IPs pass through. When check.aeojs.org imports these functions directly with user-supplied URLs (per the PR description), SSRF is possible without an additional validation layer in the consuming application.
No secrets or credential leakage introduced.
No injection vulnerabilities in the audit/scoring logic.

Important Files Changed

Filename	Overview
src/core/remote-crawl.ts	New crawler: fetches AEO surface files, parses robots.txt into a 23-bot access matrix, and crawls up to 10 inner pages. Unbounded `res.text()` call in `fetchText` poses a DoS risk for server-side use; no SSRF guard on the public API; all exported types use `interface` instead of `type`.
src/core/remote-audit.ts	New 5-category / 100-point remote audit engine mirroring the local audit, plus platform hints for Claude and Gemini driven by live bot-access data. Logic looks correct; `RemoteScanReport` uses `interface` instead of `type`.
src/cli.ts	Adds positional URL support to `check` and `report`, bare-domain normalization, Node 18 guard, and remote scan dispatch. `cmdCheckRemote` and `cmdReportRemote` produce identical JSON output despite having different text-mode behavior.
src/index.ts	Re-exports all new crawler and audit symbols, making them available as a package-level API for consumers like check.aeojs.org.
src/core/remote-crawl.test.ts	Good coverage of robots.txt parsing (wildcard, specific bot, grouped agents, allow-override), sitemap URL extraction, link extraction, and discover/crawlPages with mocked fetch including an unreachable-site path.
src/core/remote-audit.test.ts	Covers all five audit categories, aeo.js detection, full report construction (including Claude/Gemini platform hints), and the terminal formatter. Tests are well-structured and representative.
src/cli.test.ts	Adds tests for positional argument capture, mixed flags/positionals, and `normalizeTargetUrl` including rejection of localhost, FTP, and bare non-domain strings.

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as cli.ts (main)
    participant RC as remote-crawl.ts
    participant RA as remote-audit.ts
    participant Site as Target Site

    User->>CLI: npx aeo.js check mysite.com
    CLI->>CLI: "normalizeTargetUrl("mysite.com") -> "https://mysite.com/""
    CLI->>RC: discover(targetUrl)
    par AEO surface discovery
        RC->>Site: GET /robots.txt
        RC->>Site: GET /llms.txt
        RC->>Site: GET /llms-full.txt
        RC->>Site: GET /sitemap.xml
        RC->>Site: GET /ai-index.json
        RC->>Site: GET / (homepage)
    end
    RC->>RC: "parseRobotsTxtBotAccess(robotsTxt) -> 23-bot matrix"
    RC-->>CLI: DiscoveryResult
    CLI->>RC: crawlPages(discovery, targetUrl)
    loop "sitemap URLs (up to 10), batched by concurrency=5"
        RC->>Site: GET /page-n
    end
    RC-->>CLI: CrawledPage[]
    CLI->>RA: buildRemoteReport(url, discovery, pages)
    RA->>RA: "remoteAuditSite() -> 5 categories / 100 pts"
    RA->>RA: scorePageCitability() per page
    RA->>RA: generatePlatformHints() + Claude/Gemini hints
    RA-->>CLI: RemoteScanReport
    CLI->>User: formatRemoteReport(report) or JSON

Comments Outside Diff (2)

src/core/remote-crawl.ts, line 1193-1197 (link)

Unbounded response body buffering

res.text() reads the entire response body into a single string with no size cap. The PR explicitly targets check.aeojs.org as a future consumer of these exports, meaning discover will run server-side against arbitrary user-submitted URLs. A malicious or pathological target can serve a multi-GB HTML body; res.text() will buffer all of it before returning, risking OOM and crashing the server.

Consider reading the content-length header first and bailing early when it exceeds a reasonable threshold (e.g. 2 MB). Note that a missing or lying content-length won't be caught this way; streaming is safer for a hardened server deployment.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/core/remote-crawl.ts
Line: 1193-1197

Comment:
**Unbounded response body buffering**

`res.text()` reads the entire response body into a single string with no size cap. The PR explicitly targets check.aeojs.org as a future consumer of these exports, meaning `discover` will run server-side against arbitrary user-submitted URLs. A malicious or pathological target can serve a multi-GB HTML body; `res.text()` will buffer all of it before returning, risking OOM and crashing the server.

Consider reading the `content-length` header first and bailing early when it exceeds a reasonable threshold (e.g. 2 MB). Note that a missing or lying `content-length` won't be caught this way; streaming is safer for a hardened server deployment.

How can I resolve this? If you propose a fix, please make it concise.

src/core/remote-crawl.ts, line 1333-1378 (link)

SSRF exposure when used as a library

discover (and by extension crawlPages) will faithfully fetch whatever URL it is given, including private IP ranges (10.x.x.x, 192.168.x.x, 172.16-31.x.x), loopback (127.0.0.1, ::1), and cloud metadata endpoints (169.254.169.254). The CLI's normalizeTargetUrl rejects bare localhost but does nothing for raw IP addresses or link-local hosts.

Because the PR description explicitly calls out that check.aeojs.org will import these functions directly, any server-side integration that passes a user-supplied URL to discover without an additional validation layer would be vulnerable to SSRF.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/core/remote-crawl.ts
Line: 1333-1378

Comment:
**SSRF exposure when used as a library**

`discover` (and by extension `crawlPages`) will faithfully fetch whatever URL it is given, including private IP ranges (`10.x.x.x`, `192.168.x.x`, `172.16-31.x.x`), loopback (`127.0.0.1`, `::1`), and cloud metadata endpoints (`169.254.169.254`). The CLI's `normalizeTargetUrl` rejects bare `localhost` but does nothing for raw IP addresses or link-local hosts.

Because the PR description explicitly calls out that check.aeojs.org will import these functions directly, any server-side integration that passes a user-supplied URL to `discover` without an additional validation layer would be vulnerable to SSRF.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

Fix the following 5 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 5
src/core/remote-crawl.ts:3-40
`interface` used for simple data structures across both new files, violating custom rules that mandate `type` for DTOs and data structures that don't use inheritance or extension. This pattern repeats for `BotAccessEntry`, `DiscoveryResult`, `CrawledPage`, and `RemoteCrawlOptions` in this file, and `RemoteScanReport` in `remote-audit.ts`.

```suggestion
export type BotAccessEntry = {
  bot: string;
  company: string;
  purpose: string;
  allowed: boolean;
};

export type DiscoveryResult = {
  robotsTxt: { exists: boolean; content: string | null; hasAiDisallow: boolean };
  llmsTxt: { exists: boolean; contentLength: number; content?: string | null };
  llmsFullTxt: { exists: boolean; contentLength: number };
  sitemap: { exists: boolean; urls: string[] };
  aiIndex: { exists: boolean; content?: string | null };
  homepage: { html: string; url: string } | null;
  botAccess: BotAccessEntry[];
};

export type CrawledPage = {
  url: string;
  pathname: string;
  html: string;
  title?: string;
  description?: string;
  content?: string;
  jsonLd?: object[];
  ogTags?: Record<string, string>;
};

export type RemoteCrawlOptions = {
  /** Per-request timeout in milliseconds. Default: 12000. */
  timeoutMs?: number;
  /** Maximum inner pages to crawl beyond the homepage. Default: 10. */
  maxPages?: number;
  /** Concurrent page fetches. Default: 5. */
  concurrency?: number;
  /** User-Agent header for all requests. */
  userAgent?: string;
};
```

### Issue 2 of 5
src/core/remote-audit.ts:9-22
`RemoteScanReport` is a pure data container with no inheritance — should be `type` per the same convention as the other DTOs in this module.

```suggestion
export type RemoteScanReport = {
  url: string;
  scannedAt: string;
  discovery: DiscoveryResult;
  pages: CrawledPage[];
  audit: AuditResult;
  citability: {
    averageScore: number;
    pages: PageCitabilityResult[];
  };
  platformHints: PlatformHint[];
  botAccess: BotAccessEntry[];
  usesAeoJs: boolean;
};
```

### Issue 3 of 5
src/core/remote-crawl.ts:1193-1197
**Unbounded response body buffering**

`res.text()` reads the entire response body into a single string with no size cap. The PR explicitly targets check.aeojs.org as a future consumer of these exports, meaning `discover` will run server-side against arbitrary user-submitted URLs. A malicious or pathological target can serve a multi-GB HTML body; `res.text()` will buffer all of it before returning, risking OOM and crashing the server.

Consider reading the `content-length` header first and bailing early when it exceeds a reasonable threshold (e.g. 2 MB). Note that a missing or lying `content-length` won't be caught this way; streaming is safer for a hardened server deployment.

### Issue 4 of 5
src/core/remote-crawl.ts:1333-1378
**SSRF exposure when used as a library**

`discover` (and by extension `crawlPages`) will faithfully fetch whatever URL it is given, including private IP ranges (`10.x.x.x`, `192.168.x.x`, `172.16-31.x.x`), loopback (`127.0.0.1`, `::1`), and cloud metadata endpoints (`169.254.169.254`). The CLI's `normalizeTargetUrl` rejects bare `localhost` but does nothing for raw IP addresses or link-local hosts.

Because the PR description explicitly calls out that check.aeojs.org will import these functions directly, any server-side integration that passes a user-supplied URL to `discover` without an additional validation layer would be vulnerable to SSRF.

### Issue 5 of 5
src/cli.ts:213-245
**`check --json` and `report --json` produce identical output**

Both `cmdCheckRemote` and `cmdReportRemote` delegate to `remoteReportJson(report)` in the JSON branch, so `npx aeo.js check mysite.com --json` and `npx aeo.js report mysite.com --json` emit byte-for-byte the same JSON. The extra detail that `cmdReportRemote` adds in text mode (platform hints loop, per-page citability list) is not reflected in the structured output. A note in the help text or docs would prevent confusion for users scripting around the output.

_{Reviews (1): Last reviewed commit: "docs: document URL mode for check and re..." | Re-trigger Greptile}

greptile-apps · 2026-06-10T11:10:31Z

+export interface BotAccessEntry {
+  bot: string;
+  company: string;
+  purpose: string;
+  allowed: boolean;
+}
+
+export interface DiscoveryResult {
+  robotsTxt: { exists: boolean; content: string | null; hasAiDisallow: boolean };
+  llmsTxt: { exists: boolean; contentLength: number; content?: string | null };
+  llmsFullTxt: { exists: boolean; contentLength: number };
+  sitemap: { exists: boolean; urls: string[] };
+  aiIndex: { exists: boolean; content?: string | null };
+  homepage: { html: string; url: string } | null;
+  botAccess: BotAccessEntry[];
+}
+
+export interface CrawledPage {
+  url: string;
+  pathname: string;
+  html: string;
+  title?: string;
+  description?: string;
+  content?: string;
+  jsonLd?: object[];
+  ogTags?: Record<string, string>;
+}
+
+export interface RemoteCrawlOptions {
+  /** Per-request timeout in milliseconds. Default: 12000. */
+  timeoutMs?: number;
+  /** Maximum inner pages to crawl beyond the homepage. Default: 10. */
+  maxPages?: number;
+  /** Concurrent page fetches. Default: 5. */
+  concurrency?: number;
+  /** User-Agent header for all requests. */
+  userAgent?: string;
+}


interface used for simple data structures across both new files, violating custom rules that mandate type for DTOs and data structures that don't use inheritance or extension. This pattern repeats for BotAccessEntry, DiscoveryResult, CrawledPage, and RemoteCrawlOptions in this file, and RemoteScanReport in remote-audit.ts.

Suggested change

export interface BotAccessEntry {

bot: string;

company: string;

purpose: string;

allowed: boolean;

}

export interface DiscoveryResult {

robotsTxt: { exists: boolean; content: string | null; hasAiDisallow: boolean };

llmsTxt: { exists: boolean; contentLength: number; content?: string | null };

llmsFullTxt: { exists: boolean; contentLength: number };

sitemap: { exists: boolean; urls: string[] };

aiIndex: { exists: boolean; content?: string | null };

homepage: { html: string; url: string } | null;

botAccess: BotAccessEntry[];

}

export interface CrawledPage {

url: string;

pathname: string;

html: string;

title?: string;

description?: string;

content?: string;

jsonLd?: object[];

ogTags?: Record<string, string>;

}

export interface RemoteCrawlOptions {

/** Per-request timeout in milliseconds. Default: 12000. */

timeoutMs?: number;

/** Maximum inner pages to crawl beyond the homepage. Default: 10. */

maxPages?: number;

/** Concurrent page fetches. Default: 5. */

concurrency?: number;

/** User-Agent header for all requests. */

userAgent?: string;

}

export type BotAccessEntry = {

bot: string;

company: string;

purpose: string;

allowed: boolean;

};

export type DiscoveryResult = {

robotsTxt: { exists: boolean; content: string | null; hasAiDisallow: boolean };

llmsTxt: { exists: boolean; contentLength: number; content?: string | null };

llmsFullTxt: { exists: boolean; contentLength: number };

sitemap: { exists: boolean; urls: string[] };

aiIndex: { exists: boolean; content?: string | null };

homepage: { html: string; url: string } | null;

botAccess: BotAccessEntry[];

};

export type CrawledPage = {

url: string;

pathname: string;

html: string;

title?: string;

description?: string;

content?: string;

jsonLd?: object[];

ogTags?: Record<string, string>;

};

export type RemoteCrawlOptions = {

/** Per-request timeout in milliseconds. Default: 12000. */

timeoutMs?: number;

/** Maximum inner pages to crawl beyond the homepage. Default: 10. */

maxPages?: number;

/** Concurrent page fetches. Default: 5. */

concurrency?: number;

/** User-Agent header for all requests. */

userAgent?: string;

};

Rule Used: Use type by default in TypeScript unless you spe... (source)

Learned From
cytonic-network/ai-frontend#48

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/remote-crawl.ts Line: 3-40 Comment: `interface` used for simple data structures across both new files, violating custom rules that mandate `type` for DTOs and data structures that don't use inheritance or extension. This pattern repeats for `BotAccessEntry`, `DiscoveryResult`, `CrawledPage`, and `RemoteCrawlOptions` in this file, and `RemoteScanReport` in `remote-audit.ts`. ```suggestion export type BotAccessEntry = { bot: string; company: string; purpose: string; allowed: boolean; }; export type DiscoveryResult = { robotsTxt: { exists: boolean; content: string | null; hasAiDisallow: boolean }; llmsTxt: { exists: boolean; contentLength: number; content?: string | null }; llmsFullTxt: { exists: boolean; contentLength: number }; sitemap: { exists: boolean; urls: string[] }; aiIndex: { exists: boolean; content?: string | null }; homepage: { html: string; url: string } | null; botAccess: BotAccessEntry[]; }; export type CrawledPage = { url: string; pathname: string; html: string; title?: string; description?: string; content?: string; jsonLd?: object[]; ogTags?: Record<string, string>; }; export type RemoteCrawlOptions = { /** Per-request timeout in milliseconds. Default: 12000. */ timeoutMs?: number; /** Maximum inner pages to crawl beyond the homepage. Default: 10. */ maxPages?: number; /** Concurrent page fetches. Default: 5. */ concurrency?: number; /** User-Agent header for all requests. */ userAgent?: string; }; ``` **Rule Used:** Use `type` by default in TypeScript unless you spe... ([source](https://app.greptile.com/multivm-labs/-/custom-context?memory=c862f053-5655-4b41-be69-c840e3c9f280)) **Learned From** [cytonic-network/ai-frontend#48](https://github.com/cytonic-network/ai-frontend/pull/48) How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-06-10T11:10:32Z

+export interface RemoteScanReport {
+  url: string;
+  scannedAt: string;
+  discovery: DiscoveryResult;
+  pages: CrawledPage[];
+  audit: AuditResult;
+  citability: {
+    averageScore: number;
+    pages: PageCitabilityResult[];
+  };
+  platformHints: PlatformHint[];
+  botAccess: BotAccessEntry[];
+  usesAeoJs: boolean;
+}


RemoteScanReport is a pure data container with no inheritance — should be type per the same convention as the other DTOs in this module.

Suggested change

export interface RemoteScanReport {

url: string;

scannedAt: string;

discovery: DiscoveryResult;

pages: CrawledPage[];

audit: AuditResult;

citability: {

averageScore: number;

pages: PageCitabilityResult[];

};

platformHints: PlatformHint[];

botAccess: BotAccessEntry[];

usesAeoJs: boolean;

}

export type RemoteScanReport = {

url: string;

scannedAt: string;

discovery: DiscoveryResult;

pages: CrawledPage[];

audit: AuditResult;

citability: {

averageScore: number;

pages: PageCitabilityResult[];

};

platformHints: PlatformHint[];

botAccess: BotAccessEntry[];

usesAeoJs: boolean;

};

Rule Used: Use type instead of interface for DTOs and sim... (source)

Learned From
cytonic-network/ai-frontend#48

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/remote-audit.ts Line: 9-22 Comment: `RemoteScanReport` is a pure data container with no inheritance — should be `type` per the same convention as the other DTOs in this module. ```suggestion export type RemoteScanReport = { url: string; scannedAt: string; discovery: DiscoveryResult; pages: CrawledPage[]; audit: AuditResult; citability: { averageScore: number; pages: PageCitabilityResult[]; }; platformHints: PlatformHint[]; botAccess: BotAccessEntry[]; usesAeoJs: boolean; }; ``` **Rule Used:** Use `type` instead of `interface` for DTOs and sim... ([source](https://app.greptile.com/multivm-labs/-/custom-context?memory=2b2a7a55-162e-44b9-8c4c-3f52514f7037)) **Learned From** [cytonic-network/ai-frontend#48](https://github.com/cytonic-network/ai-frontend/pull/48) How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-06-10T11:10:35Z


+async function scanRemote(targetUrl: string): Promise<RemoteScanReport> {
+  if (typeof fetch !== 'function') {
+    console.error('[aeo.js] URL checks require Node 18+ (global fetch).');
+    process.exit(1);
+  }
+
+  console.error(`[aeo.js] Scanning ${targetUrl} ...`);
+  const discovery = await discover(targetUrl);
+
+  if (!discovery.homepage) {
+    console.error(`[aeo.js] Could not reach ${targetUrl} — check the URL and try again.`);
+    process.exit(1);
+  }
+
+  const pages = await crawlPages(discovery, targetUrl);
+  console.error(`[aeo.js] Crawled ${pages.length} page(s).`);
+  return buildRemoteReport(targetUrl, discovery, pages);
+}
+
+/** Report JSON for terminal output — drops raw page HTML to keep it readable. */
+function remoteReportJson(report: RemoteScanReport): string {
+  return JSON.stringify(
+    {
+      ...report,
+      discovery: {
+        ...report.discovery,
+        homepage: report.discovery.homepage ? { url: report.discovery.homepage.url } : null,
+      },
+      pages: report.pages.map(({ html: _html, ...page }) => page),
+    },
+    null,
+    2


check --json and report --json produce identical output

Both cmdCheckRemote and cmdReportRemote delegate to remoteReportJson(report) in the JSON branch, so npx aeo.js check mysite.com --json and npx aeo.js report mysite.com --json emit byte-for-byte the same JSON. The extra detail that cmdReportRemote adds in text mode (platform hints loop, per-page citability list) is not reflected in the structured output. A note in the help text or docs would prevent confusion for users scripting around the output.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/cli.ts Line: 213-245 Comment: **`check --json` and `report --json` produce identical output** Both `cmdCheckRemote` and `cmdReportRemote` delegate to `remoteReportJson(report)` in the JSON branch, so `npx aeo.js check mysite.com --json` and `npx aeo.js report mysite.com --json` emit byte-for-byte the same JSON. The extra detail that `cmdReportRemote` adds in text mode (platform hints loop, per-page citability list) is not reflected in the structured output. A note in the help text or docs would prevent confusion for users scripting around the output. How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

rubenmarcus and others added 4 commits June 10, 2026 12:57

docs: document URL mode for check and report

bdd8151

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

vercel Bot deployed to Preview June 10, 2026 11:03 View deployment

greptile-apps Bot reviewed Jun 10, 2026

View reviewed changes

rubenmarcus mentioned this pull request Jun 10, 2026

feat: MCP server — let agents audit sites and generate AEO files #66

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: audit any live site with `npx aeo.js check <url>`#63

feat: audit any live site with `npx aeo.js check <url>`#63
rubenmarcus wants to merge 4 commits into
mainfrom
feat/remote-url-check

rubenmarcus commented Jun 10, 2026

Uh oh!

vercel Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 10, 2026

Uh oh!

greptile-apps Bot commented Jun 10, 2026 •

edited

Loading

Security Review

Comments Outside Diff (2)

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rubenmarcus commented Jun 10, 2026

What

How

Verification

Uh oh!

vercel Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 10, 2026

Docs Preview

Uh oh!

greptile-apps Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Security Review

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 10, 2026 •

edited

Loading

greptile-apps Bot commented Jun 10, 2026 •

edited

Loading