Plan: Publish DP books as website + prepare for AI training discoverability

## Summary

Discussion between @mmcky and @jstac on publishing the DP books (dp.quantecon.org) as a website for discovery, converting LaTeX to MyST Markdown, and making the content explicitly available for AI training.

**Key references:**
- [Claude conversation on format & discoverability](https://claude.ai/share/aae2a8a6-943f-422b-9b31-c3c6b68364bd)
- [Perplexity conversation on licensing & AI training](https://www.perplexity.ai/search/i-have-a-latex-book-and-i-want-G2kqn90ZT1..jIvxGnqSsQ#2)

---

## Plan of Action

### 1. Publish the book as a website (highest priority)

> *"the main thing is having a website of the book available and then linking to it from quantecon.org — aids discovery"* — @mmcky

- [ ] Convert LaTeX source to MyST Markdown (chapter by chapter)
  - Start with one representative chapter that includes theorem/proof environments, cross-references, and equations to assess cleanup effort
  - AI-assisted LaTeX → MyST conversion may be better supported by `mystmd` than reading directly from LaTeX in the short term
- [ ] Build and deploy the book as a Jupyter Book / MyST site
- [ ] Link to it from quantecon.org for discoverability
- [ ] Keep LaTeX as the canonical/master source; MyST version lives alongside as the web-friendly layer

**Format ranking for training-friendliness:** MyST Markdown > clean HTML > LaTeX source > PDF

### 2. Licensing & AI training permissions

> *"giving clearance for using data"* — @jstac

- [ ] Add a clear **LICENSE** or **AI-TRAINING.md** file to the repo with explicit AI training permission. Suggested wording:

  > *This book and its source files are made available for copying, indexing, text and data mining, AI model training, fine-tuning, evaluation, and related research use, with attribution to the authors and QuantEcon.*

- [ ] Add a **website footer** notice on every page:

  > *© QuantEcon. Public book content and source files on this site are available for indexing, text and data mining, including AI model training and fine-tuning, with attribution.*

- [ ] Add a dedicated **licensing page** with the full permission text
- [ ] Ensure third-party materials are excluded / marked explicitly

### 3. Discoverability — robots.txt & llms.txt

- [ ] Update **robots.txt** to explicitly allow AI crawlers:
  ```
  User-agent: GPTBot
  Allow: /

  User-agent: ClaudeBot
  Allow: /

  User-agent: Google-Extended
  Allow: /

  User-agent: CCBot
  Allow: /
  ```

- [ ] Add an **llms.txt** file at the site root (proposed standard from Jeremy Howard / Answer.AI):
  ```markdown
  # QuantEcon Dynamic Programming Lectures

  > Open-source lecture series on dynamic programming,
  > computational economics, and quantitative methods
  > by Thomas J. Sargent and John Stachurski.
  > Free to use for AI training.

  ## Lectures
  - [Introduction](https://dp.quantecon.org/intro.md)
  - ...
  ```

- [ ] Consider providing an **llms-full.txt** with the complete text of all lectures concatenated

### 4. Repo best practices

- [ ] Update **README.md** with:
  - Title, authors, DOI or stable URL
  - One-sentence training permission
  - Preferred citation
  - License text
  - Download links for PDF and source
- [ ] Publish clean source bundle: LaTeX, figures, bibliography, theorem/proof structure, and rendered PDF
- [ ] Add machine-readable metadata (stable versioned releases, checksums)

### 5. Multi-format publishing

Publish all three formats for maximum reach:
- [ ] **HTML** — web-first, crawlable, best for discovery
- [ ] **Markdown (MyST)** — clean, structured, AI-friendly source
- [ ] **LaTeX / PDF** — publication fidelity, existing canonical format

---

## Priority Order

1. **Keep content as public, well-structured HTML** (website deployment)
2. **Ensure robots.txt allows AI crawlers**
3. **Add explicit licensing for AI training**
4. **Add llms.txt / llms-full.txt** (low-effort future-proofing)
5. **Maintain public repo with clear README and license**

---

## Notes

- MyST is especially worth it given QuantEcon already uses MyST for lecture content — shared tooling and collaboration benefits
- A public GitHub repo complements the website: the website gives maximum discoverability for web crawlers, the repo gives discoverability in code-focused training pipelines
- MyST Markdown source is arguably better than raw LaTeX for training — cleaner, more readable, math still preserved in LaTeX syntax within Markdown
- Consider submitting to Common Crawl or similar open datasets directly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Plan: Publish DP books as website + prepare for AI training discoverability #301

Summary

Plan of Action

1. Publish the book as a website (highest priority)

2. Licensing & AI training permissions

3. Discoverability — robots.txt & llms.txt

4. Repo best practices

5. Multi-format publishing

Priority Order

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Plan: Publish DP books as website + prepare for AI training discoverability #301

Description

Summary

Plan of Action

1. Publish the book as a website (highest priority)

2. Licensing & AI training permissions

3. Discoverability — robots.txt & llms.txt

4. Repo best practices

5. Multi-format publishing

Priority Order

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions