Skip to content

Add ChatGPT JSON cleanup tool to reduce metadata bloat#45

Open
chefplay wants to merge 1 commit into
diegocard:masterfrom
chefplay:claude/reduce-json-metadata-01UpkFtC4pjTH4R15hSW3Jhu
Open

Add ChatGPT JSON cleanup tool to reduce metadata bloat#45
chefplay wants to merge 1 commit into
diegocard:masterfrom
chefplay:claude/reduce-json-metadata-01UpkFtC4pjTH4R15hSW3Jhu

Conversation

@chefplay

@chefplay chefplay commented Dec 9, 2025

Copy link
Copy Markdown

This commit adds a Python script and documentation for cleaning up ChatGPT export JSON files. The tool addresses the common "metadata bloat" problem where JSON exports can be 300MB+ but contain mostly structural metadata rather than actual conversation content.

Features:

  • Strips metadata from ChatGPT JSON exports (timestamps, IDs, node relationships)
  • Reduces file size by 85-95% (typically 300MB → 15-20MB)
  • Extracts pure conversation text in human-readable format
  • Preserves conversation titles, dates, and message order
  • No external dependencies (pure Python standard library)

Files added:

  • clean_my_chat.py: Main cleanup script
  • CHAT_CLEANUP_GUIDE.md: Comprehensive usage documentation

The cleaned output can be used with:

  • Google NotebookLM for Q&A
  • VS Code for keyword search
  • GPT4All for private local AI analysis

This commit adds a Python script and documentation for cleaning up ChatGPT export JSON files. The tool addresses the common "metadata bloat" problem where JSON exports can be 300MB+ but contain mostly structural metadata rather than actual conversation content.

Features:
- Strips metadata from ChatGPT JSON exports (timestamps, IDs, node relationships)
- Reduces file size by 85-95% (typically 300MB → 15-20MB)
- Extracts pure conversation text in human-readable format
- Preserves conversation titles, dates, and message order
- No external dependencies (pure Python standard library)

Files added:
- clean_my_chat.py: Main cleanup script
- CHAT_CLEANUP_GUIDE.md: Comprehensive usage documentation

The cleaned output can be used with:
- Google NotebookLM for Q&A
- VS Code for keyword search
- GPT4All for private local AI analysis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants