Skip to content

Conversation

@chefplay
Copy link

@chefplay chefplay commented Dec 9, 2025

This commit adds a Python script and documentation for cleaning up ChatGPT export JSON files. The tool addresses the common "metadata bloat" problem where JSON exports can be 300MB+ but contain mostly structural metadata rather than actual conversation content.

Features:

  • Strips metadata from ChatGPT JSON exports (timestamps, IDs, node relationships)
  • Reduces file size by 85-95% (typically 300MB → 15-20MB)
  • Extracts pure conversation text in human-readable format
  • Preserves conversation titles, dates, and message order
  • No external dependencies (pure Python standard library)

Files added:

  • clean_my_chat.py: Main cleanup script
  • CHAT_CLEANUP_GUIDE.md: Comprehensive usage documentation

The cleaned output can be used with:

  • Google NotebookLM for Q&A
  • VS Code for keyword search
  • GPT4All for private local AI analysis

This commit adds a Python script and documentation for cleaning up ChatGPT export JSON files. The tool addresses the common "metadata bloat" problem where JSON exports can be 300MB+ but contain mostly structural metadata rather than actual conversation content.

Features:
- Strips metadata from ChatGPT JSON exports (timestamps, IDs, node relationships)
- Reduces file size by 85-95% (typically 300MB → 15-20MB)
- Extracts pure conversation text in human-readable format
- Preserves conversation titles, dates, and message order
- No external dependencies (pure Python standard library)

Files added:
- clean_my_chat.py: Main cleanup script
- CHAT_CLEANUP_GUIDE.md: Comprehensive usage documentation

The cleaned output can be used with:
- Google NotebookLM for Q&A
- VS Code for keyword search
- GPT4All for private local AI analysis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants