Bookbot is a small command-line utility for analyzing plain-text books. It counts words and computes character frequency statistics (case-insensitive, letters only). The repository includes a few public-domain sample books in the books/ folder so you can try the tool immediately.
Features:
- Word count for a book
- Character frequency analysis (letters only, case-insensitive)
- Simple, dependency-free Python code suitable for learning or small scripts
Requirements:
- Python 3.8+ (tested with Python 3.10+)
Installation:
- Clone the repository:
git clone <repo-url>
cd bookbot- (Optional) Create and activate a virtual environment:
python3 -m venv .venv
source .venv/bin/activateThere are no external dependencies to install for the basic functionality.
Usage:
Run the analyzer by passing the path to a text file (one of the included books or your own text file):
python3 main.py books/frankenstein.txtThe CLI now uses a small argument parser and will show a clear error if the file is missing or unreadable.
Example output (abbreviated):
============ BOOKBOT ============
Analyzing book found at books/frankenstein.txt...
----------- Word Count ----------
Found 77,000 total words
--------- Character Count -------
e: 45000
t: 30000
a: 28000
...
============= END ===============
Notes:
main.pyis the CLI entry point. It reads a file, computes word count usingget_num_words, counts alphabetic characters withcount_characters, and prints a sorted list of character counts.stats.pycontains reusable functions:get_book_text,get_num_words,count_characters,sort_char_counts, andprint_report.
Project structure:
main.py— CLI script to analyze a book filestats.py— core analysis functionsbooks/— sample book text files (tracked in the repository)frankenstein.txtmobydick.txtprideandprejudice.txt
.gitignore— standard Python ignores (virtualenvs, caches)README.md— this file
Development notes & suggestions:
- The project purposely keeps dependencies minimal. If you add new dependencies, add a
requirements.txtorpyproject.toml. - Consider adding unit tests for
stats.py(for example, usingpytest) to validate counting logic and edge cases (empty files, punctuation-only content, non-ASCII characters). - There is some duplicate functionality: both
main.pyandstats.pyinclude aget_book_texthelper — consider centralizing file IO instats.pyand havingmain.pyimport it. - Download some more books from Project Gutenberg!
Contributing
Contributions are welcome. Please fork the repo, make changes on a feature branch, and open a pull request against main. If you add behavior or bug fixes, include tests and update the README with any new usage instructions.
License
See the LICENSE file for details.