🗺️ Sitemap Harvester

🚀 A blazingly fast Python tool to harvest URLs and metadata from website sitemaps like a digital archaeologist!

🚀 Quick Start

Installation

pip install sitemap-harvester

Basic Usage

# Harvest a website's sitemap
sitemap-harvester --url https://example.com

# Custom output file and timeout
sitemap-harvester --url https://example.com --output my_data.csv --timeout 15

🎯 What Gets Extracted?

📝 Page Title - The main title of each page
📄 Meta Description - SEO descriptions
🏷️ Keywords - Meta keywords (if present)
👤 Author - Page author information
🔗 Canonical URL - Canonical link references
🖼️ Open Graph Data - Social media metadata
🌐 Custom Meta Tags - Any additional meta information

💡 Pro Tips

Use --timeout for slower websites or large sitemaps
The tool automatically deduplicates URLs for you
Check the console output for real-time progress updates
Large sitemaps? Grab a coffee ☕ and let it work its magic!

🤝 Contributing

Found a bug? Have a feature request? Contributions are welcome! Feel free to open an issue or submit a pull request.

📜 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Happy harvesting! 🌾

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
sitemap_harvester		sitemap_harvester
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
renovate.json		renovate.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🗺️ Sitemap Harvester

🚀 Quick Start

Installation

Basic Usage

🎯 What Gets Extracted?

💡 Pro Tips

🤝 Contributing

📜 License

About

Uh oh!

Releases 5

Uh oh!

Contributors 3

Uh oh!

Languages

License

meysam81/sitemap-harvester

Folders and files

Latest commit

History

Repository files navigation

🗺️ Sitemap Harvester

🚀 Quick Start

Installation

Basic Usage

🎯 What Gets Extracted?

💡 Pro Tips

🤝 Contributing

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Uh oh!

Contributors 3

Uh oh!

Languages