Skip to content

cameronmore/LinkGo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LinkGo

A link checker written in Go.

Note, this tool is in development

Overview

I built this tool for two reasons. First, because many link-checkers that currently exist are designed to operate on webpages. Second, I simply wanted to write a tool in Go.

Without Goroutines, the tool took 1m40.1707165s to validate 691 links, checking easy sequentially, but after optimization, that's now 21.5917739s!

Usage

This tool takes a very specific input format: a CSV, without a header, containing one URL per line. It also produces a very specific output: a JSON lines file.

There are no 'commands' for this tool, since it's doing only and exactly one thing, though it does have several flags.

  • -i indicates the input .csv file. (required)
  • -o indicates the output .jsonl file. (required)
  • -a is an optional flag that returns all (true) or only broken (false) links. The default behavior is false, only returning broken links.

Here is an example of how to run the tool:

linkgo -i input.csv -o output.jsonl -a false

Example

Using the tool on the file provided in the example directory looks like this:

linkgo -i Links.csv -o Result.jsonl -a true

Where Links.csv looks like:

https://www.id.uscourts.gov/glossary.htm
https://www.merriam-webster.com/dictionary/anthropogenic

Will produce Result.jsonl that looks like:

{"url":"https://www.id.uscourts.gov/glossary.htm","response_status_code":404,"response_status":"Not Found","time_of_validation":"2025-02-01T13:32:20Z"}

{"url":"https://www.merriam-webster.com/dictionary/anthropogenic","response_status_code":200,"response_status":"OK","time_of_validation":"2025-02-01T13:32:20Z"}

If the -o flag is given stdout as the argument, then the results will be printed to standard out.

Benchmarks

Doing some basic benchmarking, I found the following metrics:

  • 5528 links: 5m0.5996156s
  • 2764 links: 2m34.1643896s
  • 1382 links: 22.0518904s
  • 691 links: 21.5917739s
  • 355 links: 21.5984185s

TODO

I have several features and basic functionality planned for development, including:

  • ✅ Outputting only 'dead' links
  • ✅ Printing to standard out.
  • ✅ Using GoRoutines to speed up link checking
  • ⬜️ Handle different output formats, like .csv or .tsv.
  • ⬜️ Handle different input formats.

About

A link checker written in Go.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages