Skip to content

becheran/mlc

Use this GitHub action with your project
Add this Action to an existing workflow or create a new one
View on Marketplace

Markup Link Checker

crates.io downloads build status license PRs welcome

image

Check for broken links in markup files. Currently html and markdown files are supported. The Markup Link Checker can easily be integrated in your CI pipeline to prevent broken links in your markup docs.

Features

  • Find and check links in markdown and html files
  • Validated absolute and relative file paths and URLs
  • Support for ignore/disable comments to skip specific links or blocks
  • User friendly command line interface
  • Easy CI pipeline integration
  • Very fast execution using async rust
  • Efficient link resolving strategy which tries with minimized network load
  • Throttle option to prevent 429 Too Many Requests errors
  • Report broken links via GitHub workflow commands

Install Locally

There are different ways to install and use mlc.

Cargo

Use rust's package manager cargo to install mlc from crates.io:

cargo install mlc

Download Binaries

To download a compiled binary version of mlc go to github releases and download the binaries compiled for:

  • Linux: x86_64 and aarch64 (arm64)
  • macOS: aarch64 (Apple Silicon)
  • Windows: x86_64

Arch Linux

You can install from the official repositories using pacman:

pacman -S markuplinkchecker

CI Pipeline

GitHub Actions

Use mlc in GitHub using the GitHub-Action from the Marketplace.

- name: Markup Link Checker (mlc)
  uses: becheran/mlc@v1

Use mlc command line arguments using the with argument:

- name: Markup Link Checker (mlc)
  uses: becheran/mlc@v1
  with:
    args: >-
      ./README.md
      -H "User-Agent: Mozilla/5.0"
      -H "Authorization: Bearer ${{ secrets.MY_TOKEN }}"

The action does uses GitHub workflow commands to highlight broken links:

annotation

Binary

To integrate mlc in your CI pipeline running in a linux x86_64 environment you can add the following commands to download and execute it:

curl -L https://github.com/becheran/mlc/releases/download/v1.2.0/mlc-x86_64-linux -o mlc
chmod +x mlc
./mlc

For linux aarch64/arm64 environments, use:

curl -L https://github.com/becheran/mlc/releases/download/v1.2.0/mlc-aarch64-linux -o mlc
chmod +x mlc
./mlc

For example take a look at the ntest repo which uses mlc in the CI pipeline.

Docker

Use the mlc docker image from the docker hub which includes mlc:

docker run becheran/mlc mlc

Usage

Once you have mlc installed, it can be called from the command line. The following call will check all links in markup files found in the current folder and all subdirectories:

mlc

Another example is to call mlc on a certain directory or file:

mlc ./docs

To check only specific files, for example all README.md files in a monorepo:

mlc --files "./README.md,./project1/README.md,./project2/README.md"

Alternatively you may want to ignore all files currently ignored by git (requires git binary to be found on $PATH) and set a root-dir for relative links:

mlc --gitignore --root-dir .

Call mlc with the --help flag to display all available cli arguments:

mlc -h

The following arguments are available:

Argument Short Description
<directory> Only positional argument. Path to directory which shall be checked with all sub-dirs. Can also be a specific filename which shall be checked.
--help -h Print help
--debug -d Show verbose debug information
--do-not-warn-for-redirect-to Do not warn for links which redirect to the given URL. Allows the same link format as --ignore-links. For example, --do-not-warn-for-redirect-to "http*://crates.io*" will not warn for links which redirect to the crates.io website.
--offline -o Do not check any web links. Renamed from --no-web-links which is still an alias for downwards compatibility
--match-file-extension -e Set the flag, if the file extension shall be checked as well. For example the following markup link [link](dir/file) matches if for example a file called file.md exists in dir, but would fail when the --match-file-extension flag is set.
--version -V Print current version of mlc
--ignore-path -p Comma separated list of directories or files which shall be ignored. For example
--gitignore -g Ignore all files currently ignored by git (requires git binary to be available on $PATH).
--gituntracked -u Ignore all files currently untracked by git (requires git binary to be available on $PATH).
--ignore-links -i Comma separated list of links which shall be ignored. Use simple ? and * wildcards. For example --ignore-links "http*://crates.io*" will skip all links to the crates.io website. See the used lib for more information.
--markup-types -t Comma separated list list of markup types which shall be checked. Possible values: md, html
--root-dir -r All links to the file system starting with a slash on linux or backslash on windows will use another virtual root dir. For example the link in a file [link](/dir/other/file.md) checked with the cli arg --root-dir /env/another/dir will let mlc check the existence of /env/another/dir/dir/other/file.md.
--throttle -T Number of milliseconds to wait in between web requests to the same host. Default is zero which means no throttling. Set this if you need to slow down the web request frequency to avoid 429 - Too Many Requests responses. For example with --throttle 15, between each http check to the same host, 15 ms will be waited. Note that this setting can slow down the link checker.
--csv Path to csv file which contains all failed requests and warnings in the format source,line,column,target,severity. The severity column contains ERR for errors and WARN for warnings.
--files -f Comma separated list of files which shall be checked. For example --files "./README.md,./docs/README.md" will check only the specified files. This is useful for checking specific files in a monorepo without having to exclude many directories.
--http-headers -H Comma separated list of custom HTTP headers in the format 'Name: Value'. This is useful for setting custom user agents or other headers required by specific websites. For example --http-headers "User-Agent: Mozilla/5.0,X-Custom-Header: value" will set both a custom user agent and an additional header.

Ignore Comments

You can use HTML comments to disable link checking for specific lines or blocks in both markdown and HTML files:

Disable for Current Line

<!-- mlc-disable-line --> [This link](http://broken-link.invalid) will be ignored

Disable for Next Line

<!-- mlc-disable-next-line -->
[This link](http://broken-link.invalid) will be ignored

Disable/Enable Blocks

[This link](http://example.com) will be checked

<!-- mlc-disable -->
[This link](http://broken-link.invalid) will be ignored
[This link](http://also-broken.invalid) will also be ignored
<!-- mlc-enable -->

[This link](http://example.org) will be checked again

If you use <!-- mlc-disable --> without a corresponding <!-- mlc-enable -->, all links from that point until the end of the file will be ignored.

These comments work in both markdown and HTML files.

All optional arguments which can be passed via the command line can also be configured via the .mlc.toml config file in the working directory:

# Print debug information to console
debug = true
# Do not warn for links which redirect to the given URL
do-not-warn-for-redirect-to=["http*://crates.io*"]
# Do not check web links
offline = true
# Check the exact file extension when searching for a file
match-file-extension= true
# List of files and directories which will be ignored
ignore-path=["./ignore-me","./src"]
# Ignore all files ignored by git
gitignore = true
# List of links which will be ignored
ignore-links=["http://ignore-me.de/*","http://*.ignoresub-domain/*"]
# List of markup types which shall be checked
markup-types=["Markdown","Html"]
# Wait time in milliseconds between http request to the same host
throttle= 100
# Path to the root folder used to resolve all relative paths
root-dir="./"
# Path to csv file which contains all failed requests and warnings
csv="output.csv"
# List of specific files to check
files=["./README.md","./docs/README.md"]
# Custom HTTP headers to send with web requests
http-headers=["User-Agent: Mozilla/5.0","X-Custom-Header: value"]

Changelog

Checkout the changelog file to see the changes between different versions.

License

This project is licensed under the MIT License - see the LICENSE file for more details.

About

Check for broken links in markup files

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 16

Languages