Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 16, 2025

Task: Create GitHub workflow to check links on live site ✅

Summary

Implemented automated link checking workflow aligned with @cncf/techdocs setup using htmltest (built from @chalin's maintained fork) and Makefile, with improvements based on code review feedback.

Latest Updates (addressing code review feedback)

Fixed Makefile - Removed broken ifeq/endif, added file existence check in refcache-save
Improved workflow conditions - Use steps.htmltest.outcome instead of failure()/success()
Added x.com - Twitter's new domain added to ignore list
Cleaned up code - Removed unused imports, fixed file paths
Use @chalin's htmltest fork - Build from source using Go for latest patches and improvements
Go-based installation - Workflow uses Go to build htmltest from @chalin's dev/main branch
Local development support - Makefile builds from source with helpful error if Go not installed
Added .nvmrc - Set to lts/* for flexible Node version management
Updated workflow - Now uses node-version-file: .nvmrc instead of hardcoded version
Use npm script - Workflow calls npm run check:links instead of make directly
Extracted JS code - Moved analysis script to .github/scripts/analyze-htmltest.js
Cleaned .htmltest.yml - Removed empty fields, improved URL patterns with ^https://(www\.)? prefix
Streamlined Makefile - Added .PHONY targets, removed unnecessary default target, better organization

Implementation Details

Files Added:

  1. .nvmrc - Node version set to lts/*
  2. .github/scripts/analyze-htmltest.js - Link check analysis script (extracted from workflow)
  3. Makefile - Build orchestration (builds htmltest from @chalin's fork using Go)
  4. .htmltest.yml - Link checker configuration with precise URL patterns
  5. static/refcache.json - Link check cache

Files Modified:

  1. package.json - Already had correct scripts calling make check-links
  2. .github/workflows/check-links.yml - Uses Go to build htmltest from source, uses .nvmrc, extracted script, improved conditions
  3. .gitignore - Added tmp/ for htmltest artifacts
  4. README.md - Documentation about the setup, Go requirement, and @chalin's fork

Key Features

Uses maintained htmltest fork - @chalin's fork (github.com/chalin/htmltest) with latest patches and improvements
Built from source - Uses Go to build htmltest in CI and locally for latest updates
Aligned with @cncf/techdocs - Uses same tools and structure
Automated weekly checks - Runs every Monday at 9:00 UTC
Manual workflow dispatch - Can be triggered from Actions tab
Issue automation - Creates/updates/closes issues based on results
Reference caching - Speeds up subsequent checks (12-month cache)
Flexible Node version - Uses .nvmrc for easier version management
Maintainable code - JS analysis in separate file, cleaner Makefile
Robust error handling - Better condition checking, file existence guards

How It Works

The workflow:

  1. Sets up Node.js and Go
  2. Builds htmltest from @chalin's fork (github.com/chalin/htmltest, dev/main branch)
  3. Installs npm dependencies and builds the site with npm run build
  4. Runs npm run check:links which calls the Makefile
  5. Makefile runs htmltest with cache restore/save (guarded by file existence checks)
  6. Analyzes results using .github/scripts/analyze-htmltest.js
  7. Creates GitHub issues if broken links found (using steps.htmltest.outcome == 'failure')
  8. Closes issues when all links are fixed (using steps.htmltest.outcome == 'success')
  9. Uploads artifacts for review

Testing

  • ✅ YAML syntax validated
  • ✅ .htmltest.yml syntax validated
  • ✅ Analysis script validated
  • ✅ Site builds successfully
  • ✅ Makefile structure improved and streamlined
  • ✅ Configuration aligned with techdocs standards
  • ✅ Go-based installation implemented
  • ✅ File path references fixed
  • ✅ Workflow conditions improved

Local Usage

# Check links locally (requires Go to be installed)
npm run check:links

Note: Local link checking requires Go to build htmltest from source. If you don't have Go, you can install a pre-built htmltest binary from the releases page or use the bash install script.

This implementation uses @chalin's maintained fork to ensure we have the latest patches and improvements for our link checking needs, addressing all code review feedback while maintaining alignment with @cncf/techdocs standards.

Original prompt

On .github, use the check:links npm command to check links on the live site, and make suggestions as to how to fix them.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@netlify
Copy link

netlify bot commented Oct 16, 2025

Deploy Preview for contribute-cncf-io ready!

Name Link
🔨 Latest commit 05d35da
🔍 Latest deploy log https://app.netlify.com/projects/contribute-cncf-io/deploys/6939fb54249c4e0008bd1f2c
😎 Deploy Preview https://deploy-preview-83--contribute-cncf-io.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@nate-double-u
Copy link
Member

nate-double-u commented Oct 16, 2025

@copilot, I based the package.json file for this site on the CNCF Techdocs site, you can find it here: https://github.com/cncf/techdocs (@cncf/techdocs). That repo may have details about the makefile that is missing here. Please stay as close to that site's setup as you can here.

Copilot AI changed the title [WIP] Check links on the live site and suggest fixes Add automated link checking workflow for live site with intelligent fix suggestions Oct 16, 2025
Copilot AI requested a review from nate-double-u October 16, 2025 22:30
Copy link
Member

@nate-double-u nate-double-u left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot, Let's revise using the Makefile method from @cncf/techdocs (https://github.com/cncf/techdocs/blob/main/Makefile) I had mentioned.

Here's the content of that site's Makefile in case you're not able to access it:

# cSpell:ignore htmltest refcache
# Set REFCACHE to another value to disable htmltest refcache-file manipulation
REFCACHE?=refcache
HTMLTEST_DIR=tmp
HTMLTEST?=htmltest # Specify as make arg if different
HTMLTEST_ARGS?=--log-level 1
LINK_CACHE_FILE?=refcache.json
LINK_CACHE_FILE_DEST_DIR?=static
LINK_CACHE_FILE_SRC_DIR?=$(HTMLTEST_DIR)/.htmltest

# Use $(HTMLTEST) in PATH, if available; otherwise, we'll get a copy
ifeq (, $(shell which $(HTMLTEST)))
override HTMLTEST=$(HTMLTEST_DIR)/bin/htmltest
ifeq (, $(shell which $(HTMLTEST)))
GET_LINK_CHECKER_IF_NEEDED=get-link-checker
endif
endif

default:
	@echo "Make what? Target list:\n"
	@make -rpn | grep '^[a-z]\S*:' | sed 's/://' | sort

$(LINK_CACHE_FILE_SRC_DIR):
	mkdir -p $(LINK_CACHE_FILE_SRC_DIR)

$(LINK_CACHE_FILE_DEST_DIR)/$(LINK_CACHE_FILE):
	mkdir -p $(LINK_CACHE_FILE_DEST_DIR)
	echo '{}' > $(LINK_CACHE_FILE_DEST_DIR)/$(LINK_CACHE_FILE)

refcache-restore: $(LINK_CACHE_FILE_DEST_DIR)/$(LINK_CACHE_FILE) $(LINK_CACHE_FILE_SRC_DIR)
ifeq (refcache, $(REFCACHE))
	cp $(LINK_CACHE_FILE_DEST_DIR)/$(LINK_CACHE_FILE) $(LINK_CACHE_FILE_SRC_DIR)/
else
	@echo "SKIPPING refcache-restore"
endif

refcache-save:
ifeq (refcache, $(REFCACHE))
	cp $(LINK_CACHE_FILE_SRC_DIR)/$(LINK_CACHE_FILE) $(LINK_CACHE_FILE_DEST_DIR)/
	npx prettier --prose-wrap=always --write $(LINK_CACHE_FILE_DEST_DIR)/$(LINK_CACHE_FILE)
else
	@echo "SKIPPING refcache-save"
endif

check-links: $(GET_LINK_CHECKER_IF_NEEDED) \
	refcache-restore check-links-only refcache-save

check-links-only:
	$(HTMLTEST) $(HTMLTEST_ARGS)

clean:
	rm -rf $(HTMLTEST_DIR) public/* resources

get-link-checker:
	rm -Rf $(HTMLTEST_DIR)/bin
	curl https://htmltest.wjdp.uk | bash -s -- -b $(HTMLTEST_DIR)/bin

@nate-double-u
Copy link
Member

@copilot, for reference, here's that site's package.json as well:

{
  "name": "techdocs",
  "version": "0.0.0",
  "description": "Resources provided by the CNCF Technical Documentation team.",
  "scripts": {
    "_build": "npm run docus:build",
    "_check:format:any": "npx prettier --check --ignore-path ''",
    "_check:format:delta": "npm run _check:format:any -- $(npm run -s _list:git:delta)",
    "_check:format": "npx prettier --check .",
    "_check:links-md": "bash -c 'for f in *.md `find analyses -name \"*.md\"`; do npx markdown-link-check --config .markdown-link-check.json -p -v $f || exit 1; done'",
    "_check:links:internal": "npm run _check:links -- HTMLTEST_ARGS='--skip-external'",
    "_check:links": "make --keep-going check-links",
    "_check:markdown:1": "npx -p markdownlint-cli markdownlint -c .markdownlint.yaml",
    "_check:markdown:all": "npm run -s _list:check:md | xargs -I {} -P 4 npx -p markdownlint-cli markdownlint -c .markdownlint.yaml {}",
    "_check:markdown:delta": "npm run -s _list:git:delta | xargs -I {} npx -p markdownlint-cli markdownlint -c .markdownlint.yaml {}",
    "_list:check:*": "npm run --loglevel=warn | grep -Ee '^\\s*check:[^:]+$'",
    "_list:check:md:no-analysis": "find . -name '*.md' -not -path '*/node_modules/*' -a -not -path '*/.?*' -a -not -path '*/00*'",
    "_list:check:md": "find . -name '*.md' -not -path '*/node_modules/*' -a -not -path '*/.?*' | grep -Eve '/000|/0010'",
    "_list:fix:*": "npm run --loglevel=warn | grep -Ee '^\\s*fix:[^:]+$' | grep -v 'fix:all'",
    "_list:git:delta": "git diff --name-only --diff-filter=ACMR | grep -E '\\.(js|md|scss|yml|yaml)$'",
    "build:preview": "npm run _build",
    "build:production": "npm run _build",
    "build": "BUILD_ENV=dev npm run _build",
    "check:format": "npm run _check:format || (echo '[help] Run: npm run fix:format'; exit 1)",
    "check:links": "npm run _check:links",
    "check:markdown": "npm run _check:markdown:all",
    "check:spelling": "npx cspell --no-progress -c .cspell.yml analyses docs *.md",
    "check": "npm run seq -- $(npm run -s _list:check:*)",
    "docus:build": "docusaurus build",
    "docus:clear": "docusaurus clear",
    "docus:deploy": "docusaurus deploy",
    "docus:serve": "docusaurus serve",
    "docus:start": "docusaurus start",
    "docus:swizzle": "docusaurus swizzle",
    "docus:write-heading-ids": "docusaurus write-heading-ids",
    "docus:write-translations": "docusaurus write-translations",
    "docusaurus": "docusaurus",
    "fix:format": "npm run _check:format -- --write",
    "fix": "npm run seq -- $(npm -s run _list:fix:*)",
    "precheck:links": "npm run build",
    "seq": "bash -c 'for cmd in \"$@\"; do npm run $cmd || exit 1; done' - ",
    "serve": "npm run docus:serve",
    "test": "npm run check",
    "typecheck": "tsc",
    "update:pkgs": "npx npm-check-updates -u"
  },
  "author": "CNCF",
  "license": "CC-BY-4.0",
  "dependencies": {
    "@docusaurus/core": "3.8.1",
    "@docusaurus/preset-classic": "3.8.1",
    "@mdx-js/react": "^3.1.0",
    "clsx": "^2.1.1",
    "prism-react-renderer": "^2.4.1",
    "react-dom": "^19.1.0",
    "react": "^19.1.0"
  },
  "devDependencies": {
    "@docusaurus/module-type-aliases": "3.8.1",
    "@docusaurus/tsconfig": "3.8.1",
    "@docusaurus/types": "3.8.1",
    "cspell": "^9.1.1",
    "markdown-link-check": "3.13.7",
    "markdownlint-cli": "^0.45.0",
    "markdownlint": "^0.38.0",
    "npm-check-updates": "^18.0.1",
    "prettier": "^3.5.3",
    "typescript": "~5.8.3"
  },
  "private": true,
  "spelling": "cSpell:ignore ACMR docus HTMLTEST loglevel pkgs -",
  "prettier": {
    "proseWrap": "always",
    "singleQuote": true
  }
}

Copy link
Contributor Author

Copilot AI commented Oct 16, 2025

@copilot, I based the package.json file for this site on the CNCF Techdocs site, you can find it here: https://github.com/cncf/techdocs (@cncf/techdocs). That repo may have details about the makefile ...

I've aligned the implementation with @cncf/techdocs. The setup now uses:

  • Makefile with check-links target (same as techdocs)
  • htmltest link checker (same as techdocs)
  • .htmltest.yml configuration (same as techdocs)
  • refcache.json for link caching (same as techdocs)

The original package.json already had the correct _check:links script that calls make check-links - it just needed the Makefile. Changes committed in ea45e5c.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • htmltest.wjdp.uk
    • Triggering command: curl REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title Add automated link checking workflow for live site with intelligent fix suggestions Add automated link checking workflow aligned with @cncf/techdocs Oct 16, 2025
Copilot AI requested a review from nate-double-u October 16, 2025 22:41
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot, here's the techdocs version of this file in case there are any additional useful settings (not all of this will be relevant here):

# cSpell:ignore github
CacheExpires: 9000h # ~ 12 months
DirectoryPath: build
TestFilesConcurrently: true
IgnoreDirs:
IgnoreInternalURLs: # list of paths
IgnoreURLs: # list of regexes of URLs or path to be ignored
  - ^https?://localhost
  - \?no-link-check
  # FIXME: temporary ignore rules
  - assistance\.md
  - LICENSE
  - /analyses/

Copy link
Member

@nate-double-u nate-double-u left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot, Thanks. Let's try it!

@nate-double-u nate-double-u marked this pull request as ready for review October 16, 2025 22:53
Copilot AI review requested due to automatic review settings October 16, 2025 22:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds an automated link checking setup aligned with @cncf/techdocs to restore and enhance the broken check:links command and introduce scheduled monitoring.

  • Introduces Makefile targets to install and run htmltest with refcache support
  • Adds .htmltest.yml config and a weekly GitHub Actions workflow with issue automation
  • Documents usage in README and seeds a refcache file

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
Makefile Adds htmltest orchestration, refcache restore/save, and installer target
.htmltest.yml Configures htmltest (cache, directory, concurrency, ignore patterns)
.github/workflows/check-links.yml Schedules weekly link checks, analyzes output, and manages issues/artifacts
README.md Documents how automated link checking works and how to run locally
static/refcache.json Seeds the htmltest reference cache file

retention-days: 30

- name: Create or update issue on failure
if: failure() && github.event_name == 'schedule'
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the htmltest step uses continue-on-error: true, the job may not be marked as failed even when htmltest fails; failure() can produce unintended results. Gate on the step outcome instead: use if: steps.htmltest.outcome == 'failure' && github.event_name == 'schedule' to ensure issues are created only when the htmltest step actually fails.

Suggested change
if: failure() && github.event_name == 'schedule'
if: steps.htmltest.outcome == 'failure' && github.event_name == 'schedule'

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 58d928e. Now using steps.htmltest.outcome == 'failure' instead of failure() for more precise control.

}
- name: Close issue on success
if: success() && github.event_name == 'schedule'
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, with continue-on-error on the htmltest step, success() can be true even when htmltest failed. Use if: steps.htmltest.outcome == 'success' && github.event_name == 'schedule' so issues are only closed when the htmltest step succeeds.

Suggested change
if: success() && github.event_name == 'schedule'
if: steps.htmltest.outcome == 'success' && github.event_name == 'schedule'

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 58d928e. Now using steps.htmltest.outcome == 'success' instead of success().

Makefile Outdated
Comment on lines 37 to 43
refcache-save:
ifeq (refcache, $(REFCACHE))
cp $(LINK_CACHE_FILE_SRC_DIR)/$(LINK_CACHE_FILE) $(LINK_CACHE_FILE_DEST_DIR)/
npx prettier --prose-wrap=always --write $(LINK_CACHE_FILE_DEST_DIR)/$(LINK_CACHE_FILE)
else
@echo "SKIPPING refcache-save"
endif
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If htmltest did not produce $(LINK_CACHE_FILE_SRC_DIR)/$(LINK_CACHE_FILE), the cp will fail and the make target will error. Guard the copy to avoid failing when the refcache file is absent, e.g., wrap with a file existence check and skip when missing.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 58d928e. Added file existence check with @if [ -f $(LINK_CACHE_FILE_SRC_DIR)/$(LINK_CACHE_FILE) ] to guard the copy operation.

.htmltest.yml Outdated
Comment on lines 5 to 6
IgnoreDirs:
IgnoreInternalURLs: # list of paths
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These keys are defined without list values, which YAML parses as null; some tools expect empty arrays and may error on null. Either remove these keys if unused or set them explicitly to empty lists: IgnoreDirs: [] and IgnoreInternalURLs: [].

Suggested change
IgnoreDirs:
IgnoreInternalURLs: # list of paths
IgnoreDirs: []
IgnoreInternalURLs: [] # list of paths

Copilot uses AI. Check for mistakes.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

htmltest is ok with those fields being empty.

.htmltest.yml Outdated
Comment on lines 10 to 16
# Social media links that may be rate-limited
- linkedin\.com
- twitter\.com
- facebook\.com
- youtube\.com
- medium\.com
- instagram\.com
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding x\.com to the ignore list due to Twitter's domain change, to avoid rate limits from both domains.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 58d928e. Now includes ^https://(www\.)?x\.com to handle Twitter's domain change.

run: |
cat > analyze-htmltest.js << 'EOFJS'
const fs = require('fs');
const path = require('path');
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path is required but never used in the analysis script. Remove the unused import to reduce noise.

Suggested change
const path = require('path');

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 58d928e.

Makefile Outdated

get-link-checker:
rm -Rf $(HTMLTEST_DIR)/bin
curl https://htmltest.wjdp.uk | bash -s -- -b $(HTMLTEST_DIR)/bin
Copy link

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curl | bash is unsafe because it executes remote content without verification. Prefer installing htmltest from a pinned release asset (with checksum verification) or vendoring a known version to ensure supply-chain integrity.

Suggested change
curl https://htmltest.wjdp.uk | bash -s -- -b $(HTMLTEST_DIR)/bin
mkdir -p $(HTMLTEST_DIR)/bin
HTMLTEST_VERSION=v0.17.0 ; \
HTMLTEST_OS=linux ; \
HTMLTEST_ARCH=amd64 ; \
HTMLTEST_FILENAME=htmltest_${HTMLTEST_VERSION}_${HTMLTEST_OS}_${HTMLTEST_ARCH}.tar.gz ; \
HTMLTEST_URL=https://github.com/wjdp/htmltest/releases/download/${HTMLTEST_VERSION}/${HTMLTEST_FILENAME} ; \
HTMLTEST_SHA256_URL=https://github.com/wjdp/htmltest/releases/download/${HTMLTEST_VERSION}/checksums.txt ; \
curl -sSL -o $(HTMLTEST_DIR)/${HTMLTEST_FILENAME} $${HTMLTEST_URL} ; \
curl -sSL -o $(HTMLTEST_DIR)/checksums.txt $${HTMLTEST_SHA256_URL} ; \
cd $(HTMLTEST_DIR) && grep $${HTMLTEST_FILENAME} checksums.txt | sha256sum -c - ; \
tar -xzf $(HTMLTEST_DIR)/${HTMLTEST_FILENAME} -C $(HTMLTEST_DIR)/bin ; \
chmod +x $(HTMLTEST_DIR)/bin/htmltest

Copilot uses AI. Check for mistakes.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's how I've starting installing htmltest (at least for workflows). You can use similar code locally (though you'll need Go):

      - uses: actions/setup-go@v5

      - name: Install htmltest
        run: |
          HTMLTEST_REPO=github.com/chalin/htmltest
          BRANCH=dev/main
          REPO_DIR=tmp/htmltest
          git clone --branch $BRANCH https://$HTMLTEST_REPO.git $REPO_DIR
          pushd $REPO_DIR
          make build
          HTMLTEST=$REPO_DIR/bin/htmltest
          echo "HTMLTEST=$HTMLTEST" >> $GITHUB_ENV

I've been patching the tool for our needs on the OTel.io project, since the base repo doesn't get a lot 'a love. Of course, you can install from the original repo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot, I agree with @chalin here, let's pull from github.com/chalin/htmltest instead of htmltest.wjdp.uk. We had an issue pulling from htmltest.wjdp.uk earlier this week (not sure if it's because you're an agent or something else, but if @chalin has improved it, so much the better for us).

@nate-double-u nate-double-u requested a review from chalin October 16, 2025 23:02
@nate-double-u nate-double-u mentioned this pull request Oct 16, 2025
13 tasks
Copy link

@chalin chalin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments.

- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add .nvmrc set to lts/* and the use that so you won't have to update this script when versions change:

Suggested change
node-version: '20'
node-version-file: .nvmrc

continue-on-error: true
run: |
echo "Running link check with htmltest..."
make check-links 2>&1 | tee htmltest-output.txt
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider calling npm run check:links instead of calling make directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot, I agree. Let's call npm run check:links instead of calling Make directly.

Comment on lines 57 to 59
cat > analyze-htmltest.js << 'EOFJS'
const fs = require('fs');
const path = require('path');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put the .js content in a JS file rather than cat'ing it? You can put it under .github/scripts

.htmltest.yml Outdated
Comment on lines 5 to 6
IgnoreDirs:
IgnoreInternalURLs: # list of paths
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

htmltest is ok with those fields being empty.

.htmltest.yml Outdated
Comment on lines 11 to 16
- linkedin\.com
- twitter\.com
- facebook\.com
- youtube\.com
- medium\.com
- instagram\.com
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix all of these with `^https://(www.)?

Makefile Outdated
Comment on lines 1 to 2
# cSpell:ignore htmltest refcache
# Set REFCACHE to another value to disable htmltest refcache-file manipulation
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be steamlined. See the Docsy one or even the jaegertracing.io one for examples.

Makefile Outdated

get-link-checker:
rm -Rf $(HTMLTEST_DIR)/bin
curl https://htmltest.wjdp.uk | bash -s -- -b $(HTMLTEST_DIR)/bin
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's how I've starting installing htmltest (at least for workflows). You can use similar code locally (though you'll need Go):

      - uses: actions/setup-go@v5

      - name: Install htmltest
        run: |
          HTMLTEST_REPO=github.com/chalin/htmltest
          BRANCH=dev/main
          REPO_DIR=tmp/htmltest
          git clone --branch $BRANCH https://$HTMLTEST_REPO.git $REPO_DIR
          pushd $REPO_DIR
          make build
          HTMLTEST=$REPO_DIR/bin/htmltest
          echo "HTMLTEST=$HTMLTEST" >> $GITHUB_ENV

I've been patching the tool for our needs on the OTel.io project, since the base repo doesn't get a lot 'a love. Of course, you can install from the original repo.

@nate-double-u
Copy link
Member

@copilot, please look at @chalin's comments and update as needed.

Copy link
Contributor Author

Copilot AI commented Oct 17, 2025

If you'd prefer to use @chalin's alternative approach with Go and building from source, I can implement that instead.

@copilot, we should; please do. If for no other reason than @chalin is mainta...

Implemented. The workflow now uses Go to build htmltest from @chalin's fork (github.com/chalin/htmltest, dev/main branch). The Makefile also supports building from source locally with a helpful error message if Go is not installed. Changes committed in 7a9e8f3.

Copilot AI changed the title Add automated link checking workflow aligned with @cncf/techdocs Add automated link checking workflow using @chalin's maintained htmltest fork Oct 17, 2025
@nate-double-u nate-double-u requested a review from chalin October 17, 2025 22:06
Copy link
Member

@nate-double-u nate-double-u left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, this looks good. We'll have 1000s of links to fix though. To be fair, most of those will be href lacks trailing slash messages.

@nate-double-u nate-double-u enabled auto-merge (squash) October 20, 2025 18:30
Copy link

@chalin chalin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments.

Comment on lines 71 to 88
- name: Process and analyze results
id: analyze
if: always()
run: |
node .github/scripts/analyze-htmltest.js > link-analysis.md || true
cat link-analysis.md
- name: Upload results
if: always()
uses: actions/upload-artifact@v4
with:
name: link-check-results
path: |
htmltest-output.txt
link-analysis.md
tmp/.htmltest/
retention-days: 30
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of lean CI/CD when we can have it. Htmltest output is quite understandable as is. I don't think we need an extra JS script to analyse htmlttest output.

Also, consider uploading htmltest-output.txt

Comment on lines +91 to +96
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
let analysis = 'Link check failed. See workflow run for details.';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving this script content to a .js file.
For scripts this complex, I like to be able to test them locally.

Can this script be rewritten to use gh instead of the github.rest API? What are the relative advantages of both?

Comment on lines +162 to +165
with:
script: |
// Close any open broken-links issues
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving the script content into a .js file.


This will build the site and run htmltest against it, just like the CI workflow.

**Note**: Running locally requires Go to be installed, as htmltest is built from source. If you don't have Go installed, you can install the htmltest binary directly from the [releases page](https://github.com/wjdp/htmltest/releases).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or users can elect to use the bash install script, see https://github.com/wjdp/htmltest?tab=readme-ov-file#floppy_disk-installation. Or brew, etc. if they are ok using the base htmltest tool. (To date, the chalin-dev branch is backwards compatible.)

@chalin
Copy link

chalin commented Oct 27, 2025

@nate-double-u - nice functionality here, but GH wokflows can be a challenge to get right. It might be better to incrementally add functionality in the workflow: e.g., add the issue creation/changing logic in a followup PR. Or, go ahead and give this a try. (Note that I've not reviewed the code in details. Once the code is factored out, it'll be easier to test locally.)

auto-merge was automatically disabled November 4, 2025 16:21

Head branch was pushed to by a user without write access

Copilot AI requested a review from nate-double-u November 4, 2025 16:22
@nate-double-u
Copy link
Member

@chalin, let's go ahead and give this a try. We can make further updates in follow up PRs.

@nate-double-u nate-double-u enabled auto-merge (squash) November 4, 2025 21:24
@nate-double-u
Copy link
Member

(Note, I can't merge this--or even override the rules to force a merge, so if I could get a couple approvals, @chalin, @krook, @castrojo)

@chalin
Copy link

chalin commented Nov 13, 2025

Tested locally, this looks good. We'll have 1000s of links to fix though. To be fair, most of those will be href lacks trailing slash messages.

We can set htmltest config to disable some warnings. Also, use of ignoreURLs is a good way to keep things under control. There are many ways to do so, alphabetically, for example. That is, ignore all external links except those that start with an "a".

Copilot AI and others added 10 commits November 13, 2025 10:16
- Install linkinator for link checking
- Update package.json to fix _check:links script to use linkinator
- Add check:links:live script for checking the live site
- Create GitHub Actions workflow to check links weekly
- Add gitignore entries for link check artifacts
- Workflow features: scheduled runs, manual triggers, issue creation, and suggestions

Co-authored-by: nate-double-u <4453979+nate-double-u@users.noreply.github.com>
- Add detailed comments to workflow file explaining features
- Update README with automated link checking section
- Document how to run checks manually
- Explain workflow behavior for broken and fixed links

Co-authored-by: nate-double-u <4453979+nate-double-u@users.noreply.github.com>
- Create linkinator.config.json for centralized configuration
- Fix email detection to use proper regex pattern
- Simplify npm scripts to use config file
- Improve code comments for clarity
- Update documentation to reference config file
- Remove duplicated skip patterns across files

Co-authored-by: nate-double-u <4453979+nate-double-u@users.noreply.github.com>
- Add Makefile and .htmltest.yml from @cncf/techdocs
- Revert package.json to original (already had correct scripts)
- Remove linkinator in favor of htmltest (same as techdocs)
- Update GitHub Actions workflow to use htmltest
- Update README to reflect htmltest usage
- Update .gitignore for htmltest artifacts (tmp/)
- Add refcache.json for link check caching

Co-authored-by: nate-double-u <4453979+nate-double-u@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Nate W <natew@cncf.io>
- Add .nvmrc with lts/* for Node version management
- Update workflow to use node-version-file instead of hardcoded version
- Change workflow to call 'npm run check:links' instead of make directly
- Extract JS analysis code to .github/scripts/analyze-htmltest.js
- Remove empty IgnoreDirs and IgnoreInternalURLs from .htmltest.yml
- Prefix social media URLs with ^https://(www\.)? for more precise matching
- Streamline Makefile: add .PHONY targets, remove default target, improve organization

Co-authored-by: nate-double-u <4453979+nate-double-u@users.noreply.github.com>
- Add Go setup step to workflow
- Install htmltest by building from @chalin's fork (dev/main branch)
- Update Makefile to build from source with Go for local development
- Add helpful error message if Go is not installed locally
- Update documentation to reflect the change
- Uses maintained fork with latest patches and improvements

Co-authored-by: nate-double-u <4453979+nate-double-u@users.noreply.github.com>
Co-authored-by: Patrice Chalin <chalin@users.noreply.github.com>
Signed-off-by: Nate W <natew@cncf.io>
Co-authored-by: Patrice Chalin <chalin@users.noreply.github.com>
Signed-off-by: Nate W <natew@cncf.io>
- Fix Makefile: remove broken ifeq/endif, guard refcache-save with file existence check
- Use steps.htmltest.outcome instead of failure()/success() for better control flow
- Add x.com to ignore list for Twitter's domain change
- Remove unused 'path' import from analyze-htmltest.js
- Fix htmltest-output.txt path references (now in tmp/)
- Update artifact upload paths to match new tmp/ location

Co-authored-by: nate-double-u <4453979+nate-double-u@users.noreply.github.com>
@chalin chalin force-pushed the copilot/check-links-on-live-site branch from d3a2c47 to 3e672b2 Compare November 13, 2025 15:16
@chalin
Copy link

chalin commented Nov 13, 2025

(Note, I can't merge this--or even override the rules to force a merge, so if I could get a couple approvals, @chalin ...

Have my last round of comments been addressed or considered? If so, at least add a comment so I know. I can take a closer look after that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants