feat: ✨ add env variable to help debug issues #223

joelostblom · 2025-11-27T10:20:24Z

Description

I think we need to go through many types of error messages manually and see if the way the error is presented via explain is helpful. This debug option helps with that by always printing the properties dict, its issues, and the output from explain. It also enables us to easily do this on all test messages.

Here is an example of an error message where all the ^^^^^ are not that helpful:

This is now easy to detect by running the test suite with the debug option enabled, but we still need to make decision about what to do with these errors by opening a new issue for each one of them and discuss the best solutions.

Needs an in-depth review.

Checklist

Formatted Markdown
Ran just run-all

src/check_datapackage/check.py

tests/conftest.py

joelostblom · 2025-11-27T10:40:58Z

@seedcase-project/developers I'll mark as ready for review to get some early thoughts around whether this will be useful and if you prefer a different approach.

martonvago

I think this is very helpful to get a general idea about the types of messages!
I cannot say if we want the debug option actually to be part of the released thing, I leave that to Luke :P .

About the wording of the messages (not sure if you're interested in that at this point, but I think Luke wanted to rewrite them):

One thing to note is that while there are many tests, they are testing cdp-specific behaviour and not the jsonschema output. So there are presumably more message types than what shows up in the tests. They format their messages based on the JSON schema validator (their validator, our type), so getting a list of validators that are used in the Data Package JSON schema could also give us an idea of how many message types there are.

But yeah, the answer is likely a bit too many for this to be a pleasant task.

Now, jsonschema can be customised in various ways, including using a custom validator to check the input object. In a custom validator it should be possible to tweak the behaviour of each individual JSON schema rule (so minItems, required, type, etc.) and redefine the messages. I don't know if this would be easier to do than remapping the default messages after validation, but it's something to explore.

src/check_datapackage/check.py

lwjohnst86

I don't love the idea modifying behaviour by an env variable.

What if we put the explain message in the Issue itself, so that it is attached there? Then explain() would be a simple "get attribute" function with additional things like prettier printing/rich/including the pre-face and listing the messages? Would that solve this problem a bit?

joelostblom · 2025-12-05T07:49:06Z

I don't love the idea modifying behaviour by an env variable.

Yeah, unfortunately it's either that or modifying the script to accept parameter. I think those are the only two ways to enable debugging output in Python (pytest has a flag for it, so if we decide we only want this outupt from pytest it would be different, and then leave debugging the function for manual inspection via breakpoints). I actually tried injecting all code required in conftest.py but it doesn't seem possible to get this output without modifying the actual check function

What if we put the explain message in the Issue itself, so that it is attached there? Then explain() would be a simple "get attribute" function with additional things like prettier printing/rich/including the pre-face and listing the messages? Would that solve this problem a bit?

Hmmm, I don't think that would address the same issue. This PR is mainly to help us troubleshoot if error messages are suitable and thus I think it's useful to have the input dictionary, the outputted issues, and the explained error message all shown/printed together so it is quick to review if the error message are relevant/useful given the actual issue that occurred (without manually having to set breakpoints and print objects in the debugging console.

Alternatives solutions could be to decide that this branch never goes into production and we just run of this branch when we want to test "explain". Or I could try harder to move all code into conftest.py behind a debug flag if we decide we only want to run this from pytest via their cli flag. The code will be noticeably more convoluted if we go this route.

lwjohnst86 · 2025-12-05T08:22:41Z

Hmm, yea, for now we can do your suggestion of pulling this branch to test things out, while leaving this open. I'll give a think about it a bit longer 🤔

lwjohnst86

I've thought about it, and I think this is fine, especially for dealing with some of those issues we've been encountering. Here's some comments for it.

lwjohnst86 · 2025-12-18T18:04:15Z

src/check_datapackage/check.py

    issues = exclude(issues, config.exclusions)
    issues = sorted(set(issues))

+    if os.getenv("CDP_DEBUG"):


Suggested change

if os.getenv("CDP_DEBUG"):

# Use by doing `CDP_DEBUG=true uv run ...`

if os.getenv("CDP_DEBUG"):

I don't think we need to externally document this.

lwjohnst86 · 2025-12-18T18:09:12Z

src/check_datapackage/check.py

+        rprint(*issues)
+        rprint(explain(issues))


Suggested change

rprint(*issues)

rprint(explain(issues))

for issue in issues:

rprint(issue)

rprint(explain([issue]))

So that the explanation is right below the issue rather than listing them all separately.

And yes, it's a for loop 😛 Contrary to how it might seem, I know for loops are useful. In particular in non-functional scenarios like printing to screen (which is inherently non-functional). This is way multi-paradigm languages are so useful, you can use the paradigm most suited to the task 😝

lwjohnst86 · 2025-12-18T18:10:12Z

tests/conftest.py

@@ -0,0 +1,14 @@
+import os
+
+


Suggested change

# Can get debug output by using `CDP_DEBUG=true uv run pytest -sv tests/test_check.py`

lwjohnst86 · 2025-12-18T18:11:35Z

@joelostblom also need to resolve the failed check.

feat: Add optional debug output

fa4fe00

add-to-board-token bot added this to Iteration planning Nov 27, 2025

github-project-automation bot moved this to Todo in Iteration planning Nov 27, 2025

github-actions bot assigned joelostblom Nov 27, 2025

joelostblom commented Nov 27, 2025

View reviewed changes

src/check_datapackage/check.py Show resolved Hide resolved

feat: Add newline separation in tets with debug output

1352ccc

joelostblom force-pushed the feat/debug-output branch from 10ea1b0 to 1352ccc Compare November 27, 2025 10:33

joelostblom commented Nov 27, 2025

View reviewed changes

tests/conftest.py Show resolved Hide resolved

joelostblom marked this pull request as ready for review November 27, 2025 10:40

joelostblom requested a review from a team as a code owner November 27, 2025 10:40

martonvago reviewed Nov 27, 2025

View reviewed changes

src/check_datapackage/check.py Show resolved Hide resolved

lwjohnst86 moved this from Todo to In Progress in Iteration planning Nov 30, 2025

lwjohnst86 moved this from In Progress to In Review in Iteration planning Nov 30, 2025

lwjohnst86 reviewed Dec 5, 2025

View reviewed changes

lwjohnst86 moved this from In Review to In Progress in Iteration planning Dec 5, 2025

joelostblom moved this from In Progress to In Review in Iteration planning Dec 11, 2025

joelostblom mentioned this pull request Dec 18, 2025

Identify edge cases for custom messages in explain() #257

Open

Merge branch 'main' into feat/debug-output

edd765e

lwjohnst86 changed the title ~~feat: Add optional debug output~~ feat: ✨ add env variable to help debug issues Dec 18, 2025

lwjohnst86 requested changes Dec 18, 2025

View reviewed changes

github-project-automation bot moved this from In Review to In Progress in Iteration planning Dec 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: ✨ add env variable to help debug issues #223

feat: ✨ add env variable to help debug issues #223

Uh oh!

joelostblom commented Nov 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

joelostblom commented Nov 27, 2025

Uh oh!

martonvago left a comment

Uh oh!

Uh oh!

lwjohnst86 left a comment

Uh oh!

joelostblom commented Dec 5, 2025

Uh oh!

lwjohnst86 commented Dec 5, 2025

Uh oh!

lwjohnst86 left a comment

Uh oh!

lwjohnst86 Dec 18, 2025

Uh oh!

lwjohnst86 Dec 18, 2025

Uh oh!

lwjohnst86 Dec 18, 2025

Uh oh!

lwjohnst86 commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	if os.getenv("CDP_DEBUG"):
	# Use by doing `CDP_DEBUG=true uv run ...`
	if os.getenv("CDP_DEBUG"):

-        rprint(*issues)
-        rprint(explain(issues))
+        for issue in issues:
+            rprint(issue)
+            rprint(explain([issue]))


	# Can get debug output by using `CDP_DEBUG=true uv run pytest -sv tests/test_check.py`

feat: ✨ add env variable to help debug issues #223

Are you sure you want to change the base?

feat: ✨ add env variable to help debug issues #223

Uh oh!

Conversation

joelostblom commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

Uh oh!

Uh oh!

joelostblom commented Nov 27, 2025

Uh oh!

martonvago left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lwjohnst86 left a comment

Choose a reason for hiding this comment

Uh oh!

joelostblom commented Dec 5, 2025

Uh oh!

lwjohnst86 commented Dec 5, 2025

Uh oh!

lwjohnst86 left a comment

Choose a reason for hiding this comment

Uh oh!

lwjohnst86 Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

lwjohnst86 Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

lwjohnst86 Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

lwjohnst86 commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

joelostblom commented Nov 27, 2025 •

edited

Loading