Davem/xs fixups #24062

iabyn · 2026-01-07T11:41:47Z

Another round of ExtUtils::ParseXS improvements.

This branch:

Improves error checking and reporting. Several error messages have been improved, more error messages added etc. The new errors are generally backwards-compatible: they report things which would otherwise have generated broken C code and so failed later in a more confusing way.
Test coverage of the warning and error messages is now comprehensive.
Refactors the length(foo) pseudo-parameter implementation, adds more error checks, and makes it so that foo no longer needs to be a type that maps to T_PV; merely that it maps to a type whose INPUT template contains SvPV_nolen($arg) or similar.
Reorganises the test files and infrastructure. The 7000 line 001-basic.t has now been split into several files, some obsolete files have been deleted, and a new naming system for test files has been introduced:

0xx-parse-foo.t just run the parser on XS snippets and see if the generated C code matches expectations

1xx-api-foo.t test ExtUtils::ParseXS API functions

3xx-run-foo.t parse XS files into C files, compile and run them

5xx-typemaps-foo.t tests for ExtUtils::Typemaps

This set of changes does not require a perldelta entry.

jkeenan

Scope of review so far largely limited to lib/ExtUtils/ParseXS/Node.pm and limited to (i) locating places in test suite where warnings and error messages are or are not being exercised; and (ii) spelling.

dist/ExtUtils-ParseXS/lib/ExtUtils/ParseXS/Node.pm

jkeenan · 2026-01-07T20:56:10Z

dist/ExtUtils-ParseXS/lib/ExtUtils/ParseXS/Node.pm

-    ($init_op, $var_init) = ($1, $2) if $line =~ s/\s* ([=;+]) \s* (.*) $//xs;
+    if ($line =~ s/\s* ([=;+]) \s* (.*) $//xs) {
+        ($init_op, $var_init) = ($1, $2);
+        $pxs->death("Error: missing '$init_op' initialiser value")


Non-standard spellling: initialiser

dist/ExtUtils-ParseXS/lib/ExtUtils/ParseXS/Node.pm

jkeenan · 2026-01-07T21:11:06Z

Note: GH web interface is saying "This branch cannot be rebased due to conflicts". However, I created a local branch from the p.r. and was able to rebase it on blead.

jkeenan

If I checkout this commit and attempt to configure/build/test it by itself, I get a failure during the configuration stage.

$ git checkout 195e097f4395c5332a1401201872881cacf7dec8
Note: switching to '195e097f4395c5332a1401201872881cacf7dec8'.
... [skip]
HEAD is now at 195e097f43 ParseXS: rename t/002-more.t to t/302-run-more.t
[perlmonger: perl2] $ configure_and_test
First let's make sure your kit is complete.  Checking...
ls: dist/ExtUtils-ParseXS/t/302-more.t: No such file or directory

THIS PACKAGE SEEMS TO BE INCOMPLETE.

You have the option of continuing the configuration process, despite the
distinct possibility that your kit is damaged, by typing 'y'es.  If you
do, don't blame me if something goes wrong.  I advise you to type 'n'o
and contact the author (https://github.com/Perl/perl5/issues).

Continue? [n] n
ABORTING...
[perlmonger: perl2] $ ls -l dist/ExtUtils-ParseXS/t/302-more.t
ls: dist/ExtUtils-ParseXS/t/302-more.t: No such file or directory
[perlmonger: perl2] $ ack '302-more' .
MANIFEST
4217:dist/ExtUtils-ParseXS/t/302-more.t				Extended ExtUtils::ParseXS compile and run

UU/xdg
16:dist/ExtUtils-ParseXS/t/302-more.t

UU/missing
1:ls: dist/ExtUtils-ParseXS/t/302-more.t: No such file or directory

Found during an analysis using Perl5::TestEachCommit. This could potentially cause a problem for Porting/bisect.pl. Can you adjust? Thanks.

Add '#': change Error: 'else' with no matching 'if' in foo.xs, line 14 to be: Error: '#else' with no matching '#if' in foo.xs, line 14 etc.

Make this generic error message be more specific: Error: Unterminated '#if/#ifdef/#ifndef' in foo.xs, line 15 State what particular variety of #if wasn't terminated, and what line it started on: Error: Unterminated '#ifndef' from line 12 in foo.xs, line 15

There is already a WarnHint() method which prints a warning message (adding file/lineno) and then appends a hint message in parentheses. This commit adds a corresponding deathHint() method. This method will be used in the next commit. This commit also changes it so that the hint message is wrapped in a single set of parentheses, rather than each line being individually wrapped. So for example, before: Warning: no foo at foo.xs, line 7 (did you forget to add) (a foo at the end of the line) (or maybe at the start of the next line?) after: Warning: no foo at foo.xs, line 7 (did you forget to add a foo at the end of the line or maybe at the start of the next line?)

Sometimes the XS parser expects the next line *not* to be indented (usually while parsing file-scoped items). If it finds an indented line, it used to die with this error message: Code is not inside a function (maybe last function was ended by a blank line followed by a statement on column one?) in foo.xs, line 17 That error message was supposed to be a helpful hint for a common XS programmer error; however, since it also obscures the *actual* error, it can be just as confusing. This commit changes it so that the error message makes clear that the fault is that the current line is indented when an indented line wasn't expected, but also gives two different hints depending on whether the indented thing looks like a file-scoped keyword or not. So for example this XS code: #define FOO 1 PROTOTYPES: DISABLE now gives this error message Error: file-scoped keywords should not be indented in foo.xs, line 13 This is because before dying, the parser now checks whether the bad line looks a bit like an indented file-scoped keyword. If it doesn't look like a keyword, such as: #define FOO 1 blah then it now gives this error message: Error: file-scoped directives must not be indented in foo.xs, line 13 (If this line is supposed to be part of an XSUB rather than being file-scoped, then it is possible that your XSUB has a blank line followed by a line starting at column 1 which is being misinterpreted as the end of the current XSUB.) which is an alternative wording of the original hint.

Change this warning message from/to: Didn't find a 'MODULE ... PACKAGE ... PREFIX' line Warning: no MODULE line found in XS file foo.xs - As it's only a warning, include the 'Warning:' text. - Include the name of the file. - It is the lack of a MODULE keyword which is important; whether it needs a PACKAGE and PREFIX as well isn't important at this point.

When the parameters in an XSUB's signature are unparseable using the fancy regex which handles (x = ",", y) etc, it's supposed to print a warning. However, the code which emits the warning was calling an unknown Warn() method and dying instead. Fix and add a test. (Ideally the signature in the new test aught actually to be parseable, but that's an issue for another day.)

Since the recent ParseXS refactoring, an XSUB where the first CASE keyword wasn't at the start of the XSUB was, in addition to the expected error, generating spurious additional errors. For example: int abc(int x, int y) INIT: myinit CASE: x > 0 CODE: code1; CASE: CODE: code2; just before this commit was outputting all of: Error: 'CASE:' after unconditional 'CASE:' in foo.xs, line 16 Error: no 'CASE:' at top of function in foo.xs, line 16 Error: no 'CASE:' at top of function in foo.xs, line 19 and after this commit (and also before refactoring) just: Error: no 'CASE:' at top of function in foo.xs, line 16

When the XS parser finds a line starting on column 1 which isn't a recognised keyword or CPP directive or similar, it assumes that it must be the start of a new XSUB. If that line is immediately followed by a blank line, a fairly cryptic error message is emitted. For example, these two lines: int BOO: cause these two errors to be emitted: Error: function definition too short 'int' in foo.xs, line 13 Error: function definition too short 'BOO:' in foo.xs, line 15 After this commit, the first error message is changed to: Error: unrecognised line: 'int' in foo.xs, line 13 (possible start of a truncated XSUB definition?) and the second to: Error: unrecognised keyword 'BOO' in foo.xs, line 15. This change acknowledges that an unrecognised line followed by a blank line may not always be a bad XSUB start. In particular, if it looks like it might be keyword, albeit an unrecognised one, say so. Otherwise, emphasise that the line can't be parsed rather than just assuming its a bad XSUB declaration. In addition, the reporting method has been changed from blurt() to death(), so that no more parsing takes place. blurt() is best for semantic errors, where the basic syntax structure is still ok and parsing can continue. An unrecognised line means something may have gone badly wrong, so stop.

Only one of ALIAS and INTERFACE should be used per XSUB. Otherwise the CV's payload is sometimes an integer and sometimes a pointer. So any generated C code is nonsense. So ban mixing them.

Previously, something like this silently passed: INTERFACE: f1 f1 Make it an error. Also, remove the unused %map variable from Node::INTERFACE::parse().

There are two types of disallowing of default values with a length(foo) parameter: foo(char *s, int length(s) = 0) foo(int length(s), char *s = "") There are are tests only for the first form. This commit adds a test for the second, rewords it to make it clearer, and does a proper blurt() rather than a bare die, so proper file and line numbers are shown. Before: default value not supported with length(NAME) supplied at /.../perl-5.43.3.out/lib/5.43.3/ExtUtils/ParseXS/Node.pm line 1416, <__ANONIO__> line 15. After: Error: default value for s not allowed when length(s) also present in foo.xs, line 14

For some reason I managed not to add any during earlier work expanding the test suite.

In something like void foo(int i = ) the XS parser previously silently accepted it and generated broken C code. Now it dies.

In the standard framework for this test file, allow an optional extra field in the [ ...] array for each test which, if defined, marks the test as TODO and becomes the TODO message. This will be used in the next commit.

Move the various tests for INPUT: section syntax into a common place. Also modernise many of those tests: many were added before the test_many() framework was added to simplify XS syntax tests.

INPUT lines can have an optional initialiser, e.g. foo(i) int i = SvIV(ST($arg)); However if the initialiser was missing, such as: foo(i) int i = ; then the XS parser silently succeeded, generating invalid C code. Add a check and tests for this. One existing test had to be modified slightly as it was now triggering this new check; it now passes the new check, but continues to fail on the later 'invalid parameter declaration' check, which is what it was supposed to be testing.

Check that any file-scoped keywords appearing in XSUB scope raise an error.

Keywords like CODE and INIT silently ignore anything on the keyword line. For example this: CODE: aaa bbb ccc will add the two lines bbb and ccc to the output C file, and quietly ignore the aaa text. This commit instead makes it warn (but still discard). The new tests both test for the new warning and confirm that the junk is being ignored.

Clean up the regex which processes OVERLOAD keyword lines. Use //x; and since most punctuation characters within a [] character class lose their special regex meaning, remove lots of unnecessary '\'s Should be no change in functionality.

Previously this was silent; now it warns: OVERLOAD: cmp cmp

use split, rather that while (s///), to process the arguments to an OVERLOAD keyword line. This makes the code a little more comprehensible, without changing functionality.

Code like this is an error: int foo() ALIAS: 1 but this was being silently allowed: ALIAS: 0 due to code along the lines of 'die ... if $line', which was supposed to be checking for residual unrecognised content.

Previously a bad OUTPUT line like this: OUTPUT: SETMAGIC: 1 emitted a generic and confusing warning message: Error: OUTPUT SETMAGIC: not a parameter because a SETMAGIC line which wasn't syntactically correct was parsed as a general OUTPUT: line. This commit instead treats it as a SETMAGIC line, but with an invalid argument, and adds a suitable new error message: Error: SETMAGIC: invalid value '1' (should be ENABLE/DISABLE)

Add more tests for where the specified type is a function point type, such as 'int (*)(char *, long)'. This needs special processing by the XS parser to stick the variable name into the '(*)' There was an existing test for where an arg type was specified in an INPUT line. Modernise this test to use the new test_many() XS-testing framework, and add tests for where the type is specified in the XSUB's signature, and where the return type is a function pointer.

Add some basic tests for this keyword, including a test for the 'invalid value' error. Also, fix that error handling: it was calling $self->death() rather than $pxs->death() and so was croaking with 'no such method'. Also, improve the text of the error message.

Add some tests for this keyword (especially testing all the possible errors that can be raised). Also, improve the text of one of the error messages to indicate what a valid REQUIRE value can be.

See the previous commit but one for the rationale.

Add some comments at the top of these three new test files to explain their purpose.

This commit moves a test out of 001-basic.t into its own test file which checks that ExtUtils::ParseXS can be loaded and that its version matches that in lib/perxs.pod. As of this commit, 001-basic.t now contains only parse tests of XS snippets using the test_many() framework. Previous commits have removed anything else out of it. The next series of commits will split 001-basic.t into several smaller files, as its currently about 7000 lines long.

Move the test_many() sub and associated stuff out of t/001-basic.t and into its own module so that it can be shared across multiple test files. This is a prelude to splitting t/001-basic.t.

Move out of 001-basic.t and into their own test file, the tests which are concerned with things which can appear at file-scope in an XS file (i.e. outside an XSUB), apart from file-scoped keywords. This is just several simple cut and pastes of blocks of tests; no changes have been made to the tests themselves. This commit is part of splitting 001-basic.t into several smaller files, with a more coherent grouping.

Move out of 001-basic.t and into their own test file, the tests which are concerned with keywords which can appear at file-scope in an XS file (i.e. outside an XSUB). This is just several simple cut and pastes of blocks of tests; no changes have been made to the tests themselves. This commit is part of splitting 001-basic.t into several smaller files, with a more coherent grouping.

Move out of 001-basic.t and into their own test file, the tests which are concerned with parsing the declaration of an XSUB (i.e. the first two lines). This is just several simple cut and pastes of blocks of tests; no changes have been made to the tests themselves. This commit is part of splitting 001-basic.t into several smaller files, with a more coherent grouping.

Move out of 001-basic.t and into their own test file, the tests which are concerned with parsing the individual parameters of an XSUB. This is just several simple cut and pastes of blocks of tests; no changes have been made to the tests themselves. This commit is part of splitting 001-basic.t into several smaller files, with a more coherent grouping.

Move out of 001-basic.t and into their own test file, the tests which are concerned with parsing the return type of an XSUB. This is just several simple cut and pastes of blocks of tests; no changes have been made to the tests themselves. This commit is part of splitting 001-basic.t into several smaller files, with a more coherent grouping.

Move out of 001-basic.t and into their own test file, the tests which are concerned with parsing the INPUT and OUTPUT keywords of an XSUB. This is just several simple cut and pastes of blocks of tests; no changes have been made to the tests themselves. This commit is part of splitting 001-basic.t into several smaller files, with a more coherent grouping.

Move out of 001-basic.t and into their own test file, the tests which are concerned with parsing the keywords (except INPUT and OUTPUT) of an XSUB. This is just several simple cut and pastes of blocks of tests; no changes have been made to the tests themselves. This commit is part of splitting 001-basic.t into several smaller files, with a more coherent grouping.

Move out of 001-basic.t and into their own test file, the tests which are concerned with parsing the C++ support features for XSUBs. This is just several simple cut and pastes of blocks of tests; no changes have been made to the tests themselves. This commit is part of splitting 001-basic.t into several smaller files, with a more coherent grouping.

In the previous commits, most of the body of t/001-basic.t has been moved out into separate 0xx-parse-foo.t files. Rename this file now so that it is also a 0xx-parse- file.

Add the standard boilerplate to the top of the file.

This test data file has not been used since v5.15.0-477-g9b58169ac2, but has been hanging around ever since.

This test file has apparently never actually tested anything, and there's already another test file, t/104-map_type.t which actually tests the map_type() method.

This test file doesn't actually test anything. The set_cond() method is fairly trivial; it just returns a short snippet of C code like "items < 1" for use in an XSUB's number-of-args checking, and there's plenty of tests in the 0xx-parse-foo.t test files for that already.

Each of the 1xx-*.t test files run tests on a particular API function. So rename all these files to include a -api- prefix (in the same way that the parsing files are all 0xx-parse-foo.t and the code running are all 3xx-run-foo.t)

Add a boilerplate comment line explaining the purpose of each file.

The 5xx-t-foo.t files all test ExtUtils::Typemaps (as opposed to all the other test files in the t/ directory which test ExtUtils::ParseXS). Rename the filename prefix from -t- to -typemaps- to better signal this fact. 600-t-compat.t is a bit of an odd one - it's also testing ExtUtils::Typemaps and I'm not sure why its 600 rather than 5xx. It's the only 6xx in the directory. I've renamed it to 518-typemaps-compat.t for consistency.

Add a boilerplate comment line or two explaining the purpose of each file.

This directory just contains typemap files to be used by various test files. Make the name give a clearer indication of its purpose.

Fix up a test to work under perl 5.8.9. This regex: qr/\Q...\".../ was meant to match the two literal characters backslash and double-quote, but that only worked correctly from 5.10.0 onwards.

This private helper module had 'strict' enabled, but not warnings. So enable them, and fix up a problem it reveals: the destructor for the tied handle was trying to write the buffer contents even if the buffer was empty. This is harmless, but was now triggering a warning for some tests which tested for an error condition that didn't produce any output, e.g.: ../dist/ExtUtils-ParseXS/t/002-parse-file-scope.t ................... 1/? print() on unopened filehandle FH at .../CountLines.pm line 44 during global destruction. This is also a fix for running under 5.8.9, whose test harness rather anti-socially invokes perl with -w. Which is how I got to know about the warnings in the first place.

iabyn · 2026-01-09T14:49:54Z

If I checkout this commit and attempt to configure/build/test it by itself, I get a failure during the configuration stage.

Now fixed and rebased.

Leont · 2026-01-09T17:24:10Z

Could you please keep the Changes file up to date? I'm trying to sync to CPAN now and figuring out what went into which release is the hardest part of it. It doesn't have to be much, probably the PR description can be reused for it really.

iabyn · 2026-01-10T12:23:38Z

Could you please keep the Changes file up to date? I'm trying to sync to CPAN now and figuring out what went into which release is the hardest part of it. It doesn't have to be much, probably the PR description can be reused for it really.

Now done.

jkeenan requested changes Jan 7, 2026

View reviewed changes

jkeenan requested changes Jan 8, 2026

View reviewed changes

iabyn added 27 commits January 9, 2026 14:20

ParseXS: improve else with no matching if err msg

a2d1b8a

Add '#': change Error: 'else' with no matching 'if' in foo.xs, line 14 to be: Error: '#else' with no matching '#if' in foo.xs, line 14 etc.

ParseXS: add error check for ALIAS and INTERFACE

291fb27

Only one of ALIAS and INTERFACE should be used per XSUB. Otherwise the CV's payload is sometimes an integer and sometimes a pointer. So any generated C code is nonsense. So ban mixing them.

ParseXS: detect duplicate INTERFACE names

31f555d

Previously, something like this silently passed: INTERFACE: f1 f1 Make it an error. Also, remove the unused %map variable from Node::INTERFACE::parse().

ParseXS: add basic tests for param default values

fa021fe

For some reason I managed not to add any during earlier work expanding the test suite.

ParseXS: detect missing param default expression

ebf04ae

In something like void foo(int i = ) the XS parser previously silently accepted it and generated broken C code. Now it dies.

ParseXS:t/001-basic.t : add support for TODO tests

077d73b

In the standard framework for this test file, allow an optional extra field in the [ ...] array for each test which, if defined, marks the test as TODO and becomes the TODO message. This will be used in the next commit.

ParseXS: 001-basic.t: consolidate INPUT tests

60c7a1f

Move the various tests for INPUT: section syntax into a common place. Also modernise many of those tests: many were added before the test_many() framework was added to simplify XS syntax tests.

ParseXS: add tests for 'misplaced FOO:' error

eb8d1ba

Check that any file-scoped keywords appearing in XSUB scope raise an error.

ParseXS: OVERLOAD: simplify regex

411426e

Clean up the regex which processes OVERLOAD keyword lines. Use //x; and since most punctuation characters within a [] character class lose their special regex meaning, remove lots of unnecessary '\'s Should be no change in functionality.

ParseXS: warn on duplicate OVERLOAD entries

eb655b7

Previously this was silent; now it warns: OVERLOAD: cmp cmp

ParseXS: refactor: split OVERLOAD args

5583cd8

use split, rather that while (s///), to process the arguments to an OVERLOAD keyword line. This makes the code a little more comprehensible, without changing functionality.

ParseXS: ban ALIAS: 0

a0814bc

Code like this is an error: int foo() ALIAS: 1 but this was being silently allowed: ALIAS: 0 due to code along the lines of 'die ... if $line', which was supposed to be checking for residual unrecognised content.

ParseXS: 001-basic.t: fix comment typos

de3cd90

ParseXS: add tests for REQUIRE and improve err msg

28bc8f1

Add some tests for this keyword (especially testing all the possible errors that can be raised). Also, improve the text of one of the error messages to indicate what a valid REQUIRE value can be.

iabyn added 24 commits January 9, 2026 14:20

ParseXS: rename t/003-usage.t to t/303-run-usage.t

240bd65

See the previous commit but one for the rationale.

ParseXS: add top comments to new 3xx-run-foo files

3698d16

Add some comments at the top of these three new test files to explain their purpose.

ParseXS: add t/lib/TestMany.pm

9ae13a9

Move the test_many() sub and associated stuff out of t/001-basic.t and into its own module so that it can be shared across multiple test files. This is a prelude to splitting t/001-basic.t.

ParseXS: rename t/001-basic.t t/001-parse-basic.t

7da0855

In the previous commits, most of the body of t/001-basic.t has been moved out into separate 0xx-parse-foo.t files. Rename this file now so that it is also a 0xx-parse- file.

ParseXS: t/001-parse-basic.t: add std comments

0c07d1c

Add the standard boilerplate to the top of the file.

ParseXS: delete t/pseudotypemap1

322aebb

This test data file has not been used since v5.15.0-477-g9b58169ac2, but has been hanging around ever since.

ParseXS: delete t/108-map_type.t

08935f4

This test file has apparently never actually tested anything, and there's already another test file, t/104-map_type.t which actually tests the map_type() method.

ParseXS: rename t/1xx-*.t to t/1xx-api-*.t

4f736bc

Each of the 1xx-*.t test files run tests on a particular API function. So rename all these files to include a -api- prefix (in the same way that the parsing files are all 0xx-parse-foo.t and the code running are all 3xx-run-foo.t)

ParseXS: add comments at the top of all t/1xx*.t

2d2304d

Add a boilerplate comment line explaining the purpose of each file.

ParseXS: add comment at the top of all t/5xx*.t

5ad4fdb

Add a boilerplate comment line or two explaining the purpose of each file.

ParseXS: rename t/data/ to t/test_typemaps

f430b42

This directory just contains typemap files to be used by various test files. Make the name give a clearer indication of its purpose.

ParseXS: 004-parse-xsub-declaration.t: fix for 589

f8084a0

Fix up a test to work under perl 5.8.9. This regex: qr/\Q...\".../ was meant to match the two literal characters backslash and double-quote, but that only worked correctly from 5.10.0 onwards.

iabyn force-pushed the davem/xs_fixups branch from 9379c12 to db99710 Compare January 9, 2026 14:47

iabyn mentioned this pull request Jan 9, 2026

Explicitly reject length(NAME) with typemaps other than T_PV #23479

Merged

ParseXS: update Changes

7af02c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Davem/xs fixups #24062

Davem/xs fixups #24062

iabyn commented Jan 7, 2026

Uh oh!

jkeenan left a comment

Uh oh!

Uh oh!

Uh oh!

jkeenan Jan 7, 2026

Uh oh!

Uh oh!

jkeenan commented Jan 7, 2026

Uh oh!

jkeenan left a comment

Uh oh!

iabyn commented Jan 9, 2026

Uh oh!

Leont commented Jan 9, 2026

Uh oh!

iabyn commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Davem/xs fixups #24062

Are you sure you want to change the base?

Davem/xs fixups #24062

Conversation

iabyn commented Jan 7, 2026

Uh oh!

jkeenan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jkeenan Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jkeenan commented Jan 7, 2026

Uh oh!

jkeenan left a comment

Choose a reason for hiding this comment

Uh oh!

iabyn commented Jan 9, 2026

Uh oh!

Leont commented Jan 9, 2026

Uh oh!

iabyn commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants