Skip to content

Conversation

@iabyn
Copy link
Contributor

@iabyn iabyn commented Jan 7, 2026

Another round of ExtUtils::ParseXS improvements.

This branch:

  • Improves error checking and reporting. Several error messages have been improved, more error messages added etc. The new errors are generally backwards-compatible: they report things which would otherwise have generated broken C code and so failed later in a more confusing way.

  • Test coverage of the warning and error messages is now comprehensive.

  • Refactors the length(foo) pseudo-parameter implementation, adds more error checks, and makes it so that foo no longer needs to be a type that maps to T_PV; merely that it maps to a type whose INPUT template contains SvPV_nolen($arg) or similar.

  • Reorganises the test files and infrastructure. The 7000 line 001-basic.t has now been split into several files, some obsolete files have been deleted, and a new naming system for test files has been introduced:

    0xx-parse-foo.t just run the parser on XS snippets and see if the generated C code matches expectations

    1xx-api-foo.t test ExtUtils::ParseXS API functions

    3xx-run-foo.t parse XS files into C files, compile and run them

    5xx-typemaps-foo.t tests for ExtUtils::Typemaps

  • This set of changes does not require a perldelta entry.

Copy link
Contributor

@jkeenan jkeenan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scope of review so far largely limited to lib/ExtUtils/ParseXS/Node.pm and limited to (i) locating places in test suite where warnings and error messages are or are not being exercised; and (ii) spelling.

($init_op, $var_init) = ($1, $2) if $line =~ s/\s* ([=;+]) \s* (.*) $//xs;
if ($line =~ s/\s* ([=;+]) \s* (.*) $//xs) {
($init_op, $var_init) = ($1, $2);
$pxs->death("Error: missing '$init_op' initialiser value")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-standard spellling: initialiser

@jkeenan
Copy link
Contributor

jkeenan commented Jan 7, 2026

Note: GH web interface is saying "This branch cannot be rebased due to conflicts". However, I created a local branch from the p.r. and was able to rebase it on blead.

Copy link
Contributor

@jkeenan jkeenan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I checkout this commit and attempt to configure/build/test it by itself, I get a failure during the configuration stage.

$ git checkout 195e097f4395c5332a1401201872881cacf7dec8
Note: switching to '195e097f4395c5332a1401201872881cacf7dec8'.
... [skip]
HEAD is now at 195e097f43 ParseXS: rename t/002-more.t to t/302-run-more.t
[perlmonger: perl2] $ configure_and_test
First let's make sure your kit is complete.  Checking...
ls: dist/ExtUtils-ParseXS/t/302-more.t: No such file or directory

THIS PACKAGE SEEMS TO BE INCOMPLETE.

You have the option of continuing the configuration process, despite the
distinct possibility that your kit is damaged, by typing 'y'es.  If you
do, don't blame me if something goes wrong.  I advise you to type 'n'o
and contact the author (https://github.com/Perl/perl5/issues).

Continue? [n] n
ABORTING...
[perlmonger: perl2] $ ls -l dist/ExtUtils-ParseXS/t/302-more.t
ls: dist/ExtUtils-ParseXS/t/302-more.t: No such file or directory
[perlmonger: perl2] $ ack '302-more' .
MANIFEST
4217:dist/ExtUtils-ParseXS/t/302-more.t				Extended ExtUtils::ParseXS compile and run

UU/xdg
16:dist/ExtUtils-ParseXS/t/302-more.t

UU/missing
1:ls: dist/ExtUtils-ParseXS/t/302-more.t: No such file or directory

Found during an analysis using Perl5::TestEachCommit. This could potentially cause a problem for Porting/bisect.pl. Can you adjust? Thanks.

iabyn added 27 commits January 9, 2026 14:20
Add '#': change

    Error: 'else' with no matching 'if' in foo.xs, line 14

to be:

    Error: '#else' with no matching '#if' in foo.xs, line 14

etc.
Make this generic error message be more specific:

    Error: Unterminated '#if/#ifdef/#ifndef' in foo.xs, line 15

State what particular variety of #if wasn't terminated, and what line it
started on:

    Error: Unterminated '#ifndef' from line 12 in foo.xs, line 15
There is already a WarnHint() method which prints a warning message
(adding file/lineno) and then appends a hint message in parentheses.
This commit adds a corresponding deathHint() method.

This method will be used in the next commit.

This commit also changes it so that the hint message is wrapped in a
single set of parentheses, rather than each line being individually
wrapped.

So for example, before:

    Warning: no foo at foo.xs, line 7
        (did you forget to add)
        (a foo at the end of the line)
        (or maybe at the start of the next line?)

after:

    Warning: no foo at foo.xs, line 7
      (did you forget to add
       a foo at the end of the line
       or maybe at the start of the next line?)
Sometimes the XS parser expects the next line *not* to be indented
(usually while parsing file-scoped items).

If it finds an indented line, it used to die with this error message:

    Code is not inside a function (maybe last function was ended by a
    blank line  followed by a statement on column one?) in foo.xs, line 17

That error message was supposed to be a helpful hint for a common XS
programmer error; however, since it also obscures the *actual* error, it
can be just as confusing.

This commit changes it so that the error message makes clear that the
fault is that the current line is indented when an indented line wasn't
expected, but also gives two different hints depending on whether the
indented thing looks like a file-scoped keyword or not. So for example
this XS code:

    #define FOO 1
      PROTOTYPES: DISABLE

now gives this error message

    Error: file-scoped keywords should not be indented in foo.xs, line 13

This is because before dying, the parser now checks whether the bad line
looks a bit like an indented file-scoped keyword. If it doesn't look
like a keyword, such as:

    #define FOO 1
      blah

then it now gives this error message:

    Error: file-scoped directives must not be indented in foo.xs, line 13
      (If this line is supposed to be part of an XSUB rather than being
       file-scoped, then it is possible that your XSUB has a blank line
       followed by a line starting at column 1 which is being misinterpreted
       as the end of the current XSUB.)

which is an alternative wording of the original hint.
Change this warning message from/to:

    Didn't find a 'MODULE ... PACKAGE ... PREFIX' line

    Warning: no MODULE line found in XS file foo.xs

- As it's only a warning, include the 'Warning:' text.

- Include the name of the file.

- It is the lack of a MODULE keyword which is important; whether it
  needs a PACKAGE and PREFIX as well isn't important at this point.
When the parameters in an XSUB's signature are unparseable using the
fancy regex which handles (x = ",", y) etc, it's supposed to print a
warning.

However, the code which emits the warning was calling an unknown Warn()
method and dying instead.

Fix and add a test.

(Ideally the signature in the new test aught actually to be parseable,
but that's an issue for another day.)
Since the recent ParseXS refactoring, an XSUB where the first CASE
keyword wasn't at the start of the XSUB was, in addition to the
expected error, generating spurious additional errors. For example:

    int abc(int x, int y)
      INIT:
        myinit
      CASE: x > 0
        CODE:
          code1;
      CASE:
        CODE:
          code2;

just before this commit was outputting all of:

    Error: 'CASE:' after unconditional 'CASE:' in foo.xs, line 16
    Error: no 'CASE:' at top of function in foo.xs, line 16
    Error: no 'CASE:' at top of function in foo.xs, line 19

and after this commit (and also before refactoring) just:

    Error: no 'CASE:' at top of function in foo.xs, line 16
When the XS parser finds a line starting on column 1 which isn't a
recognised keyword or CPP directive or similar, it assumes that it must
be the start of a new XSUB. If that line is immediately followed by a
blank line, a fairly cryptic error message is emitted. For example,
these two lines:

    int

    BOO:

cause these two errors to be emitted:

    Error: function definition too short 'int' in foo.xs, line 13
    Error: function definition too short 'BOO:' in foo.xs, line 15

After this commit, the first error message is changed to:

    Error: unrecognised line: 'int' in foo.xs, line 13
      (possible start of a truncated XSUB definition?)

and the second to:

    Error: unrecognised keyword 'BOO' in foo.xs, line 15.

This change acknowledges that an unrecognised line followed by a blank
line may not always be a bad XSUB start. In particular, if it looks like
it might be keyword, albeit an unrecognised one, say so. Otherwise,
emphasise that the line can't be parsed rather than just assuming its a
bad XSUB declaration.

In addition, the reporting method has been changed from blurt() to
death(), so that no more parsing takes place. blurt() is best for
semantic errors, where the basic syntax structure is still ok and
parsing can continue. An unrecognised line means something may have gone
badly wrong, so stop.
Only one of ALIAS and INTERFACE should be used per XSUB. Otherwise the
CV's payload is sometimes an integer and sometimes a pointer. So any
generated C code is nonsense. So ban mixing them.
Previously, something like this silently passed:

    INTERFACE: f1 f1

Make it an error.

Also, remove the unused %map variable from Node::INTERFACE::parse().
There are two types of disallowing of default values with a length(foo)
parameter:

    foo(char *s, int length(s) = 0)
    foo(int length(s), char *s = "")

There are are tests only for the first form. This commit adds a test
for the second, rewords it to make it clearer, and does a proper blurt()
rather than a bare die, so proper file and line numbers are shown.

Before:

    default value not supported with length(NAME) supplied at /.../perl-5.43.3.out/lib/5.43.3/ExtUtils/ParseXS/Node.pm line 1416, <__ANONIO__> line 15.

After:

    Error: default value for s not allowed when length(s) also present in foo.xs, line 14
For some reason I managed not to add any during earlier work expanding
the test suite.
In something like

    void
    foo(int i = )

the XS parser previously silently accepted it and generated broken C
code. Now it dies.
In the standard framework for this test file, allow an optional
extra field in the  [ ...] array for each test which, if defined,
marks the test as TODO and becomes the TODO message.

This will be used in the next commit.
Move the various tests for INPUT: section syntax into a common place.
Also modernise many of those tests: many were added before the
test_many() framework was added to simplify XS syntax tests.
INPUT lines can have an optional initialiser, e.g.

    foo(i)
        int i = SvIV(ST($arg));

However if the initialiser was missing, such as:

    foo(i)
        int i = ;

then the XS parser silently succeeded, generating invalid C code.

Add a check and tests for this.

One existing test had to be modified slightly as it was now triggering
this new check; it now passes the new check, but continues to fail on
the later 'invalid parameter declaration' check, which is what it was
supposed to be testing.
Check that any file-scoped keywords appearing in XSUB scope raise an
error.
Keywords like CODE and INIT silently ignore anything on the keyword
line. For example this:

    CODE: aaa
        bbb
        ccc

will add the two lines bbb and ccc to the output C file, and quietly
ignore the aaa text. This commit instead makes it warn (but still
discard).

The new tests both test for the new warning and confirm that the junk is
being ignored.
Clean up the regex which processes OVERLOAD keyword lines. Use //x;
and since most punctuation characters within a [] character class lose
their special regex meaning, remove lots of unnecessary '\'s

Should be no change in functionality.
Previously this was silent; now it warns:

    OVERLOAD: cmp cmp
use split, rather that while (s///), to process the arguments to an
OVERLOAD keyword line. This makes the code a little more comprehensible,
without changing functionality.
Code like this is an error:

    int
    foo()
      ALIAS: 1

but this was being silently allowed:

      ALIAS: 0

due to code along the lines of 'die ... if $line', which was supposed to
be checking for residual unrecognised content.
Previously a bad OUTPUT line like this:

    OUTPUT:
        SETMAGIC: 1

emitted a generic and confusing warning message:

    Error: OUTPUT SETMAGIC: not a parameter

because a SETMAGIC line which wasn't syntactically correct was parsed
as a general OUTPUT: line.

This commit instead treats it as a SETMAGIC line, but with an invalid
argument, and adds a suitable new error message:

    Error: SETMAGIC: invalid value '1' (should be ENABLE/DISABLE)
Add more tests for where the specified type is a function point type,
such as 'int (*)(char *, long)'. This needs special processing by the XS
parser to stick the variable name into the '(*)'

There was an existing test for where an arg type was specified in an
INPUT line. Modernise this test to use the new test_many() XS-testing
framework, and add tests for where the type is specified in the XSUB's
signature, and where the return type is a function pointer.
Add some basic tests for this keyword, including a test for the
'invalid value' error.

Also, fix that error handling: it was calling $self->death() rather
than $pxs->death() and so was croaking with 'no such method'. Also, improve
the text of the error message.
Add some tests for this keyword (especially testing all the possible
errors that can be raised).

Also, improve the text of one of the error messages to indicate what a
valid REQUIRE value can be.
iabyn added 24 commits January 9, 2026 14:20
See the previous commit but one for the rationale.
Add some comments at the top of these three new test files to explain
their purpose.
This commit moves a test out of 001-basic.t into its own test file which
checks that ExtUtils::ParseXS can be loaded and that its version matches
that in lib/perxs.pod.

As of this commit, 001-basic.t now contains only parse tests of XS
snippets using the test_many() framework. Previous commits have removed
anything else out of it.

The next series of commits will split  001-basic.t into several
smaller files, as its currently about 7000 lines long.
Move the test_many() sub and associated stuff out of t/001-basic.t
and into its own module so that it can be shared across multiple test
files. This is a prelude to splitting t/001-basic.t.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with things which can appear at file-scope in an XS file
(i.e. outside an XSUB), apart from file-scoped keywords.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with keywords which can appear at file-scope in an XS file
(i.e. outside an XSUB).

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the declaration of an XSUB (i.e. the first
two lines).

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the individual parameters of an XSUB.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the return type of an XSUB.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the INPUT and OUTPUT keywords of an XSUB.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the  keywords (except INPUT and OUTPUT) of an
XSUB.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
Move out of 001-basic.t and into their own test file, the tests which
are concerned with parsing the C++ support features for XSUBs.

This is just several simple cut and pastes of blocks of tests; no
changes have been made to the tests themselves.

This commit is part of splitting 001-basic.t into several smaller files,
with a more coherent grouping.
In the previous commits, most of the body of t/001-basic.t has been
moved out into separate 0xx-parse-foo.t files. Rename this file now
so that it is also a 0xx-parse- file.
Add the standard boilerplate to the top of the file.
This test data file has not been used since v5.15.0-477-g9b58169ac2,
but has been hanging around ever since.
This test file has apparently never actually tested anything, and
there's already another test file, t/104-map_type.t which actually tests
the map_type() method.
This test file doesn't actually test anything.

The set_cond() method is fairly trivial; it just returns a short snippet
of C code like "items < 1" for use in an XSUB's number-of-args checking,
and there's plenty of tests in the 0xx-parse-foo.t test files for that
already.
Each of the 1xx-*.t test files run tests on a particular API function.
So rename all these files to include a -api- prefix (in the same way
that the parsing files are all 0xx-parse-foo.t and the code running are
all 3xx-run-foo.t)
Add a boilerplate comment line explaining the purpose of each file.
The 5xx-t-foo.t files all test ExtUtils::Typemaps (as opposed to all the
other test files in the t/ directory which test ExtUtils::ParseXS).
Rename the filename prefix from -t- to -typemaps- to better signal this
fact.

600-t-compat.t is a bit of an odd one - it's also testing
ExtUtils::Typemaps and I'm not sure why its 600 rather than 5xx. It's
the only 6xx in the directory. I've renamed it to 518-typemaps-compat.t
for consistency.
Add a boilerplate comment line or two explaining the purpose of each
file.
This directory just contains typemap files to be used by various
test files. Make the name give a clearer indication of its purpose.
Fix up a test to work under perl 5.8.9.

This regex:  qr/\Q...\".../ was meant to match the two literal
characters backslash and double-quote, but that only worked correctly
from 5.10.0 onwards.
This private helper module had 'strict' enabled, but not warnings.
So enable them, and fix up a problem it reveals:

the destructor for the tied handle was trying to write the buffer
contents even if the buffer was empty. This is harmless, but was now
triggering a warning for some tests which tested for an error condition
that didn't produce any output, e.g.:

    ../dist/ExtUtils-ParseXS/t/002-parse-file-scope.t ................... 1/?
    print() on unopened filehandle FH at .../CountLines.pm line 44 during global destruction.

This is also a fix for running under 5.8.9, whose test harness rather
anti-socially invokes perl with -w. Which is how I got to know about the
warnings in the first place.
@iabyn
Copy link
Contributor Author

iabyn commented Jan 9, 2026

If I checkout this commit and attempt to configure/build/test it by itself, I get a failure during the configuration stage.

Now fixed and rebased.

@Leont
Copy link
Contributor

Leont commented Jan 9, 2026

Could you please keep the Changes file up to date? I'm trying to sync to CPAN now and figuring out what went into which release is the hardest part of it. It doesn't have to be much, probably the PR description can be reused for it really.

@iabyn
Copy link
Contributor Author

iabyn commented Jan 10, 2026

Could you please keep the Changes file up to date? I'm trying to sync to CPAN now and figuring out what went into which release is the hardest part of it. It doesn't have to be much, probably the PR description can be reused for it really.

Now done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants