Skip to content

A case of parser backtracking not being supported #138

@ForNeVeR

Description

@ForNeVeR

Disclaimer

Once again, mind that I never had any formal education in language parsing, so I may ask of strange / unrealisting things, and please feel free to correct me in anything I say.

Describe the bug

Let's consider this EBNF sample, an excerpt of the actual C17 grammar with everything unrelated stripped.

function_definition: declaration_specifiers declarator
declaration_specifiers: type_specifier declaration_specifiers?
declarator: direct_declarator
direct_declarator: Identifier
type_specifier: 'int'
type_specifier: Identifier

So, a function_definition may be boiled down to a sequence of type_specifiers followed by a single direct_declarator (which is a Identifier). So far, so good.

This synthetic sample should pass through this parser:

int main

It has one type_specifier, namely int, and then a declarator which is direct_declarator which is main.

Unfortunately, I wasn't able to make Yoakke to parse this sample.

Here's my program:

using Yoakke.SynKit.C.Syntax;
using Yoakke.SynKit.Lexer;
using Yoakke.SynKit.Parser.Attributes;

namespace Foo;

public record FunctionDefinition(List<IDeclarationSpecifier> Specifiers, Declarator Declarator);

public interface IDeclarationSpecifier
{
}

public record Declarator(DirectDeclarator DirectDeclarator);

public record DirectDeclarator(string Text);

public record TypeSpecifier(string Name) : IDeclarationSpecifier;

[Parser(typeof(CTokenType))]
public partial class CParser
{
    [Rule("function_definition: declaration_specifiers declarator")]
    private static FunctionDefinition MakeFunctionDefinition(
        List<IDeclarationSpecifier> specifiers,
        Declarator declarator) => new(specifiers, declarator);

    [Rule("declaration_specifiers: type_specifier declaration_specifiers?")]
    private static List<IDeclarationSpecifier> MakeDeclarationSpecifiers(
        IDeclarationSpecifier typeSpecifier,
        List<IDeclarationSpecifier>? rest) =>
        rest?.Prepend(typeSpecifier).ToList() ?? new List<IDeclarationSpecifier> { typeSpecifier };
    
    [Rule("declarator: direct_declarator")]
    private static Declarator MakeDeclarator(DirectDeclarator directDeclarator) =>
        new(directDeclarator);

    [Rule("direct_declarator: Identifier")]
    private static DirectDeclarator MakeDirectDeclarator(IToken identifier) =>
        new DirectDeclarator(identifier.Text);
    
    [Rule("type_specifier: 'int'")]
    [Rule("type_specifier: Identifier")]
    private static TypeSpecifier MakeSimpleTypeSpecifier(IToken specifier) => new(specifier.Text);
}


public class Program
{
    public static void Main(string[] args)
    {
        var parser = new CParser(new CLexer("int main"));
        Console.WriteLine(parser.ParseFunctionDefinition().Ok);
    }
}

For convenience, here's also a .csproj:

<Project Sdk="Microsoft.NET.Sdk">

    <PropertyGroup>
        <OutputType>Exe</OutputType>
        <TargetFramework>net6.0</TargetFramework>
        <ImplicitUsings>enable</ImplicitUsings>
        <Nullable>enable</Nullable>
    </PropertyGroup>

    <ItemGroup>
      <PackageReference Include="Yoakke.SynKit.C.Syntax" Version="2022.1.24-2.29.33-nightly" />
      <PackageReference Include="Yoakke.SynKit.Parser" Version="2022.1.24-2.29.33-nightly" />
      <PackageReference Include="Yoakke.SynKit.Parser.Generator" Version="2022.1.24-2.29.33-nightly" />
    </ItemGroup>

</Project>

I expect that this program should print the following:

Yoakke.SynKit.Parser.ParseOk`1[Foo.FunctionDefinition]

Instead, it prints this:

Unhandled exception. System.InvalidCastException: Unable to cast object of type 'Yoakke.SynKit.Parser.ParseError' to type 'Yoakke.SynKit.Parser.ParseOk`1[Foo.FunctionDefinition]'.
   at Yoakke.SynKit.Parser.ParseResult`1.get_Ok()
   at Foo.Program.Main(String[] args) in T:\Temp\ConsoleApp4\ConsoleApp4\Program.cs:line 52

Analysis

I have thoroughly read and, I believe, understood the generated code, and I believe that the following is happening.

Yoakke eagerly eats the declaration_specifiers collection, and eats both tokens: int and main as its items. Then it tries to parse a declarator, but isn't able to do so, because it's out of tokens already.

Then, it's unable to drop the latest item from the declaration_specifiers and retry the declarator, though it would be the winning strategy here.

Unfortunately, I don't know a workaround for this problem, so I would appreciate any feedback. If there's any hacky/ugly way to fix this, I'd love to hear it. Obviously, I would love to hear if there's an elegant solution to this problem, too :)

Environment

  • OS: Windows 10
  • .NET version: .NET 6
  • Library version: 2022.1.24-2.29.33-nightly

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions