Skip to content

read_ident() result differs between SIMD and non-SIMD #53

@kelko

Description

@kelko

As I noticed while working on #52 the method read_ident() in base.rs returns different results on the underlying string "Frameset//EN", depending on whether SIMD is used or not.

In stable rust, without SIMD, it returns the full string with a length of 12.
In nightly rust, with SIMD, it returns only the part up to "//" with a length of 8.

I happily try to fix that in #52, just which version is correct and which to change?

non-SIMD uses (util.rs):

pub fn is_ident(c: u8) -> bool {
    (b'0'..=b'9').contains(&c)
        || (b'A'..=b'Z').contains(&c)
        || (b'a'..=b'z').contains(&c)
        || c == b'-'
        || c == b'_'
        || c == b':'
        || c == b'+'
        || c == b'/'
}

SIMD uses (nightly.rs):

    let needle_zero = u8x16::splat(b'0');
    let needle_nine = u8x16::splat(b'9');
    let needle_lc_a = u8x16::splat(b'a');
    let needle_lc_z = u8x16::splat(b'z');
    let needle_uc_a = u8x16::splat(b'A');
    let needle_uc_z = u8x16::splat(b'Z');
    let needle_minus = u8x16::splat(b'-');
    let needle_underscore = u8x16::splat(b'_');

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions