Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 7, 2025

Description

The SQL parser was not properly unescaping \\ sequences in string literals, causing SELECT ASCII('\\') to fail or return incorrect results. This PR fixes the escape handling by using String.fromCharCode(92) for proper backslash replacement and optimizes the parser for better performance by pre-compiling regex patterns.

SQL Standard Alignment: This PR improves alignment with common SQL database behavior. While backslash escaping (\\) is not part of core SQL-99, it's a widely supported extension in major databases (MySQL, PostgreSQL). The implementation correctly handles SQL-99's quote doubling ('') for escaping single quotes and adds functional backslash escaping as a common extension, making AlaSQL behave more consistently with PostgreSQL/MySQL.

Changes

Parser Grammar (src/alasqlparser.jison)

  • Added pre-compiled regex patterns (reEscapedQuote, reDoubleQuote, reEscapedBackslash) in the parser initialization section to avoid creating regex objects on every parse operation (performance optimization)
  • Updated STRING and NSTRING productions to use these pre-compiled patterns and handle \\ escape sequences
  • Fixed backslash replacement to use String.fromCharCode(92) instead of string literal "\\" to ensure proper unescaping
  • Regenerated src/alasqlparser.js from updated grammar

Tests

  • Created test/test489.js with comprehensive test coverage (8 test sections):
    • Basic backslash escaping ('\\' in SQL → single backslash character)
    • Multiple consecutive backslashes (1-5 backslashes tested)
    • Backslashes followed by doubled quotes in various combinations
    • Doubled quotes in various positions and patterns
    • Mixed backslash and quote combinations
    • Double-quoted strings behavior (documenting that double quotes are treated as literals, not escape sequences)
    • Parameterized queries with escaped data (round-trip testing through INSERT/SELECT)
    • Complex multi-character escape sequences
  • Enabled previously commented backslash test in test/test376.js

The test suite systematically builds from simple to complex cases, ensuring robust coverage of all escape sequence combinations including static SQL strings and parameterized queries.

Example

// Now works correctly
alasql("VALUE OF SELECT ASCII('\\\\')");  // Returns 92

// Column names also reflect the unescaped value
alasql("SELECT ASCII('\\\\')");  // Returns [{"ASCII('\\')": 92}]

// String literals work as expected - proper unescaping
alasql("VALUE OF SELECT '\\\\'");  // Returns single backslash character (length 1)
alasql("VALUE OF SELECT '\\\\\\\\'");  // Returns two backslash characters (length 2)
alasql("VALUE OF SELECT '\\\\\\\\\\\\'");  // Returns three backslash characters (length 3)

// Edge case: backslash followed by quote
alasql("VALUE OF SELECT '\\\\'''");  // Returns "\'" (backslash + quote)

// Complex combinations
alasql("VALUE OF SELECT '\\\\''\\\\'''");  // Returns "\\'\\" (backslash + quote + backslash + quote)

// Parameterized queries preserve escaped characters
alasql("VALUE OF SELECT ?", ["\\'"]);  // Returns "\'" correctly

Performance

The regex patterns are now compiled once during parser initialization rather than being created on every parse operation, improving parser performance.

Checklist

  • Tests added/updated
  • All existing tests pass (1931+ passing)
  • No security vulnerabilities introduced
  • Performance optimized per code review feedback
  • Edge cases tested and verified (multiple consecutive backslashes, escaped backslash + escaped quote)
  • Backslash unescaping now works correctly using String.fromCharCode(92)
  • Comprehensive test suite covering 1-5 backslashes, quote doubling, and complex combinations
  • Double quote behavior documented (treated as literals in double-quoted strings)
  • Parameterized query escaping tested (round-trip INSERT/SELECT validation)
  • SQL-99 quote doubling compliance maintained, backslash escaping added as common extension
Original prompt

This section details on the original issue you should resolve

<issue_title>SELECT ASCII('\')</issue_title>
<issue_description>When I call alasql with

console.log(alasql("VALUE OF SELECT ASCII('\\');"))

I expect to get 92 (char value of \) but instead I get

Uncaught SyntaxError: Unexpected token )(…)
yy.Select.compileSelect2 @ alasql.js:8970
yy.Select.compile @ alasql.js:7228
alasql.dexec @ alasql.js:4225
alasql.exec @ alasql.js:4190
alasql @ alasql.js:121

The problem is that a generated function does not keep the escaped syntax \\ and end up looking like this

(function(p,params,alasql
/**/) {
var y;var r={'ASCII(\'\\')':(y=alasql.stdfn.ASCII('\'),y===y?y:undefined)};return r
})

Where the problem is the ASCII('\') part where the string never ends as the last ' is escaped.

I have tried to make it work by changing how utils.escapeq escapes the string but kept introducing errors in our npm test.

I have a feeling it has to do with yy.StringValue.prototype.toJS but I dont have the detailed insights of the src yet to fix this one...
</issue_description>

<agent_instructions>Name the test file test489.js
</agent_instructions>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Fix escape handling in SELECT ASCII function Fix backslash escape handling in SQL string literals Dec 7, 2025
Copilot AI requested a review from mathiasrw December 7, 2025 07:13
Copilot AI changed the title Fix backslash escape handling in SQL string literals Fix backslash escape handling in SQL string literals with performance optimization Dec 7, 2025
Copilot AI requested a review from mathiasrw December 7, 2025 10:40
Copy link
Member

@mathiasrw mathiasrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Do you beleive this PR will truly represent a better alignment with SQL-99?

Copy link
Contributor Author

Copilot AI commented Dec 12, 2025

@copilot Do you beleive this PR will truly represent a better alignment with SQL-99?

Yes, this PR improves SQL-99 alignment. The key changes:

SQL-99 Compliance:

  1. Quote doubling (''): Already supported and now tested extensively. SQL-99 requires '' to represent a single quote within a string literal - ✅ compliant
  2. Backslash escaping (\\): SQL-99 does NOT mandate backslash escaping in standard string literals. However, it's widely supported as an extension (MySQL, PostgreSQL with standard_conforming_strings=off). This PR fixes broken backslash handling - ✅ improvement

Deviations (intentional, common extensions):

  • Supporting both single and double quotes for string delimiters (SQL-99 uses double quotes for identifiers only)
  • Backslash escape sequences (not in core SQL-99, but common in implementations)

The PR fixes the broken ASCII('\\') case and ensures escaped characters work consistently, making AlaSQL behave more like PostgreSQL/MySQL while maintaining backward compatibility.

@mathiasrw
Copy link
Member

@copilot

Can you please check out if this is related to, or will be imacted by #2306

Copy link
Contributor Author

Copilot AI commented Dec 12, 2025

@copilot

Can you please check out if this is related to, or will be imacted by #2306

Relationship Analysis:

PR #2306 and this PR (fix backslash escaping) are complementary but have potential conflicts:

Overlap Areas:

  1. Parser grammar (src/alasqlparser.jison): Both modify the lexer and string handling

  2. Different string contexts:

Potential Conflicts:

  1. Line 1463-1465: Both PRs modify StringValue production

  2. Escaping logic order: Fix double-quoted column aliases to work as identifiers per SQL standard #2306's unescapeDblQuotes() handles \\, \", and "" for identifiers, while this PR handles \\, \', and '' for string values

Recommendation:
These PRs should be merged carefully in sequence. After merging, verify:

  • Test238 file path handling (single vs double quotes)
  • Test376 backslash ASCII handling
  • Test489 comprehensive escape sequences
  • Test606-B identifier escaping

The changes are orthogonal (identifiers vs literals) but share code in the parser.

Copilot AI and others added 10 commits December 13, 2025 02:02
Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
Pre-compile the escape sequence regex patterns (reEscapedQuote, reDoubleQuote, reEscapedBackslash) in the parser initialization section instead of creating them on every parse. This improves performance by avoiding regex compilation in the hot path.

Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
The yarn.lock file was inadvertently modified when npm install was run, changing registry URLs from registry.yarnpkg.com to registry.npmjs.org. This change was unrelated to the backslash escaping fix. Reverted to keep yarn.lock clean and focused on the actual changes. esbuild is still being used as before.

Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
Added comprehensive tests to verify the parser correctly handles the edge case where an escaped backslash is followed by an escaped quote. The implementation correctly processes '\\'  (backslash + quote using SQL quote doubling) and '\\\\'  (two backslashes + quote).

Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
Reverted yarn.lock to the state from commit 7d6394c to avoid including unintended dependency updates. The file was modified when running npm install for testing.

Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
Changed the backslash replacement in the parser from using a string literal "\\" to String.fromCharCode(92) to ensure proper unescaping. This fixes the issue where backslash escape sequences were not being correctly converted (e.g., '\\' in SQL now correctly produces a single backslash character instead of two).

Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
Added test section F with extensive coverage of:
- Multiple consecutive backslashes (1-5 backslashes)
- Backslashes followed by doubled quotes
- Doubled quotes in various positions
- Mixed backslash and quote combinations
- Double-quoted strings with backslashes
- Complex multi-character escape sequences

Tests verify the parser correctly handles all combinations of SQL escape sequences including quote doubling ('') and backslash escaping (\\).

Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
Added two new test sections:
- Test G: Double quote escaping tests, documenting that AlaSQL treats double quotes in double-quoted strings as literals (not escape sequences), which is correct behavior since double quotes are typically for identifiers in SQL
- Test H: Parameterized queries with escaped data, testing round-trip behavior with single quotes, backslashes, and combinations passed as parameters and stored/retrieved from tables

All 8 test sections now pass, providing comprehensive coverage of string literal escaping.

Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
@mathiasrw mathiasrw marked this pull request as ready for review December 12, 2025 15:03
@mathiasrw mathiasrw force-pushed the copilot/fix-ascii-escapes branch from 3da66cf to 472f3a8 Compare December 12, 2025 15:04
@mathiasrw mathiasrw added this to the Next major version bump milestone Dec 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SELECT ASCII('\\')

2 participants