forgot about strings #6

heavyk · 2015-07-22T13:29:08Z

this fixes the case console.log \'lala, a-variable

heavyk · 2015-07-22T13:29:50Z

oh yeah, and ampersands too

heavyk · 2015-07-29T22:21:30Z

just pushed another update for ) livescript does not inlcude it in backticks

eg. this is correct

heavyk · 2015-07-29T22:31:08Z

well, shit the weird thing is this:

# fine
console.log \)
# also fine
console.log \)lala
# syntax error: unexpected ')'
console.log \lala)lala

gabeio · 2015-07-29T22:43:58Z

is the syntax error from livescript(compile) or atom(parser)?

heavyk · 2015-07-29T22:49:47Z

it's from livescript.
appears that, if the first char is an ')' it's accepted..

gabeio · 2015-07-29T22:51:54Z

yeah that's so you can do something like:

console.log(\asdf\))
console.log(\asdf)

heavyk · 2015-07-29T23:01:40Z

ok, I got it (kinda) but I want to simplify the regex. this is what I have now:

match: "\\\\[\\w\\W][\\$\\.\\/\\%\\^\\@\\#\\&\\*\\'\\\"\\!\\=\\+\\[\\]\\(\\{\\}\\<\\>\\w-]*"

do you know how to do a regex which basically says: any char after the \ \\\\[\\w\\W] but any subsequent chars can be anything except for a ')' ???

seems we could simplify the above mess to two rules (otherwise I'd have to add an exception for every unicode char -- cause for example both of these compile fine:

// Generated by LiveScript 1.4.0
(function(){
  console.log('this', 'is', 'livescript');
  console.log(yay);
  console.log(')');
  console.log(')lal%&*(!@§a');
  console.log(')hello:£¢€°·‚‚Æ§l%&*(!@§a');
}).call(this);

heavyk · 2015-07-29T23:04:07Z

my wording sucks. sorry bout that. do you know a regex which will match any letter except for ')' ??

gabeio · 2015-07-29T23:04:15Z

usually a . means any character (not sure about this version of regex) and as for anything but you can do a (?!\)) meaning can't match this group(which only is )) so try something like:

match: "\\\\[\\.][\\$\\.\\/\\%\\^\\@\\#\\&\\*\\'\\\"\\!\\=\\+\\[\\]\\(\\{\\}\\<\\>\\w-]*(?!\\))"

but I am not sure if that works... because the \\. I changed...

heavyk · 2015-07-29T23:09:04Z

they look to be compiled RegExp ... so I'm testing them in the console like this:

var r = new RegExp("\\\\[\\w\\W][\\$\\.\\/\\%\\^\\@\\#\\&\\*\\:\\'\\\"\\!\\=\\+\\[\\]\\(\\{\\}\\<\\>\\w-]*")
'\\)hello:£¢€°·‚‚Æ§l%&*(!@§a'.match(r)

ok, gonna try your suggestion

heavyk · 2015-07-29T23:50:47Z

I can't get it to work. according to this comment ... http://stackoverflow.com/questions/6851921/negative-lookahead-regular-expression#comment8148005_6851958

I would need to know the whole line. (^ ... $) for that technique to work in js ... I dunno if that's even right. this is way over my head right now. I honestly just learned about negative look-ahead ...

@gabeio do see an easy way for this:

'\\)hello:§()'.match(new RegExp("\\\\[\\w\\W][\\$\\.\\/\\%\\^\\@\\#\\&\\*\\:\\'\\\"\\!\\=\\+\\[\\]\\(\\{\\}\\<\\>\\w-]*"))
["\)hello:"]
// to become this: ???
["\)hello:§("]

for now, I'm giving up :/

98devin · 2016-02-13T02:02:08Z

I know this is a really old topic, but I think there's a simple solution. Rather than use a character class whitelisting acceptable characters, blacklist the bad ones.
That's done in general by using [^ insert chars here] where the ^ character means everything NOT in the class when put at the beginning.

That said, this works in the engine javascript uses at least:

'\\)hello:§()'.match /\\[\w\W][^\)\]\s]*/  #=> '\\)hello:§('

I'm not sure if this regex is foolproof though, or if it will work here, but it's likely.

heavyk · 2016-02-13T14:42:16Z

well, either way, this version is a huge improvement on what's published in apm. I'll probably revisit this though, because the other day I had strange formatting.

either way, I want to figure out how to use LS's tokenizer directly instead of using regexp.

98devin · 2016-02-13T22:15:52Z

Interesting idea; do any other syntax plugins on apm use their own engine? I just wonder how complicated that would be to set up.

As for other backslash string problems, they currently don't have the right priority since any # character inside will begin a comment...

Is the priority just based on the ordering in the file? If so that's an easy fix probably.

heavyk · 2016-02-13T23:06:07Z

Interesting idea; do any other syntax plugins on apm use their own engine?

I looked a while back and didn't see any, so that doesn't mean it doesn't exist. if not raise an issue on atom's tracker asking how it could be done.

Is the priority just based on the ordering in the file?

I don't remember right now. I just remember how complicated it was, and since I have little real knowledge of regexp that's what forced me to see if I could implement the existing tokenizer

98devin · 2016-02-14T03:41:20Z

I think it might be a good idea to look through all the regexes used in the grammar for redundancies and things to improve because of problems like this, even more so because the current available package conflicts with the language definitions (such as allowing ] and ) anywhere in a backslash string).

I couldn't find a good source on what engine Atom uses for regex, but it seems to be either javascript's or something called oniguruma. In any case they should be similar for the most part, so I'll try to understand the project as it is now.

forgot about strings

13361d4

')' paren is not included

f9e5f54

forgot about strings #6

Are you sure you want to change the base?

forgot about strings #6

Uh oh!

Conversation

heavyk commented Jul 22, 2015

Uh oh!

heavyk commented Jul 22, 2015

Uh oh!

heavyk commented Jul 29, 2015

Uh oh!

heavyk commented Jul 29, 2015

Uh oh!

gabeio commented Jul 29, 2015

Uh oh!

heavyk commented Jul 29, 2015

Uh oh!

gabeio commented Jul 29, 2015

Uh oh!

heavyk commented Jul 29, 2015

Uh oh!

heavyk commented Jul 29, 2015

Uh oh!

gabeio commented Jul 29, 2015

Uh oh!

heavyk commented Jul 29, 2015

Uh oh!

heavyk commented Jul 29, 2015

Uh oh!

98devin commented Feb 13, 2016

Uh oh!

heavyk commented Feb 13, 2016

Uh oh!

98devin commented Feb 13, 2016

Uh oh!

heavyk commented Feb 13, 2016

Uh oh!

98devin commented Feb 14, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants