[WIP] tools: Add pygments import script#100
Conversation
Currently, we can retrieve the regex patterns for the required tokens of all languages not found in the coAST schema. Closes coala#96
Many edge cases are yet to be covered. For now, the script simply skips over all the languages for which it was unable to parse the patterns properly.
|
I feel like the number of lines is getting too big. Will probably break up the script into two or three files. |
|
Also, I'm not really satisfied with the extraction logic for the keywords. I'm currently going word by word, handling each regex metacharacter and its behaviour separately, which is obviously not very sustainable, and leaves out many edge cases.
I've been trying, rather unsuccessfully, to do the second one using regexes, but I'm not very skilled at that, so I couldn't figure out the proper logic to do so. If someone can help out, it would be greatly appreciated 😃 |
A simple way to check the correctness of the keyword, is to ensure that it matches the regex pattern that it was extracted from.
|
I guess most of the hard part is completed now. I've hit a snag on the yaml file dumping, as the |
Currently, we can retrieve the regex patterns from lexers for the required
tokens of all languages not found in the coAST schema.
TODO:
Will close #96