-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dada reference grammar #17
Comments
I've started on this. |
Of these two, LL(k) would be better, as it enforces non-ambiguity. |
https://github.com/brson/dada/blob/grammar/components/dada-reference-grammar/src/grammar.par It passes all but 28 of 107 tests. I'm struggling with it though since I don't understand how to write grammars well. So far parol says what I've written is LL2. |
parol has a neat feature that it can emit random valid syntax, which could be useful for testing. |
Now it's only 9 tests failing, all related to format strings (which I haven't tried to parse yet); and return without value and unit/tuples. The latter two both seem ambiguous to me. Return without expression because if the expression is optional then it is not possible to have any expression-statements after a return statement e.g.
Unit/tuples, at least in statement position, are ambiguous with function calls per #117 |
I am though very happy for you to correct me and fix my grammar definition. I don't understand this subject well. |
Nice! I'll make some time to check this out. |
The parser has to be generated manually right now and the command for that is here: https://github.com/brson/dada/blob/grammar/components/dada-reference-grammar/src/lib.rs#L3 |
I added two xtasks for grammar so you don't have to type the full commands:
|
33 tests failing now. I'll give it another look soon. (edit: looks to be mostly unary ops, that i haven't implemented yet). |
Well I fixed the problem with return-without-expr by making it only possible to appear as the final expression in a block. |
@brson whoops, forgot about the |
If we're going to be newline-sensitive for disambiguation elsewhere then we should probably do it for return/break/continue as well. |
Unasigning myself. Busy with other things lately. |
Regarding
the current parser parses it as return with value
if you want a return without value. |
I've been experimenting with tree-sitter here: https://github.com/vemoo/dada/blob/tree-sitter/tree-sitter-dada/grammar.js I'm not sure if tree-sitter is a good idea for defining a reference grammar, but it wasn't to difficult to create the grammar and it's parsing most of the dada_tests files but with some differences. I'm now taking a look at the work done by @brson and experimenting with it, but I've found that some changes to the grammar file make parol hang, but I'm not sure if its an issue in the grammar or a bug in parol. |
Thinking some more about this, it is a bit weird that newlines sometimes are interpreted as separators (fields, parameters, tuples, sometimes expressions) and other times as trivia. Maybe newlines should not be automatically skipped in the parser, and should only be skipped where they are not ambiguous? |
Maybe -- I was hoping to make newlines kind of magically "dwim" at all the right times, but I wouldn't be surprised if it doesn't always do what is expected. |
I've been trying to complete the parol grammar here: https://github.com/vemoo/dada/tree/grammar but I'm finding it much harder to work with than tree-sitter. I think it's because a lot of the changes I try cause the grammar to no longer be LL(k), so that may be a LL(k) vs GLR difference. Another thing, I've noticed that there are a few issues that affect the grammar: #177, #156, #134, #117 |
sorry, but what program can I use to render or check .par files?Answer: parol - https://crates.io/crates/parol |
Our parser is hand-written and I'm ok with that, but it'd be nice to have a reference grammar, particularly one that enforces non-ambiguity.
I was thinking we could do it using some kind of parser generator tool that supports LL(k) or perhaps (better) GLL/GLR. Looking around at what I see on crates.io, parol looks like the best fit, and most actively maintained.
I definitely want something on crates.io so that we can make a
dada-reference-grammar
crate, runcargo test
on it, and so forth, without having to install any external tools.The text was updated successfully, but these errors were encountered: