Dada reference grammar #17

nikomatsakis · 2022-01-08T10:56:22Z

Our parser is hand-written and I'm ok with that, but it'd be nice to have a reference grammar, particularly one that enforces non-ambiguity.

I was thinking we could do it using some kind of parser generator tool that supports LL(k) or perhaps (better) GLL/GLR. Looking around at what I see on crates.io, parol looks like the best fit, and most actively maintained.

I definitely want something on crates.io so that we can make a dada-reference-grammar crate, run cargo test on it, and so forth, without having to install any external tools.

The text was updated successfully, but these errors were encountered:

brson · 2022-02-12T08:45:44Z

I've started on this.

nikomatsakis · 2022-02-12T09:32:01Z

I was thinking we could do it using some kind of parser generator tool that supports LL(k) or perhaps (better) GLL/GLR. Looking around at what I see on crates.io, parol looks like the best fit, and most actively maintained.

Of these two, LL(k) would be better, as it enforces non-ambiguity.

brson · 2022-02-12T12:35:07Z

https://github.com/brson/dada/blob/grammar/components/dada-reference-grammar/src/grammar.par

It passes all but 28 of 107 tests. I'm struggling with it though since I don't understand how to write grammars well. So far parol says what I've written is LL2.

brson · 2022-02-12T12:36:10Z

parol has a neat feature that it can emit random valid syntax, which could be useful for testing.

brson · 2022-02-14T04:31:44Z

Now it's only 9 tests failing, all related to format strings (which I haven't tried to parse yet); and return without value and unit/tuples. The latter two both seem ambiguous to me.

Return without expression because if the expression is optional then it is not possible to have any expression-statements after a return statement

e.g.

fn foo() {
    return
    dead_code_here // can't decide what this is?
}

Unit/tuples, at least in statement position, are ambiguous with function calls per #117

brson · 2022-02-14T04:41:57Z

I am though very happy for you to correct me and fix my grammar definition. I don't understand this subject well.

nikomatsakis · 2022-02-14T14:38:51Z

Nice! I'll make some time to check this out.

brson · 2022-02-14T17:29:17Z

The parser has to be generated manually right now and the command for that is here:

https://github.com/brson/dada/blob/grammar/components/dada-reference-grammar/src/lib.rs#L3

brson · 2022-02-21T04:23:09Z

I added two xtasks for grammar so you don't have to type the full commands:

cargo xtask grammar build
cargo xtask grammar decidable which shows the maximum lookahead

brson · 2022-02-21T04:46:57Z

33 tests failing now. I'll give it another look soon. (edit: looks to be mostly unary ops, that i haven't implemented yet).

brson · 2022-02-21T05:26:38Z

Well I fixed the problem with return-without-expr by making it only possible to appear as the final expression in a block.

nikomatsakis · 2022-03-02T09:32:31Z

@brson whoops, forgot about the return question. To be honest, I hadn't thought about it, but that rule (must be last in block) makes sense. Presumably that would apply to break and continue? I would probably have expected to use newline sensitivity -- i.e., there must be some token on the same line as return (could be an opening ().

brson · 2022-03-03T05:36:30Z

If we're going to be newline-sensitive for disambiguation elsewhere then we should probably do it for return/break/continue as well.

brson · 2022-07-17T16:03:12Z

Unasigning myself. Busy with other things lately.

vemoo · 2022-07-31T14:45:23Z

Regarding

fn foo() {
    return
    bar
}

the current parser parses it as return with value bar, which I think it makes sense, because you can do:

fn foo() {
    return,
    bar
}

if you want a return without value.

vemoo · 2022-07-31T14:54:15Z

I've been experimenting with tree-sitter here: https://github.com/vemoo/dada/blob/tree-sitter/tree-sitter-dada/grammar.js

I'm not sure if tree-sitter is a good idea for defining a reference grammar, but it wasn't to difficult to create the grammar and it's parsing most of the dada_tests files but with some differences.

I'm now taking a look at the work done by @brson and experimenting with it, but I've found that some changes to the grammar file make parol hang, but I'm not sure if its an issue in the grammar or a bug in parol.

vemoo · 2022-08-03T17:15:11Z

Thinking some more about this, it is a bit weird that newlines sometimes are interpreted as separators (fields, parameters, tuples, sometimes expressions) and other times as trivia.

Maybe newlines should not be automatically skipped in the parser, and should only be skipped where they are not ambiguous?

nikomatsakis · 2022-08-05T20:31:46Z

Maybe -- I was hoping to make newlines kind of magically "dwim" at all the right times, but I wouldn't be surprised if it doesn't always do what is expected.

vemoo · 2022-08-08T16:27:24Z

I've been trying to complete the parol grammar here: https://github.com/vemoo/dada/tree/grammar but I'm finding it much harder to work with than tree-sitter. I think it's because a lot of the changes I try cause the grammar to no longer be LL(k), so that may be a LL(k) vs GLR difference.

Another thing, I've noticed that there are a few issues that affect the grammar: #177, #156, #134, #117
Should we create a label for them? To know that they could affect the reference grammmar.

igotfr · 2023-06-06T21:44:43Z

https://github.com/brson/dada/blob/grammar/components/dada-reference-grammar/src/grammar.par

It passes all but 28 of 107 tests. I'm struggling with it though since I don't understand how to write grammars well. So far parol says what I've written is LL2.

sorry, but what program can I use to render or check .par files?

Answer: parol - https://crates.io/crates/parol

brson self-assigned this Feb 12, 2022

brson removed their assignment Jul 17, 2022

vemoo mentioned this issue Aug 3, 2022

add source to diagnostics #204

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dada reference grammar #17

Dada reference grammar #17

nikomatsakis commented Jan 8, 2022

brson commented Feb 12, 2022

nikomatsakis commented Feb 12, 2022 •

edited

Loading

brson commented Feb 12, 2022

brson commented Feb 12, 2022

brson commented Feb 14, 2022 •

edited

Loading

brson commented Feb 14, 2022

nikomatsakis commented Feb 14, 2022

brson commented Feb 14, 2022

brson commented Feb 21, 2022

brson commented Feb 21, 2022 •

edited

Loading

brson commented Feb 21, 2022

nikomatsakis commented Mar 2, 2022

brson commented Mar 3, 2022

brson commented Jul 17, 2022

vemoo commented Jul 31, 2022

vemoo commented Jul 31, 2022

vemoo commented Aug 3, 2022 •

edited

Loading

nikomatsakis commented Aug 5, 2022

vemoo commented Aug 8, 2022

igotfr commented Jun 6, 2023 •

edited

Loading

Dada reference grammar #17

Dada reference grammar #17

Comments

nikomatsakis commented Jan 8, 2022

brson commented Feb 12, 2022

nikomatsakis commented Feb 12, 2022 • edited Loading

brson commented Feb 12, 2022

brson commented Feb 12, 2022

brson commented Feb 14, 2022 • edited Loading

brson commented Feb 14, 2022

nikomatsakis commented Feb 14, 2022

brson commented Feb 14, 2022

brson commented Feb 21, 2022

brson commented Feb 21, 2022 • edited Loading

brson commented Feb 21, 2022

nikomatsakis commented Mar 2, 2022

brson commented Mar 3, 2022

brson commented Jul 17, 2022

vemoo commented Jul 31, 2022

vemoo commented Jul 31, 2022

vemoo commented Aug 3, 2022 • edited Loading

nikomatsakis commented Aug 5, 2022

vemoo commented Aug 8, 2022

igotfr commented Jun 6, 2023 • edited Loading

sorry, but what program can I use to render or check .par files?

Answer: parol - https://crates.io/crates/parol

nikomatsakis commented Feb 12, 2022 •

edited

Loading

brson commented Feb 14, 2022 •

edited

Loading

brson commented Feb 21, 2022 •

edited

Loading

vemoo commented Aug 3, 2022 •

edited

Loading

igotfr commented Jun 6, 2023 •

edited

Loading