However I consider the "opposite" sort of a mistake? It seems to me that the statement language and expression language are separate things, and it makes sense to use recursive descent for the statement language and either top-down or bottom-up precedence for expressions. In other words, the architecture of the original C compiler is very clean, and no doubt influenced the language design. Also, the difference between "Pratt parsing" and "Precedence climbing" is sort of a code organization issue -- do you handle a[i] and x? I wondered if there is an analogous thing for bottom-up expression parsing.
|Published (Last):||16 June 2017|
|PDF File Size:||12.73 Mb|
|ePub File Size:||9.34 Mb|
|Price:||Free* [*Free Regsitration Required]|
However I consider the "opposite" sort of a mistake? It seems to me that the statement language and expression language are separate things, and it makes sense to use recursive descent for the statement language and either top-down or bottom-up precedence for expressions. In other words, the architecture of the original C compiler is very clean, and no doubt influenced the language design.
Also, the difference between "Pratt parsing" and "Precedence climbing" is sort of a code organization issue -- do you handle a[i] and x?
I wondered if there is an analogous thing for bottom-up expression parsing. I like having each "form" in one function. It was my first compiler project about 25 years ago or so, and "invented" precedence parsing with the help of a symbolic expression evaluator in Prolog in a book I picked up that I tried to figure out how to translate to Pascal. So "while" took two expressions: the condition and a block.
The hard part was not handling the control structures, but getting the priorities right. My Ruby parser uses recursive descent for the top level productions.
But it was fascinating to do, as it did make for a very compact parser. Ruby that is true, though many constructs just returns nil e. But given that many of the "operators" are only allowed at specific levels, it makes less sense to use a precedence parser anyway e. I think the reason is simply that there is a fundamental relationship between the language you are parsing the data the algorithm you choose to parse it with the code.
Those 2 things are tightly linked. They usually only have time to teach one parsing method, and the language being parsed is chosen to fit that. There are of course ugly parts of this with Ruby that ruined my originally quite nice and clean generic shunting yard code. Worst case is I "surrender" and convert it to a recursive descent like the higher levels of the grammar, but I want to see what I can do with the current one first.
Less if you exclude the code for a decent generic parser library. I think that says a lot of the simplicity of applying both recursive descent and precedence parsing.. Sure, you can also deal with terminal overloading using adjacency if you want. However, the parser gets more complex and it becomes less worth it.
The variant I use for my Ruby parser does have some extra ugly hacks needed to deal with Ruby ambiguities, and "fun" stuff like variable arity operators e. Or one of my "favorites": foo 5,1 is valid and has two arguments. Sure you do. If the value stack contains fewer arguments than the current operator requires after any reduce step that the priorities require, you have an error, and the cause is generally straightforward enough to place.
Constructing precise errors from that can be a bit painful, but it usually points at the "correct" spot if you record the locations of tokens, and the priorities lets you process the priorities of the operators and easily enough determine what would be valid.
If you want to apply semantic checks, you can just as easily call out to do those tests with a precedence parser as with a recursive descent one. Depends what you mean by "checking for correctness".
In the simplest form, such as the example I posted, it is applying one very basic check: Does the number of available values on the value stack match what is required by the operator currently being processed? If you can do that, you can construct a well formed tree as per the rules specified by the precedence parser configuration. Error recovery mainly.
An unpaired brace can simply be flagged and then ignored during parsing. Maybe not so interesting for a batch parser, but it works really well in an IDE where transient errors like this are common. Finally, to if you type inside For live programming contexts where you are constantly parsing and type checking, this is useful. The entire point of the parser is "shunt" tokens onto an operand and operator stack effective sort them into reverse polish notation.
The operator table is a specification of the grammar according to those rules. There are certainly subsets that are awkward or requires you to augment a precedence parser. The precedence parser itself does not need to put any limits on the AST construction. In fact, the example I posted explicitly decouple tree construction from the parsing, so you can pass it a builder class that can apply whatever additional constraints you want on the construction of the trees. This is nonsense.
I vastly under appreciated how well Wirthian languages are designed, even after toying with the excellent offshoot, Modula3 . In my youth I was too concerned with raw performance and syntax. In my old age I am concerned about semantics and maintainability.
Compiler Construction: An Introduction
Compiler Construction by Niklaus Wirth
Niklaus Wirth, Compiler Construction