parsing - Require newline or EOF after statement match -


just looking simple way of getting antlr4 generate parser following (ignore after ;):

int #i ;    defines int int #j ;    see how have go line statement? 

my parser following:

compilationunit:     (statement end?)*     statement end?     eof ;  statement:     intdef |     ws ;  // 10 - 1f block.  intdef:     'intdef' identifier ;  // lexer.  identifier: '#' letter letterordigit*; fragment letter: [a-za-z_]; fragment letterordigit: [a-za-z0-9$_];  // whitespace, fragments , terminals.  ws: [ \t\r\n\u000c]+ -> skip; //comment: '/*' .*? '*/' -> channel(hidden); end: (';' ~[\r\n]*) | '\n'; 

in essence, time have statement, need require newline before entered. don't care if there's 3 new lines , on second 1 bunch of tabs persist, long there's new line.

the issue is, antlr4 parse tree seems giving me errors inputs such as:

. 

(pretend dot isnt there, literally no input)

int #i int #j 

woops, got 2 on same line!

any ideas on how can achieve this? appreciate help.

i've simplified grammar bit made require end-of-line sequence after each statement parse correctly.

grammar testnl;  program: (statement )* eof ;  statement: 'int' identifier eol;  identifier: '#' letter letterordigit*; fragment letter: [a-za-z_]; fragment letterordigit: [a-za-z0-9$_];  eol: ';' .*? '\r\n' | ';' .*? '\n' ;  ws: [ \t\r\n\u000c]+ -> skip; 

it parses

int #i ; int #j;   [@0,0:2='int',<'int'>,1:0] [@1,4:5='#i',<identifier>,1:4] [@2,7:9=';\r\n',<eol>,1:7] [@3,10:12='int',<'int'>,2:0] [@4,14:15='#j',<identifier>,2:4] [@5,16:18=';\r\n',<eol>,2:6] [@6,19:18='<eof>',<eof>,3:0] 

it ignore stuff after semicolon part of eol token:

[@0,0:2='int',<'int'>,1:0] [@1,4:5='#i',<identifier>,1:4] [@2,7:20='; ignore this\n',<eol>,1:7] [@3,21:23='int',<'int'>,2:0] [@4,25:26='#j',<identifier>,2:4] [@5,27:28=';\n',<eol>,2:6] [@6,29:28='<eof>',<eof>,3:0] 

using either linefeed or carriagereturn-linefeed fine. you're looking for?

edit

per op comment, made small change allow consecutive eol tokens, , move eol token statement reduce repetition:

grammar testnl;

program: ( statement eol )* eof ;  statement: 'int' identifier;  identifier: '#' letter letterordigit*; fragment letter: [a-za-z_]; fragment letterordigit: [a-za-z0-9$_];  eol: ';' .*? ('\r\n')+ | ';' .*? ('\n')+ ;  ws: [ \t\r\n\u000c]+ -> skip; 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -