fsteeg.com | notes | tags
∞ /notes/linguistic-dsls-with-antlr | 2007-10-06 | release dsl fdg linguistics nlp
Cross-posted to: https://fsteeg.wordpress.com/2007/10/06/linguistic-dsls-with-antlr/
As an update to our Functional Grammar project I've added some experimental ANTLR v3 grammar files for Functional Discourse Grammar (FDG) structures on the Interpersonal and Representational Levels and updated the project page (with some details in an updated version of our overview paper). For example, on the RL, it is possible to generate a parser in all sorts of programming languages (thanks to ANTLR v3), that will parse linguistic descriptions on the RL, such as the following serial verb construction in Jamaican Creole (Im tek naif kot mi, 'He cut me with a knife'):(p1:[ (Past e1:[ (f1:tek[ (x1:im(x1))Ag (x2:naif(x2))Inst ](f1)) (f2:kot[ (x1:im(x1))Ag (x3:mi(x3))Pat ](f2)) ](e1)) ](p1))The grammar looks like this (naming is based on FDG terminology):
pcontent : '(' OPERATOR? 'p' X ( ':' head '(' 'p' X ')' )* ')' FUNCTION? ; soaffairs : '(' OPERATOR? 'e' X ( ':' head '(' 'e' X ')' )* ')' FUNCTION? ; property : '(' OPERATOR? 'f' X ( ':' head '(' 'f' X ')' )* ')' FUNCTION? ; individual : '(' OPERATOR? 'x' X ( ':' head '(' 'x' X ')' )* ')' FUNCTION? ; location : '(' OPERATOR? 'l' X ( ':' head '(' 'l' X ')' )* ')' FUNCTION? ; time : '(' OPERATOR? 't' X ( ':' head '(' 't' X ')' )* ')' FUNCTION? ; head : LEMMA? ( '[' ( soaffairs | property | individual | location | time )* ']' ) ? ;And with the same short grammar it is basically possible to parse all valid RL representations. It's the same for IL structures. Such an approach makes sense for two reasons, I believe: