Deriving Lex In SDF

XT -- A Bundle of Program Transformation Tools
-----------------------------------------------------------------------------

RECOVERY OF SYNTAX DEFINITION FOR LEX

-----------------------------------------------------------------------------

No syntax definition for LEX was available in the grammar-base. In
order to further automate the translation of LEX/YACC grammars to SDF2
syntax definitions, such a syntax definition is needed. In this file I
report the steps I took to get LEX in SDF2.

-- Eelco Visser 2001/09/29

-----------------------------------------------------------------------------

[Step 1] Locate the sources

   ftp://ftp.gnu.org/non-gnu/flex/

[Step 2] Inspect the source

   cd flex-2.5.4
   less parse.y 

[Step 3] Copy source to grammar base

   cp parse.y ~/res/XT/gb/grammars/lex.0

[Step 3] Parse the YACC (Bison) source

>    parse -l yacc -i lex.y -I -o lex.af 

=>    Error: charliteral '\n' not recognized. Repair syntax definition of
   YACC.

[Step 4] Translate to AbstractSDF

>   parse -l yacc -i lex.y -I -o lex.af 
   yacc2sdf -i lex.af -o lex.asdf 

[Step 5] Pretty-print syntax definition and inspect

>   sdf-bracket -i lex.asdf | pp -a -l sdf -o lex.def -v 2.1 
   less lex.def    

[Step 6] Regularize the syntax definition

>   parse -l yacc -i lex.y -I -o lex.af 
   yacc2sdf -i lex.af -o lex.asdf 
   sdf-regularize -i lex.asdf -o lex.reg.asdf 
   sdf-bracket -i lex.reg.asdf | pp -a -l sdf -o lex.def -v 2.1 
   less lex.def    

[Step 7] Generate constructors

>   parse -l yacc -i lex.y -I -o lex.af 
   yacc2sdf -i lex.af -o lex.asdf 
   sdf-regularize -i lex.asdf -o lex.reg.asdf 
   sdf-cons -i lex.reg.asdf -o lex.reg.cons.asdf
   sdf-bracket -i lex.reg.cons.asdf | pp -a -l sdf -o lex.def -v 2.1 
   less lex.def   

[Step 8] Edit the definition to define lexical syntax and improve constructors

[Step 9] Unpack the definition to create separate SDF modules. Make check can then
   be used to generate lex.def and automatically parse various example files.
   Before doing this change the names of the modules Lexical and Generated into
     Lex-Symbols and Lex, respectively.

>   unpack-sdf lex.def

[Step 10] Further edit the modules to define lexical syntax.

=>    It turns rather hard to parse complete lex files. The lexical syntax
   is very tricky. Instead I decide to reduce the problem by editing lex
   files by hand to remove all irrelevant stuff and only leave definitions
   of the form |name re| and rules of the form |re return id;|. Newlines
   cannot be used as general layout, but are used to delimit definitions
   and rules. No superfluous newlines are allowed.

=>    Succeed in parsing stratego.mod.l, a modified version of the stratego lexical
   syntax in lex.

[Step 11] Improve the syntax definition to get good abstract syntax. Start with
   unfolding literals.

>   parse -l sdf -v 2.1 -I -i lex.def -o lex.adef
   unfold-literal -i lex.adef -o lex.unf.adef
   sdf-bracket -i lex.adef | pp -a -l sdf -o lex.def -v 2.1 
   less lex.def   

=>    Automatic application does not work, do it manually. (come back and
   repair unfold-literal later)

[Step 12] Abstract syntax looks good. Install parse table such that it can be used with the
   parse tool of the grammar base.

>    make install
   parse -l lex -i data/stratego.mod.l -I


[Step 13] Project finished. 

=>    Future work: 
   - parse full lex definitions
   - generate a signature from the syntax definition

-- EelcoVisser - 29 Sep 2001