| 1 | This directory contains some examples illustrating techniques for extracting
|
|---|
| 2 | high-performance from flex scanners. Each program implements a simplified
|
|---|
| 3 | version of the Unix "wc" tool: read text from stdin and print the number of
|
|---|
| 4 | characters, words, and lines present in the text. All programs were compiled
|
|---|
| 5 | using gcc (version unavailable, sorry) with the -O flag, and run on a
|
|---|
| 6 | SPARCstation 1+. The input used was a PostScript file, mainly containing
|
|---|
| 7 | figures, with the following "wc" counts:
|
|---|
| 8 |
|
|---|
| 9 | lines words characters
|
|---|
| 10 | 214217 635954 2592172
|
|---|
| 11 |
|
|---|
| 12 |
|
|---|
| 13 | The basic principles illustrated by these programs are:
|
|---|
| 14 |
|
|---|
| 15 | - match as much text with each rule as possible
|
|---|
| 16 | - adding rules does not slow you down!
|
|---|
| 17 | - avoid backing up
|
|---|
| 18 |
|
|---|
| 19 | and the big caveat that comes with them is:
|
|---|
| 20 |
|
|---|
| 21 | - you buy performance with decreased maintainability; make
|
|---|
| 22 | sure you really need it before applying the above techniques.
|
|---|
| 23 |
|
|---|
| 24 | See the "Performance Considerations" section of flexdoc for more
|
|---|
| 25 | details regarding these principles.
|
|---|
| 26 |
|
|---|
| 27 |
|
|---|
| 28 | The different versions of "wc":
|
|---|
| 29 |
|
|---|
| 30 | mywc.c
|
|---|
| 31 | a simple but fairly efficient C version
|
|---|
| 32 |
|
|---|
| 33 | wc1.l a naive flex "wc" implementation
|
|---|
| 34 |
|
|---|
| 35 | wc2.l somewhat faster; adds rules to match multiple tokens at once
|
|---|
| 36 |
|
|---|
| 37 | wc3.l faster still; adds more rules to match longer runs of tokens
|
|---|
| 38 |
|
|---|
| 39 | wc4.l fastest; still more rules added; hard to do much better
|
|---|
| 40 | using flex (or, I suspect, hand-coding)
|
|---|
| 41 |
|
|---|
| 42 | wc5.l identical to wc3.l except one rule has been slightly
|
|---|
| 43 | shortened, introducing backing-up
|
|---|
| 44 |
|
|---|
| 45 | Timing results (all times in user CPU seconds):
|
|---|
| 46 |
|
|---|
| 47 | program time notes
|
|---|
| 48 | ------- ---- -----
|
|---|
| 49 | wc1 16.4 default flex table compression (= -Cem)
|
|---|
| 50 | wc1 6.7 -Cf compression option
|
|---|
| 51 | /bin/wc 5.8 Sun's standard "wc" tool
|
|---|
| 52 | mywc 4.6 simple but better C implementation!
|
|---|
| 53 | wc2 4.6 as good as C implementation; built using -Cf
|
|---|
| 54 | wc3 3.8 -Cf
|
|---|
| 55 | wc4 3.3 -Cf
|
|---|
| 56 | wc5 5.7 -Cf; ouch, backing up is expensive
|
|---|