| 1 | <?xml version="1.0" encoding="iso-8859-1"?> | 
|---|
| 2 | <!DOCTYPE chapter PUBLIC "-//Samba-Team//DTD DocBook V4.2-Based Variant V1.0//EN" "http://www.samba.org/samba/DTD/samba-doc"> | 
|---|
| 3 | <chapter id="parsing"> | 
|---|
| 4 | <chapterinfo> | 
|---|
| 5 | <author> | 
|---|
| 6 | <firstname>Chris</firstname><surname>Hertel</surname> | 
|---|
| 7 | </author> | 
|---|
| 8 | <pubdate>November 1997</pubdate> | 
|---|
| 9 | </chapterinfo> | 
|---|
| 10 |  | 
|---|
| 11 | <title>The smb.conf file</title> | 
|---|
| 12 |  | 
|---|
| 13 | <sect1> | 
|---|
| 14 | <title>Lexical Analysis</title> | 
|---|
| 15 |  | 
|---|
| 16 | <para> | 
|---|
| 17 | Basically, the file is processed on a line by line basis.  There are | 
|---|
| 18 | four types of lines that are recognized by the lexical analyzer | 
|---|
| 19 | (params.c): | 
|---|
| 20 | </para> | 
|---|
| 21 |  | 
|---|
| 22 | <orderedlist> | 
|---|
| 23 | <listitem><para> | 
|---|
| 24 | Blank lines - Lines containing only whitespace. | 
|---|
| 25 | </para></listitem> | 
|---|
| 26 | <listitem><para> | 
|---|
| 27 | Comment lines - Lines beginning with either a semi-colon or a | 
|---|
| 28 | pound sign (';' or '#'). | 
|---|
| 29 | </para></listitem> | 
|---|
| 30 | <listitem><para> | 
|---|
| 31 | Section header lines - Lines beginning with an open square bracket ('['). | 
|---|
| 32 | </para></listitem> | 
|---|
| 33 | <listitem><para> | 
|---|
| 34 | Parameter lines - Lines beginning with any other character. | 
|---|
| 35 | (The default line type.) | 
|---|
| 36 | </para></listitem> | 
|---|
| 37 | </orderedlist> | 
|---|
| 38 |  | 
|---|
| 39 | <para> | 
|---|
| 40 | The first two are handled exclusively by the lexical analyzer, which | 
|---|
| 41 | ignores them.  The latter two line types are scanned for | 
|---|
| 42 | </para> | 
|---|
| 43 |  | 
|---|
| 44 | <orderedlist> | 
|---|
| 45 | <listitem><para> | 
|---|
| 46 | - Section names | 
|---|
| 47 | </para></listitem> | 
|---|
| 48 | <listitem><para> | 
|---|
| 49 | - Parameter names | 
|---|
| 50 | </para></listitem> | 
|---|
| 51 | <listitem><para> | 
|---|
| 52 | - Parameter values | 
|---|
| 53 | </para></listitem> | 
|---|
| 54 | </orderedlist> | 
|---|
| 55 |  | 
|---|
| 56 | <para> | 
|---|
| 57 | These are the only tokens passed to the parameter loader | 
|---|
| 58 | (loadparm.c).  Parameter names and values are divided from one | 
|---|
| 59 | another by an equal sign: '='. | 
|---|
| 60 | </para> | 
|---|
| 61 |  | 
|---|
| 62 | <sect2> | 
|---|
| 63 | <title>Handling of Whitespace</title> | 
|---|
| 64 |  | 
|---|
| 65 | <para> | 
|---|
| 66 | Whitespace is defined as all characters recognized by the isspace() | 
|---|
| 67 | function (see ctype(3C)) except for the newline character ('\n') | 
|---|
| 68 | The newline is excluded because it identifies the end of the line. | 
|---|
| 69 | </para> | 
|---|
| 70 |  | 
|---|
| 71 | <orderedlist> | 
|---|
| 72 | <listitem><para> | 
|---|
| 73 | The lexical analyzer scans past white space at the beginning of a line. | 
|---|
| 74 | </para></listitem> | 
|---|
| 75 |  | 
|---|
| 76 | <listitem><para> | 
|---|
| 77 | Section and parameter names may contain internal white space.  All | 
|---|
| 78 | whitespace within a name is compressed to a single space character. | 
|---|
| 79 | </para></listitem> | 
|---|
| 80 |  | 
|---|
| 81 | <listitem><para> | 
|---|
| 82 | Internal whitespace within a parameter value is kept verbatim with | 
|---|
| 83 | the exception of carriage return characters ('\r'), all of which | 
|---|
| 84 | are removed. | 
|---|
| 85 | </para></listitem> | 
|---|
| 86 |  | 
|---|
| 87 | <listitem><para> | 
|---|
| 88 | Leading and trailing whitespace is removed from names and values. | 
|---|
| 89 | </para></listitem> | 
|---|
| 90 |  | 
|---|
| 91 | </orderedlist> | 
|---|
| 92 |  | 
|---|
| 93 | </sect2> | 
|---|
| 94 |  | 
|---|
| 95 | <sect2> | 
|---|
| 96 | <title>Handling of Line Continuation</title> | 
|---|
| 97 |  | 
|---|
| 98 | <para> | 
|---|
| 99 | Long section header and parameter lines may be extended across | 
|---|
| 100 | multiple lines by use of the backslash character ('\\').  Line | 
|---|
| 101 | continuation is ignored for blank and comment lines. | 
|---|
| 102 | </para> | 
|---|
| 103 |  | 
|---|
| 104 | <para> | 
|---|
| 105 | If the last (non-whitespace) character within a section header or on | 
|---|
| 106 | a parameter line is a backslash, then the next line will be | 
|---|
| 107 | (logically) concatonated with the current line by the lexical | 
|---|
| 108 | analyzer.  For example: | 
|---|
| 109 | </para> | 
|---|
| 110 |  | 
|---|
| 111 | <para><programlisting> | 
|---|
| 112 | param name = parameter value string \ | 
|---|
| 113 | with line continuation. | 
|---|
| 114 | </programlisting></para> | 
|---|
| 115 |  | 
|---|
| 116 | <para>Would be read as</para> | 
|---|
| 117 |  | 
|---|
| 118 | <para><programlisting> | 
|---|
| 119 | param name = parameter value string     with line continuation. | 
|---|
| 120 | </programlisting></para> | 
|---|
| 121 |  | 
|---|
| 122 | <para> | 
|---|
| 123 | Note that there are five spaces following the word 'string', | 
|---|
| 124 | representing the one space between 'string' and '\\' in the top | 
|---|
| 125 | line, plus the four preceeding the word 'with' in the second line. | 
|---|
| 126 | (Yes, I'm counting the indentation.) | 
|---|
| 127 | </para> | 
|---|
| 128 |  | 
|---|
| 129 | <para> | 
|---|
| 130 | Line continuation characters are ignored on blank lines and at the end | 
|---|
| 131 | of comments.  They are *only* recognized within section and parameter | 
|---|
| 132 | lines. | 
|---|
| 133 | </para> | 
|---|
| 134 |  | 
|---|
| 135 | </sect2> | 
|---|
| 136 |  | 
|---|
| 137 | <sect2> | 
|---|
| 138 | <title>Line Continuation Quirks</title> | 
|---|
| 139 |  | 
|---|
| 140 | <para>Note the following example:</para> | 
|---|
| 141 |  | 
|---|
| 142 | <para><programlisting> | 
|---|
| 143 | param name = parameter value string \ | 
|---|
| 144 | \ | 
|---|
| 145 | with line continuation. | 
|---|
| 146 | </programlisting></para> | 
|---|
| 147 |  | 
|---|
| 148 | <para> | 
|---|
| 149 | The middle line is *not* parsed as a blank line because it is first | 
|---|
| 150 | concatonated with the top line.  The result is | 
|---|
| 151 | </para> | 
|---|
| 152 |  | 
|---|
| 153 | <para><programlisting> | 
|---|
| 154 | param name = parameter value string         with line continuation. | 
|---|
| 155 | </programlisting></para> | 
|---|
| 156 |  | 
|---|
| 157 | <para>The same is true for comment lines.</para> | 
|---|
| 158 |  | 
|---|
| 159 | <para><programlisting> | 
|---|
| 160 | param name = parameter value string \ | 
|---|
| 161 | ; comment \ | 
|---|
| 162 | with a comment. | 
|---|
| 163 | </programlisting></para> | 
|---|
| 164 |  | 
|---|
| 165 | <para>This becomes:</para> | 
|---|
| 166 |  | 
|---|
| 167 | <para><programlisting> | 
|---|
| 168 | param name = parameter value string     ; comment     with a comment. | 
|---|
| 169 | </programlisting></para> | 
|---|
| 170 |  | 
|---|
| 171 | <para> | 
|---|
| 172 | On a section header line, the closing bracket (']') is considered a | 
|---|
| 173 | terminating character, and the rest of the line is ignored.  The lines | 
|---|
| 174 | </para> | 
|---|
| 175 |  | 
|---|
| 176 | <para><programlisting> | 
|---|
| 177 | [ section   name ] garbage \ | 
|---|
| 178 | param  name  = value | 
|---|
| 179 | </programlisting></para> | 
|---|
| 180 |  | 
|---|
| 181 | <para>are read as</para> | 
|---|
| 182 |  | 
|---|
| 183 | <para><programlisting> | 
|---|
| 184 | [section name] | 
|---|
| 185 | param name = value | 
|---|
| 186 | </programlisting></para> | 
|---|
| 187 |  | 
|---|
| 188 | </sect2> | 
|---|
| 189 | </sect1> | 
|---|
| 190 |  | 
|---|
| 191 | <sect1> | 
|---|
| 192 | <title>Syntax</title> | 
|---|
| 193 |  | 
|---|
| 194 | <para>The syntax of the smb.conf file is as follows:</para> | 
|---|
| 195 |  | 
|---|
| 196 | <para><programlisting> | 
|---|
| 197 | <file>            :==  { <section> } EOF | 
|---|
| 198 | <section>         :==  <section header> { <parameter line> } | 
|---|
| 199 | <section header>  :==  '[' NAME ']' | 
|---|
| 200 | <parameter line>  :==  NAME '=' VALUE NL | 
|---|
| 201 | </programlisting></para> | 
|---|
| 202 |  | 
|---|
| 203 | <para>Basically, this means that</para> | 
|---|
| 204 |  | 
|---|
| 205 | <orderedlist> | 
|---|
| 206 | <listitem><para> | 
|---|
| 207 | a file is made up of zero or more sections, and is terminated by | 
|---|
| 208 | an EOF (we knew that). | 
|---|
| 209 | </para></listitem> | 
|---|
| 210 |  | 
|---|
| 211 | <listitem><para> | 
|---|
| 212 | A section is made up of a section header followed by zero or more | 
|---|
| 213 | parameter lines. | 
|---|
| 214 | </para></listitem> | 
|---|
| 215 |  | 
|---|
| 216 | <listitem><para> | 
|---|
| 217 | A section header is identified by an opening bracket and | 
|---|
| 218 | terminated by the closing bracket.  The enclosed NAME identifies | 
|---|
| 219 | the section. | 
|---|
| 220 | </para></listitem> | 
|---|
| 221 |  | 
|---|
| 222 | <listitem><para> | 
|---|
| 223 | A parameter line is divided into a NAME and a VALUE.  The *first* | 
|---|
| 224 | equal sign on the line separates the NAME from the VALUE.  The | 
|---|
| 225 | VALUE is terminated by a newline character (NL = '\n'). | 
|---|
| 226 | </para></listitem> | 
|---|
| 227 |  | 
|---|
| 228 | </orderedlist> | 
|---|
| 229 |  | 
|---|
| 230 | <sect2> | 
|---|
| 231 | <title>About params.c</title> | 
|---|
| 232 |  | 
|---|
| 233 | <para> | 
|---|
| 234 | The parsing of the config file is a bit unusual if you are used to | 
|---|
| 235 | lex, yacc, bison, etc.  Both lexical analysis (scanning) and parsing | 
|---|
| 236 | are performed by params.c.  Values are loaded via callbacks to | 
|---|
| 237 | loadparm.c. | 
|---|
| 238 | </para> | 
|---|
| 239 | </sect2> | 
|---|
| 240 | </sect1> | 
|---|
| 241 | </chapter> | 
|---|