source: trunk/doc/html/qregexp.html@ 203

Last change on this file since 203 was 190, checked in by rudi, 14 years ago

reference documentation added

File size: 57.6 KB
Line 
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
2<!-- /home/espenr/tmp/qt-3.3.8-espenr-2499/qt-x11-free-3.3.8/src/tools/qregexp.cpp:77 -->
3<html>
4<head>
5<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
6<title>QRegExp Class</title>
7<style type="text/css"><!--
8fn { margin-left: 1cm; text-indent: -1cm; }
9a:link { color: #004faf; text-decoration: none }
10a:visited { color: #672967; text-decoration: none }
11body { background: #ffffff; color: black; }
12--></style>
13</head>
14<body>
15
16<table border="0" cellpadding="0" cellspacing="0" width="100%">
17<tr bgcolor="#E5E5E5">
18<td valign=center>
19 <a href="index.html">
20<font color="#004faf">Home</font></a>
21 | <a href="classes.html">
22<font color="#004faf">All&nbsp;Classes</font></a>
23 | <a href="mainclasses.html">
24<font color="#004faf">Main&nbsp;Classes</font></a>
25 | <a href="annotated.html">
26<font color="#004faf">Annotated</font></a>
27 | <a href="groups.html">
28<font color="#004faf">Grouped&nbsp;Classes</font></a>
29 | <a href="functions.html">
30<font color="#004faf">Functions</font></a>
31</td>
32<td align="right" valign="center"><img src="logo32.png" align="right" width="64" height="32" border="0"></td></tr></table><h1 align=center>QRegExp Class Reference</h1>
33
34<p>The QRegExp class provides pattern matching using regular expressions.
35<a href="#details">More...</a>
36<p>All the functions in this class are <a href="threads.html#reentrant">reentrant</a> when Qt is built with thread support.</p>
37<p><tt>#include &lt;<a href="qregexp-h.html">qregexp.h</a>&gt;</tt>
38<p><a href="qregexp-members.html">List of all member functions.</a>
39<h2>Public Members</h2>
40<ul>
41<li class=fn>enum <a href="#CaretMode-enum"><b>CaretMode</b></a> { CaretAtZero, CaretAtOffset, CaretWontMatch }</li>
42<li class=fn><a href="#QRegExp"><b>QRegExp</b></a> ()</li>
43<li class=fn><a href="#QRegExp-2"><b>QRegExp</b></a> ( const&nbsp;QString&nbsp;&amp;&nbsp;pattern, bool&nbsp;caseSensitive = TRUE, bool&nbsp;wildcard = FALSE )</li>
44<li class=fn><a href="#QRegExp-3"><b>QRegExp</b></a> ( const&nbsp;QRegExp&nbsp;&amp;&nbsp;rx )</li>
45<li class=fn><a href="#~QRegExp"><b>~QRegExp</b></a> ()</li>
46<li class=fn>QRegExp &amp; <a href="#operator-eq"><b>operator=</b></a> ( const&nbsp;QRegExp&nbsp;&amp;&nbsp;rx )</li>
47<li class=fn>bool <a href="#operator-eq-eq"><b>operator==</b></a> ( const&nbsp;QRegExp&nbsp;&amp;&nbsp;rx ) const</li>
48<li class=fn>bool <a href="#operator!-eq"><b>operator!=</b></a> ( const&nbsp;QRegExp&nbsp;&amp;&nbsp;rx ) const</li>
49<li class=fn>bool <a href="#isEmpty"><b>isEmpty</b></a> () const</li>
50<li class=fn>bool <a href="#isValid"><b>isValid</b></a> () const</li>
51<li class=fn>QString <a href="#pattern"><b>pattern</b></a> () const</li>
52<li class=fn>void <a href="#setPattern"><b>setPattern</b></a> ( const&nbsp;QString&nbsp;&amp;&nbsp;pattern )</li>
53<li class=fn>bool <a href="#caseSensitive"><b>caseSensitive</b></a> () const</li>
54<li class=fn>void <a href="#setCaseSensitive"><b>setCaseSensitive</b></a> ( bool&nbsp;sensitive )</li>
55<li class=fn>bool <a href="#wildcard"><b>wildcard</b></a> () const</li>
56<li class=fn>void <a href="#setWildcard"><b>setWildcard</b></a> ( bool&nbsp;wildcard )</li>
57<li class=fn>bool <a href="#minimal"><b>minimal</b></a> () const</li>
58<li class=fn>void <a href="#setMinimal"><b>setMinimal</b></a> ( bool&nbsp;minimal )</li>
59<li class=fn>bool <a href="#exactMatch"><b>exactMatch</b></a> ( const&nbsp;QString&nbsp;&amp;&nbsp;str ) const</li>
60<li class=fn>int match ( const&nbsp;QString&nbsp;&amp;&nbsp;str, int&nbsp;index = 0, int&nbsp;*&nbsp;len = 0, bool&nbsp;indexIsStart = TRUE ) const &nbsp;<em>(obsolete)</em></li>
61<li class=fn>int <a href="#search"><b>search</b></a> ( const&nbsp;QString&nbsp;&amp;&nbsp;str, int&nbsp;offset = 0, CaretMode&nbsp;caretMode = CaretAtZero ) const</li>
62<li class=fn>int <a href="#searchRev"><b>searchRev</b></a> ( const&nbsp;QString&nbsp;&amp;&nbsp;str, int&nbsp;offset = -1, CaretMode&nbsp;caretMode = CaretAtZero ) const</li>
63<li class=fn>int <a href="#matchedLength"><b>matchedLength</b></a> () const</li>
64<li class=fn>int <a href="#numCaptures"><b>numCaptures</b></a> () const</li>
65<li class=fn>QStringList <a href="#capturedTexts"><b>capturedTexts</b></a> ()</li>
66<li class=fn>QString <a href="#cap"><b>cap</b></a> ( int&nbsp;nth = 0 )</li>
67<li class=fn>int <a href="#pos"><b>pos</b></a> ( int&nbsp;nth = 0 )</li>
68<li class=fn>QString <a href="#errorString"><b>errorString</b></a> ()</li>
69</ul>
70<h2>Static Public Members</h2>
71<ul>
72<li class=fn>QString <a href="#escape"><b>escape</b></a> ( const&nbsp;QString&nbsp;&amp;&nbsp;str )</li>
73</ul>
74<hr><a name="details"></a><h2>Detailed Description</h2>
75
76
77
78The QRegExp class provides pattern matching using regular expressions.
79<p>
80
81
82
83<!-- index regular expression --><a name="regular-expression"></a>
84<p> Regular expressions, or "regexps", provide a way to find patterns
85within text. This is useful in many contexts, for example:
86<p> <center><table cellpadding="4" cellspacing="2" border="0">
87<tr bgcolor="#f0f0f0"> <td valign="top">Validation
88<td valign="top">A regexp can be used to check whether a piece of text
89meets some criteria, e.g. is an integer or contains no
90whitespace.
91<tr bgcolor="#d0d0d0"> <td valign="top">Searching
92<td valign="top">Regexps provide a much more powerful means of searching
93text than simple string matching does. For example we can
94create a regexp which says "find one of the words 'mail',
95'letter' or 'correspondence' but not any of the words
96'email', 'mailman' 'mailer', 'letterbox' etc."
97<tr bgcolor="#f0f0f0"> <td valign="top">Search and Replace
98<td valign="top">A regexp can be used to replace a pattern with a piece of
99text, for example replace all occurrences of '&' with
100'&amp;amp;' except where the '&' is already followed by 'amp;'.
101<tr bgcolor="#d0d0d0"> <td valign="top">String Splitting
102<td valign="top">A regexp can be used to identify where a string should be
103split into its component fields, e.g. splitting tab-delimited
104strings.
105</table></center>
106<p> We present a very brief introduction to regexps, a description of
107Qt's regexp language, some code examples, and finally the function
108documentation itself. QRegExp is modeled on Perl's regexp
109language, and also fully supports Unicode. QRegExp can also be
110used in the weaker 'wildcard' (globbing) mode which works in a
111similar way to command shells. A good text on regexps is <em>Mastering Regular Expressions: Powerful Techniques for Perl and Other Tools</em> by Jeffrey E. Friedl, ISBN 1565922573.
112<p> Experienced regexp users may prefer to skip the introduction and
113go directly to the relevant information.
114<p> In case of multi-threaded programming, note that QRegExp depends on
115<a href="qthreadstorage.html">QThreadStorage</a> internally. For that reason, QRegExp should only be
116used with threads started with <a href="qthread.html">QThread</a>, i.e. not with threads
117started with platform-specific APIs.
118<p> <!-- toc -->
119<ul>
120<li><a href="#1"> Introduction
121</a>
122<li><a href="#1-1"> Characters and Abbreviations for Sets of Characters
123</a>
124<li><a href="#1-2"> Sets of Characters
125</a>
126<li><a href="#1-3"> Quantifiers
127</a>
128<li><a href="#1-4"> Capturing Text
129</a>
130<li><a href="#1-5"> Assertions
131</a>
132<li><a href="#1-6"> Wildcard Matching (globbing)
133</a>
134<li><a href="#1-7"> Notes for Perl Users
135</a>
136<li><a href="#1-8"> Code Examples
137</a>
138</ul>
139<!-- endtoc -->
140
141<p> <h3> Introduction
142</h3>
143<a name="1"></a><p> Regexps are built up from expressions, quantifiers, and assertions.
144The simplest form of expression is simply a character, e.g.
145<b>x</b> or <b>5</b>. An expression can also be a set of
146characters. For example, <b>[ABCD]</b>, will match an <b>A</b> or
147a <b>B</b> or a <b>C</b> or a <b>D</b>. As a shorthand we could
148write this as <b>[A-D]</b>. If we want to match any of the
149captital letters in the English alphabet we can write
150<b>[A-Z]</b>. A quantifier tells the regexp engine how many
151occurrences of the expression we want, e.g. <b>x{1,1}</b> means
152match an <b>x</b> which occurs at least once and at most once.
153We'll look at assertions and more complex expressions later.
154<p> Note that in general regexps cannot be used to check for balanced
155brackets or tags. For example if you want to match an opening html
156<tt>&lt;b&gt;</tt> and its closing <tt>&lt;/b&gt;</tt> you can only use a regexp if you
157know that these tags are not nested; the html fragment, <tt>&lt;b&gt;bold &lt;b&gt;bolder&lt;/b&gt;&lt;/b&gt;</tt> will not match as expected. If you know the
158maximum level of nesting it is possible to create a regexp that
159will match correctly, but for an unknown level of nesting, regexps
160will fail.
161<p> We'll start by writing a regexp to match integers in the range 0
162to 99. We will require at least one digit so we will start with
163<b>[0-9]{1,1}</b> which means match a digit exactly once. This
164regexp alone will match integers in the range 0 to 9. To match one
165or two digits we can increase the maximum number of occurrences so
166the regexp becomes <b>[0-9]{1,2}</b> meaning match a digit at
167least once and at most twice. However, this regexp as it stands
168will not match correctly. This regexp will match one or two digits
169<em>within</em> a string. To ensure that we match against the whole
170string we must use the anchor assertions. We need <b>^</b> (caret)
171which when it is the first character in the regexp means that the
172regexp must match from the beginning of the string. And we also
173need <b>$</b> (dollar) which when it is the last character in the
174regexp means that the regexp must match until the end of the
175string. So now our regexp is <b>^[0-9]{1,2}$</b>. Note that
176assertions, such as <b>^</b> and <b>$</b>, do not match any
177characters.
178<p> If you've seen regexps elsewhere they may have looked different from
179the ones above. This is because some sets of characters and some
180quantifiers are so common that they have special symbols to
181represent them. <b>[0-9]</b> can be replaced with the symbol
182<b>\d</b>. The quantifier to match exactly one occurrence,
183<b>{1,1}</b>, can be replaced with the expression itself. This means
184that <b>x{1,1}</b> is exactly the same as <b>x</b> alone. So our 0
185to 99 matcher could be written <b>^\d{1,2}$</b>. Another way of
186writing it would be <b>^\d\d{0,1}$</b>, i.e. from the start of the
187string match a digit followed by zero or one digits. In practice
188most people would write it <b>^\d\d?$</b>. The <b>?</b> is a
189shorthand for the quantifier <b>{0,1}</b>, i.e. a minimum of no
190occurrences a maximum of one occurrence. This is used to make an
191expression optional. The regexp <b>^\d\d?$</b> means "from the
192beginning of the string match one digit followed by zero or one
193digits and then the end of the string".
194<p> Our second example is matching the words 'mail', 'letter' or
195'correspondence' but without matching 'email', 'mailman',
196'mailer', 'letterbox' etc. We'll start by just matching 'mail'. In
197full the regexp is, <b>m{1,1}a{1,1}i{1,1}l{1,1}</b>, but since
198each expression itself is automatically quantified by <b>{1,1}</b>
199we can simply write this as <b>mail</b>; an 'm' followed by an 'a'
200followed by an 'i' followed by an 'l'. The symbol '|' (bar) is
201used for <em>alternation</em>, so our regexp now becomes
202<b>mail|letter|correspondence</b> which means match 'mail' <em>or</em>
203'letter' <em>or</em> 'correspondence'. Whilst this regexp will find the
204words we want it will also find words we don't want such as
205'email'. We will start by putting our regexp in parentheses,
206<b>(mail|letter|correspondence)</b>. Parentheses have two effects,
207firstly they group expressions together and secondly they identify
208parts of the regexp that we wish to <a href="#capturing-text">capture</a>. Our regexp still matches any of the three words but now
209they are grouped together as a unit. This is useful for building
210up more complex regexps. It is also useful because it allows us to
211examine which of the words actually matched. We need to use
212another assertion, this time <b>\b</b> "word boundary":
213<b>\b(mail|letter|correspondence)\b</b>. This regexp means "match
214a word boundary followed by the expression in parentheses followed
215by another word boundary". The <b>\b</b> assertion matches at a <em>position</em> in the regexp not a <em>character</em> in the regexp. A word
216boundary is any non-word character such as a space a newline or
217the beginning or end of the string.
218<p> For our third example we want to replace ampersands with the HTML
219entity '&amp;amp;'. The regexp to match is simple: <b>&amp;</b>, i.e.
220match one ampersand. Unfortunately this will mess up our text if
221some of the ampersands have already been turned into HTML
222entities. So what we really want to say is replace an ampersand
223providing it is not followed by 'amp;'. For this we need the
224negative lookahead assertion and our regexp becomes:
225<b>&amp;(?!amp;)</b>. The negative lookahead assertion is introduced
226with '(?!' and finishes at the ')'. It means that the text it
227contains, 'amp;' in our example, must <em>not</em> follow the expression
228that preceeds it.
229<p> Regexps provide a rich language that can be used in a variety of
230ways. For example suppose we want to count all the occurrences of
231'Eric' and 'Eirik' in a string. Two valid regexps to match these
232are <b>&#92;b(Eric|Eirik)&#92;b</b> and <b>&#92;bEi?ri[ck]&#92;b</b>. We need
233the word boundary '\b' so we don't get 'Ericsson' etc. The second
234regexp actually matches more than we want, 'Eric', 'Erik', 'Eiric'
235and 'Eirik'.
236<p> We will implement some the examples above in the
237<a href="#code-examples">code examples</a> section.
238<p> <a name="characters-and-abbreviations-for-sets-of-characters"></a>
239<h3> Characters and Abbreviations for Sets of Characters
240</h3>
241<a name="1-1"></a><p> <center><table cellpadding="4" cellspacing="2" border="0">
242<tr bgcolor="#a2c511"> <th valign="top">Element <th valign="top">Meaning
243<tr bgcolor="#f0f0f0"> <td valign="top"><b>c</b>
244<td valign="top">Any character represents itself unless it has a special
245regexp meaning. Thus <b>c</b> matches the character <em>c</em>.
246<tr bgcolor="#d0d0d0"> <td valign="top"><b>&#92;c</b>
247<td valign="top">A character that follows a backslash matches the character
248itself except where mentioned below. For example if you
249wished to match a literal caret at the beginning of a string
250you would write <b>&#92;^</b>.
251<tr bgcolor="#f0f0f0"> <td valign="top"><b>&#92;a</b>
252<td valign="top">This matches the ASCII bell character (BEL, 0x07).
253<tr bgcolor="#d0d0d0"> <td valign="top"><b>&#92;f</b>
254<td valign="top">This matches the ASCII form feed character (FF, 0x0C).
255<tr bgcolor="#f0f0f0"> <td valign="top"><b>&#92;n</b>
256<td valign="top">This matches the ASCII line feed character (LF, 0x0A, Unix newline).
257<tr bgcolor="#d0d0d0"> <td valign="top"><b>&#92;r</b>
258<td valign="top">This matches the ASCII carriage return character (CR, 0x0D).
259<tr bgcolor="#f0f0f0"> <td valign="top"><b>&#92;t</b>
260<td valign="top">This matches the ASCII horizontal tab character (HT, 0x09).
261<tr bgcolor="#d0d0d0"> <td valign="top"><b>&#92;v</b>
262<td valign="top">This matches the ASCII vertical tab character (VT, 0x0B).
263<tr bgcolor="#f0f0f0"> <td valign="top"><b>&#92;xhhhh</b>
264<td valign="top">This matches the Unicode character corresponding to the
265hexadecimal number hhhh (between 0x0000 and 0xFFFF). &#92;0ooo
266(i.e., \zero ooo) matches the ASCII/Latin-1 character
267corresponding to the octal number ooo (between 0 and 0377).
268<tr bgcolor="#d0d0d0"> <td valign="top"><b>. (dot)</b>
269<td valign="top">This matches any character (including newline).
270<tr bgcolor="#f0f0f0"> <td valign="top"><b>&#92;d</b>
271<td valign="top">This matches a digit (<a href="qchar.html#isDigit">QChar::isDigit</a>()).
272<tr bgcolor="#d0d0d0"> <td valign="top"><b>&#92;D</b>
273<td valign="top">This matches a non-digit.
274<tr bgcolor="#f0f0f0"> <td valign="top"><b>&#92;s</b>
275<td valign="top">This matches a whitespace (<a href="qchar.html#isSpace">QChar::isSpace</a>()).
276<tr bgcolor="#d0d0d0"> <td valign="top"><b>&#92;S</b>
277<td valign="top">This matches a non-whitespace.
278<tr bgcolor="#f0f0f0"> <td valign="top"><b>&#92;w</b>
279<td valign="top">This matches a word character (<a href="qchar.html#isLetterOrNumber">QChar::isLetterOrNumber</a>() or '_').
280<tr bgcolor="#d0d0d0"> <td valign="top"><b>&#92;W</b>
281<td valign="top">This matches a non-word character.
282<tr bgcolor="#f0f0f0"> <td valign="top"><b>&#92;n</b>
283<td valign="top">The n-th <a href="#capturing-text">backreference</a>,
284e.g. &#92;1, &#92;2, etc.
285</table></center>
286<p> <em>Note that the C++ compiler transforms backslashes in strings so to include a <b>&#92;</b> in a regexp you will need to enter it twice, i.e. <b>&#92;&#92;</b>.</em>
287<p> <a name="sets-of-characters"></a>
288<h3> Sets of Characters
289</h3>
290<a name="1-2"></a><p> Square brackets are used to match any character in the set of
291characters contained within the square brackets. All the character
292set abbreviations described above can be used within square
293brackets. Apart from the character set abbreviations and the
294following two exceptions no characters have special meanings in
295square brackets.
296<p> <center><table cellpadding="4" cellspacing="2" border="0">
297<tr bgcolor="#d0d0d0"> <td valign="top"><b>^</b>
298<td valign="top">The caret negates the character set if it occurs as the
299first character, i.e. immediately after the opening square
300bracket. For example, <b>[abc]</b> matches 'a' or 'b' or 'c',
301but <b>[^abc]</b> matches anything <em>except</em> 'a' or 'b' or
302'c'.
303<tr bgcolor="#f0f0f0"> <td valign="top"><b>-</b>
304<td valign="top">The dash is used to indicate a range of characters, for
305example <b>[W-Z]</b> matches 'W' or 'X' or 'Y' or 'Z'.
306</table></center>
307<p> Using the predefined character set abbreviations is more portable
308than using character ranges across platforms and languages. For
309example, <b>[0-9]</b> matches a digit in Western alphabets but
310<b>\d</b> matches a digit in <em>any</em> alphabet.
311<p> Note that in most regexp literature sets of characters are called
312"character classes".
313<p> <a name="quantifiers"></a>
314<h3> Quantifiers
315</h3>
316<a name="1-3"></a><p> By default an expression is automatically quantified by
317<b>{1,1}</b>, i.e. it should occur exactly once. In the following
318list <b><em>E</em></b> stands for any expression. An expression is a
319character or an abbreviation for a set of characters or a set of
320characters in square brackets or any parenthesised expression.
321<p> <center><table cellpadding="4" cellspacing="2" border="0">
322<tr bgcolor="#d0d0d0"> <td valign="top"><b><em>E</em>?</b>
323<td valign="top">Matches zero or one occurrence of <em>E</em>. This quantifier
324means "the previous expression is optional" since it will
325match whether or not the expression occurs in the string. It
326is the same as <b><em>E</em>{0,1}</b>. For example <b>dents?</b>
327will match 'dent' and 'dents'.
328<tr bgcolor="#f0f0f0"> <td valign="top"><b><em>E</em>+</b>
329<td valign="top">Matches one or more occurrences of <em>E</em>. This is the same
330as <b><em>E</em>{1,MAXINT}</b>. For example, <b>0+</b> will match
331'0', '00', '000', etc.
332<tr bgcolor="#d0d0d0"> <td valign="top"><b><em>E</em>*</b>
333<td valign="top">Matches zero or more occurrences of <em>E</em>. This is the same
334as <b><em>E</em>{0,MAXINT}</b>. The <b>*</b> quantifier is often
335used by a mistake. Since it matches <em>zero</em> or more
336occurrences it will match no occurrences at all. For example
337if we want to match strings that end in whitespace and use
338the regexp <b>\s*$</b> we would get a match on every string.
339This is because we have said find zero or more whitespace
340followed by the end of string, so even strings that don't end
341in whitespace will match. The regexp we want in this case is
342<b>\s+$</b> to match strings that have at least one
343whitespace at the end.
344<tr bgcolor="#f0f0f0"> <td valign="top"><b><em>E</em>{n}</b>
345<td valign="top">Matches exactly <em>n</em> occurrences of the expression. This
346is the same as repeating the expression <em>n</em> times. For
347example, <b>x{5}</b> is the same as <b>xxxxx</b>. It is also
348the same as <b><em>E</em>{n,n}</b>, e.g. <b>x{5,5}</b>.
349<tr bgcolor="#d0d0d0"> <td valign="top"><b><em>E</em>{n,}</b>
350<td valign="top">Matches at least <em>n</em> occurrences of the expression. This
351is the same as <b><em>E</em>{n,MAXINT}</b>.
352<tr bgcolor="#f0f0f0"> <td valign="top"><b><em>E</em>{,m}</b>
353<td valign="top">Matches at most <em>m</em> occurrences of the expression. This
354is the same as <b><em>E</em>{0,m}</b>.
355<tr bgcolor="#d0d0d0"> <td valign="top"><b><em>E</em>{n,m}</b>
356<td valign="top">Matches at least <em>n</em> occurrences of the expression and at
357most <em>m</em> occurrences of the expression.
358</table></center>
359<p> (MAXINT is implementation dependent but will not be smaller than
3601024.)
361<p> If we wish to apply a quantifier to more than just the preceding
362character we can use parentheses to group characters together in
363an expression. For example, <b>tag+</b> matches a 't' followed by
364an 'a' followed by at least one 'g', whereas <b>(tag)+</b> matches
365at least one occurrence of 'tag'.
366<p> Note that quantifiers are "greedy". They will match as much text
367as they can. For example, <b>0+</b> will match as many zeros as it
368can from the first zero it finds, e.g. '2.<u>000</u>5'.
369Quantifiers can be made non-greedy, see <a href="#setMinimal">setMinimal</a>().
370<p> <a name="capturing-text"></a>
371<h3> Capturing Text
372</h3>
373<a name="1-4"></a><p> Parentheses allow us to group elements together so that we can
374quantify and capture them. For example if we have the expression
375<b>mail|letter|correspondence</b> that matches a string we know
376that <em>one</em> of the words matched but not which one. Using
377parentheses allows us to "capture" whatever is matched within
378their bounds, so if we used <b>(mail|letter|correspondence)</b>
379and matched this regexp against the string "I sent you some email"
380we can use the <a href="#cap">cap</a>() or <a href="#capturedTexts">capturedTexts</a>() functions to extract the
381matched characters, in this case 'mail'.
382<p> We can use captured text within the regexp itself. To refer to the
383captured text we use <em>backreferences</em> which are indexed from 1,
384the same as for cap(). For example we could search for duplicate
385words in a string using <b>\b(\w+)\W+&#92;1\b</b> which means match a
386word boundary followed by one or more word characters followed by
387one or more non-word characters followed by the same text as the
388first parenthesised expression followed by a word boundary.
389<p> If we want to use parentheses purely for grouping and not for
390capturing we can use the non-capturing syntax, e.g.
391<b>(?:green|blue)</b>. Non-capturing parentheses begin '(?:' and
392end ')'. In this example we match either 'green' or 'blue' but we
393do not capture the match so we only know whether or not we matched
394but not which color we actually found. Using non-capturing
395parentheses is more efficient than using capturing parentheses
396since the regexp engine has to do less book-keeping.
397<p> Both capturing and non-capturing parentheses may be nested.
398<p> <a name="assertions"></a>
399<h3> Assertions
400</h3>
401<a name="1-5"></a><p> Assertions make some statement about the text at the point where
402they occur in the regexp but they do not match any characters. In
403the following list <b><em>E</em></b> stands for any expression.
404<p> <center><table cellpadding="4" cellspacing="2" border="0">
405<tr bgcolor="#f0f0f0"> <td valign="top"><b>^</b>
406<td valign="top">The caret signifies the beginning of the string. If you
407wish to match a literal <tt>^</tt> you must escape it by
408writing <b>&#92;^</b>. For example, <b>^#include</b> will only
409match strings which <em>begin</em> with the characters '#include'.
410(When the caret is the first character of a character set it
411has a special meaning, see <a href="#sets-of-characters">Sets of
412 Characters</a>.)
413<tr bgcolor="#d0d0d0"> <td valign="top"><b>$</b>
414<td valign="top">The dollar signifies the end of the string. For example
415<b>\d\s*$</b> will match strings which end with a digit
416optionally followed by whitespace. If you wish to match a
417literal <tt>$</tt> you must escape it by writing
418<b>&#92;$</b>.
419<tr bgcolor="#f0f0f0"> <td valign="top"><b>&#92;b</b>
420<td valign="top">A word boundary. For example the regexp
421<b>&#92;bOK&#92;b</b> means match immediately after a word
422boundary (e.g. start of string or whitespace) the letter 'O'
423then the letter 'K' immediately before another word boundary
424(e.g. end of string or whitespace). But note that the
425assertion does not actually match any whitespace so if we
426write <b>(&#92;bOK&#92;b)</b> and we have a match it will only
427contain 'OK' even if the string is "Its <u>OK</u> now".
428<tr bgcolor="#d0d0d0"> <td valign="top"><b>&#92;B</b>
429<td valign="top">A non-word boundary. This assertion is true wherever
430<b>&#92;b</b> is false. For example if we searched for
431<b>&#92;Bon&#92;B</b> in "Left on" the match would fail (space
432and end of string aren't non-word boundaries), but it would
433match in "t<u>on</u>ne".
434<tr bgcolor="#f0f0f0"> <td valign="top"><b>(?=<em>E</em>)</b>
435<td valign="top">Positive lookahead. This assertion is true if the
436expression matches at this point in the regexp. For example,
437<b>const(?=&#92;s+char)</b> matches 'const' whenever it is
438followed by 'char', as in 'static <u>const</u> char *'.
439(Compare with <b>const&#92;s+char</b>, which matches 'static
440<u>const char</u> *'.)
441<tr bgcolor="#d0d0d0"> <td valign="top"><b>(?!<em>E</em>)</b>
442<td valign="top">Negative lookahead. This assertion is true if the
443expression does not match at this point in the regexp. For
444example, <b>const(?!&#92;s+char)</b> matches 'const' <em>except</em>
445when it is followed by 'char'.
446</table></center>
447<p> <a name="wildcard-matching"></a>
448<h3> Wildcard Matching (globbing)
449</h3>
450<a name="1-6"></a><p> Most command shells such as <em>bash</em> or <em>cmd.exe</em> support "file
451globbing", the ability to identify a group of files by using
452wildcards. The <a href="#setWildcard">setWildcard</a>() function is used to switch between
453regexp and wildcard mode. Wildcard matching is much simpler than
454full regexps and has only four features:
455<p> <center><table cellpadding="4" cellspacing="2" border="0">
456<tr bgcolor="#f0f0f0"> <td valign="top"><b>c</b>
457<td valign="top">Any character represents itself apart from those mentioned
458below. Thus <b>c</b> matches the character <em>c</em>.
459<tr bgcolor="#d0d0d0"> <td valign="top"><b>?</b>
460<td valign="top">This matches any single character. It is the same as
461<b>.</b> in full regexps.
462<tr bgcolor="#f0f0f0"> <td valign="top"><b>*</b>
463<td valign="top">This matches zero or more of any characters. It is the
464same as <b>.*</b> in full regexps.
465<tr bgcolor="#d0d0d0"> <td valign="top"><b>[...]</b>
466<td valign="top">Sets of characters can be represented in square brackets,
467similar to full regexps. Within the character class, like
468outside, backslash has no special meaning.
469</table></center>
470<p> For example if we are in wildcard mode and have strings which
471contain filenames we could identify HTML files with <b>*.html</b>.
472This will match zero or more characters followed by a dot followed
473by 'h', 't', 'm' and 'l'.
474<p> <a name="perl-users"></a>
475<h3> Notes for Perl Users
476</h3>
477<a name="1-7"></a><p> Most of the character class abbreviations supported by Perl are
478supported by QRegExp, see <a href="#characters-and-abbreviations-for-sets-of-characters">characters
479 and abbreviations for sets of characters</a>.
480<p> In QRegExp, apart from within character classes, <tt>^</tt> always
481signifies the start of the string, so carets must always be
482escaped unless used for that purpose. In Perl the meaning of caret
483varies automagically depending on where it occurs so escaping it
484is rarely necessary. The same applies to <tt>$</tt> which in
485QRegExp always signifies the end of the string.
486<p> QRegExp's quantifiers are the same as Perl's greedy quantifiers.
487Non-greedy matching cannot be applied to individual quantifiers,
488but can be applied to all the quantifiers in the pattern. For
489example, to match the Perl regexp <b>ro+?m</b> requires:
490<pre>
491 QRegExp rx( "ro+m" );
492 rx.<a href="#setMinimal">setMinimal</a>( TRUE );
493 </pre>
494
495<p> The equivalent of Perl's <tt>/i</tt> option is
496<a href="#setCaseSensitive">setCaseSensitive</a>(FALSE).
497<p> Perl's <tt>/g</tt> option can be emulated using a <a href="#cap_in_a_loop">loop</a>.
498<p> In QRegExp <b>.</b> matches any character, therefore all QRegExp
499regexps have the equivalent of Perl's <tt>/s</tt> option. QRegExp
500does not have an equivalent to Perl's <tt>/m</tt> option, but this
501can be emulated in various ways for example by splitting the input
502into lines or by looping with a regexp that searches for newlines.
503<p> Because QRegExp is string oriented there are no \A, \Z or \z
504assertions. The \G assertion is not supported but can be emulated
505in a loop.
506<p> Perl's $& is <a href="#cap">cap</a>(0) or <a href="#capturedTexts">capturedTexts</a>()[0]. There are no QRegExp
507equivalents for $`, $' or $+. Perl's capturing variables, $1, $2,
508... correspond to cap(1) or capturedTexts()[1], cap(2) or
509capturedTexts()[2], etc.
510<p> To substitute a pattern use <a href="qstring.html#replace">QString::replace</a>().
511<p> Perl's extended <tt>/x</tt> syntax is not supported, nor are
512directives, e.g. (?i), or regexp comments, e.g. (?#comment). On
513the other hand, C++'s rules for literal strings can be used to
514achieve the same:
515<pre>
516 QRegExp mark( "\\b" // word boundary
517 "[Mm]ark" // the word we want to match
518 );
519 </pre>
520
521<p> Both zero-width positive and zero-width negative lookahead
522assertions (?=pattern) and (?!pattern) are supported with the same
523syntax as Perl. Perl's lookbehind assertions, "independent"
524subexpressions and conditional expressions are not supported.
525<p> Non-capturing parentheses are also supported, with the same
526(?:pattern) syntax.
527<p> See <a href="qstringlist.html#split">QStringList::split</a>() and <a href="qstringlist.html#join">QStringList::join</a>() for equivalents
528to Perl's split and join functions.
529<p> Note: because C++ transforms &#92;'s they must be written <em>twice</em> in
530code, e.g. <b>&#92;b</b> must be written <b>&#92;&#92;b</b>.
531<p> <a name="code-examples"></a>
532<h3> Code Examples
533</h3>
534<a name="1-8"></a><p> <pre>
535 QRegExp rx( "^\\d\\d?$" ); // match integers 0 to 99
536 rx.<a href="#search">search</a>( "123" ); // returns -1 (no match)
537 rx.<a href="#search">search</a>( "-6" ); // returns -1 (no match)
538 rx.<a href="#search">search</a>( "6" ); // returns 0 (matched as position 0)
539 </pre>
540
541<p> The third string matches '<u>6</u>'. This is a simple validation
542regexp for integers in the range 0 to 99.
543<p> <pre>
544 QRegExp rx( "^\\S+$" ); // match strings without whitespace
545 rx.<a href="#search">search</a>( "Hello world" ); // returns -1 (no match)
546 rx.<a href="#search">search</a>( "This_is-OK" ); // returns 0 (matched at position 0)
547 </pre>
548
549<p> The second string matches '<u>This_is-OK</u>'. We've used the
550character set abbreviation '\S' (non-whitespace) and the anchors
551to match strings which contain no whitespace.
552<p> In the following example we match strings containing 'mail' or
553'letter' or 'correspondence' but only match whole words i.e. not
554'email'
555<p> <pre>
556 QRegExp rx( "\\b(mail|letter|correspondence)\\b" );
557 rx.<a href="#search">search</a>( "I sent you an email" ); // returns -1 (no match)
558 rx.<a href="#search">search</a>( "Please write the letter" ); // returns 17
559 </pre>
560
561<p> The second string matches "Please write the <u>letter</u>". The
562word 'letter' is also captured (because of the parentheses). We
563can see what text we've captured like this:
564<p> <pre>
565 <a href="qstring.html">QString</a> captured = rx.cap( 1 ); // captured == "letter"
566 </pre>
567
568<p> This will capture the text from the first set of capturing
569parentheses (counting capturing left parentheses from left to
570right). The parentheses are counted from 1 since <a href="#cap">cap</a>( 0 ) is the
571whole matched regexp (equivalent to '&' in most regexp engines).
572<p> <pre>
573 QRegExp rx( "&amp;(?!amp;)" ); // match ampersands but not &amp;amp;
574 <a href="qstring.html">QString</a> line1 = "This &amp; that";
575 line1.<a href="qstring.html#replace">replace</a>( rx, "&amp;amp;" );
576 // line1 == "This &amp;amp; that"
577 <a href="qstring.html">QString</a> line2 = "His &amp;amp; hers &amp; theirs";
578 line2.<a href="qstring.html#replace">replace</a>( rx, "&amp;amp;" );
579 // line2 == "His &amp;amp; hers &amp;amp; theirs"
580 </pre>
581
582<p> Here we've passed the QRegExp to <a href="qstring.html">QString</a>'s replace() function to
583replace the matched text with new text.
584<p> <pre>
585 <a href="qstring.html">QString</a> str = "One Eric another Eirik, and an Ericsson."
586 " How many Eiriks, Eric?";
587 QRegExp rx( "\\b(Eric|Eirik)\\b" ); // match Eric or Eirik
588 int pos = 0; // where we are in the string
589 int count = 0; // how many Eric and Eirik's we've counted
590 while ( pos &gt;= 0 ) {
591 pos = rx.<a href="#search">search</a>( str, pos );
592 if ( pos &gt;= 0 ) {
593 pos++; // move along in str
594 count++; // count our Eric or Eirik
595 }
596 }
597 </pre>
598
599<p> We've used the <a href="#search">search</a>() function to repeatedly match the regexp in
600the string. Note that instead of moving forward by one character
601at a time <tt>pos++</tt> we could have written <tt>pos += rx.matchedLength()</tt> to skip over the already matched string. The
602count will equal 3, matching 'One <u>Eric</u> another
603<u>Eirik</u>, and an Ericsson. How many Eiriks, <u>Eric</u>?'; it
604doesn't match 'Ericsson' or 'Eiriks' because they are not bounded
605by non-word boundaries.
606<p> One common use of regexps is to split lines of delimited data into
607their component fields.
608<p> <pre>
609 str = "Trolltech AS\twww.trolltech.com\tNorway";
610 <a href="qstring.html">QString</a> company, web, country;
611 rx.setPattern( "^([^\t]+)\t([^\t]+)\t([^\t]+)$" );
612 if ( rx.search( str ) != -1 ) {
613 company = rx.cap( 1 );
614 web = rx.cap( 2 );
615 country = rx.cap( 3 );
616 }
617 </pre>
618
619<p> In this example our input lines have the format company name, web
620address and country. Unfortunately the regexp is rather long and
621not very versatile -- the code will break if we add any more
622fields. A simpler and better solution is to look for the
623separator, '\t' in this case, and take the surrounding text. The
624<a href="qstringlist.html">QStringList</a> split() function can take a separator string or regexp
625as an argument and split a string accordingly.
626<p> <pre>
627 <a href="qstringlist.html">QStringList</a> field = QStringList::<a href="qstringlist.html#split">split</a>( "\t", str );
628 </pre>
629
630<p> Here field[0] is the company, field[1] the web address and so on.
631<p> To imitate the matching of a shell we can use wildcard mode.
632<p> <pre>
633 QRegExp rx( "*.html" ); // invalid regexp: * doesn't quantify anything
634 rx.<a href="#setWildcard">setWildcard</a>( TRUE ); // now it's a valid wildcard regexp
635 rx.<a href="#exactMatch">exactMatch</a>( "index.html" ); // returns TRUE
636 rx.<a href="#exactMatch">exactMatch</a>( "default.htm" ); // returns FALSE
637 rx.<a href="#exactMatch">exactMatch</a>( "readme.txt" ); // returns FALSE
638 </pre>
639
640<p> Wildcard matching can be convenient because of its simplicity, but
641any wildcard regexp can be defined using full regexps, e.g.
642<b>.*&#92;.html$</b>. Notice that we can't match both <tt>.html</tt> and <tt>.htm</tt> files with a wildcard unless we use <b>*.htm*</b> which will
643also match 'test.html.bak'. A full regexp gives us the precision
644we need, <b>.*&#92;.html?$</b>.
645<p> QRegExp can match case insensitively using <a href="#setCaseSensitive">setCaseSensitive</a>(), and
646can use non-greedy matching, see <a href="#setMinimal">setMinimal</a>(). By default QRegExp
647uses full regexps but this can be changed with <a href="#setWildcard">setWildcard</a>().
648Searching can be forward with <a href="#search">search</a>() or backward with
649<a href="#searchRev">searchRev</a>(). Captured text can be accessed using <a href="#capturedTexts">capturedTexts</a>()
650which returns a string list of all captured strings, or using
651<a href="#cap">cap</a>() which returns the captured string for the given index. The
652<a href="#pos">pos</a>() function takes a match index and returns the position in the
653string where the match was made (or -1 if there was no match).
654<p> <p>See also <a href="qregexpvalidator.html">QRegExpValidator</a>, <a href="qstring.html">QString</a>, <a href="qstringlist.html">QStringList</a>, <a href="misc.html">Miscellaneous Classes</a>, <a href="shared.html">Implicitly and Explicitly Shared Classes</a>, and <a href="tools.html">Non-GUI Classes</a>.
655
656<p> <a name="member-function-documentation"></a>
657
658<hr><h2>Member Type Documentation</h2>
659<h3 class=fn><a name="CaretMode-enum"></a>QRegExp::CaretMode</h3>
660
661<p> The CaretMode enum defines the different meanings of the caret
662(<b>^</b>) in a <a href="qregexp.html#regular-expression">regular expression</a>. The possible values are:
663<ul>
664<li><tt>QRegExp::CaretAtZero</tt> -
665The caret corresponds to index 0 in the searched string.
666<li><tt>QRegExp::CaretAtOffset</tt> -
667The caret corresponds to the start offset of the search.
668<li><tt>QRegExp::CaretWontMatch</tt> -
669The caret never matches.
670</ul>
671<hr><h2>Member Function Documentation</h2>
672<h3 class=fn><a name="QRegExp"></a>QRegExp::QRegExp ()
673</h3>
674Constructs an empty regexp.
675<p> <p>See also <a href="#isValid">isValid</a>() and <a href="#errorString">errorString</a>().
676
677<h3 class=fn><a name="QRegExp-2"></a>QRegExp::QRegExp ( const&nbsp;<a href="qstring.html">QString</a>&nbsp;&amp;&nbsp;pattern, bool&nbsp;caseSensitive = TRUE, bool&nbsp;wildcard = FALSE )
678</h3>
679Constructs a <a href="qregexp.html#regular-expression">regular expression</a> object for the given <em>pattern</em>
680string. The pattern must be given using wildcard notation if <em>wildcard</em> is TRUE (default is FALSE). The pattern is case
681sensitive, unless <em>caseSensitive</em> is FALSE. Matching is greedy
682(maximal), but can be changed by calling <a href="#setMinimal">setMinimal</a>().
683<p> <p>See also <a href="#setPattern">setPattern</a>(), <a href="#setCaseSensitive">setCaseSensitive</a>(), <a href="#setWildcard">setWildcard</a>(), and <a href="#setMinimal">setMinimal</a>().
684
685<h3 class=fn><a name="QRegExp-3"></a>QRegExp::QRegExp ( const&nbsp;<a href="qregexp.html">QRegExp</a>&nbsp;&amp;&nbsp;rx )
686</h3>
687Constructs a <a href="qregexp.html#regular-expression">regular expression</a> as a copy of <em>rx</em>.
688<p> <p>See also <a href="#operator-eq">operator=</a>().
689
690<h3 class=fn><a name="~QRegExp"></a>QRegExp::~QRegExp ()
691</h3>
692Destroys the <a href="qregexp.html#regular-expression">regular expression</a> and cleans up its internal data.
693
694<h3 class=fn><a href="qstring.html">QString</a> <a name="cap"></a>QRegExp::cap ( int&nbsp;nth = 0 )
695</h3>
696Returns the text captured by the <em>nth</em> subexpression. The entire
697match has index 0 and the parenthesized subexpressions have
698indices starting from 1 (excluding non-capturing parentheses).
699<p> <pre>
700 QRegExp rxlen( "(\\d+)(?:\\s*)(cm|inch)" );
701 int pos = rxlen.<a href="#search">search</a>( "Length: 189cm" );
702 if ( pos &gt; -1 ) {
703 <a href="qstring.html">QString</a> value = rxlen.<a href="#cap">cap</a>( 1 ); // "189"
704 <a href="qstring.html">QString</a> unit = rxlen.<a href="#cap">cap</a>( 2 ); // "cm"
705 // ...
706 }
707 </pre>
708
709<p> The order of elements matched by <a href="#cap">cap</a>() is as follows. The first
710element, cap(0), is the entire matching string. Each subsequent
711element corresponds to the next capturing open left parentheses.
712Thus cap(1) is the text of the first capturing parentheses, cap(2)
713is the text of the second, and so on.
714<p> <a name="cap_in_a_loop"></a>
715Some patterns may lead to a number of matches which cannot be
716determined in advance, for example:
717<p> <pre>
718 QRegExp rx( "(\\d+)" );
719 str = "Offsets: 12 14 99 231 7";
720 <a href="qstringlist.html">QStringList</a> list;
721 pos = 0;
722 while ( pos &gt;= 0 ) {
723 pos = rx.<a href="#search">search</a>( str, pos );
724 if ( pos &gt; -1 ) {
725 list += rx.<a href="#cap">cap</a>( 1 );
726 pos += rx.<a href="#matchedLength">matchedLength</a>();
727 }
728 }
729 // list contains "12", "14", "99", "231", "7"
730 </pre>
731
732<p> <p>See also <a href="#capturedTexts">capturedTexts</a>(), <a href="#pos">pos</a>(), <a href="#exactMatch">exactMatch</a>(), <a href="#search">search</a>(), and <a href="#searchRev">searchRev</a>().
733
734<p>Examples: <a href="archivesearch-example.html#x479">network/archivesearch/archivedialog.ui.h</a> and <a href="regexptester-example.html#x2485">regexptester/regexptester.cpp</a>.
735<h3 class=fn><a href="qstringlist.html">QStringList</a> <a name="capturedTexts"></a>QRegExp::capturedTexts ()
736</h3>
737Returns a list of the captured text strings.
738<p> The first string in the list is the entire matched string. Each
739subsequent list element contains a string that matched a
740(capturing) subexpression of the regexp.
741<p> For example:
742<pre>
743 QRegExp rx( "(\\d+)(\\s*)(cm|inch(es)?)" );
744 int pos = rx.<a href="#search">search</a>( "Length: 36 inches" );
745 <a href="qstringlist.html">QStringList</a> list = rx.<a href="#capturedTexts">capturedTexts</a>();
746 // list is now ( "36 inches", "36", " ", "inches", "es" )
747 </pre>
748
749<p> The above example also captures elements that may be present but
750which we have no interest in. This problem can be solved by using
751non-capturing parentheses:
752<p> <pre>
753 QRegExp rx( "(\\d+)(?:\\s*)(cm|inch(?:es)?)" );
754 int pos = rx.<a href="#search">search</a>( "Length: 36 inches" );
755 <a href="qstringlist.html">QStringList</a> list = rx.<a href="#capturedTexts">capturedTexts</a>();
756 // list is now ( "36 inches", "36", "inches" )
757 </pre>
758
759<p> Note that if you want to iterate over the list, you should iterate
760over a copy, e.g.
761<pre>
762 <a href="qstringlist.html">QStringList</a> list = rx.capturedTexts();
763 QStringList::Iterator it = list.<a href="qvaluelist.html#begin">begin</a>();
764 while( it != list.<a href="qvaluelist.html#end">end</a>() ) {
765 myProcessing( *it );
766 ++it;
767 }
768 </pre>
769
770<p> Some regexps can match an indeterminate number of times. For
771example if the input string is "Offsets: 12 14 99 231 7" and the
772regexp, <tt>rx</tt>, is <b>(&#92;d+)+</b>, we would hope to get a list of
773all the numbers matched. However, after calling
774<tt>rx.search(str)</tt>, <a href="#capturedTexts">capturedTexts</a>() will return the list ( "12",
775"12" ), i.e. the entire match was "12" and the first subexpression
776matched was "12". The correct approach is to use <a href="#cap">cap</a>() in a <a href="#cap_in_a_loop">loop</a>.
777<p> The order of elements in the string list is as follows. The first
778element is the entire matching string. Each subsequent element
779corresponds to the next capturing open left parentheses. Thus
780capturedTexts()[1] is the text of the first capturing parentheses,
781capturedTexts()[2] is the text of the second and so on
782(corresponding to $1, $2, etc., in some other regexp languages).
783<p> <p>See also <a href="#cap">cap</a>(), <a href="#pos">pos</a>(), <a href="#exactMatch">exactMatch</a>(), <a href="#search">search</a>(), and <a href="#searchRev">searchRev</a>().
784
785<h3 class=fn>bool <a name="caseSensitive"></a>QRegExp::caseSensitive () const
786</h3>
787Returns TRUE if case sensitivity is enabled; otherwise returns
788FALSE. The default is TRUE.
789<p> <p>See also <a href="#setCaseSensitive">setCaseSensitive</a>().
790
791<h3 class=fn><a href="qstring.html">QString</a> <a name="errorString"></a>QRegExp::errorString ()
792</h3>
793Returns a text string that explains why a regexp pattern is
794invalid the case being; otherwise returns "no error occurred".
795<p> <p>See also <a href="#isValid">isValid</a>().
796
797<p>Example: <a href="regexptester-example.html#x2486">regexptester/regexptester.cpp</a>.
798<h3 class=fn><a href="qstring.html">QString</a> <a name="escape"></a>QRegExp::escape ( const&nbsp;<a href="qstring.html">QString</a>&nbsp;&amp;&nbsp;str )<tt> [static]</tt>
799</h3>
800Returns the string <em>str</em> with every regexp special character
801escaped with a backslash. The special characters are $, (, ), *, +,
802., ?, [, &#92;, ], ^, {, | and }.
803<p> Example:
804<pre>
805 s1 = QRegExp::<a href="#escape">escape</a>( "bingo" ); // s1 == "bingo"
806 s2 = QRegExp::<a href="#escape">escape</a>( "f(x)" ); // s2 == "f\\(x\\)"
807 </pre>
808
809<p> This function is useful to construct regexp patterns dynamically:
810<p> <pre>
811 QRegExp rx( "(" + QRegExp::escape(name) +
812 "|" + QRegExp::escape(alias) + ")" );
813 </pre>
814
815
816<h3 class=fn>bool <a name="exactMatch"></a>QRegExp::exactMatch ( const&nbsp;<a href="qstring.html">QString</a>&nbsp;&amp;&nbsp;str ) const
817</h3>
818Returns TRUE if <em>str</em> is matched exactly by this <a href="qregexp.html#regular-expression">regular expression</a>; otherwise returns FALSE. You can determine how much of
819the string was matched by calling <a href="#matchedLength">matchedLength</a>().
820<p> For a given regexp string, R, <a href="#exactMatch">exactMatch</a>("R") is the equivalent of
821<a href="#search">search</a>("^R$") since exactMatch() effectively encloses the regexp
822in the start of string and end of string anchors, except that it
823sets matchedLength() differently.
824<p> For example, if the regular expression is <b>blue</b>, then
825exactMatch() returns TRUE only for input <tt>blue</tt>. For inputs <tt>bluebell</tt>, <tt>blutak</tt> and <tt>lightblue</tt>, exactMatch() returns FALSE
826and matchedLength() will return 4, 3 and 0 respectively.
827<p> Although const, this function sets matchedLength(),
828<a href="#capturedTexts">capturedTexts</a>() and <a href="#pos">pos</a>().
829<p> <p>See also <a href="#search">search</a>(), <a href="#searchRev">searchRev</a>(), and <a href="qregexpvalidator.html">QRegExpValidator</a>.
830
831<h3 class=fn>bool <a name="isEmpty"></a>QRegExp::isEmpty () const
832</h3>
833Returns TRUE if the pattern string is empty; otherwise returns
834FALSE.
835<p> If you call <a href="#exactMatch">exactMatch</a>() with an empty pattern on an empty string
836it will return TRUE; otherwise it returns FALSE since it operates
837over the whole string. If you call <a href="#search">search</a>() with an empty pattern
838on <em>any</em> string it will return the start offset (0 by default)
839because the empty pattern matches the 'emptiness' at the start of
840the string. In this case the length of the match returned by
841<a href="#matchedLength">matchedLength</a>() will be 0.
842<p> See <a href="qstring.html#isEmpty">QString::isEmpty</a>().
843
844<h3 class=fn>bool <a name="isValid"></a>QRegExp::isValid () const
845</h3>
846Returns TRUE if the <a href="qregexp.html#regular-expression">regular expression</a> is valid; otherwise returns
847FALSE. An invalid regular expression never matches.
848<p> The pattern <b>[a-z</b> is an example of an invalid pattern, since
849it lacks a closing square bracket.
850<p> Note that the validity of a regexp may also depend on the setting
851of the wildcard flag, for example <b>*.html</b> is a valid
852wildcard regexp but an invalid full regexp.
853<p> <p>See also <a href="#errorString">errorString</a>().
854
855<p>Example: <a href="regexptester-example.html#x2487">regexptester/regexptester.cpp</a>.
856<h3 class=fn>int <a name="match"></a>QRegExp::match ( const&nbsp;<a href="qstring.html">QString</a>&nbsp;&amp;&nbsp;str, int&nbsp;index = 0, int&nbsp;*&nbsp;len = 0, bool&nbsp;indexIsStart = TRUE ) const
857</h3> <b>This function is obsolete.</b> It is provided to keep old source working. We strongly advise against using it in new code.
858<p> Attempts to match in <em>str</em>, starting from position <em>index</em>.
859Returns the position of the match, or -1 if there was no match.
860<p> The length of the match is stored in <em>*len</em>, unless <em>len</em> is a
861null pointer.
862<p> If <em>indexIsStart</em> is TRUE (the default), the position <em>index</em> in
863the string will match the start of string anchor, <b>^</b>, in the
864regexp, if present. Otherwise, position 0 in <em>str</em> will match.
865<p> Use <a href="#search">search</a>() and <a href="#matchedLength">matchedLength</a>() instead of this function.
866<p> <p>See also <a href="qstring.html#mid">QString::mid</a>() and <a href="qconststring.html">QConstString</a>.
867
868<p>Example: <a href="qmag-example.html#x1791">qmag/qmag.cpp</a>.
869<h3 class=fn>int <a name="matchedLength"></a>QRegExp::matchedLength () const
870</h3>
871Returns the length of the last matched string, or -1 if there was
872no match.
873<p> <p>See also <a href="#exactMatch">exactMatch</a>(), <a href="#search">search</a>(), and <a href="#searchRev">searchRev</a>().
874
875<p>Examples: <a href="archivesearch-example.html#x480">network/archivesearch/archivedialog.ui.h</a> and <a href="regexptester-example.html#x2488">regexptester/regexptester.cpp</a>.
876<h3 class=fn>bool <a name="minimal"></a>QRegExp::minimal () const
877</h3>
878Returns TRUE if minimal (non-greedy) matching is enabled;
879otherwise returns FALSE.
880<p> <p>See also <a href="#setMinimal">setMinimal</a>().
881
882<h3 class=fn>int <a name="numCaptures"></a>QRegExp::numCaptures () const
883</h3>
884Returns the number of captures contained in the <a href="qregexp.html#regular-expression">regular expression</a>.
885
886<p>Example: <a href="regexptester-example.html#x2489">regexptester/regexptester.cpp</a>.
887<h3 class=fn>bool <a name="operator!-eq"></a>QRegExp::operator!= ( const&nbsp;<a href="qregexp.html">QRegExp</a>&nbsp;&amp;&nbsp;rx ) const
888</h3>
889
890<p> Returns TRUE if this <a href="qregexp.html#regular-expression">regular expression</a> is not equal to <em>rx</em>;
891otherwise returns FALSE.
892<p> <p>See also <a href="#operator-eq-eq">operator==</a>().
893
894<h3 class=fn><a href="qregexp.html">QRegExp</a>&nbsp;&amp; <a name="operator-eq"></a>QRegExp::operator= ( const&nbsp;<a href="qregexp.html">QRegExp</a>&nbsp;&amp;&nbsp;rx )
895</h3>
896Copies the <a href="qregexp.html#regular-expression">regular expression</a> <em>rx</em> and returns a reference to the
897copy. The case sensitivity, wildcard and minimal matching options
898are also copied.
899
900<h3 class=fn>bool <a name="operator-eq-eq"></a>QRegExp::operator== ( const&nbsp;<a href="qregexp.html">QRegExp</a>&nbsp;&amp;&nbsp;rx ) const
901</h3>
902Returns TRUE if this <a href="qregexp.html#regular-expression">regular expression</a> is equal to <em>rx</em>;
903otherwise returns FALSE.
904<p> Two QRegExp objects are equal if they have the same pattern
905strings and the same settings for case sensitivity, wildcard and
906minimal matching.
907
908<h3 class=fn><a href="qstring.html">QString</a> <a name="pattern"></a>QRegExp::pattern () const
909</h3>
910Returns the pattern string of the <a href="qregexp.html#regular-expression">regular expression</a>. The pattern
911has either regular expression syntax or wildcard syntax, depending
912on <a href="#wildcard">wildcard</a>().
913<p> <p>See also <a href="#setPattern">setPattern</a>().
914
915<h3 class=fn>int <a name="pos"></a>QRegExp::pos ( int&nbsp;nth = 0 )
916</h3>
917Returns the position of the <em>nth</em> captured text in the searched
918string. If <em>nth</em> is 0 (the default), <a href="#pos">pos</a>() returns the position
919of the whole match.
920<p> Example:
921<pre>
922 QRegExp rx( "/([a-z]+)/([a-z]+)" );
923 rx.<a href="#search">search</a>( "Output /dev/null" ); // returns 7 (position of /dev/null)
924 rx.<a href="#pos">pos</a>( 0 ); // returns 7 (position of /dev/null)
925 rx.<a href="#pos">pos</a>( 1 ); // returns 8 (position of dev)
926 rx.<a href="#pos">pos</a>( 2 ); // returns 12 (position of null)
927 </pre>
928
929<p> For zero-length matches, pos() always returns -1. (For example, if
930<a href="#cap">cap</a>(4) would return an empty string, pos(4) returns -1.) This is
931due to an implementation tradeoff.
932<p> <p>See also <a href="#capturedTexts">capturedTexts</a>(), <a href="#exactMatch">exactMatch</a>(), <a href="#search">search</a>(), and <a href="#searchRev">searchRev</a>().
933
934<h3 class=fn>int <a name="search"></a>QRegExp::search ( const&nbsp;<a href="qstring.html">QString</a>&nbsp;&amp;&nbsp;str, int&nbsp;offset = 0, <a href="qregexp.html#CaretMode-enum">CaretMode</a>&nbsp;caretMode = CaretAtZero ) const
935</h3>
936Attempts to find a match in <em>str</em> from position <em>offset</em> (0 by
937default). If <em>offset</em> is -1, the search starts at the last
938character; if -2, at the next to last character; etc.
939<p> Returns the position of the first match, or -1 if there was no
940match.
941<p> The <em>caretMode</em> parameter can be used to instruct whether <b>^</b>
942should match at index 0 or at <em>offset</em>.
943<p> You might prefer to use <a href="qstring.html#find">QString::find</a>(), <a href="qstring.html#contains">QString::contains</a>() or
944even <a href="qstringlist.html#grep">QStringList::grep</a>(). To replace matches use
945<a href="qstring.html#replace">QString::replace</a>().
946<p> Example:
947<pre>
948 <a href="qstring.html">QString</a> str = "offsets: 1.23 .50 71.00 6.00";
949 QRegExp rx( "\\d*\\.\\d+" ); // primitive floating point matching
950 int count = 0;
951 int pos = 0;
952 while ( (pos = rx.<a href="#search">search</a>(str, pos)) != -1 ) {
953 count++;
954 pos += rx.<a href="#matchedLength">matchedLength</a>();
955 }
956 // pos will be 9, 14, 18 and finally 24; count will end up as 4
957 </pre>
958
959<p> Although const, this function sets <a href="#matchedLength">matchedLength</a>(),
960<a href="#capturedTexts">capturedTexts</a>() and <a href="#pos">pos</a>().
961<p> <p>See also <a href="#searchRev">searchRev</a>() and <a href="#exactMatch">exactMatch</a>().
962
963<p>Examples: <a href="archivesearch-example.html#x481">network/archivesearch/archivedialog.ui.h</a> and <a href="regexptester-example.html#x2490">regexptester/regexptester.cpp</a>.
964<h3 class=fn>int <a name="searchRev"></a>QRegExp::searchRev ( const&nbsp;<a href="qstring.html">QString</a>&nbsp;&amp;&nbsp;str, int&nbsp;offset = -1, <a href="qregexp.html#CaretMode-enum">CaretMode</a>&nbsp;caretMode = CaretAtZero ) const
965</h3>
966Attempts to find a match backwards in <em>str</em> from position <em>offset</em>. If <em>offset</em> is -1 (the default), the search starts at the
967last character; if -2, at the next to last character; etc.
968<p> Returns the position of the first match, or -1 if there was no
969match.
970<p> The <em>caretMode</em> parameter can be used to instruct whether <b>^</b>
971should match at index 0 or at <em>offset</em>.
972<p> Although const, this function sets <a href="#matchedLength">matchedLength</a>(),
973<a href="#capturedTexts">capturedTexts</a>() and <a href="#pos">pos</a>().
974<p> <b>Warning:</b> Searching backwards is much slower than searching
975forwards.
976<p> <p>See also <a href="#search">search</a>() and <a href="#exactMatch">exactMatch</a>().
977
978<h3 class=fn>void <a name="setCaseSensitive"></a>QRegExp::setCaseSensitive ( bool&nbsp;sensitive )
979</h3>
980Sets case sensitive matching to <em>sensitive</em>.
981<p> If <em>sensitive</em> is TRUE, <b>&#92;.txt$</b> matches <tt>readme.txt</tt> but
982not <tt>README.TXT</tt>.
983<p> <p>See also <a href="#caseSensitive">caseSensitive</a>().
984
985<p>Example: <a href="regexptester-example.html#x2491">regexptester/regexptester.cpp</a>.
986<h3 class=fn>void <a name="setMinimal"></a>QRegExp::setMinimal ( bool&nbsp;minimal )
987</h3>
988Enables or disables minimal matching. If <em>minimal</em> is FALSE,
989matching is greedy (maximal) which is the default.
990<p> For example, suppose we have the input string "We must be
991&lt;b>bold&lt;/b>, very &lt;b>bold&lt;/b>!" and the pattern
992<b>&lt;b>.*&lt;/b></b>. With the default greedy (maximal) matching,
993the match is "We must be <u>&lt;b>bold&lt;/b>, very
994&lt;b>bold&lt;/b></u>!". But with minimal (non-greedy) matching the
995first match is: "We must be <u>&lt;b>bold&lt;/b></u>, very
996&lt;b>bold&lt;/b>!" and the second match is "We must be &lt;b>bold&lt;/b>,
997very <u>&lt;b>bold&lt;/b></u>!". In practice we might use the pattern
998<b>&lt;b>[^&lt;]+&lt;/b></b> instead, although this will still fail for
999nested tags.
1000<p> <p>See also <a href="#minimal">minimal</a>().
1001
1002<p>Examples: <a href="archivesearch-example.html#x482">network/archivesearch/archivedialog.ui.h</a> and <a href="regexptester-example.html#x2492">regexptester/regexptester.cpp</a>.
1003<h3 class=fn>void <a name="setPattern"></a>QRegExp::setPattern ( const&nbsp;<a href="qstring.html">QString</a>&nbsp;&amp;&nbsp;pattern )
1004</h3>
1005Sets the pattern string to <em>pattern</em>. The case sensitivity,
1006wildcard and minimal matching options are not changed.
1007<p> <p>See also <a href="#pattern">pattern</a>().
1008
1009<h3 class=fn>void <a name="setWildcard"></a>QRegExp::setWildcard ( bool&nbsp;wildcard )
1010</h3>
1011Sets the wildcard mode for the <a href="qregexp.html#regular-expression">regular expression</a>. The default is
1012FALSE.
1013<p> Setting <em>wildcard</em> to TRUE enables simple shell-like wildcard
1014matching. (See <a href="#wildcard-matching">wildcard matching
1015 (globbing)</a>.)
1016<p> For example, <b>r*.txt</b> matches the string <tt>readme.txt</tt> in
1017wildcard mode, but does not match <tt>readme</tt>.
1018<p> <p>See also <a href="#wildcard">wildcard</a>().
1019
1020<p>Example: <a href="regexptester-example.html#x2493">regexptester/regexptester.cpp</a>.
1021<h3 class=fn>bool <a name="wildcard"></a>QRegExp::wildcard () const
1022</h3>
1023Returns TRUE if wildcard mode is enabled; otherwise returns FALSE.
1024The default is FALSE.
1025<p> <p>See also <a href="#setWildcard">setWildcard</a>().
1026
1027<!-- eof -->
1028<hr><p>
1029This file is part of the <a href="index.html">Qt toolkit</a>.
1030Copyright &copy; 1995-2007
1031<a href="http://www.trolltech.com/">Trolltech</a>. All Rights Reserved.<p><address><hr><div align=center>
1032<table width=100% cellspacing=0 border=0><tr>
1033<td>Copyright &copy; 2007
1034<a href="troll.html">Trolltech</a><td align=center><a href="trademarks.html">Trademarks</a>
1035<td align=right><div align=right>Qt 3.3.8</div>
1036</table></div></address></body>
1037</html>
Note: See TracBrowser for help on using the repository browser.