Annotation of embedaddon/sqlite3/ext/fts3/README.syntax, revision 1.1.1.1

1.1       misho       1: 
                      2: 1. OVERVIEW
                      3: 
                      4:   This README file describes the syntax of the arguments that may be passed to
                      5:   the FTS3 MATCH operator used for full-text queries. For example, if table 
                      6:   "t1" is an Fts3 virtual table, the following SQL query:
                      7: 
                      8:     SELECT * FROM t1 WHERE <col> MATCH <full-text query>
                      9: 
                     10:   may be used to retrieve all rows that match a specified for full-text query. 
                     11:   The text "<col>" should be replaced by either the name of the fts3 table 
                     12:   (in this case "t1"), or by the name of one of the columns of the fts3 
                     13:   table. <full-text-query> should be replaced by an SQL expression that 
                     14:   computes to a string containing an Fts3 query.
                     15: 
                     16:   If the left-hand-side of the MATCH operator is set to the name of the
                     17:   fts3 table, then by default the query may be matched against any column
                     18:   of the table. If it is set to a column name, then by default the query
                     19:   may only match the specified column. In both cases this may be overriden
                     20:   as part of the query text (see sections 2 and 3 below).
                     21: 
                     22:   As of SQLite version 3.6.8, Fts3 supports two slightly different query 
                     23:   formats; the standard syntax, which is used by default, and the enhanced
                     24:   query syntax which can be selected by compiling with the pre-processor
                     25:   symbol SQLITE_ENABLE_FTS3_PARENTHESIS defined.
                     26: 
                     27:     -DSQLITE_ENABLE_FTS3_PARENTHESIS
                     28: 
                     29: 2. STANDARD QUERY SYNTAX
                     30: 
                     31:   When using the standard Fts3 query syntax, a query usually consists of a 
                     32:   list of terms (words) separated by white-space characters. To match a
                     33:   query, a row (or column) of an Fts3 table must contain each of the specified
                     34:   terms. For example, the following query:
                     35: 
                     36:     <col> MATCH 'hello world'
                     37: 
                     38:   matches rows (or columns, if <col> is the name of a column name) that 
                     39:   contain at least one instance of the token "hello", and at least one 
                     40:   instance of the token "world". Tokens may be grouped into phrases using
                     41:   quotation marks. In this case, a matching row or column must contain each
                     42:   of the tokens in the phrase in the order specified, with no intervening
                     43:   tokens. For example, the query:
                     44: 
                     45:     <col> MATCH '"hello world" joe"
                     46: 
                     47:   matches the first of the following two documents, but not the second or
                     48:   third:
                     49: 
                     50:     "'Hello world', said Joe."
                     51:     "One should always greet the world with a cheery hello, thought Joe."
                     52:     "How many hello world programs could their be?"
                     53: 
                     54:   As well as grouping tokens together by phrase, the binary NEAR operator 
                     55:   may be used to search for rows that contain two or more specified tokens 
                     56:   or phrases within a specified proximity of each other. The NEAR operator
                     57:   must always be specified in upper case. The word "near" in lower or mixed
                     58:   case is treated as an ordinary token. For example, the following query:
                     59: 
                     60:     <col> MATCH 'engineering NEAR consultancy'
                     61: 
                     62:   matches rows that contain both the "engineering" and "consultancy" tokens
                     63:   in the same column with not more than 10 other words between them. It does
                     64:   not matter which of the two terms occurs first in the document, only that
                     65:   they be seperated by only 10 tokens or less. The user may also specify
                     66:   a different required proximity by adding "/N" immediately after the NEAR
                     67:   operator, where N is an integer. For example:
                     68: 
                     69:     <col> MATCH 'engineering NEAR/5 consultancy'
                     70: 
                     71:   searches for a row containing an instance of each specified token seperated
                     72:   by not more than 5 other tokens. More than one NEAR operator can be used
                     73:   in as sequence. For example this query:
                     74: 
                     75:     <col> MATCH 'reliable NEAR/2 engineering NEAR/5 consultancy'
                     76: 
                     77:   searches for a row that contains an instance of the token "reliable" 
                     78:   seperated by not more than two tokens from an instance of "engineering",
                     79:   which is in turn separated by not more than 5 other tokens from an
                     80:   instance of the term "consultancy". Phrases enclosed in quotes may
                     81:   also be used as arguments to the NEAR operator.
                     82: 
                     83:   Similar to the NEAR operator, one or more tokens or phrases may be 
                     84:   separated by OR operators. In this case, only one of the specified tokens
                     85:   or phrases must appear in the document. For example, the query:
                     86: 
                     87:     <col> MATCH 'hello OR world'
                     88: 
                     89:   matches rows that contain either the term "hello", or the term "world",
                     90:   or both. Note that unlike in many programming languages, the OR operator
                     91:   has a higher precedence than the AND operators implied between white-space
                     92:   separated tokens. The following query matches documents that contain the
                     93:   term 'sqlite' and at least one of the terms 'fantastic' or 'impressive',
                     94:   not those that contain both 'sqlite' and 'fantastic' or 'impressive':
                     95: 
                     96:     <col> MATCH 'sqlite fantastic OR impressive'
                     97: 
                     98:   Any token that is part of an Fts3 query expression, whether or not it is
                     99:   part of a phrase enclosed in quotes, may have a '*' character appended to
                    100:   it. In this case, the token matches all terms that begin with the characters
                    101:   of the token, not just those that exactly match it. For example, the 
                    102:   following query:
                    103: 
                    104:     <col> MATCH 'sql*'
                    105: 
                    106:   matches all rows that contain the term "SQLite", as well as those that
                    107:   contain "SQL".
                    108: 
                    109:   A token that is not part of a quoted phrase may be preceded by a '-'
                    110:   character, which indicates that matching rows must not contain the 
                    111:   specified term. For example, the following:
                    112: 
                    113:     <col> MATCH '"database engine" -sqlite'
                    114: 
                    115:   matches rows that contain the phrase "database engine" but do not contain
                    116:   the term "sqlite". If the '-' character occurs inside a quoted phrase,
                    117:   it is ignored. It is possible to use both the '-' prefix and the '*' postfix
                    118:   on a single term. At this time, all Fts3 queries must contain at least
                    119:   one term or phrase that is not preceded by the '-' prefix.
                    120: 
                    121:   Regardless of whether or not a table name or column name is used on the 
                    122:   left hand side of the MATCH operator, a specific column of the fts3 table
                    123:   may be associated with each token in a query by preceding a token with
                    124:   a column name followed by a ':' character. For example, regardless of what
                    125:   is specified for <col>, the following query requires that column "col1"
                    126:   of the table contains the term "hello", and that column "col2" of the
                    127:   table contains the term "world". If the table does not contain columns
                    128:   named "col1" and "col2", then an error is returned and the query is
                    129:   not run.
                    130: 
                    131:     <col> MATCH 'col1:hello col2:world'
                    132: 
                    133:   It is not possible to associate a specific table column with a quoted 
                    134:   phrase or a term preceded by a '-' operator. A '*' character may be
                    135:   appended to a term associated with a specific column for prefix matching.
                    136: 
                    137: 3. ENHANCED QUERY SYNTAX
                    138: 
                    139:   The enhanced query syntax is quite similar to the standard query syntax,
                    140:   with the following four differences:
                    141: 
                    142:   1) Parenthesis are supported. When using the enhanced query syntax,
                    143:      parenthesis may be used to overcome the built-in precedence of the
                    144:      supplied binary operators. For example, the following query:
                    145: 
                    146:        <col> MATCH '(hello world) OR (simple example)'
                    147: 
                    148:      matches documents that contain both "hello" and "world", and documents
                    149:      that contain both "simple" and "example". It is not possible to forumlate
                    150:      such a query using the standard syntax.
                    151: 
                    152:   2) Instead of separating tokens and phrases by whitespace, an AND operator
                    153:      may be explicitly specified. This does not change query processing at
                    154:      all, but may be used to improve readability. For example, the following
                    155:      query is handled identically to the one above:
                    156: 
                    157:        <col> MATCH '(hello AND world) OR (simple AND example)'
                    158: 
                    159:      As with the OR and NEAR operators, the AND operator must be specified
                    160:      in upper case. The word "and" specified in lower or mixed case is 
                    161:      handled as a regular token.
                    162: 
                    163:   3) The '-' token prefix is not supported. Instead, a new binary operator,
                    164:      NOT, is included. The NOT operator requires that the query specified
                    165:      as its left-hand operator matches, but that the query specified as the
                    166:      right-hand operator does not. For example, to query for all rows that
                    167:      contain the term "example" but not the term "simple", the following
                    168:      query could be used:
                    169: 
                    170:        <col> MATCH 'example NOT simple'
                    171: 
                    172:      As for all other operators, the NOT operator must be specified in
                    173:      upper case. Otherwise it will be treated as a regular token.
                    174: 
                    175:   4) Unlike in the standard syntax, where the OR operator has a higher
                    176:      precedence than the implicit AND operator, when using the enhanced
                    177:      syntax implicit and explict AND operators have a higher precedence
                    178:      than OR operators. Using the enhanced syntax, the following two
                    179:      queries are equivalent:
                    180: 
                    181:        <col> MATCH 'sqlite fantastic OR impressive'
                    182:        <col> MATCH '(sqlite AND fantastic) OR impressive'
                    183: 
                    184:      however, when using the standard syntax, the query:
                    185: 
                    186:        <col> MATCH 'sqlite fantastic OR impressive'
                    187: 
                    188:      is equivalent to the enhanced syntax query:
                    189: 
                    190:        <col> MATCH 'sqlite AND (fantastic OR impressive)'
                    191: 
                    192:      The precedence of all enhanced syntax operators, in order from highest
                    193:      to lowest, is:
                    194: 
                    195:        NEAR       (highest precedence, tightest grouping)
                    196:        NOT
                    197:        AND
                    198:        OR         (lowest precedence, loosest grouping)
                    199: 
                    200:   Using the advanced syntax, it is possible to specify expressions enclosed
                    201:   in parenthesis as operands to the NOT, AND and OR operators. However both
                    202:   the left and right hand side operands of NEAR operators must be either
                    203:   tokens or phrases. Attempting the following query will return an error:
                    204: 
                    205:     <col> MATCH 'sqlite NEAR (fantastic OR impressive)'
                    206: 
                    207:   Queries of this form must be re-written as:
                    208: 
                    209:     <col> MATCH 'sqlite NEAR fantastic OR sqlite NEAR impressive'

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>