Return to README.syntax CVS log | Up to [ELWIX - Embedded LightWeight unIX -] / embedaddon / sqlite3 / ext / fts3 |
1.1 ! misho 1: ! 2: 1. OVERVIEW ! 3: ! 4: This README file describes the syntax of the arguments that may be passed to ! 5: the FTS3 MATCH operator used for full-text queries. For example, if table ! 6: "t1" is an Fts3 virtual table, the following SQL query: ! 7: ! 8: SELECT * FROM t1 WHERE <col> MATCH <full-text query> ! 9: ! 10: may be used to retrieve all rows that match a specified for full-text query. ! 11: The text "<col>" should be replaced by either the name of the fts3 table ! 12: (in this case "t1"), or by the name of one of the columns of the fts3 ! 13: table. <full-text-query> should be replaced by an SQL expression that ! 14: computes to a string containing an Fts3 query. ! 15: ! 16: If the left-hand-side of the MATCH operator is set to the name of the ! 17: fts3 table, then by default the query may be matched against any column ! 18: of the table. If it is set to a column name, then by default the query ! 19: may only match the specified column. In both cases this may be overriden ! 20: as part of the query text (see sections 2 and 3 below). ! 21: ! 22: As of SQLite version 3.6.8, Fts3 supports two slightly different query ! 23: formats; the standard syntax, which is used by default, and the enhanced ! 24: query syntax which can be selected by compiling with the pre-processor ! 25: symbol SQLITE_ENABLE_FTS3_PARENTHESIS defined. ! 26: ! 27: -DSQLITE_ENABLE_FTS3_PARENTHESIS ! 28: ! 29: 2. STANDARD QUERY SYNTAX ! 30: ! 31: When using the standard Fts3 query syntax, a query usually consists of a ! 32: list of terms (words) separated by white-space characters. To match a ! 33: query, a row (or column) of an Fts3 table must contain each of the specified ! 34: terms. For example, the following query: ! 35: ! 36: <col> MATCH 'hello world' ! 37: ! 38: matches rows (or columns, if <col> is the name of a column name) that ! 39: contain at least one instance of the token "hello", and at least one ! 40: instance of the token "world". Tokens may be grouped into phrases using ! 41: quotation marks. In this case, a matching row or column must contain each ! 42: of the tokens in the phrase in the order specified, with no intervening ! 43: tokens. For example, the query: ! 44: ! 45: <col> MATCH '"hello world" joe" ! 46: ! 47: matches the first of the following two documents, but not the second or ! 48: third: ! 49: ! 50: "'Hello world', said Joe." ! 51: "One should always greet the world with a cheery hello, thought Joe." ! 52: "How many hello world programs could their be?" ! 53: ! 54: As well as grouping tokens together by phrase, the binary NEAR operator ! 55: may be used to search for rows that contain two or more specified tokens ! 56: or phrases within a specified proximity of each other. The NEAR operator ! 57: must always be specified in upper case. The word "near" in lower or mixed ! 58: case is treated as an ordinary token. For example, the following query: ! 59: ! 60: <col> MATCH 'engineering NEAR consultancy' ! 61: ! 62: matches rows that contain both the "engineering" and "consultancy" tokens ! 63: in the same column with not more than 10 other words between them. It does ! 64: not matter which of the two terms occurs first in the document, only that ! 65: they be seperated by only 10 tokens or less. The user may also specify ! 66: a different required proximity by adding "/N" immediately after the NEAR ! 67: operator, where N is an integer. For example: ! 68: ! 69: <col> MATCH 'engineering NEAR/5 consultancy' ! 70: ! 71: searches for a row containing an instance of each specified token seperated ! 72: by not more than 5 other tokens. More than one NEAR operator can be used ! 73: in as sequence. For example this query: ! 74: ! 75: <col> MATCH 'reliable NEAR/2 engineering NEAR/5 consultancy' ! 76: ! 77: searches for a row that contains an instance of the token "reliable" ! 78: seperated by not more than two tokens from an instance of "engineering", ! 79: which is in turn separated by not more than 5 other tokens from an ! 80: instance of the term "consultancy". Phrases enclosed in quotes may ! 81: also be used as arguments to the NEAR operator. ! 82: ! 83: Similar to the NEAR operator, one or more tokens or phrases may be ! 84: separated by OR operators. In this case, only one of the specified tokens ! 85: or phrases must appear in the document. For example, the query: ! 86: ! 87: <col> MATCH 'hello OR world' ! 88: ! 89: matches rows that contain either the term "hello", or the term "world", ! 90: or both. Note that unlike in many programming languages, the OR operator ! 91: has a higher precedence than the AND operators implied between white-space ! 92: separated tokens. The following query matches documents that contain the ! 93: term 'sqlite' and at least one of the terms 'fantastic' or 'impressive', ! 94: not those that contain both 'sqlite' and 'fantastic' or 'impressive': ! 95: ! 96: <col> MATCH 'sqlite fantastic OR impressive' ! 97: ! 98: Any token that is part of an Fts3 query expression, whether or not it is ! 99: part of a phrase enclosed in quotes, may have a '*' character appended to ! 100: it. In this case, the token matches all terms that begin with the characters ! 101: of the token, not just those that exactly match it. For example, the ! 102: following query: ! 103: ! 104: <col> MATCH 'sql*' ! 105: ! 106: matches all rows that contain the term "SQLite", as well as those that ! 107: contain "SQL". ! 108: ! 109: A token that is not part of a quoted phrase may be preceded by a '-' ! 110: character, which indicates that matching rows must not contain the ! 111: specified term. For example, the following: ! 112: ! 113: <col> MATCH '"database engine" -sqlite' ! 114: ! 115: matches rows that contain the phrase "database engine" but do not contain ! 116: the term "sqlite". If the '-' character occurs inside a quoted phrase, ! 117: it is ignored. It is possible to use both the '-' prefix and the '*' postfix ! 118: on a single term. At this time, all Fts3 queries must contain at least ! 119: one term or phrase that is not preceded by the '-' prefix. ! 120: ! 121: Regardless of whether or not a table name or column name is used on the ! 122: left hand side of the MATCH operator, a specific column of the fts3 table ! 123: may be associated with each token in a query by preceding a token with ! 124: a column name followed by a ':' character. For example, regardless of what ! 125: is specified for <col>, the following query requires that column "col1" ! 126: of the table contains the term "hello", and that column "col2" of the ! 127: table contains the term "world". If the table does not contain columns ! 128: named "col1" and "col2", then an error is returned and the query is ! 129: not run. ! 130: ! 131: <col> MATCH 'col1:hello col2:world' ! 132: ! 133: It is not possible to associate a specific table column with a quoted ! 134: phrase or a term preceded by a '-' operator. A '*' character may be ! 135: appended to a term associated with a specific column for prefix matching. ! 136: ! 137: 3. ENHANCED QUERY SYNTAX ! 138: ! 139: The enhanced query syntax is quite similar to the standard query syntax, ! 140: with the following four differences: ! 141: ! 142: 1) Parenthesis are supported. When using the enhanced query syntax, ! 143: parenthesis may be used to overcome the built-in precedence of the ! 144: supplied binary operators. For example, the following query: ! 145: ! 146: <col> MATCH '(hello world) OR (simple example)' ! 147: ! 148: matches documents that contain both "hello" and "world", and documents ! 149: that contain both "simple" and "example". It is not possible to forumlate ! 150: such a query using the standard syntax. ! 151: ! 152: 2) Instead of separating tokens and phrases by whitespace, an AND operator ! 153: may be explicitly specified. This does not change query processing at ! 154: all, but may be used to improve readability. For example, the following ! 155: query is handled identically to the one above: ! 156: ! 157: <col> MATCH '(hello AND world) OR (simple AND example)' ! 158: ! 159: As with the OR and NEAR operators, the AND operator must be specified ! 160: in upper case. The word "and" specified in lower or mixed case is ! 161: handled as a regular token. ! 162: ! 163: 3) The '-' token prefix is not supported. Instead, a new binary operator, ! 164: NOT, is included. The NOT operator requires that the query specified ! 165: as its left-hand operator matches, but that the query specified as the ! 166: right-hand operator does not. For example, to query for all rows that ! 167: contain the term "example" but not the term "simple", the following ! 168: query could be used: ! 169: ! 170: <col> MATCH 'example NOT simple' ! 171: ! 172: As for all other operators, the NOT operator must be specified in ! 173: upper case. Otherwise it will be treated as a regular token. ! 174: ! 175: 4) Unlike in the standard syntax, where the OR operator has a higher ! 176: precedence than the implicit AND operator, when using the enhanced ! 177: syntax implicit and explict AND operators have a higher precedence ! 178: than OR operators. Using the enhanced syntax, the following two ! 179: queries are equivalent: ! 180: ! 181: <col> MATCH 'sqlite fantastic OR impressive' ! 182: <col> MATCH '(sqlite AND fantastic) OR impressive' ! 183: ! 184: however, when using the standard syntax, the query: ! 185: ! 186: <col> MATCH 'sqlite fantastic OR impressive' ! 187: ! 188: is equivalent to the enhanced syntax query: ! 189: ! 190: <col> MATCH 'sqlite AND (fantastic OR impressive)' ! 191: ! 192: The precedence of all enhanced syntax operators, in order from highest ! 193: to lowest, is: ! 194: ! 195: NEAR (highest precedence, tightest grouping) ! 196: NOT ! 197: AND ! 198: OR (lowest precedence, loosest grouping) ! 199: ! 200: Using the advanced syntax, it is possible to specify expressions enclosed ! 201: in parenthesis as operands to the NOT, AND and OR operators. However both ! 202: the left and right hand side operands of NEAR operators must be either ! 203: tokens or phrases. Attempting the following query will return an error: ! 204: ! 205: <col> MATCH 'sqlite NEAR (fantastic OR impressive)' ! 206: ! 207: Queries of this form must be re-written as: ! 208: ! 209: <col> MATCH 'sqlite NEAR fantastic OR sqlite NEAR impressive'