Annotation of embedaddon/sqlite3/ext/fts3/README.content, revision 1.1
1.1 ! misho 1:
! 2: FTS4 CONTENT OPTION
! 3:
! 4: Normally, in order to create a full-text index on a dataset, the FTS4
! 5: module stores a copy of all indexed documents in a specially created
! 6: database table.
! 7:
! 8: As of SQLite version 3.7.9, FTS4 supports a new option - "content" -
! 9: designed to extend FTS4 to support the creation of full-text indexes where:
! 10:
! 11: * The indexed documents are not stored within the SQLite database
! 12: at all (a "contentless" FTS4 table), or
! 13:
! 14: * The indexed documents are stored in a database table created and
! 15: managed by the user (an "external content" FTS4 table).
! 16:
! 17: Because the indexed documents themselves are usually much larger than
! 18: the full-text index, the content option can sometimes be used to achieve
! 19: significant space savings.
! 20:
! 21: CONTENTLESS FTS4 TABLES
! 22:
! 23: In order to create an FTS4 table that does not store a copy of the indexed
! 24: documents at all, the content option should be set to an empty string.
! 25: For example, the following SQL creates such an FTS4 table with three
! 26: columns - "a", "b", and "c":
! 27:
! 28: CREATE VIRTUAL TABLE t1 USING fts4(content="", a, b, c);
! 29:
! 30: Data can be inserted into such an FTS4 table using an INSERT statements.
! 31: However, unlike ordinary FTS4 tables, the user must supply an explicit
! 32: integer docid value. For example:
! 33:
! 34: -- This statement is Ok:
! 35: INSERT INTO t1(docid, a, b, c) VALUES(1, 'a b c', 'd e f', 'g h i');
! 36:
! 37: -- This statement causes an error, as no docid value has been provided:
! 38: INSERT INTO t1(a, b, c) VALUES('j k l', 'm n o', 'p q r');
! 39:
! 40: It is not possible to UPDATE or DELETE a row stored in a contentless FTS4
! 41: table. Attempting to do so is an error.
! 42:
! 43: Contentless FTS4 tables also support SELECT statements. However, it is
! 44: an error to attempt to retrieve the value of any table column other than
! 45: the docid column. The auxiliary function matchinfo() may be used, but
! 46: snippet() and offsets() may not. For example:
! 47:
! 48: -- The following statements are Ok:
! 49: SELECT docid FROM t1 WHERE t1 MATCH 'xxx';
! 50: SELECT docid FROM t1 WHERE a MATCH 'xxx';
! 51: SELECT matchinfo(t1) FROM t1 WHERE t1 MATCH 'xxx';
! 52:
! 53: -- The following statements all cause errors, as the value of columns
! 54: -- other than docid are required to evaluate them.
! 55: SELECT * FROM t1;
! 56: SELECT a, b FROM t1 WHERE t1 MATCH 'xxx';
! 57: SELECT docid FROM t1 WHERE a LIKE 'xxx%';
! 58: SELECT snippet(t1) FROM t1 WHERE t1 MATCH 'xxx';
! 59:
! 60: Errors related to attempting to retrieve column values other than docid
! 61: are runtime errors that occur within sqlite3_step(). In some cases, for
! 62: example if the MATCH expression in a SELECT query matches zero rows, there
! 63: may be no error at all even if a statement does refer to column values
! 64: other than docid.
! 65:
! 66: EXTERNAL CONTENT FTS4 TABLES
! 67:
! 68: An "external content" FTS4 table is similar to a contentless table, except
! 69: that if evaluation of a query requires the value of a column other than
! 70: docid, FTS4 attempts to retrieve that value from a table (or view, or
! 71: virtual table) nominated by the user (hereafter referred to as the "content
! 72: table"). The FTS4 module never writes to the content table, and writing
! 73: to the content table does not affect the full-text index. It is the
! 74: responsibility of the user to ensure that the content table and the
! 75: full-text index are consistent.
! 76:
! 77: An external content FTS4 table is created by setting the content option
! 78: to the name of a table (or view, or virtual table) that may be queried by
! 79: FTS4 to retrieve column values when required. If the nominated table does
! 80: not exist, then an external content table behaves in the same way as
! 81: a contentless table. For example:
! 82:
! 83: CREATE TABLE t2(id INTEGER PRIMARY KEY, a, b, c);
! 84: CREATE VIRTUAL TABLE t3 USING fts4(content="t2", a, c);
! 85:
! 86: Assuming the nominated table does exist, then its columns must be the same
! 87: as or a superset of those defined for the FTS table.
! 88:
! 89: When a users query on the FTS table requires a column value other than
! 90: docid, FTS attempts to read this value from the corresponding column of
! 91: the row in the content table with a rowid value equal to the current FTS
! 92: docid. Or, if such a row cannot be found in the content table, a NULL
! 93: value is used instead. For example:
! 94:
! 95: CREATE TABLE t2(id INTEGER PRIMARY KEY, a, b, c, d);
! 96: CREATE VIRTUAL TABLE t3 USING fts4(content="t2", b, c);
! 97:
! 98: INSERT INTO t2 VALUES(2, 'a b', 'c d', 'e f');
! 99: INSERT INTO t2 VALUES(3, 'g h', 'i j', 'k l');
! 100: INSERT INTO t3(docid, b, c) SELECT id, b, c FROM t2;
! 101:
! 102: -- The following query returns a single row with two columns containing
! 103: -- the text values "i j" and "k l".
! 104: --
! 105: -- The query uses the full-text index to discover that the MATCH
! 106: -- term matches the row with docid=3. It then retrieves the values
! 107: -- of columns b and c from the row with rowid=3 in the content table
! 108: -- to return.
! 109: --
! 110: SELECT * FROM t3 WHERE t3 MATCH 'k';
! 111:
! 112: -- Following the UPDATE, the query still returns a single row, this
! 113: -- time containing the text values "xxx" and "yyy". This is because the
! 114: -- full-text index still indicates that the row with docid=3 matches
! 115: -- the FTS4 query 'k', even though the documents stored in the content
! 116: -- table have been modified.
! 117: --
! 118: UPDATE t2 SET b = 'xxx', c = 'yyy' WHERE rowid = 3;
! 119: SELECT * FROM t3 WHERE t3 MATCH 'k';
! 120:
! 121: -- Following the DELETE below, the query returns one row containing two
! 122: -- NULL values. NULL values are returned because FTS is unable to find
! 123: -- a row with rowid=3 within the content table.
! 124: --
! 125: DELETE FROM t2;
! 126: SELECT * FROM t3 WHERE t3 MATCH 'k';
! 127:
! 128: When a row is deleted from an external content FTS4 table, FTS4 needs to
! 129: retrieve the column values of the row being deleted from the content table.
! 130: This is so that FTS4 can update the full-text index entries for each token
! 131: that occurs within the deleted row to indicate that that row has been
! 132: deleted. If the content table row cannot be found, or if it contains values
! 133: inconsistent with the contents of the FTS index, the results can be difficult
! 134: to predict. The FTS index may be left containing entries corresponding to the
! 135: deleted row, which can lead to seemingly nonsensical results being returned
! 136: by subsequent SELECT queries. The same applies when a row is updated, as
! 137: internally an UPDATE is the same as a DELETE followed by an INSERT.
! 138:
! 139: Instead of writing separately to the full-text index and the content table,
! 140: some users may wish to use database triggers to keep the full-text index
! 141: up to date with respect to the set of documents stored in the content table.
! 142: For example, using the tables from earlier examples:
! 143:
! 144: CREATE TRIGGER t2_bu BEFORE UPDATE ON t2 BEGIN
! 145: DELETE FROM t3 WHERE docid=old.rowid;
! 146: END;
! 147: CREATE TRIGGER t2_bd BEFORE DELETE ON t2 BEGIN
! 148: DELETE FROM t3 WHERE docid=old.rowid;
! 149: END;
! 150:
! 151: CREATE TRIGGER t2_bu AFTER UPDATE ON t2 BEGIN
! 152: INSERT INTO t3(docid, b, c) VALUES(new.rowid, new.b, new.c);
! 153: END;
! 154: CREATE TRIGGER t2_bd AFTER INSERT ON t2 BEGIN
! 155: INSERT INTO t3(docid, b, c) VALUES(new.rowid, new.b, new.c);
! 156: END;
! 157:
! 158: The DELETE trigger must be fired before the actual delete takes place
! 159: on the content table. This is so that FTS4 can still retrieve the original
! 160: values in order to update the full-text index. And the INSERT trigger must
! 161: be fired after the new row is inserted, so as to handle the case where the
! 162: rowid is assigned automatically within the system. The UPDATE trigger must
! 163: be split into two parts, one fired before and one after the update of the
! 164: content table, for the same reasons.
! 165:
! 166: FTS4 features a special command similar to the 'optimize' command that
! 167: deletes the entire full-text index and rebuilds it based on the current
! 168: set of documents in the content table. Assuming again that "t3" is the
! 169: name of the external content FTS4 table, the command is:
! 170:
! 171: INSERT INTO t3(t3) VALUES('rebuild');
! 172:
! 173: This command may also be used with ordinary FTS4 tables, although it may
! 174: only be useful if the full-text index has somehow become corrupt. It is an
! 175: error to attempt to rebuild the full-text index maintained by a contentless
! 176: FTS4 table.
! 177:
! 178:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>