Loading...

sqlite-users@sqlite.org

[Prev] Thread [Next]  |  [Prev] Date [Next]

[sqlite] FTS simple tokenizer Hamish Allan Sun Feb 26 15:00:46 2012

The docs for the simple tokenizer
(http://www.sqlite.org/fts3.html#tokenizer) say:

"A term is a contiguous sequence of eligible characters, where
eligible characters are all alphanumeric characters, the "_"
character, and all characters with UTF codepoints greater than or
equal to 128."

If I do:

CREATE VIRTUAL TABLE test USING fts3();
INSERT INTO test (content) VALUES ('hello_world');

SELECT * FROM test WHERE content MATCH 'orld';
SELECT * FROM test WHERE content MATCH 'world';

I get no match for the first query, because it doesn't match a term,
but I get a match for the second, whereas according to my reading of
the docs "world" shouldn't be a term because the underscore character
shouldn't be considered a term break.

Can anyone please help me understand this behaviour?

Thanks,
Hamish
_______________________________________________
sqlite-users mailing list
[EMAIL PROTECTED]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users