This is a somewhat complex but simple issue. I am creating a small application that allows the storing on an object, that object has a title, description, tags, and sub-objects. the subjects contain comments. What would be an ideal way to use a search field to find any matches in these objects.
e.g. object 1 {id:1, title:test, description:fun, tags:[{games, toys}]} subobject 1 {lid:1, comment:'this is fun and fast'}
say i search: fun fast. or i search: test
i would like any of those to return the same object.
I understand i will have to explode the variable by space or comma and create an array, but how exactly should i construct the sql.
I am not trying to create a sophisticated page rank algrothim, if i have to restrict to search by title or tag or description i will do so.
note: this is a simple app, im just looking for options.
The simplest way I know is to use the PostgreSQL trigram searches. Assuming you have a table like this:
CREATE TABLE t (id int NOT NULL PRIMARY KEY, value text);
INSERT INTO t VALUES (1, '{id:1, title:test, description:fun, tags:[{games, toys}]} subobject 1 {lid:1, comment:''this is fun and fast''}');
You could prepare for searches with this:
CREATE EXTENSION pg_trgm;
CREATE INDEX t_trgm ON t USING gin (value gin_trgm_ops);
SELECT set_limit(0.05);
You only need to install the extension once per database. You can use GiST or GIN indexing; just use the corresponding operations set. GiST is generally faster to update; GIN is generally faster to search -- so the best choice depends on the workload. I set the similarity limit on the connection to 5% because you seem to be happy with matches on small portions of the string. Adjust as necessary. Remember this limit is session-based, so you need to set it up on the connection before you use the % operator.
Having done that setup, here are the searches:
SELECT * FROM t WHERE value % 'test';
SELECT * FROM t WHERE value % 'fun fast';
It doesn't get too much simpler than that.
While it probably won't even use the index when you only have one row in the table, we have found in our production databases with millions of rows to search, a similarity search will generally run in a few tens of milliseconds.
http://www.postgresql.org/docs/current/interactive/pgtrgm.html
Simplest way would be to combine your fields into a JSON object, and place that in MongoDB or another JSON db