Couch DB use regex to search body

I am pretty new to CouchDB, Map/Reduce and NoSQL in general.

I was hoping if you guys could guide me on how I would implement very basic search on my Node.js/ CouchDB application.

I am looking at searching for certain text elements within the CouchDB documents.

Most of my documents within couch are of a similar format mentioned below :

{
    "_id": "b001729a2c5cf4100ea50a71ec04e20b",
    "_rev": "1-8928ea122d80fd5c1da1c22dfc6ce46e",
    "Approved To Publish": "false",
    "Article Body": "has retained the Carbon Trust Standard for energy and carbon management.Following a demanding audit process that took place in December 2012, was awarded recertification for the Carbon Trust Standard.This certificate is a mark of excellence in recognition of how well we measure, manage and reduce our carbon emissions.",
    "Headline": "Delight retains Carbon Trust Standard"
}

My search keys would be for eg 'Carbon trust', 'emissions', 'excellence recognition' etc.

What I have so for is a temporary map function which I use in the request body of my Node.js application in a POST request, but I am sure its not the right approach, and I would expect it to be a stored view in CouchDB.

My Map function:

function (doc) {
    if ('Headline' in doc) {
        if (doc['Article Body'].search(/" + req.body.jsonValue  + "/i) !== -1
            || doc.Headline.search(/" + req.body.jsonValue + "/i) !== -1) {
            var key = doc.Headline,
                value = doc;
            emit(key, value);
        }
    }
}

Please let me know on what I need to do to improve my approach or let me know if things are unclear.

Regards

A list function has access to querystring values, so you can just add one that you use in conjunction with your view.

map function

function (doc) {
    if ("Headline" in doc) {
        emit(doc.Headline, doc);
    }
}

list function

function (head, req) {
    var rows = [],
        regex = new RegExp(req.query.search, "i"), // use the querystring param to create the RegExp object
        row;

    while (row = getRow()) {
        if (row.value.Headline.search(regex) > -1 || row.value["Article Body"].search(regex)) {
            rows.push(row);
        }
    }

    // I'm just mocking up what a view's output would look like (not required)
    send(JSON.stringify({
        total_rows: rows.length,
        offset: 0,
        rows: rows
    }));
}

Of course you could modify the list function to emit data in chunks, rather than all at once, but this should give you an idea of what a solution like this looks like.

You could instead build a view where keys are keywords (or alternatively any words) and values are _ids.

The drawback is that this view could potentially become very large. CouchDB experts may have a better solution to what is I suppose a typical problem.

Naive example1:

function(doc) {
    if ('Headline' in doc) {
        for (key in doc.Headline.split(' ')) {
            emit(key, doc._id)
        }
    }
    if ('Article Body' in doc) {
        for (key in doc['Article Body'].split(' ')) {
            emit(key, doc._id)
        }
    }
}

Then you would query it with /app/_design/app/_view/search?key="keyword" for instance.

1: you would actually need to normalize case, remove punctuation, common words like a, the, of and more…