Below is the logic that I am trying to implement but I am finding it really difficult to figure out a way with MongoDB/ Node.js app
Data: country, state, val1
I need to compute mean and std. deviation using the below formula. I checked other stack overflow posts but the std dev formula that i am working is not the same:
for each row -> group by country, state
mean = sum(val1)/count ->
for each row ->
deviation += Math.pow((val1 - mean), 2)
for each row -> group by country, state
std dev = Math.sqrt(dev/ count)
the problem is with the way deviation needs to be computed. It looks like I need an aggregation for Mean before computing the deviation/ std dev through Map reduce which I dont find a way to compute. Could anyone suggest a way to do this?
If it is not possible, do we have a way to issue an update statement in mongodb similar to the below traditional merge query? I shall update the mean value for all the rows and would later invoke Mapreduce for the deviation/std dev.
merge into Tbl1 a using
(select b.country, b.state, sum(b.val1)/count(b.val1) as mean
from Tbl1 b
group by b.country, b.state) c
on (a.country = c.country and
a.state = c.state)
when matched
then update
set a.mean = c.mean
I am pretty new to the nosql and nodejs and it would be great if you guys could suggest a solution/ alternative.
Yes, computing standard deviation using map-reduce is tricky as you need to compare each data value to the mean in the traditional algorithm.
Take a look at this solution based upon the parallel calculation algorithm: https://gist.github.com/RedBeard0531/1886960