mean in Javascript

calculate mean and standard deviation from a vector of samples in C++ using boost

Is there a way to calculate mean and standard deviation for a vector containing samples using boost? Or do I have to create an accumulator and feed the vector into it?

Source: (StackOverflow)

How to use numpy with 'None' value in Python?

I'd like to calculate the mean of an array in Python in this form:

Matrice = [1, 2, None]

I'd just like to have my None value ignored by the numpy.mean calculation but I can't figure out how to do it.

Source: (StackOverflow)

calculate the mean for each column of a matrix in R

I am working on R in R studio. I need to calculate the mean for each column of a data frame.

 cluster1  // 5 by 4 data frame
 mean(cluster1) //

I got :

  Warning message:
  In mean.default(cluster1) :
  argument is not numeric or logical: returning NA

But I can use

  mean(cluster1[[1]])

to get the mean of the first column.

How to get means for all columns ?

Any help would be appreciated.

Source: (StackOverflow)

Calculate mean across dimension in a 2D array

I have an array a like this:

a=[]
a.append([40,10])
a.append([50,11])

so it looks like this:

 >>> a
[[40, 10], [50, 11]]

I need to calculate the mean for each dimension separately, the result should be this:

[45,10.5]

45 being the mean of a[*][0] and 10.5 the mean of `a[*][1].

What is the most elegant way of solving this without going to a loop?

Source: (StackOverflow)

How to average over a cell-array of arrays?

I have a cell array c of equal-sized arrays, i.e. size(c{n}) = [ m l ... ] for any n. How can I get the mean values (averaging over the cell array index n) for all array elements in one sweep? I thought about using cell2mat and mean but the former does not add another dimension but changes l to l*n. And looping manually of course takes like forever...

Source: (StackOverflow)

Get mean of 2D slice of a 3D array in numpy

I have a numpy array with a shape of:

(11L, 5L, 5L)

I want to calculate the mean over the 25 elements of each 'slice' of the array [0, :, :], [1, :, :] etc, returning 11 values.

It seems silly, but I can't work out how to do this. I've thought the mean(axis=x) function would do this, but I've tried all possible combinations of axis and none of them give me the result I want.

I can obviously do this using a for loop and slicing, but surely there is a better way?

Source: (StackOverflow)

Mean Squared Error in Numpy?

Is there a method in numpy for calculating the Mean Squared Error between two matrices?

I've tried searching but found none. Is it under a different name?

If there isn't, how do you overcome this? Do you write it yourself or use a different lib?

Source: (StackOverflow)

Why does isNaN(" ") equal false

I have a quick question (I hope!). In JS, why does isNaN(" ") evaluate to false, but isNaN(" x") evaluate to true?

I'm performing numerical operations on a text input field, and am checking if the field is null, "", or NaN. When someone types a handful of spaces into the field, my validation fails on all three, and I'm confused as to why it gets past the isNAN check.

Thanks!

Source: (StackOverflow)

haskell - Average floating point error using QuickCheck

I am using QuickCheck-2.5.1.1 to do QA. I am testing two pure functions gold :: a -> Float and f :: a -> Float, where a instances Arbitrary.

Here gold is a reference calculation and f is a variation I am optimizing.

To date, most of my tests using quickcheck have been using tests like \a -> abs (gold a - f a) < 0.0001.

However, I would like to gather statistics along with checking the threshold, since knowing the average error and standard deviation are useful in guiding my design.

Is there any way to use QuickCheck to gather statistics like this?

Concrete example

To give a concrete example of the sort of thing I'm looking for, suppose I have the following two functions for approximating square roots:

-- Heron's method
heron :: Float -> Float
heron x = heron' 5 1
    where
      heron' n est
          | n > 0 = heron' (n-1) $ (est + (x/est)) / 2
          | otherwise = est

-- Fifth order Maclaurin series expansion
maclaurin :: Float -> Float
maclaurin x = 1 + (1/2) * (x - 1) - (1/8)*(x - 1)^2
                + (1/16)*(x - 1)^3 - (5/128)*(x - 1)^4
                + (7/256)*(x - 1)^5

A test for this might be:

test = quickCheck
       $ forAll (choose (1,2))
       $ \x -> abs (heron x - maclaurin x) < 0.02

So what I'd like to know, as a side-effect of the test, is the statistics on abs (heron x - maclaurin x) (such as the mean and standard deviation).

Source: (StackOverflow)

Element-wise mean in R

In R, I have two vectors:

a <- c(1, 2, 3, 4)
b <- c(NA, 6, 7, 8)

How do I find the element-wise mean of the two vectors, removing NA, without a loop? i.e. I want to get the vector of

(1, 4, 5, 6)

I know the function mean(), I know the argument na.rm = 1. But I don't know how to put things together. To be sure, in reality I have thousands of vectors with NA appearing at various places, so any dimension-dependent solution wouldn't work. Thanks.

Source: (StackOverflow)

z-Scores(standard deviation and mean) in PHP

I am trying to calculate Z-scores using PHP. Essentially, I am looking for the most efficient way to calculate the mean and standard deviation of a data set (PHP array). Any suggestions on how to do this in PHP?

I am trying to do this in the smallest number of steps.

Source: (StackOverflow)

What's the quickest way to get the mean of a set of numbers from the command line?

Using any tools which you would expect to find on a nix system (in fact, if you want, msdos is also fine too), what is the easiest/fastest way to calculate the mean of a set of numbers, assuming you have them one per line in a stream or file?

Source: (StackOverflow)

How to get column mean for specific rows only?

I need to get the mean of one column (here: score) for specific rows (here: years). Specifically, I would like to know the average score for three periods:

period 1: year <= 1983
period 2: year >= 1984 & year <= 1990
period 3: year >= 1991

This is the structure of my data:

  country year     score        
 Algeria 1980     -1.1201501 
 Algeria 1981     -1.0526943 
 Algeria 1982     -1.0561565 
 Algeria 1983     -1.1274560 
 Algeria 1984     -1.1353926 
 Algeria 1985     -1.1734330 
 Algeria 1986     -1.1327666 
 Algeria 1987     -1.1263586 
 Algeria 1988     -0.8529455 
 Algeria 1989     -0.2930265 
 Algeria 1990     -0.1564207 
 Algeria 1991     -0.1526328 
 Algeria 1992     -0.9757842 
 Algeria 1993     -0.9714060 
 Algeria 1994     -1.1422258 
 Algeria 1995     -0.3675797 
 ...

The calculated mean values should be added to the df in an additional column ("mean"), i.e. same mean value for years of period 1, for those of period 2 etc.

This is how it should look like:

country year     score         mean   
 Algeria 1980     -1.1201501     -1.089
 Algeria 1981     -1.0526943     -1.089
 Algeria 1982     -1.0561565     -1.089
 Algeria 1983     -1.1274560     -1.089
 Algeria 1984     -1.1353926     -0.839
 Algeria 1985     -1.1734330     -0.839
 Algeria 1986     -1.1327666     -0.839
 Algeria 1987     -1.1263586     -0.839
 Algeria 1988     -0.8529455     -0.839
 Algeria 1989     -0.2930265     -0.839
 Algeria 1990     -0.1564207     -0.839
 ...

Every possible path I tried got easily super complicated - and I have to calculate the mean scores for different periods of time for over 90 countries ...

Many many thanks for your help!

Source: (StackOverflow)

Mean value and standard deviation of a very huge data set

I am wondering if there is an algorithm that calculates the mean value and standard deviation of an unbound data set.

for example, I am monitoring an measurement value, say, electric current. I would like to have the mean value of all historical data. Whenever a new value come, update the mean and stdev? Because the data is too big to store, I hope it can just update the mean and stdev on the fly without storing the data.

Even data is stored, the standard way (d1+...+dn)/n, doesn't work, the sum will blow out the data representation.

I through about sum(d1/n + d2/n + ... d3/n), if n is hugh, the error is too big and accumulated. Besides, n is unbound in this case.

The number of data is definitely unbound, whenever it comes, it requires to update the value.

Does anybody know if there is an algorithm for it?

Source: (StackOverflow)

compute mean in python for a generator

I'm doing some statistics work, I have a (large) collection of random numbers to compute the mean of, I'd like to work with generators, because I just need to compute the mean, so I don't need to store the numbers.

The problem is that numpy.mean breaks if you pass it a generator. I can write a simple function to do what I want, but I'm wondering if there's a proper, built-in way to do this?

It would be nice if I could say "sum(values)/len(values)", but len doesn't work for genetators, and sum already consumed values.

here's an example:

import numpy 

def my_mean(values):
    n = 0
    Sum = 0.0
    try:
        while True:
            Sum += next(values)
            n += 1
    except StopIteration: pass
    return float(Sum)/n

X = [k for k in range(1,7)]
Y = (k for k in range(1,7))

print numpy.mean(X)
print my_mean(Y)

these both give the same, correct, answer, buy my_mean doesn't work for lists, and numpy.mean doesn't work for generators.

I really like the idea of working with generators, but details like this seem to spoil things.

Source: (StackOverflow)