The numpy.var()
method computes the variance along the specified axis.
Example
import numpy as np
# create an array
array1= np.array([0, 1, 2, 3, 4, 5, 6, 7])
# calculate the variance of the array
variance = np.var(array1)
print(variance)
# Output: 5.25
var() Syntax
The syntax of the numpy.var()
method is:
numpy.var(array, axis = None, dtype = None, out = None, ddof = 0, keepdims = <no value>, where = <no value>)
var() Arguments
The numpy.var()
method takes the following arguments:
array
- array containing numbers whose variance is desired (can bearray_like
)axis
(optional) - axis or axes along which the variances are computed (int
ortuple of int
)dtype
(optional) - the data type to use in the calculation of variance (datatype
)out
(optional) - output array in which to place the result (ndarray
)ddof
(optional) - delta degrees of freedom (int
)keepdims
(optional) - specifies whether to preserve the shape of the original array (bool
)where
(optional) - elements to include in the variance (array of bool
)
Notes: The default values of numpy.var() impy the following:
axis = None
- the variance of the entire array is taken.dtype = None
- in the case of integers,float
is taken; otherwise variance is of the same data type as the elements- By default,
keepdims
andwhere
will not be passed.
var() Return Value
The numpy.var()
method returns the variance of the array.
Example 1: Find the variance of an ndArray
import numpy as np
# create an array
array1 = np.array([[[0, 1],
[2, 3]],
[[4, 5],
[6, 7]]])
# find the variance of the entire array
variance1 = np.var(array1)
# find the variance across axis 0
variance2 = np.var(array1, 0)
# find the variance across axis 0 and 1
variance3 = np.var(array1, (0, 1))
print('\nvariance of the entire array:', variance1)
print('\nvariance across axis 0:\n', variance2)
print('\nvariance across axis 0 and 1:', variance3)
Output
variance of the entire array: 5.25 variance across axis 0: [[4. 4.] [4. 4.]] variance across axis 0 and 1: [5. 5.]
Example 2: Specifying the Data Type of Variance of an ndArray
The dtype
parameter can be used to control the data type of the output array.
import numpy as np
array1 = np.array([[1, 2, 3],
[4, 5, 6]])
# by default int is converted to float
result1 = np.var(array1)
# get integer variance
result2 = np.var(array1, dtype = int)
print('Float variance:', result1)
print('Integer variance:', result2)
Output
Float variance: 2.9166666666666665 Integer variance: 3
Note: Using a lower precision dtype
, such as int
, can lead to a loss of accuracy.
Example 3: Using Optional keepdims Argument
If keepdims
is set to True
, the resultant variance array is of the same number of dimensions as the original array.
import numpy as np
array1 = np.array([[1, 2, 3],
[4, 5, 6]])
# keepdims defaults to False
result1 = np.var(array1, axis = 0)
# pass keepdims as True
result2 = np.var(array1, axis = 0, keepdims = True)
print('Dimensions in original array:', arr.ndim)
print('Without keepdims:', result1, 'with dimensions', result1.ndim)
print('With keepdims:', result2, 'with dimensions', result2.ndim)
Output
Dimensions in original array: 2 Without keepdims: [2.25 2.25 2.25] with dimensions 1 With keepdims: [[2.25 2.25 2.25]] with dimensions 2
Example 4: Using Optional where Argument
The optional argument where
specifies which elements to include in the variance.
import numpy as np
array1= np.array([[1, 2, 3],
[4, 5, 6]])
# take variance of the entire array
result1 = np.var(array1)
# variance of only even elements
result2 = np.var(array1, where = (array1% 2 == 0))
# variance of numbers greater than 3
result3 = np.var(array1, where = (array1 > 3))
print('variance of entire array:', result1)
print('variance of only even elements:', result2)
print('variance of numbers greater than 3:', result3)
Output
variance of entire array: 2.9166666666666665 variance of only even elements: 2.6666666666666665 variance of numbers greater than 3: 0.6666666666666666
Example 5: Using Optional out Argument
The out
parameter allows us to specify an output array where the result will be stored.
import numpy as np
array1 = np.array([[1, 2, 3],
[4, 5, 6]])
# create an output array
output = np.zeros(3)
# compute variance and store the result in the output array
np.var(array1, out = output, axis = 0)
print('variance:', output)
Output
variance: [2.25 2.25 2.25]
Frequently Asked Questions
Variance is the average of the squared deviation from the mean. It is the measure of the spread of values around the mean in the given array.
Mathematically,
var = sum((array1 - arr.mean())** 2) / (N - 1)
In NumPy,
import numpy as np
array1 = np.array([2, 4, 6, 8, 10])
# calculate variance using np.var()
variance1 = np.var(array1)
# calculate variance without using np.var()
mean = np.mean(array1)
diff_squared = (array1 - mean) ** 2
variance2 = np.mean(diff_squared)
print('variance with np.var():', variance1)
print('variance without np.var():', variance2)
Output
variance with np.var(): 8.0 variance without np.var(): 8.0
numpy.var()
used for?
The ddof (Delta Degrees of Freedom) parameter in numpy.var() allows adjusting the divisor used in the calculation of variance. The default value is 0, which corresponds to dividing by N, the number of elements.
In the above formula of var,
var = sum((array1 - arr.mean())** 2) / (N - ddof)
Let's look at an example.
import numpy as np
array1 = np.array([1, 2, 3, 4, 5])
# calculate variance with the default ddof = 0
variance0 = np.var(array1)
# calculate variance with ddof = 1
variance1 = np.var(array1, ddof = 1)
print('variance (default ddof = 0):', variance0)
print('variance (ddof = 1):', variance1)
Output
variance (default ddof = 0): 2.0 variance (ddof = 1): 2.5