Python Statistics Module
The statistics module in Python provides functions to perform statistical calculations on numeric data. It includes common operations such as finding mean, median, mode, variance, and more.
Basic Statistical Functions
mean(data)- Returns the arithmetic mean (average) of the data.
- The input must be a sequence of numbers.
import statistics data = [10, 20, 30, 40, 50] print(statistics.mean(data)) # Output: 30median(data)- Returns the median (middle value) of the data.
- If the number of data points is odd, the middle value is returned.
- If the number of data points is even, the average of the two middle values is returned.
data = [10, 20, 30, 40, 50] print(statistics.median(data)) # Output: 30mode(data)- Returns the most frequent value in the data.
- If there are multiple modes, it returns the first one encountered.
data = [1, 2, 2, 3, 3, 3, 4, 4] print(statistics.mode(data)) # Output: 3stdev(data)- Returns the standard deviation of the data, a measure of the amount of variation or dispersion.
- Uses the sample standard deviation (n-1 degrees of freedom).
data = [10, 20, 30, 40, 50] print(statistics.stdev(data)) # Output: 15.811388300841896variance(data)- Returns the variance of the data.
- Uses the sample variance (n-1 degrees of freedom).
data = [10, 20, 30, 40, 50] print(statistics.variance(data)) # Output: 250
Additional Functions
harmonic_mean(data)- Returns the harmonic mean of the data.
- The harmonic mean is the reciprocal of the arithmetic mean of the reciprocals of the data.
data = [40, 60, 80] print(statistics.harmonic_mean(data)) # Output: 57.6median_low(data)- Returns the lower of the two middle values if the number of data points is even.
- If odd, it returns the middle value.
data = [10, 20, 30, 40, 50] print(statistics.median_low(data)) # Output: 30median_high(data)- Returns the higher of the two middle values if the number of data points is even.
- If odd, it returns the middle value.
data = [10, 20, 30, 40, 50] print(statistics.median_high(data)) # Output: 30median_grouped(data, interval=1)- Returns the median of grouped continuous data.
- Assumes data points are grouped in intervals.
data = [10, 20, 30, 40, 50] print(statistics.median_grouped(data)) # Output: 30.0pvariance(data)- Returns the population variance of the data.
- Uses the population variance formula (n degrees of freedom).
data = [10, 20, 30, 40, 50] print(statistics.pvariance(data)) # Output: 200pstdev(data)- Returns the population standard deviation of the data.
- Uses the population standard deviation formula (n degrees of freedom).
data = [10, 20, 30, 40, 50] print(statistics.pstdev(data)) # Output: 14.142135623730951
Example Usage
import statistics
data = [10, 20, 30, 40, 50]
# Mean (average)
print("Mean:", statistics.mean(data)) # Output: 30
# Median (middle value)
print("Median:", statistics.median(data)) # Output: 30
# Mode (most frequent value)
data_with_mode = [1, 2, 2, 3, 4]
print("Mode:", statistics.mode(data_with_mode)) # Output: 2
# Standard deviation
print("Standard Deviation:", statistics.stdev(data)) # Output: 15.811388300841896
# Variance
print("Variance:", statistics.variance(data)) # Output: 250
The statistics module provides a simple and effective way to perform statistical analysis on numeric data, offering essential functions to compute basic statistics, dispersion measures, and more.