Matplotlib Scatter
A scatter plot in Matplotlib is used to plot points on a 2D plane, where each point represents a pair of values (x, y). Scatter plots are ideal for visualizing relationships or correlations between two variables. In Matplotlib, you can create scatter plots using the plt.scatter() function, which allows for customization of colors, sizes, markers, and more.
1. Basic Scatter Plot
To create a simple scatter plot, you need two arrays (or lists) of equal length: one for the x-axis and one for the y-axis.
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 40]
# Create a scatter plot
plt.scatter(x, y)
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Basic Scatter Plot")
# Show plot
plt.show()
This will create a scatter plot with default settings, where each point corresponds to a pair of (x, y) values.
2. Customizing Markers
You can change the marker style, size, and color of the points in the scatter plot using various parameters.
- Marker Style: Use the
markerparameter to specify different marker shapes (e.g.,ofor circles,sfor squares,^for triangles, etc.). - Marker Size: Use the
sparameter to adjust the size of the markers. - Marker Color: Use the
cparameter to change the color of the markers.
Example: Customizing Markers
# Create a scatter plot with customized markers
plt.scatter(x, y, color='green', marker='s', s=100)
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Custom Marker Scatter Plot")
plt.show()
In this example:
color='green'sets the marker color to green.marker='s'sets the marker shape to squares.s=100increases the marker size.
3. Color Mapping Based on Values
You can assign colors to the points based on their values by passing an array to the c parameter. You can also use the cmap parameter to specify a colormap that assigns colors to the points.
Example: Color by Value
import numpy as np
# Data
x = np.random.rand(50)
y = np.random.rand(50)
colors = np.random.rand(50)
# Create a scatter plot with color mapping
plt.scatter(x, y, c=colors, cmap='viridis')
# Add colorbar
plt.colorbar()
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Scatter Plot with Color Mapping")
plt.show()
In this example:
c=colorsassigns a color to each point based on the corresponding value in thecolorsarray.cmap='viridis'specifies the colormap to use.plt.colorbar()adds a colorbar to the plot to show the color scale.
4. Varying Point Sizes
You can vary the sizes of the markers based on the data values by passing an array to the s parameter.
Example: Varying Marker Sizes
# Data
sizes = 1000 * np.random.rand(50) # Random sizes for each point
# Create a scatter plot with varying marker sizes
plt.scatter(x, y, s=sizes, alpha=0.5)
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Scatter Plot with Varying Sizes")
plt.show()
Here, each point’s size is determined by the corresponding value in the sizes array. The alpha=0.5 parameter makes the points semi-transparent, making it easier to distinguish overlapping points.
5. Adding Labels to Points
To add labels to individual points in the scatter plot, you can use the plt.text() function.
Example: Labeling Points
# Create a scatter plot
plt.scatter(x, y)
# Add labels to each point
for i in range(len(x)):
plt.text(x[i], y[i], f"({x[i]}, {y[i]})")
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Scatter Plot with Labels")
plt.show()
This example places a label next to each point, showing its coordinates.
6. Scatter Plot with Different Colors and Sizes
You can combine the customization of colors and sizes to create a more dynamic scatter plot.
Example: Combining Color and Size Variations
# Data
sizes = 1000 * np.random.rand(50) # Random sizes
colors = np.random.rand(50) # Random colors
# Create a scatter plot with varying sizes and colors
plt.scatter(x, y, s=sizes, c=colors, cmap='plasma', alpha=0.6)
# Add colorbar
plt.colorbar()
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Scatter Plot with Sizes and Colors")
plt.show()
In this example:
s=sizessets the size of each point.c=colorsassigns colors to each point.cmap='plasma'specifies the colormap.alpha=0.6makes the points semi-transparent.
7. Scatter Plot with Multiple Datasets
You can plot multiple datasets on the same scatter plot by calling plt.scatter() multiple times with different data and styles.
Example: Multiple Datasets
# Data for second dataset
x2 = [1.5, 2.5, 3.5, 4.5, 5.5]
y2 = [15, 25, 35, 45, 55]
# Create a scatter plot for the first dataset
plt.scatter(x, y, color='blue', label='Dataset 1')
# Create a scatter plot for the second dataset
plt.scatter(x2, y2, color='red', label='Dataset 2')
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Scatter Plot with Multiple Datasets")
# Add legend
plt.legend()
plt.show()
This creates a scatter plot with two datasets, each with different colors and labels.
8. Logarithmic Scale in Scatter Plots
You can apply logarithmic scaling to the axes of a scatter plot using plt.xscale() and plt.yscale().
Example: Logarithmic Scale
# Data
x = np.linspace(1, 100, 100)
y = np.exp(x)
# Create a scatter plot with logarithmic scale on y-axis
plt.scatter(x, y)
plt.yscale('log')
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis (log scale)")
plt.title("Scatter Plot with Logarithmic Scale")
plt.show()
This example applies a logarithmic scale to the y-axis, which is useful for displaying data with large ranges.
9. 3D Scatter Plot
Matplotlib also supports 3D scatter plots if you are working in three dimensions. To create a 3D scatter plot, you need to use the Axes3D class from the mpl_toolkits.mplot3d module.
Example: 3D Scatter Plot
from mpl_toolkits.mplot3d import Axes3D
# Data
z = np.random.rand(50)
# Create a figure for 3D plotting
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Create a 3D scatter plot
ax.scatter(x, y, z, c=z, cmap='cool')
# Add labels
ax.set_xlabel('X-axis')
ax.set_ylabel('Y-axis')
ax.set_zlabel('Z-axis')
ax.set_title("3D Scatter Plot")
plt.show()
This example creates a 3D scatter plot with color mapping based on the z values.
Conclusion
Scatter plots in Matplotlib are highly customizable and useful for visualizing relationships between two variables. You can control the marker styles, colors, and sizes, and even incorporate color mapping to represent additional dimensions. Scatter plots are ideal for showing the distribution, correlation, and trends in data, and with 3D scatter plots, you can expand this visualization into three dimensions.