NumPy Interactive Handbook

01

Why Use NumPy?

Python lists are flexible but slow for numerical computing. NumPy solves these problems with powerful array operations optimized in C.

Performance Comparison

Python List (Loop) ~120ms

NumPy Array (Vectorized) ~1.2ms

Adding two arrays of 1,000,000 elements

Python List Memory

Scattered pointers: High overhead

NumPy Array Memory

Contiguous block: Cache efficient

import numpy as np
import time

size = 1_000_000

# Python list approach
list1 = list(range(size))
list2 = list(range(size))
start = time.time()
result = [x + y for x, y in zip(list1, list2)]
python_time = time.time() - start

# NumPy approach
arr1 = np.array(list1)
arr2 = np.array(list2)
start = time.time()
result = arr1 + arr2  # Vectorized operation
numpy_time = time.time() - start

print(f"Speedup: {python_time/numpy_time:.0f}x faster")

02

Creating Arrays

From Python Lists

import numpy as np

# 1D array
arr1 = np.array([1, 2, 3, 4, 5])

# 2D array
arr2 = np.array([[1, 2, 3], 
                 [4, 5, 6]])

print(arr2.shape)  # (2, 3)

Built-in Methods

np.zeros((3, 3))        # 3x3 zeros
np.ones((2, 4))         # 2x4 ones
np.full((2, 2), 7)      # Fill with 7
np.eye(4)               # Identity matrix
np.arange(0, 10, 2)     # [0, 2, 4, 6, 8]
np.linspace(0, 1, 5)    # 5 values from 0 to 1

Interactive Array Builder

Click a button to generate array

03

Indexing & Slicing

Important: Views vs Copies

Slicing returns a view, not a copy. Modifying the slice modifies the original array. Use .copy() if you need independence.

Basic Indexing

arr = np.array([10, 20, 30, 40, 50])

arr[0]       # 10 (first)
arr[-1]      # 50 (last)
arr[1:4]     # [20, 30, 40]
arr[::2]     # [10, 30, 50]

Boolean Masking

arr = np.array([10, 20, 30, 40, 50])

# Filter values > 25
mask = arr > 25
arr[mask]    # [30, 40, 50]

# Direct filtering
arr[arr > 25]  # Same result

Visual Indexing Demo

04

Multidimensional Arrays

Understanding axes is crucial. In a 2D array: Axis 0 = rows (vertical), Axis 1 = columns (horizontal).

arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

# Access element
arr[1, 2]      # 6 (row 1, col 2)

# Slice rows
arr[0:2, 1:3]  # [[2, 3],
               #  [5, 6]]

# Operations along axes
np.sum(arr, axis=0)  # Sum columns: [12, 15, 18]
np.sum(arr, axis=1)  # Sum rows: [6, 15, 24]

# Modify column
arr[:, 1] = 0  # Set column 1 to 0

3D Array Structure

1

2

3

4

5

6

Sheet 0 (axis 0)

+

7

8

9

10

11

12

Sheet 1 (axis 0)

arr3D[0, 1, 2] → 6 (Sheet 0, Row 1, Col 2)

05

Data Types

NumPy arrays are homogeneous (same type). Choosing the right dtype optimizes memory and performance.

Type	Description	Memory
int32	Integer	4 bytes
int64	Integer (default)	8 bytes
float32	Single precision	4 bytes
float64	Double precision (default)	8 bytes
bool	Boolean	1 byte

# Specify dtype
arr = np.array([1, 2, 3], dtype=np.float32)

# Convert dtype
arr_int = arr.astype(np.int32)

# Check memory
arr_int64 = np.array([1, 2, 3], dtype=np.int64)   # 24 bytes
arr_int32 = np.array([1, 2, 3], dtype=np.int32)   # 12 bytes

06

Broadcasting

Broadcasting allows operations on arrays of different shapes without creating copies. Smaller arrays are "stretched" to match larger ones.

Array 1 (2x3)

1

2

3

4

5

6

+

Array 2 (1x3) → Broadcasted

10

20

30

10

20

30

arr1 = np.array([[1, 2, 3], [4, 5, 6]])      # Shape: (2, 3)
arr2 = np.array([10, 20, 30])                 # Shape: (3,)

result = arr1 + arr2  # Broadcasting adds arr2 to each row
# [[11, 22, 33],
#  [14, 25, 36]]

Real-world: Data Normalization

data = np.array([[10, 20, 30],
                 [15, 25, 35],
                 [20, 30, 40]])

mean = data.mean(axis=0)    # [15, 25, 35]
std = data.std(axis=0)      # [4.08, 4.08, 4.08]

normalized = (data - mean) / std  # Broadcasting!

07

Mathematical Functions

np.mean()

Average of elements

np.std()

Standard deviation

np.sum()

Sum of elements

np.min() / np.max()

Minimum / Maximum

np.argmin() / np.argmax()

Index of min/max

np.cumsum()

Cumulative sum

arr = np.array([10, 20, 30, 40, 50])

np.mean(arr)        # 30.0
np.std(arr)         # 14.14...
np.percentile(arr, 50)  # 30.0 (median)
np.corrcoef(arr1, arr2) # Correlation matrix

Master NumPy with Interactive Visualizations