Dot Product Application
NN from ScratchEvery neural network computation boils down to dot products. np.dot handles three scenarios — vector·vector, matrix·vector, and matrix·matrix — and knowing the difference is key to understanding how data flows through a network.
1. Vector × Vector (Single Neuron)
np.dot(w, x) multiplies element-wise and sums: w₁x₁ + w₂x₂ + w₃x₃. This is exactly what one neuron computes before adding bias.
For two vectors, the dot product is commutative: np.dot(w, x) == np.dot(x, w). The result is always a single scalar.
2. Matrix × Vector (Layer)
When W is a matrix (rows = neurons), each row of W dots with the input vector to produce one output. This computes an entire layer at once.
Order matters now — non-commutative. np.dot(W, x) is correct (each row dots with x). np.dot(x, W) gives a different result — it treats x as a row vector multiplying columns.
3. Matrix × Matrix (Batch)
In practice we process a batch of samples at once. Each row in X is one sample. We need the transpose: np.dot(X, W.T) + b.
This is full matrix multiplication — each sample's row dots with each neuron's weights (columns of WT). The GPU processes all samples in parallel.
(3,3) · (3,3)T + b → (3,3)Three Levels of Dot Product
Vector · Vector — multiply matching pairs and add them up. One neuron does exactly this to produce a single number.
Matrix · Vector — each row of the weight matrix dots with the input vector. One operation computes an entire layer of neurons at once.
Matrix · Matrix — every sample in the batch gets its own row-by-column dot product. The GPU processes all samples in parallel — that's why neural networks are fast.