The Neuron
NN from ScratchA neural network is just a function composed of many simple functions stacked together. Strip away the buzzwords: inputs ā weighted sums ā activation ā repeat ā output.
1. A Single Neuron
Each neuron computes one tiny formula:
output = activation( wāxā + wāxā + ⦠+ wāxā + b )Think of it as a yes/no/maybe detector. It takes inputs, weighs each one by importance (weights), adds a nudge (bias), then decides whether to fire through an activation function.
Each connection has a number: big weight ā important input. Near-zero weight ā ignored. Negative weight ā suppresses the input.
2. A Layer of Neurons
A layer is just many neurons running in parallel. Every neuron looks at the same inputs but with different weights ā so each learns to detect a different pattern.
In a weight matrix, each row = one neuron's weights. Computing a layer's output is just doing the dot product for every neuron at once.
| wĀ·1 | wĀ·2 | wĀ·3 | wĀ·4 | bias | output | |
| Nā | w11 0.20 | w12 0.80 | w13 -0.50 | w14 1.00 | 2.00 | 4.800 |
| Nā | w21 0.50 | w22 -0.91 | w23 0.26 | w24 -0.50 | 3.00 | 1.210 |
| Nā | w31 -0.26 | w32 -0.27 | w33 0.17 | w34 0.87 | 0.50 | 2.385 |
3. Batch Input
In practice, we don't feed one sample at a time ā we process a batch of samples at once. Each row in the input matrix is one sample, and the layer processes all of them simultaneously.
This is just matrix multiplication: Output = ReLU(X @ W.T + b). The GPU does all rows in parallel ā that's why neural networks are fast.
| X1 | X2 | X3 | X4 | |
| s1 | 1.00 | 2.00 | 3.00 | 2.50 |
| s2 | -0.80 | 0.40 | 0.10 | 1.20 |
| s3 | 0.20 | 0.90 | -0.50 | 0.30 |
(2 neurons)
| y1 | y2 | y3 | |
| s1 | 4.800 | 1.210 | 2.385 |
| s2 | 3.310 | 1.662 | 1.661 |
| s3 | 3.310 | 2.001 | 0.381 |
One-line summary
A neural network is a giant stack of weighted sums + nonlinear activations that learns patterns from data.
Single neuron ā weighted sum + bias + activation = one tiny detector
Layer ā many neurons in parallel = many detectors running together
Batch ā many samples at once = matrix multiplication = GPU go brrrr