The Neuron

NN from Scratch

A neural network is just a function composed of many simple functions stacked together. Strip away the buzzwords: inputs → weighted sums → activation → repeat → output.

1. A Single Neuron

Each neuron computes one tiny formula:

output = activation( w₁x₁ + wā‚‚xā‚‚ + … + wā‚™xā‚™ + b )

Think of it as a yes/no/maybe detector. It takes inputs, weighs each one by importance (weights), adds a nudge (bias), then decides whether to fire through an activation function.

Each connection has a number: big weight → important input. Near-zero weight → ignored. Negative weight → suppresses the input.

Drag the sliders to change inputs and watch the neuron respond
w1=0.50w2=-0.30w3=0.80X10.50X2-0.30X30.70Ī£+ bz = 1.000ReLUÅ·1.000
X1
0.50
X2
-0.30
X3
0.70
Step-by-step computation:
z = (0.50 Ɨ 0.50) + (-0.30 Ɨ -0.30) + (0.70 Ɨ 0.80) + 0.10
z = 0.250 + 0.090 + 0.560 + 0.10 = 1.000
Å· = ReLU(1.000) = 1.000
Python
Shift+Enter to run
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

2. A Layer of Neurons

A layer is just many neurons running in parallel. Every neuron looks at the same inputs but with different weights — so each learns to detect a different pattern.

In a weight matrix, each row = one neuron's weights. Computing a layer's output is just doing the dot product for every neuron at once.

Same inputs, different weights per neuron — each neuron detects a different pattern
w11w12w13w14w21w22w23w24w31w32w33w34X11.00X22.00X33.00X42.50N₁Σ + bReLUy14.800Nā‚‚Ī£ + bReLUy21.210Nā‚ƒĪ£ + bReLUy32.385
X1
1.00
X2
2.00
X3
3.00
X4
2.50
Weight Matrix — each row is one neuron's weights
wĀ·1wĀ·2wĀ·3wĀ·4biasoutput
N₁
w11
0.20
w12
0.80
w13
-0.50
w14
1.00
2.004.800
Nā‚‚
w21
0.50
w22
-0.91
w23
0.26
w24
-0.50
3.001.210
Nā‚ƒ
w31
-0.26
w32
-0.27
w33
0.17
w34
0.87
0.502.385
Python
Shift+Enter to run
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

3. Batch Input

In practice, we don't feed one sample at a time — we process a batch of samples at once. Each row in the input matrix is one sample, and the layer processes all of them simultaneously.

This is just matrix multiplication: Output = ReLU(X @ W.T + b). The GPU does all rows in parallel — that's why neural networks are fast.

Click a sample row to select it — the neuron diagram shows that sample flowing through the layer
Sample s1 through the layer
X11.00X22.00X33.00X42.50N₁Σ + bReLUy14.800Nā‚‚Ī£ + bReLUy21.210Nā‚ƒĪ£ + bReLUy32.385
Input Batch (3 samples)
X1X2X3X4
s11.002.003.002.50
s2-0.800.400.101.20
s30.200.90-0.500.30
→
Layer
(2 neurons)
Output (3 Ɨ 3)
y1y2y3
s14.8001.2102.385
s23.3101.6621.661
s33.3102.0010.381
Editing sample s1
X1
1.00
X2
2.00
X3
3.00
X4
2.50
Python
Shift+Enter to run
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

One-line summary

A neural network is a giant stack of weighted sums + nonlinear activations that learns patterns from data.

Single neuron → weighted sum + bias + activation = one tiny detector

Layer → many neurons in parallel = many detectors running together

Batch → many samples at once = matrix multiplication = GPU go brrrr