
π Introduction: π:
- π My Learning Adventure: Join me on this exciting journey from confusion to clarity as we navigate the fascinating world of backpropagation with Andrej Karpathy's tutorial! ππΆββοΈ
- π§ Why Backpropagation Matters: Imagine backpropagation as the superhero of neural networks, wielding a magical recipe to make computers smarter! π¦ΈββοΈπ»β¨
- π Thank You, Andrej Karpathy: A massive shoutout to Andrej Karpathy for creating a video that's like a friendly guide through the neural network jungle. Learning has never been this much fun! ππ
- π₯ Who Can Join: If you know a bit of Python and recall a touch of high school math, you're ready for this adventure! ππ No need to be a computer whizβjust bring your curiosity! π€π
π Overview of Micrograd/Neural Engine: A Tiny Magic Engine
π A pint-sized Neural Engine, packs a punch with its implementation of backpropagation: Unleashing its power in just 100 lines of code, this marvel dynamically constructs a Directed Acyclic Graph (DAG) π¨. Boasting a petite neural networks library in a mere 50 lines, it embraces a PyTorch-like API! π€π‘
πͺ This powerhouse operates exclusively on scalar values, slicing through neurons into minuscule adds and multiplies. ππ§
π For those who crave visual enchantment: Get ready for mesmerizing Graphviz visualizations! Witness the beauty of a 2D neuron, where each node tells a tale with both data and gradients. ππ
Code Wizards Unite: Enough Talk, Show Me the code ππ»β¨
Step1: Derivative of a Simple Function
The "Derivative of a Simple Function" refers to the rate at which the output of a function changes concerning its input. In simpler terms, it measures how a small change in the input of a function affects the output.
Lets defining the function as
f(x) = 3x^2 - 4x + 5
def f(x):
return 3*x**2 -4*x + 5
Using the chain rule, we can derives that the derivative of
f(x) = 3x^2 - 4x + 5 is 6x - 4.
Now intuitively , if we change the value of x by a slight number i.e. h, so how does the overall function respond , so what we mean is how sensitive the function is to the change that we made. And where does the direction and the amount by which it goes either up or down.
We can evaluate the derivative numerically by the definition
h=0.001
x=3.0

Now lets start a bit more complex examples:
Initial values:
a=2.0
b=-3.0
c=10.0
d=a*b+c
print(d)
>>
4.0
Lets see how much the overall function changes , while we change the 'a' by slight
h=0.001
a=2.0
b=-3.0
c=10.0
d1=a*b+c
a+=h
d2=a*b+c
print("d1",d1)
print("d2",d2)
print("slope",(d2-d1)/h)
>>
d1 4.0
d2 3.997
slope -3.0000000000001137
We can the same change for 'b' and 'c'
h=0.001
a=2.0
b=-3.0
c=10.0
d1=a*b+c
b+=h
d2=a*b+c
print("d1",d1)
print("d2",d2)
print("slope",(d2-d1)/h)
>>
d1 4.0
d2 4.002
slope 1.9999999999997797
h=0.001
a=2.0
b=-3.0
c=10.0
d1=a*b+c
c+=h
d2=a*b+c
print("d1",d1)
print("d2",d2)
print("slope",(d2-d1)/h)
>>
d1 4.0
d2 4.0009999999999994
slope 0.9999999999994458
Core Value Object of Micrograd:
Now we have some intuition about the derivative, we can now move to the neural network. As we know NN are the very large mathematical expression . so to work with those expression , we need some sort of data structurer . That is what we are going to build next. and we are going to build it in an iterative manner.
class Value:
def __init__(self,data):
self.data=data
def __repr__(self):
return f"Value(data={self.data})"
a=Value(2.0)
b=Value(3.0)
a,b
>>(Value(data=2.0), Value(data=3.0))
Now we want to add these two value, something like this
a+b
>>TypeError: unsupported operand type(s) for +: 'Value' and 'Value'
So right now python doesn't know how to add these two Value objects.
Now lets modify our value class:
class Value:
def __init__(self,data):
self.data=data
def __repr__(self):
return f"Value(data={self.data})"
def __add__(self,other):
out=Value(self.data+other.data)
return out
a=Value(2.0)
b=Value(3.0)
a+b
>>Value(data=5.0)
Now lets modify our value class for multiplication;
class Value:
def __init__(self,data):
self.data=data
def __repr__(self):
return f"Value(data={self.data})"
def __add__(self,other):
out=Value(self.data+other.data)
return out
def __mul__(self,other):
out=Value(self.data*other.data)
return out
a=Value(2.0)
b=Value(-3.0)
c=Value(10.0)
d=a*b +c
d
>>Value(data=4.0)
Now we want to keep the expression graph , So we need to keep the pointer about , what values produced what other values. And for that we will introduce a new variable "children" and we will keep one more variable "prev" in the class , which will be the set of children
class Value:
def __init__(self,data,_children={}):
self.data=data
self._prev=set(_children)
def __repr__(self):
return f"Value(data={self.data})"
def __add__(self,other):
out=Value(self.data+other.data,(self,other))
return out
def __mul__(self,other):
out=Value(self.data*other.data,(self,other))
return out
a=Value(2.0)
b=Value(-3.0)
c=Value(10.0)
d=a*b +c
d
>>Value(data=4.0)
d._prev
>>{Value(data=-6.0), Value(data=10.0)}
Here we can see from the above output "d._prev" come from Value(data=-6.0) which is a*b and Value(data=10.0) which is c
Now one more piece of information we dont know is what operation created this final node. and Also add label to graph nodes. lets add this into our code:
class Value:
def __init__(self,data,_children={},_op="",label=''):
self.data=data
self._prev=set(_children)
self._op=_op
self.label=label
def __repr__(self):
return f"Value(data={self.data})"
def __add__(self,other):
out=Value(self.data+other.data,(self,other),"+")
return out
def __mul__(self,other):
out=Value(self.data*other.data,(self,other),"*")
return out
a=Value(2.0,label='a')
b=Value(-3.0,label='b')
c=Value(10.0,label='c')
e=a*b;e.label='e'
d=e +c ;d.label='d'
d
>>Value(data=4.0)
d._prev
>>{Value(data=-6.0), Value(data=10.0)}
d._op
>> '+'
Lets visualize these expression with the help of below code:
def trace(root):
nodes, edges = set(), set()
def build(v):
if v not in nodes:
nodes.add(v)
for child in v._prev:
edges.add((child, v))
build(child)
build(root)
return nodes, edges
def draw_dot(root, format='svg', rankdir='LR'):
"""
format: png | svg | ...
rankdir: TB (top to bottom graph) | LR (left to right)
"""
assert rankdir in ['LR', 'TB']
nodes, edges = trace(root)
dot = Digraph(format=format, graph_attr={'rankdir': rankdir}) #, node_attr={'rankdir': 'TB'})
for n in nodes:
dot.node(name=str(id(n)), label = "{ %s | data %.4f}" % (n.label,n.data), shape='record')
if n._op:
dot.node(name=str(id(n)) + n._op, label=n._op)
dot.edge(str(id(n)) + n._op, str(id(n)))
for n1, n2 in edges:
dot.edge(str(id(n1)), str(id(n2)) + n2._op)
return dot
We are going to call drawdot on some root node to visualize it.
Lets call draw_dot() at d
draw_dot(d)
>>

see one more example:
a=Value(2.0,label='a')
b=Value(-3.0,label='b')
c=Value(10.0,label='c')
e=a*b;e.label='e'
d=e +c ;d.label='d'
f=Value(-2.0,label='f')
L=d*f
L;L.label='L'
draw_dot(L)
>>

We've built mathematical expressions using addition and multiplication, resulting in a scalar output. The forward pass involves multiple inputs (a, b, c, f), yielding an output (l) of -8.
π Summary:
π§ We've constructed mathematical expressions using addition and multiplication operations, laying the groundwork for our neural exploration.
π Emphasis was on the forward pass, leveraging inputs (a, b, c, and f) to craft a singular output (l). The result? An output value of negative eight.
β© Now, we're diving into the fascinating world of backpropagation. Starting from the end, we're calculating gradients along intermediate values.
π― The ultimate goal? To compute the derivative of each node with respect to the final output (l). ππ‘
π Brace for the Thrill of Backpropagation: π Get ready for a thrilling rewind! We're unraveling gradients in reverse, with each node playing a crucial role in this numeric symphony. πΆπ
π Ready to ride the waves of derivatives? Join me as we unravel the secrets behind every twist in this computational journey. π΅οΈββοΈπ»β¨
π To Be Continued... πβ¨ Stay tuned for the next chapter as we continue this exciting journey into the depths of backpropagation and the mysteries of computational exploration. The adventure is just beginning! πππ
Comments