Background image

Automatic Differentiation

Back to BareNet

Reverse-Mode Automatic Differentiation with Computational Graph

Computational Graph

a = 2b = 3a + b = 5c = 4(a+b) * c = 20

Example: y = (a + b) * c where a=2, b=3, c=4

How It Works

Forward Pass

During forward pass, operations are executed and results are stored. The computational graph is built automatically by tracking dependencies.

Backward Pass

During backward pass, gradients are computed using the chain rule. Starting from the output, gradients flow backward through the graph.

Chain Rule

∂L/∂x = ∂L/∂y * ∂y/∂x

AGTensor Implementation

_backward: closure function

Stores backward computation logic

_prev: set of parent nodes

Tracks dependencies in graph

grad: accumulated gradients

Stores computed gradients

Example Code

import numpy as np
from mygrad.engine import AGTensor

# Create tensors
a = AGTensor(np.array([[2.0]]), is_cuda=True)
b = AGTensor(np.array([[3.0]]), is_cuda=True)
c = AGTensor(np.array([[4.0]]), is_cuda=True)

# Forward pass
result = (a + b) * c
print(result.numpy())  # [[20.0]]

# Backward pass
result.backward()

# Gradients
print(a.grad.to_numpy())  # [[4.0]]
print(b.grad.to_numpy())  # [[4.0]]
print(c.grad.to_numpy())  # [[5.0]]