Background image

BareNet

GPU-Accelerated Deep Learning Framework

A minimal deep learning framework built from scratch with CUDA acceleration, automatic differentiation, and PyTorch-like API

Key Features

🚀 GPU Acceleration

All tensor operations run on GPU using custom CUDA kernels, achieving 5X speedup over CPU implementations.

⚡ Automatic Differentiation

Built-in autograd engine with reverse-mode backpropagation, tracking computational graphs automatically.

🐍 Python API

PyTorch-like interface using Pybind11, making it easy to build and train neural networks.

📊 MNIST Training

Successfully trained 2-layer MLP achieving 97.48% test accuracy on MNIST handwritten digits.

NYU Machine Learning Systems Course • Built with CUDA, C++, and Python