▌ GitHub radar

gateGPT: a GPT baked into a chip, 56k tokens/sec

2026-06-17

A working transformer that runs entirely in hardware on an FPGA, no CPU or GPU involved. It spits out names at tens of thousands of tokens per second, and it picked up hundreds of stars in days.

New posts every dayFollow me on TelegramWhere the AI world lives — daily AI news + Claude Code tipsFollow →

01fguzman82/gateGPT★ 454Verilog

gateGPT is a complete tiny GPT implemented in Verilog and run on a Xilinx Virtex-5 FPGA, so the model lives in dedicated hardware rather than as software on a CPU or GPU. It is a single transformer block with attention, a key-value cache and fixed-point math, generating human names one character at a time. At an 80 MHz clock it reaches roughly 56,000–69,000 tokens per second, and every stage was verified to match a Python reference exactly.

Why a vibe-coder should care

It is a rare, hands-on look at what a language model really is once you strip away the frameworks: matrix multiplies, attention and sampling turned into logic gates. For anyone curious how AI works under the hood, this is about as concrete as it gets.

Open on GitHub →

More finds

2026-06-19

CodexPro: let ChatGPT edit your local code

2026-06-19

Fablize: make Opus finish the job like Fable

2026-06-18

DevSpace: let ChatGPT touch your real codebase

All finds →