Trending topics
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
i open-sourced autokernel -- autoresearch for GPU kernels
you give it any pytorch model. it profiles the model, finds the bottleneck kernels, writes triton replacements, and runs experiments overnight. edit one file, benchmark, keep or revert, repeat forever.
same loop as @karpathy autoresearch, applied to kernel optimization
95 experiments. 18 TFLOPS → 187 TFLOPS. 1.31x vs cuBLAS. all autonomous
9 kernel types (matmul, flash attention, fused mlp, layernorm, rmsnorm, softmax, rope, cross entropy, reduce). amdahl's law decides what to optimize next. 5-stage correctness checks before any speedup counts
the agent reads program.md (the "research org code"), edits runs and either keeps or reverts. ~40 experiments/hour. ~320 overnight
ships with self-contained GPT-2, LLaMA, and BERT definitions so you don't need the transformers library to get started

Top
Ranking
Favorites
