epoch0.org
training

epoch0

§ 0
before the first
backward pass

loss is highest here. the first pass before anything is learned

§ 1
a brief
positioning

01 About

A notebook by a first-year undergraduate in navigation engineering, currently pivoting into mechanistic interpretability research. This site is the working surface — mostly half-finished experiments, reading notes, and log entries from the transition.

The name is an admission: I'm at epoch 0. Everything here is weights before training.

§ 2
updated weekly ·
last: 2026-05-14

02 Now

shipped May 4

Pythia-6.9B emotion probing

writingReplicating Anthropic 2026 concept-vector probing on Pythia-6.9B. After auditing 14 findings, written up as the first post on writing.epoch0.org.

shipped
this week

MNIST dual interp · GradCAM

experimentOnboarding project for the lab. CNN baseline at 98.7% test acc; hooking GradCAM next, then a small Transformer with logit lens for a side-by-side interp comparison.

step 2 / 5
reading

Currently on the desk

papersPower Grokking: Generalization Beyond Overfitting (2022), Nanda Progress Measures for Grokking (2023), Varma Explaining Grokking through Circuit Efficiency (2023), Prieto Grokking at the Edge of Numerical Stability (2025), Doshi Grokking Modular Polynomials (2024).

5 active
§ 3
click a card
to unfold

03 Projects

Fuchsia-L/mnist-dual-interp
active
CNN + GradCAM vs Transformer + logit lens — comparing what two interp toolchains say about the same input.
PyTorch github just published
+
CNN 421K params · test acc 98.7%
GradCAM in progress
Transformerqueued — naive 4×4 patch embed
notebooks 01_cnn_mnist · 02_gradcam_mnist
Step 1 done. Step 2 hook debugging. The dual-tool comparison only makes sense once both halves are wired — staying patient about Step 3.
Fuchsia-L/mobile-llm-bridge
active
Android phone event → public reverse proxy → local Fastify → LLM agent → Telegram bot. Sanitized public release of a personal automation pipeline.
Node.js · MIT github just published
+
stack Fastify · SQLite · Telegram Bot API · OAuth
core rate limit · offline-tolerant backfill · feedback loop
secrets all via .env, no hardcoded tokens
Sanitized from a private agent. Two-pass cleanup — one worker drafts, one auditor fixes — caught four polish items the worker missed.
Fuchsia-L/llm-gateway-writeup
writeup
System design notes for a self-hosted multi-LLM reverse proxy gateway: nginx + fail2ban + SSH reverse tunnel + per-domain HTTPS.
Markdown github just published
+
stack nginx · fail2ban · certbot · ssh -R
infra VPS + Task Scheduler / systemd auto-reconnect
status long-term stable, abusive IPs auto-banned
Architecture-only writeup. Live services live elsewhere; this repo is for reference + pattern reuse.
Fuchsia-L/cf-coach
active
Codeforces training dashboard — fetch / stats / weak / next / serve. Built around an anchor-question training loop.
Node.js github updated this week
+
cli fetch · stats · weak · next · serve
dashboardrating trend · tag ability · roadmap · buckets
loop anchor → prereqs → recall → 4-line summary
Built so I could argue with myself about training plan. The anchor + prereqs loop got me from 800 → 1200+ pupil this spring.
§ 4
essays & logs,
mixed timeline

04 Writing

日志 — 实验记录 笔记 — 学习/阅读 随笔 — 个人想法 思绪 — 短篇碎念
  writing.epoch0.org →  ·  all updates & notes
§ 4
prefers email;
slow but replies

04 Contact

PGP on request. Happy to talk about SAEs, classical Chinese, or why the navigation-engineering curriculum doesn't teach gradient descent.

Tweaks
italic, appears below epoch0