epoch0.org
training

epoch0

§ 0
before the first
backward pass

loss is highest here. the first pass before anything is learned

§ 1
a brief
positioning

01 About

A notebook by a first-year undergraduate in navigation engineering, currently pivoting into mechanistic interpretability research. This site is the working surface — mostly half-finished experiments, reading notes, and log entries from the transition.

The name is an admission: I'm at epoch 0. Everything here is weights before training.

§ 2
updated weekly ·
last: this morning

02 Now

W3 · due May 11

Emotion probing post-mortem

writingReplicating Anthropic 2026 concept-vector probing on Pythia-6.9B. Audit pass on 14 findings flipped the original "clean replication" plan into a post-mortem essay on why none of them ship raw. Drafting now.

in draft
this week

MNIST dual interp · GradCAM

experimentOnboarding project for the lab. CNN baseline at 98.7% test acc; hooking GradCAM next, then a small Transformer with logit lens for a side-by-side interp comparison.

step 2 / 5
W4 → W8

Classical Chinese · developmental interp

researchA 50–100M decoder, two-stage: classical Chinese pretraining → continued pretraining on Tang poetry. Snapshot-level circuit analysis on prosodic structure (平仄 / 押韵). Currently collecting corpora and standing up Stage 1 infra.

data + infra
reading

Currently on the desk

papersEmotion Concepts and their Function in a Large Language Model (Anthropic 2026), Nanda Progress Measures for Grokking, Olsson In-context Learning & Induction Heads, Anthropic Toy Models of Superposition, Arditi Refusal is Mediated by a Single Direction.

5 active
§ 3
click a card
to unfold

03 Projects

Fuchsia-L/mnist-dual-interp
active
CNN + GradCAM vs Transformer + logit lens — comparing what two interp toolchains say about the same input.
PyTorch github just published
+
CNN 421K params · test acc 98.7%
GradCAM in progress
Transformerqueued — naive 4×4 patch embed
notebooks 01_cnn_mnist · 02_gradcam_mnist
Step 1 done. Step 2 hook debugging. The dual-tool comparison only makes sense once both halves are wired — staying patient about Step 3.
Fuchsia-L/mobile-llm-bridge
active
Android phone event → public reverse proxy → local Fastify → LLM agent → Telegram bot. Sanitized public release of a personal automation pipeline.
Node.js · MIT github just published
+
stack Fastify · SQLite · Telegram Bot API · OAuth
core rate limit · offline-tolerant backfill · feedback loop
secrets all via .env, no hardcoded tokens
Sanitized from a private agent. Two-pass cleanup — one worker drafts, one auditor fixes — caught four polish items the worker missed.
Fuchsia-L/llm-gateway-writeup
writeup
System design notes for a self-hosted multi-LLM reverse proxy gateway: nginx + fail2ban + SSH reverse tunnel + per-domain HTTPS.
Markdown github just published
+
stack nginx · fail2ban · certbot · ssh -R
infra VPS + Task Scheduler / systemd auto-reconnect
status long-term stable, abusive IPs auto-banned
Architecture-only writeup. Live services live elsewhere; this repo is for reference + pattern reuse.
Fuchsia-L/cf-coach
active
Codeforces training dashboard — fetch / stats / weak / next / serve. Built around an anchor-question training loop.
Node.js github updated this week
+
cli fetch · stats · weak · next · serve
dashboardrating trend · tag ability · roadmap · buckets
loop anchor → prereqs → recall → 4-line summary
Built so I could argue with myself about training plan. The anchor + prereqs loop got me from 800 → 1200+ pupil this spring.
§ 4
essays & logs,
mixed timeline

04 Writing

日志 — 实验记录 笔记 — 学习/阅读 随笔 — 个人想法 思绪 — 短篇碎念
  coming soon · RSS planned · first post in draft, target W3
§ 4
prefers email;
slow but replies

04 Contact

PGP on request. Happy to talk about SAEs, classical Chinese, or why the navigation-engineering curriculum doesn't teach gradient descent.

Tweaks
italic, appears below epoch0