Research Notes

Thoughts, notes, and experiments.

LLM Learning: From Pretraining to Decoder Inference

A structured note on how large language models are built and used: tokenization, pretraining, decoder-only Transformers, post-training, prefill, decoding, KV cache, RAG, and core LLM vocabulary.

13 min read · June 09, 2026

2026 · LLM transformers pretraining decoder RAG · notes
Refining My PhD Research Direction Around 3D Perception

A personal research note on connecting my PhD preparation, semantic occupancy prediction, collaborative perception, token communication, and occupancy world models into a coherent research direction.

6 min read · June 08, 2026

2026 · PhD-preparation research-direction 3D-perception collaborative-perception · notes
From Occupancy Prediction to Occupancy World Models

A research note on extending semantic occupancy prediction from current-state reconstruction to future 4D occupancy forecasting and world modeling for autonomous agents.

5 min read · June 04, 2026

2026 · world-models occupancy-prediction 4D-perception autonomous-driving · notes
Token Communication for Multi-Agent 3D Perception

A research note on tokenized scene representations, token selection, token merging, and communication-efficient collaborative occupancy prediction.

5 min read · May 30, 2026

2026 · token-communication collaborative-perception transformers semantic-occupancy · notes
Collaborative Perception: Seeing Beyond a Single Agent

A research note on collaborative perception, multi-agent scene understanding, communication constraints, pose alignment, and why collaboration is important for 3D occupancy prediction.

6 min read · May 22, 2026

2026 · collaborative-perception multi-agent 3D-perception autonomous-driving · notes
Semantic Occupancy as a Bridge Between Perception and Planning

A research note on semantic occupancy prediction, why it matters for autonomous agents, and how it connects 3D perception, occlusion reasoning, uncertainty, and downstream planning.

6 min read · May 19, 2026

2026 · semantic-occupancy 3D-perception autonomous-driving embodied-intelligence · notes
AI Agents and Embodied Intelligence

Study notes on AI agents, embodied intelligence, perception-action loops, memory, planning, world models, and their connections to computer vision, autonomous driving, and 3D scene understanding.

17 min read · May 12, 2026

2026 · AI-agents embodied-intelligence world-models robotics autonomous-driving · notes
Reinforcement Learning and Decision Making

Study notes on reinforcement learning, decision making, Markov decision processes, dynamic programming, model-free and model-based RL, multi-agent RL, and their connections to embodied AI and autonomous driving.

16 min read · April 26, 2026

2026 · reinforcement-learning decision-making MDP robotics autonomous-driving · notes
Computer Graphics Foundations

Study notes on computer graphics foundations, geometry, rendering, NeRF, 3D Gaussian Splatting, differentiable rendering, and their connections to computer vision and 3D scene understanding.

18 min read · April 14, 2026

2026 · computer-graphics rendering NeRF 3D-Gaussian-Splatting differentiable-rendering · notes
Computer Vision Foundations

Study notes on computer vision foundations, Cornell Introduction to Computer Vision, multi-view geometry, 3D representations, depth estimation, and their connections to autonomous driving perception.

20 min read · April 05, 2026

2026 · computer-vision multi-view-geometry 3D-vision depth-estimation · notes
Deep Learning Foundations

Study notes on neural networks, CNNs, Transformers, CS231n, Andrew Ng's Deep Learning Specialization, and their connections to computer vision and autonomous driving perception.

18 min read · March 22, 2026

2026 · deep-learning CNN Transformer CS231n computer-vision · notes
Machine Learning Foundations

Study notes on core machine learning, Andrew Ng's machine learning course, PRML, statistical learning theory, representation learning, and their connections to computer vision and autonomous driving.

22 min read · March 15, 2026

2026 · machine learning PRML statistical learning representation learning · notes
Mathematical Foundations

Study notes on matrix theory, numerical analysis, probability and statistics, optimization, and related mathematical foundations for machine learning and computer vision.

16 min read · March 07, 2026

2026 · math matrix theory numerical analysis probability optimization · notes
Building My PhD Knowledge Base for Computer Vision

A structured roadmap for building the mathematical, machine learning, computer vision, graphics, autonomous driving, and embodied AI foundations needed for PhD research.

12 min read · March 01, 2026

2026 · roadmap · notes