FP8
A precision format (8-bit floating point) used for DeepSeek V3 weights, requiring specific kernel support for inference, and a trend in native quantization during training.
Videos Mentioning FP8

Training Llama 2, 3 & 4: The Path to Open Source AGI — with Thomas Scialom of Meta AI
Latent Space
An 8-bit floating-point format that can be used for inference and potentially training, allowing for greater efficiency.

DeepSeek V3, SGLang, and the state of Open Model Inference in 2025 (Quantization, MoEs, Pricing)
Latent Space
A precision format (8-bit floating point) used for DeepSeek V3 weights, requiring specific kernel support for inference, and a trend in native quantization during training.

llm.c's Origin and the Future of LLM Compilers - Andrej Karpathy at CUDA MODE
Latent Space
An 8-bit floating-point format that is being added to llm.c for improved training efficiency.

Beating GPT-4 with Open Source Models - with Michael Royzen of Phind
Latent Space
A floating-point format using 8 bits, implemented by NVIDIA in their Transformer Engine for potentially faster and more efficient deep learning model inference with minimal quality loss.