LLaMA CPP
Software / App
A framework enabling on-device inference for small models.
Mentioned in 4 videos
Save the 4 videos on LLaMA CPP to your own pod.
Sign up free to keep building your knowledge base on LLaMA CPP as more episodes are added.
Videos Mentioning LLaMA CPP
![Best of 2024: Synthetic Data / Smol Models, Loubna Ben Allal, HuggingFace [LS Live! @ NeurIPS 2024]](https://i.ytimg.com/vi/AjmdDy7Rzx0/maxresdefault.jpg)
Best of 2024: Synthetic Data / Smol Models, Loubna Ben Allal, HuggingFace [LS Live! @ NeurIPS 2024]
Latent Space
A framework enabling on-device inference for small models.

Building AGI with OpenAI's Structured Outputs API
Latent Space
A project where constrained grammar mechanisms, like using Backus-Naur form, were first observed, influencing discussions around grammar in LLMs.

Beating GPT-4 with Open Source Models - with Michael Royzen of Phind
Latent Space
A C++ implementation of Meta's LLaMA models, enabling efficient local execution on consumer hardware, mentioned as a key tool for local LLMs.

Stanford CS336 Language Modeling from Scratch | Spring 2026 | Lecture 10: Inference
Stanford Online
A package for running inference on CPUs, highlighted as a popular option for CPU-based inference.