Honeycomb
Software / AppMentioned in 2 videos
A high-scoring agent on the SWE-Bench full dataset, which first attempts to reproduce a bug before executing actions like running bash commands.
Videos Mentioning Honeycomb
![[Paper Club] SWE-Bench [OpenAI Verified/Multimodal] + MLE-Bench with Jesse Hu](https://i.ytimg.com/vi/ULcwHlxfSkQ/maxresdefault.jpg)
[Paper Club] SWE-Bench [OpenAI Verified/Multimodal] + MLE-Bench with Jesse Hu
Latent Space
A high-scoring agent on the SWE-Bench full dataset, which first attempts to reproduce a bug before executing actions like running bash commands.

Production AI Engineering starts with Evals
Latent Space
An observability platform that built its own super wide column store. The speaker agrees with their decision given the lack of accessible semi-structured data solutions like Snowflake's variant type.