📊 How to evaluate skills❓️ Lots of companies are building skills for coding agents. But how do you know if your skill is actually working? It's tempting to go by vibes, but performance varies a lot across tasks — and coding agents have a huge action space, which makes that variance even harder to predict. We built an evaluation benchmark for our newly released LangSmith and LangChain skills. ➡️ Learn about our findings here: ➡️ Check out the benchmark for yourself: