- Ben Burtenshaw / Hugging Face Researcher601.mcp is dead, if you used it like a cli. a lot of folk were using MCP like clis or APIs, and most of the value in MCP they ignored.›LLM score 92 · about 2 months ago
- Noam Brown / OpenAI Research Scientist602.What we really need is a benchmark where AI models make AI models that play poker.›LLM score 92 · about 2 months ago
- Lewis Tunstall / Hugging Face Researcher
- Sayak Paul / Hugging Face Researcher604.We've been studying what it takes to get NVFP4 & MXFP8 deliver good speedups on modern flow models for image & video gen.›LLM score 95 · about 2 months ago
- Alex Zhang / MIT CSAIL PhD605.The "Mismanaged Geniuses" Hypothesis(blog)LLM score 100 · about 2 months ago
- Cameron Wolfe / Researcher at Netflix606.Can't wait to get my physical copy! Definitely order the RLHF book.›LLM score 92 · about 2 months ago
- Zixuan Li / Lead Z.ai607.Often discuss my three-level vision for opening GLM to the community:›LLM score 92 · about 2 months ago
- John Carmack608.Making a scatter plot of 400_000 data points, some of the plots had odd gaps in coverage.›LLM score 92 · about 2 months ago
- Boris Cherny / Creator of Claude Code609.Just got a nice DM from a big enterprise customer using Claude Code in one of the world's biggest codebases›LLM score 95 · about 2 months ago
- Alex Zhang / MIT CSAIL PhD610.does anyone know of any evals / benchmarks that are particularly sensitive to prompting? can be for any range of reasons, e.g.›LLM score 92 · about 2 months ago
- Andrej Karpathy / AI researcher611.Judging by my tl there is a growing gap in understanding of AI capability.›LLM score 92 · about 2 months ago
- Eric Jang / ex VP of AI at 1X Robotics612.These days, instead of directly asking LLMs to write code, I'm trying a new practice where I write a description of what the program should do into a plan.md file and have a LLM execute the computations "manually", thinking through every step, instead of actually writing theLLM score 92 · about 2 months ago
- Niklas Muennighoff / AI Researcher at Stanford
- Shuchao Bi / Meta Researcher614.I can feel their passion of health capabilities when discussing with @hwchung27 and @_jasonwei “if we can make every user of our product live 1 year longer in healthy conditions and we have 1 billion DAUs, that’s 1 billion years of human life.” With the projection of LLM https://t.co/CSA9wTr7TALLM score 85 · about 2 months ago
- Noam Brown / OpenAI Research Scientist615.I'm surprised that, more than a year later, it's still the norm to compare reasoning models on evals by a single number.›LLM score 92 · about 2 months ago
- Cameron Wolfe / Researcher at Netflix616.My reaction to muse spark is similar to how I felt about llama 4.›LLM score 85 · about 2 months ago
- Sebastien Bubeck / OpenAI MTS617.The world of mathematics is rapidly changing.›LLM score 92 · about 2 months ago
- Mehtaab Sawhney / OpenAI for Science618.We’ve just released another paper solving five further Erdős problems with an internal model at OpenAI: https://t.co/yq5kb4wSNL.›LLM score 95 · about 2 months ago
- Shuchao Bi / Meta Researcher619.the model is incredible at generating playable mini-games.›LLM score 92 · about 2 months ago
- Alex Zhang / MIT CSAIL PhD620.but I thought Claude Code and RLMs were the same thing ›LLM score 85 · about 2 months ago
- Yang Chen / Nvidia Research Scientist621.
- Jason Wei / AI Researcher at Meta
- Yoshua Bengio
- Hyung Won Chung / Research Scientist at Meta624.We are releasing Muse Spark today.›LLM score 92 · about 2 months ago
- Shuchao Bi / Meta Researcher625.Muse Spark represents the beginning of a significant collaborative team effort.›LLM score 75 · about 2 months ago
- Shuchao Bi / Meta Researcher
- Jeff Dean / Chief Scientist at DeepMind
- Merve Noyan / Hugging Face ML Engineer628.why local models matter for reproducibility, proven time after time›LLM score 85 · about 2 months ago
- Cameron Wolfe / Researcher at Netflix
- Jason Weston / Meta Research Scientist630.🏋️Thinking Mid-training: RL of Interleaved Reasoning🎗️›LLM score 92 · about 2 months ago