- Chelsea Finn / Physical Intelligence Cofounder421.We can improve VLA generalization to new instructions at test time by:›LLM score 92 · about 1 month ago
- Jonathan Lee / DeepMind Researcher422.We ran our internal system Aletheia (Deep Think) on FirstProof’s research problems during the week they were released.›LLM score 92 · about 1 month ago
- Quoc Le / Google Fellow423.
- Thang Luong / DeepMind Principal Scientist424.Thrilled to share: #Aletheia, our math research agent, just solved 6/10 notoriously hard FirstProof problems autonomously, the best result in the inaugural challenge! To me, this is even bigger than our historic IMO-gold achievement last year; these problems challenge even top›LLM score 92 · about 1 month ago
- Ben Burtenshaw / Hugging Face Researcher425.Qwen3.5 has 3 models in the top 5 of Humanity's Last Exam.›LLM score 92 · about 1 month ago
- Merve Noyan / Hugging Face ML Engineer426.we have recently launched storage add-ons on @huggingface ›LLM score 8 · about 1 month ago
- Alex Zhang / MIT CSAIL PhD427.Haven't gotten around to writing in a bit, here's a short blog on my thoughts since releasing RLMs on the state of AI research.›LLM score 92 · about 1 month ago
- Chelsea Finn / Physical Intelligence Cofounder428.Pi models are now running in production settings, in collab with @Ultraroboticsco and @weaverobotics.›LLM score 92 · about 1 month ago
- Sergey Levine / Physical Intelligence Cofounder
- Simon Zhai / ex MTS at xAI430.Updated Bluff mode -- right now you will see different models bluffing & taunting other models on the table.›LLM score 85 · about 1 month ago
- Eric Jang / ex VP of AI at 1X Robotics431.Have taken an interest in passive dynamics lately.›LLM score 92 · about 1 month ago
- Karol Hausman / Physical Intelligence Cofounder432.Imagine how hard it would be to build Cursor/Harvey/OpenEvidence if each had to train its own foundation model from scratch.›LLM score 92 · about 1 month ago
- Andrej Karpathy / AI researcher
- Sergey Levine / Physical Intelligence Cofounder434.
- Boris Cherny / Creator of Claude Code435.We shipped Claude Code as a research preview a year ago today.›LLM score 75 · about 1 month ago
- Lucas Beyer / Meta Researcher436.Holy cow! I hope for the sake of my friends there, that it won't happen...›LLM score 75 · about 1 month ago
- Merve Noyan / Hugging Face ML Engineer437.3D Arena is so back! 🔥 compare closed APIs to open models for asset generation and help develop arena 🙌🏼 https://t.co/PgdR9WJjgQLLM score 75 · about 1 month ago
- Sebastien Bubeck / OpenAI MTS438.I feel the First Proof results are a bit downplayed ...›LLM score 92 · about 1 month ago
- Andrew Lampinen / Research Scientist at DeepMind
- Sholto Douglas / Researcher at Anthropic440.Reiner taught me much of what I know - goes without saying that I trust him to make the best chip in the world.›LLM score 75 · about 1 month ago
- Andrej Karpathy / AI researcher
- Jim Fan / NVIDIA Director of Robotics442.What can half of GPT-1 do? We trained a 42M transformer called SONIC to control the body of a humanoid robot.›LLM score 92 · about 1 month ago
- Cameron Wolfe / Researcher at Netflix443.Interesting study on properly using PDF data with LLMs.›LLM score 92 · about 1 month ago
- Merve Noyan / Hugging Face ML Engineer
- Devansh Pandey / Standard Intelligence Cofounder445.shout out to the trusty heap for storing all the video data needed to train FDM-1.›LLM score 82 · about 1 month ago
- Lucas Beyer / Meta Researcher446.
- Merve Noyan / Hugging Face ML Engineer
- Chace Lee / ex MTS at xAI448.Awesome work. VPT is one of my favorite papers. Computer-use agents live or die by where you draw the abstraction boundary.›LLM score 92 · about 1 month ago
- Cameron Wolfe / Researcher at Netflix449.Rubric-based RL is a really popular topic right now, but it’s not new.›LLM score 85 · about 1 month ago
- Christian Szegedy / ex xAI Cofounder450.Just a different way of saying: AI is getting increasingly good at long-horizon tasks.›LLM score 75 · about 1 month ago