AI300

Thomas Wolf / Hugging Face Cofounder
1261.
json is so token inefficient it hurts these days man, these braces and quotes are costing me real $$
LLM score 82 · 4 months ago
Lewis Tunstall / Hugging Face Researcher
1262.
We've rebuilt TRL's on-policy distillation trainer from the ground up to:›
LLM score 92 · 4 months ago
Merve Noyan / Hugging Face ML Engineer
1263.
MiniMax 2.7 is out 🔥 it sits in the open frontier in score per token efficiency 🥵 https://t.co/izzcB1I3vf
LLM score 15 · 4 months ago
Alex Zhang / MIT CSAIL PhD
1264.
ik this is an ai slop post but it contains a laundry list of misconceptions that are are quite nice to clear up lol›
LLM score 92 · 4 months ago
Eric Jang / ex VP of AI at 1X Robotics
1265.
just tried this with Claude code + computer use mcp on my Mac.›
LLM score 92 · 4 months ago
Skyler Miao / MiniMax Head of Engineering
1266.
M2.7 weights are live. hope you all enjoy it 😎 been grinding sleepless on M3 and the harness›
LLM score 82 · 4 months ago
Jerry Tworek / ex OpenAI VP of RL
1267.
Most interesting solutions come from looking at old problem from a new side rather than from hammering harder on old techniques
LLM score 75 · 4 months ago
Niklas Muennighoff / AI Researcher at Stanford
1268.
There's a wave of omni embedding models (gemini, nemotron, bidirlm).›
LLM score 82 · 4 months ago
Alex Zhang / MIT CSAIL PhD
1269.
this is a sick idea applying a paper I think is very cool (attention matching) to RLMs ›
LLM score 92 · 4 months ago
Ben Burtenshaw / Hugging Face Researcher
1270.
mcp is dead, if you used it like a cli. a lot of folk were using MCP like clis or APIs, and most of the value in MCP they ignored.›
LLM score 92 · 4 months ago
Noam Brown / OpenAI Research Scientist
1271.
What we really need is a benchmark where AI models make AI models that play poker.›
LLM score 92 · 4 months ago
Lewis Tunstall / Hugging Face Researcher
1272.
I had a lot of fun talking with SAIR about unconventional paths into AI, how we might RL messy domains like biology, and is there life after GRPO :) https://t.co/wTzXTYoSwb
LLM score 85 · 4 months ago
Sayak Paul / Hugging Face Researcher
1273.
We've been studying what it takes to get NVFP4 & MXFP8 deliver good speedups on modern flow models for image & video gen.›
LLM score 95 · 4 months ago
Alex Zhang / MIT CSAIL PhD
1274.
The "Mismanaged Geniuses" Hypothesis(blog)
LLM score 100 · 4 months ago
Cameron Wolfe / Researcher at Netflix
1275.
Can't wait to get my physical copy! Definitely order the RLHF book.›
LLM score 92 · 4 months ago
Zixuan Li / Lead Z.ai
1276.
Often discuss my three-level vision for opening GLM to the community:›
LLM score 92 · 4 months ago
John Carmack
1277.
Making a scatter plot of 400_000 data points, some of the plots had odd gaps in coverage.›
LLM score 92 · 4 months ago
Boris Cherny / Creator of Claude Code
1278.
Just got a nice DM from a big enterprise customer using Claude Code in one of the world's biggest codebases›
LLM score 95 · 4 months ago
Alex Zhang / MIT CSAIL PhD
1279.
does anyone know of any evals / benchmarks that are particularly sensitive to prompting? can be for any range of reasons, e.g.›
LLM score 92 · 4 months ago
Andrej Karpathy / AI researcher
1280.
Judging by my tl there is a growing gap in understanding of AI capability.›
LLM score 92 · 4 months ago
Eric Jang / ex VP of AI at 1X Robotics
1281.
These days, instead of directly asking LLMs to write code, I'm trying a new practice where I write a description of what the program should do into a plan.md file and have a LLM execute the computations "manually", thinking through every step, instead of actually writing the
LLM score 92 · 4 months ago
Niklas Muennighoff / AI Researcher at Stanford
1282.
Congrats to @mustafasuleyman & team on taking the top spot on MTEB with Harrier! Among my favorite recent additions to our leaderboard is this plot showing the embedding frontier over time📈 https://t.co/dYoZYGorYZ https://t.co/4BZYnB3ld4
LLM score 75 · 4 months ago
Shuchao Bi / Meta Researcher
1283.
I can feel their passion of health capabilities when discussing with @hwchung27 and @_jasonwei “if we can make every user of our product live 1 year longer in healthy conditions and we have 1 billion DAUs, that’s 1 billion years of human life.” With the projection of LLM https://t.co/CSA9wTr7TA
LLM score 85 · 4 months ago
Noam Brown / OpenAI Research Scientist
1284.
I'm surprised that, more than a year later, it's still the norm to compare reasoning models on evals by a single number.›
LLM score 92 · 4 months ago
Cameron Wolfe / Researcher at Netflix
1285.
My reaction to muse spark is similar to how I felt about llama 4.›
LLM score 85 · 4 months ago
Sebastien Bubeck / OpenAI MTS
1286.
The world of mathematics is rapidly changing.›
LLM score 92 · 4 months ago
Mehtaab Sawhney / OpenAI for Science
1287.
We’ve just released another paper solving five further Erdős problems with an internal model at OpenAI: https://t.co/yq5kb4wSNL.›
LLM score 95 · 4 months ago
Shuchao Bi / Meta Researcher
1288.
the model is incredible at generating playable mini-games.›
LLM score 92 · 4 months ago
Alex Zhang / MIT CSAIL PhD
1289.
but I thought Claude Code and RLMs were the same thing ›
LLM score 85 · 4 months ago
Yang Chen / Nvidia Research Scientist
1290.
amazing! observed this think long --> compress --> think long trend during the nemotron-cascade math rl training on a 14B model last year›
LLM score 85 · 4 months ago