HYB - AI300

AI300

Omar Khattab / MIT CSAIL Asst professor
1.
New updates for the RLM paper: We post-trained RLM-Qwen3-8B at tiny scale, the first natively recursive LM.›
LLM score 85 · about 12 hours ago
Alex Zhang / MIT CSAIL PhD
2.
We just updated the RLM paper with some new stuff.›
LLM score 85 · about 13 hours ago
alphaXiv
3.
2026 is the year of continual learning And we are getting some amazing papers towards that›
LLM score 85 · about 15 hours ago
Andrew Lampinen / Research Scientist at DeepMind
4.
New paper studying how language models representations of things like factuality evolve over a conversation.›
LLM score 85 · about 16 hours ago
Cameron Wolfe / Researcher at Netflix
5.
Trinity large is very sparse (400B-A13B, 256 experts w/ 4 active per token).›
LLM score 85 · about 21 hours ago
Ben Burtenshaw / Hugging Face Researcher
6.
We got Claude to teach open models how to write CUDA kernels.›
LLM score 85 · 1 day ago
alphaXiv
7.
BIG new idea in interpretability called Patterning›
LLM score 75 · 1 day ago
Boris Cherny / Creator of Claude Code
8.
@charles_irl In case it’s not clear in the docs: - Ancestor https://t.co/v4FOLUBHz9’s are loaded into context automatically on startup›
LLM score 20 · 2 days ago
alphaXiv
9.
"LLM-in-Sandbox Elicits General Agentic Intelligence"›
LLM score 85 · 2 days ago
Ethan Shen / Ai2 Researcher
10.
@unsorsodicorda @saurabh_shah2 @Tim_Dettmers GA is trained using 4.5 Air as the teacher, whereas SERA-32B uses GLM 4.6.›
LLM score 85 · 2 days ago
Tim Dettmers / Research Scientist at Ai2
11.
his work was mostly the genius of Ethan Shen.›
LLM score 70 · 3 days ago
Tim Dettmers / Research Scientist at Ai2
12.
Our method became so efficient (26x vs RL; 57x vs other synth gen), that we could easily generate 1000s of trajectories for a single repo.›
LLM score 85 · 3 days ago
Tim Dettmers / Research Scientist at Ai2
13.
This is very impactful: you can now distill frontier performance into small models that are specialized to private repositories.›
LLM score 75 · 3 days ago
Tim Dettmers / Research Scientist at Ai2
14.
From there we could do a massiv amounts of experiments and really understand what matters for training coding agents.›
LLM score 80 · 3 days ago
Cameron Wolfe / Researcher at Netflix
15.
Continual learning is a popular topic in LLM research, but it might not be as far away as we think.›
LLM score 65 · 3 days ago
Ethan Shen / Ai2 Researcher
16.
Finally, we conduct an analysis of variance across SWE-Bench runs.›
LLM score 95 · 3 days ago
Ethan Shen / Ai2 Researcher
17.
We also experiment with mixing data from rollouts and discover that training on SERA’s first and second rollouts complement each other to increase performance in data constrained regimes.›
LLM score 85 · 3 days ago
Ethan Shen / Ai2 Researcher
18.
Even more interesting is the fact that the best data comes from truncated trajectories that just barely exceed the context limit.›
LLM score 85 · 3 days ago
Niklas Muennighoff / AI Researcher at Stanford
19.
Community-built open benchmarks work really well, e.g., Terminal-Bench, HLE, MMTEB.›
LLM score 80 · 3 days ago
Noam Brown / OpenAI Research Scientist
20.
Had to cut this one for space: 2019: AI can't create art—creativity is uniquely human
LLM score 20 · 3 days ago
Andrej Karpathy / AI researcher
21.
@0xabi96 It feels like I’m cheating.›
LLM score 70 · 3 days ago
Andrej Karpathy / AI researcher
22.
@ChiragLathiya The nearest neighbor really is some kind of a junior engineer.›
LLM score 80 · 3 days ago
Andrej Karpathy / AI researcher
23.
@jeremytwei Love the word "comprehension debt", haven't encountered it so far, it's very accurate.›
LLM score 30 · 3 days ago
Andrej Karpathy / AI researcher
24.
@airesearch12 💯 @ Spec-driven development It's the limit of imperative -> declarative transition, basically being declarative entirely.›
LLM score 60 · 3 days ago
Andrej Karpathy / AI researcher
25.
A few random notes from claude coding quite a bit last few weeks.›
LLM score 20 · 3 days ago
Noam Brown / OpenAI Research Scientist
26.
1987: AI can't win at chess—planning is uniquely human›
LLM score 70 · 3 days ago
Yoshua Bengio
27.
Had a great discussion on AI's major societal impacts and emerging risks on the @RestIsPolitics podcast this past November.›
LLM score 20 · 3 days ago
Jonathan Ross / TPU Creator
28.
Success in the Information Age was about being able to answer questions.›
LLM score 20 · 3 days ago
Ben Burtenshaw / Hugging Face Researcher
29.
this is a blog post on claude + llama.cpp https://t.co/yej6WsNnQA
LLM score 20 · 4 days ago
Yoshua Bengio
30.
Merci à @HugoDecrypte pour l'invitation à faire un tour d'horizon des risques de l'IA ainsi que des solutions nécessaires pour se diriger vers une meilleure trajectoire.›
LLM score 20 · 4 days ago