- Sheryl Hsu / OpenAI Researcher
- Alexander Wei / OpenAI Researcher1202.3/ We’ve come a long way since last summer.›LLM score 70 · 8 months ago
- Sheryl Hsu / OpenAI Researcher
- Sheryl Hsu / OpenAI Researcher
- Juntang Zhuang / MTS at xAI (pre-training lead)1205.It’s extremely fun though tough to train the first natively multimodal model ever in xAI.›LLM score 92 · 8 months ago
- Geoffrey Hinton1206.A major cut to the funding of the National Science Foundation would be very bad for the future of the US.›LLM score 20 · 8 months ago
- Ted Sanders / OpenAI Researcher1207.a cool thing you get to see building AI products: ›LLM score 75 · 8 months ago
- Ted Sanders / OpenAI Researcher1208.GPT-5 is here! it's way better at coding - not just in pointless evals, but real usage.›LLM score 70 · 8 months ago
- Jeremy Bernstein / Thinking Machines Researcher1209.I had wondered why there was no official Dion implementation by the authors...›LLM score 75 · 8 months ago
- Sally Zhu / Researcher at Flapping Airplanes1210.
- Tri Dao / Chief Scientist at Together1211.Hierarchical layout is super elegant.›LLM score 85 · 9 months ago
- Jason Wei / AI Researcher at Meta
- Jason Wei / AI Researcher at Meta1213.New blog post about asymmetry of verification and "verifier's law": https://t.co/bvS8HrX1jP›LLM score 80 · 9 months ago
- Jakub Pachocki / OpenAI Chief Scientist1214.I am extremely excited about the potential of chain-of-thought faithfulness & interpretability.›LLM score 80 · 9 months ago
- Lilian Weng / Thinking Machines Cofounder
- Tri Dao / Chief Scientist at Together1216.I played w it for 1h. Went through my usual prompts (math derivations, floating point optimizations, …).›LLM score 35 · 9 months ago
- Tri Dao / Chief Scientist at Together1217.@RaghuGanti @cHHillee Oh you’d want to use warp reduction if the whole row fits into 1 warp.›LLM score 80 · 9 months ago
- Tri Dao / Chief Scientist at Together1218.They’ve finally done it. They got rid of tokenizers! https://t.co/x4CXHdCw0WLLM score 60 · 9 months ago
- Tri Dao / Chief Scientist at Together
- Tri Dao / Chief Scientist at Together1220.
- Tri Dao / Chief Scientist at Together1221.Albert articulates really well the trade offs between transformers and SSMs.›LLM score 80 · 9 months ago
- Tri Dao / Chief Scientist at Together
- Shuchao Bi / Meta Researcher
- Yang Chen / Nvidia Research Scientist1224.The first thing we did was to make sure the eval setup is correct!›LLM score 92 · 10 months ago
- Yang Chen / Nvidia Research Scientist1225.📢We conduct a systematic study to demystify the synergy between SFT and RL for reasoning models.›LLM score 92 · 10 months ago
- Geoffrey Hinton
- Yang Chen / Nvidia Research Scientist1227.Does RL incentive reasoning capability over the starting SFT model? ›LLM score 92 · 10 months ago
- Geoffrey Hinton1228.I just watched a great compilation of various people's views about what is coming:›LLM score 10 · 10 months ago
- Ludwig Schmidt / Anthropic MTS
- Geoffrey Hinton1230.AGI is the most important and potentially dangerous technology of our time.›LLM score 70 · 11 months ago