- Tri Dao / Chief Scientist at Together121.This is what we've been coking for the last 9 months: make MoEs training goes ~2x faster and ~2x less memory! Highlights:›LLM score 85 · about 1 month ago
- Noam Shazeer122.Gemini 3 Flash is live.›LLM score 70 · about 1 month ago
- Demis Hassabis / CEO of DeepMind123.
- Noam Brown / OpenAI Research Scientist124.Our efforts at @OpenAI to advance scientific progress aren't just limited to math/physics/coding.›LLM score 20 · about 1 month ago
- Demis Hassabis / CEO of DeepMind125.Always enjoy discussing the big picture with @FryRsquared.›LLM score 20 · about 1 month ago
- Dan Fu / VP of Kernels at Together126.@maxencefrenette Good point! I think nuanced discussions of the task, and task-specific architectures are key for this.›LLM score 70 · about 2 months ago
- Tri Dao / Chief Scientist at Together127.Nvidia continues to put out some of the strongest and fastest open models.›LLM score 80 · about 2 months ago
- Dan Fu / VP of Kernels at Together
- Dan Fu / VP of Kernels at Together129.New blog post about paths to AGI and arguing why it’s too early to say AGI is resource limited.›LLM score 80 · about 2 months ago
- Dan Fu / VP of Kernels at Together130.My response to @Tim_Dettmers great post last week that we won't reach AGI because of resource limitations.›LLM score 65 · about 2 months ago
- Noam Brown / OpenAI Research Scientist131.@tszzl Yeah, the Claude 3 announcement from March 2024 still listed GSM8K as one of the benchmarksLLM score 70 · about 2 months ago
- Demis Hassabis / CEO of DeepMind
- Noam Brown / OpenAI Research Scientist133.
- Noam Brown / OpenAI Research Scientist134.IMO GDPVal is the most important result from our @OpenAI GPT-5.2 launch.›LLM score 80 · about 2 months ago
- Noam Brown / OpenAI Research Scientist135.I'm also really happy that @OpenAI was willing to publish the original GDPVal results showing Claude ahead of ChatGPT.›LLM score 60 · about 2 months ago
- Demis Hassabis / CEO of DeepMind
- Demis Hassabis / CEO of DeepMind
- Demis Hassabis / CEO of DeepMind138.The UK is an amazing place for science & innovation.›LLM score 50 · about 2 months ago
- Andrej Karpathy / AI researcher139.Quick new post: Auto-grading decade-old Hacker News discussions with hindsight›LLM score 80 · about 2 months ago
- Tim Dettmers / Research Scientist at Ai2140.Many people think AI will continue improve towards AGI.›LLM score 80 · about 2 months ago
- Tim Dettmers / Research Scientist at Ai2
- Andrej Karpathy / AI researcher142.In today's episode of programming horror... In the Python docs of random.seed() def, we're told›LLM score 40 · about 2 months ago
- Noam Brown / OpenAI Research Scientist143.From inception to release, the journal publication process can easily take over a year.›LLM score 70 · about 2 months ago
- Igor Babuschkin / Cofounder of xAI144.SGLang is the best inference framework for LLMs.›LLM score 20 · about 2 months ago
- Andrej Karpathy / AI researcher145.
- Noam Brown / OpenAI Research Scientist146.
- Andrej Karpathy / AI researcher147.@DimitrisPapail There is definitely work going into engineering the "you" simulation - the personality that gets all the rewards in verifiable problems, or all the upvotes from users/judge LLMs, or mimics the responses of SFT, and there is an emergent composite personality from that.›LLM score 30 · about 2 months ago
- Andrej Karpathy / AI researcher148.Don't think of LLMs as entities but as simulators.›LLM score 85 · about 2 months ago
- Demis Hassabis / CEO of DeepMind149.Gemini has always had exceptionally strong multimodal capabilities.›LLM score 15 · about 2 months ago
- Noam Brown / OpenAI Research Scientist150.@deredleritt3r Ah yeah that could have been worded better.›LLM score 75 · about 2 months ago