AI300
new
top
about
Alec Radford / OpenAI Researcher
1951.
By the way - I think a valid (if extreme) take on GPT-2 is "lol you need 10,000x the data, 1 billion parameters, and a supercomputer to get current DL models to generalize to Penn Treebank."
LLM score 82 · over 7 years ago
Alec Radford / OpenAI Researcher
1952.
The DL CV community is having a "oh wait, bags of local features are a really strong baseline for classification" moment with the BagNet paper.
›
LLM score 92 · over 7 years ago
Alec Radford / OpenAI Researcher
1953.
Nice discussion of the progress in NLU that's happening with BERT, OpenAI GPT, ULMFiT, ELMo, and more covered by @CadeMetz in the @nytimes I'm super excited to see how far this line of research will be able to get in the next few years!
›
LLM score 8 · over 7 years ago
Alec Radford / OpenAI Researcher
1954.
Been meaning to check this - thanks @Thom_Wolf ! Random speculation: the bit of weirdness going on in BERT's position embeddings compared to GPT is due to the sentence similarity task.
›
LLM score 82 · over 7 years ago