RT @schrep: 2 important details:
a) Shows that smaller models trained with more data can outperform larger models (e.g. 13B outperforms GPT-3 175B)
2) The "larger" 65B model is competitive with best models - and is freely available to the research community!
https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/
🐦🔗: https://n.respublicae.eu/lugaricano/status/1629230230304063489