Maybe the biggest insight of watching these videos is how much the revolution results from a combination of a few key shortcuts ("drop""attention"), with massive raw computer power.
Nothing very deep statistically about "deep learning," except the surprise of how well it works.
RT @lugaricano: This 2023 MIT course is a fantastic survey of the state of the art today in neural networks/deep learning/LLMs etc. Easy to follow with just basic knowledge of matrix alg…
🐦🔗: https://n.respublicae.eu/lugaricano/status/1647380589689880577