Al momento, stai seguendo questo contributo
- Vedrai gli aggiornamenti nel tuo feed del contenuto seguito
- Potresti ricevere delle email a seconda delle tue preferenze per le comunicazioni
This educational Live Script explores the transformer, the neural network architecture behind current large language models. A language model assigns a probability to each character that could continue a piece of text, and generates text by repeatedly drawing the next character from that distribution. We build a complete character-level model, train it on text of Shakespeare, and examine the attention operation that lets each position in the text depend on the positions before it.
The model built here has roughly 1.1e5 adjustable parameters. Its architecture is similar to that of frontier models with order 1e11 to 1e12 parameters. The construction follows the nanoGPT model of A. Karpathy [1].
This script may interest students and instructors of physics and other fields. It is appropriate for a first course that includes neural networks and assumes familiarity with a basic classifier network of the kind developed in Identify Objects Acoustically with a Neural Network [2]. A Background Information section describe the transformer, and interactive 'Try this' suggestions, coding 'Challenges,' and references are included for further exploration. Additional educational Live Scripts by the author may be found here.
Cita come
Duncan Carlsmith (2026). nanoGPT Explorer (https://it.mathworks.com/matlabcentral/fileexchange/183953-nanogpt-explorer), MATLAB Central File Exchange. Recuperato .
Informazioni generali
- Versione 1.0.0 (4,37 MB)
Compatibilità della release di MATLAB
- Compatibile con qualsiasi release
Compatibilità della piattaforma
- Windows
- macOS
- Linux
| Versione | Pubblicato | Note della release | Action |
|---|---|---|---|
| 1.0.0 |
