nanoGPT Explorer

Educational Live Script exploring how a transformer language model works with a miniature GPT trained on Shakespeare or novels.

Al momento, stai seguendo questo contributo

This educational Live Script explores the transformer, the neural network architecture behind current large language models. A language model assigns a probability to each character that could continue a piece of text, and generates text by repeatedly drawing the next character from that distribution. We build a complete character-level model, train it on text of Shakespeare, and examine the attention operation that lets each position in the text depend on the positions before it.
The model built here has roughly 1.1e5 adjustable parameters. Its architecture is similar to that of frontier models with order 1e11 to 1e12 parameters. The construction follows the nanoGPT model of A. Karpathy [1].
This script may interest students and instructors of physics and other fields. It is appropriate for a first course that includes neural networks and assumes familiarity with a basic classifier network of the kind developed in Identify Objects Acoustically with a Neural Network [2]. A Background Information section describe the transformer, and interactive 'Try this' suggestions, coding 'Challenges,' and references are included for further exploration. Additional educational Live Scripts by the author may be found here.

Cita come

Duncan Carlsmith (2026). nanoGPT Explorer (https://it.mathworks.com/matlabcentral/fileexchange/183953-nanogpt-explorer), MATLAB Central File Exchange. Recuperato .

Informazioni generali

Compatibilità della release di MATLAB

  • Compatibile con qualsiasi release

Compatibilità della piattaforma

  • Windows
  • macOS
  • Linux
Versione Pubblicato Note della release Action
1.0.0