Docy Child


Estimated reading: 1 minute 56 views

A model architecture at the core of most state of the art (SOTA) ML research. It is composed of multiple “attention” layers which learn which parts of the input data are the most important for a given task. Transformers started in language modeling, then expanded into computer vision, audio, and other modalities.

Share to...