← All terms

Attention Mechanism

Also known as: Attention

A technique in neural networks that allows models to focus on relevant parts of the input when generating each part of the output, rather than relying solely on a fixed-length context vector. In sequence-to-sequence models, attention computes a weighted combination of all encoder states when predicting each output token, with weights learned based on relevance. This enables models to handle long sequences and capture dependencies between distant elements. In accessibility applications, attention mechanisms improve sign language translation by allowing models to focus on the most relevant video frames when generating each word, and enhance image captioning by attending to specific image regions when describing them.

Category: Machine Learning · Deep Learning · Natural Language Processing

Related: Transformer · Sequence-to-Sequence · LSTM · Neural Network

Sources