ViewTube

Skip

Mighty New TransformerFAM (Feedback Attention Mem)

1,486 views

code_your_own_AI

32.4K subscribers

Thu, 18 Apr 2024 00:00:00 GMT

Tags

artificial intelligence

AI models

LLM

VLM

VLA

Multi-modal model

explanatory video

RAG

multi-AI

multi-agent

Fine-tune

Pre-train

RLHF

The Latest in AI research: The video introduces a powerful new TransformerFAM (Feedback Attention Memory) by https://www.youtube.com/channel/UCK8sQmJBp8GCxrOtXWBpyEA a novel architecture designed to enhance Transformers by incorporating a feedback mechanism that emulates working memory. Plus the introduction of the new Transformer BSWA (Block Sliding Window Attention). Based on ring attention by https://www.youtube.com/channel/UCwbsWIWfcOL2FiUZ2hKNJHQ This design allows the Transformer to maintain awareness of its own latent representations across different blocks of data, improving its ability to process indefinitely long sequences without additional computational overhead. Unlike traditional Transformers that suffer from quadratic complexity with sequence length, TransformerFAM operates with linear complexity, making it better suited for handling extensive data sequences efficiently. TransformerFAM integrates seamlessly with existing pretrained models and does not introduce new weights, which facilitates the retention and compression of past information within a feedback loop across sequence blocks. This enables the model to manage long-term dependencies effectively, thus enhancing performance on tasks requiring extensive context awareness. The architecture’s feedback loop mimics biological neural networks' mechanisms, proposing a scalable solution to the limitations of current Transformer models regarding long sequence data processing. 00:00 3 videos on infinity context length 01:19 Visualization of new transformerFAM 02:58 Pseudocode for two new transformer 04:14 Basics of Attention calculations 07:00 TransformerBSWA - Block Sliding Window Attention 12:15 TransformerFAM - Feedback Attention Memory 14:47 Symmetries in operational feedback code 20:09 Time series visualization of new FAM and BSWA 23:24 Outlook on Reasoning w/ TransformerFAM all rights w/ Authors: https://arxiv.org/pdf/2404.09173.pdf TransformerFAM: Feedback attention is working memory #airesearch #ai

ViewTube

Recommended videos

Mighty New TransformerFAM (Feedback Attention Mem)

5 Comments