ViewTube

Skip

Recommended videos

code_your_own_AI

39:05

New View on AI: Monad Algebra in Category Theory

1,872 views

3 weeks ago

code_your_own_AI

30:50

New Discovery: Retrieval Heads for Long Context

1,780 views

2 days ago

Yannic Kilcher

37:17

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

44,398 views

10 days ago

code_your_own_AI · Playlist

New LLMs with 1M token context length

RING Attention explained: 1 Mio Context Length

1,940 views

code_your_own_AI

32.4K subscribers

Tue, 16 Apr 2024 00:00:00 GMT

Tags

artificial intelligence

AI models

LLM

VLM

VLA

Multi-modal model

explanatory video

RAG

multi-AI

multi-agent

Fine-tune

Pre-train

RLHF

Ring Attention enables context lengths of 1 mio tokens for our latest LLMs and VLMs. How is this possible? What happens to the quadratic complexity of self-attention on sequence length? In this video, I explain the Block Parallel Transformer idea from UC Berkeley to the actual code implementation on Github for Ring Attention with blockwise transformer. Current Google Gemini 1.5 Pro has a context length of 1 mio tokens on Vertex AI. 00:00 3 ways for infinite context lengths 02:05 Blockwise Parallel Transformers 03:11 Q, K, V explained in a library 06:10 BPT explained in a library 11:30 Maths for blockwise parallel transformers 12:41 Ring attention symmetries 14:25 Ring attention explained 19:52 Ring attention JAX code 23:59 Outlook: Infini Attention by Google All rights w/ authors: Ring Attention with Blockwise Transformers for Near-Infinite Context https://arxiv.org/pdf/2310.01889.pdf #airesearch #ai

ViewTube

Recommended videos

RING Attention explained: 1 Mio Context Length

3 Comments