An interesting event happened at the 2024 GTC Conference: NVIDIA founder Jensen Huang participated in a panel discussion with eight Google engineers, one of whom was surprisingly the founder of NEAR. These eight engineers co-authored the paper “Attention Is All You Need” seven years ago, which has been cited over 110,000 times to date. They probably didn’t expect that this research, published on June 12, 2017, would profoundly reshape the entire AI industry.
How Transformers Revolutionized AI Learning Methods
Imagine the human brain as an Amazon rainforest—full of various functional areas connected by dense pathways. Neurons are like messengers along these pathways, capable of sending and receiving signals to any part of the brain. This structure endows the human brain with powerful learning and recognition abilities.
The Transformer architecture is a neural network attempting to replicate this mechanism. By introducing the self-attention mechanism, it broke through the bottleneck of early RNNs (Recurrent Neural Networks)—which could only process sequential data step by step—allowing Transformers to analyze all parts of a sequence simultaneously, capturing long-range dependencies and context. Of course, current technology still falls far short of the human brain’s capabilities.
From voice recognition applications like Siri to today’s ChatGPT, the evolution of AI has been driven by iterative improvements in Transformer-based models: XLNet, BERT, GPT, and other derivatives have emerged. Among them, GPT is the most well-known, but it still has significant limitations in event prediction capabilities.
The Next Key for Large Language Models—Time Fusion Ability
The core contribution of “Attention Is All You Need” lies in the attention mechanism, and the next leap in AI will come from the Time Fusion Transformer (TFT). When large language models (LLMs) can predict future events based on historical data and patterns, it will mark a significant step toward realizing Artificial General Intelligence (AGI).
TFT not only predicts future values but also explains its prediction logic. This capability has unique application value in the blockchain field. By defining rules within the model, TFT can automatically manage consensus processes, increase block production speed, reward honest validators, and penalize malicious actors.
New Possibilities for Blockchain Consensus Mechanisms
The consensus process in public blockchain networks is essentially a game among validators—requiring over two-thirds of validators to agree on who will create the next block. This process is often contentious, leading to inefficiencies in networks like Ethereum.
The introduction of TFT offers a new approach. Public chains can establish a reputation score system based on validators’ voting history, block proposal records, Slash records, staking amounts, and activity levels. Validators with higher reputation scores can receive more block rewards, thereby improving block production efficiency.
The BasedAI project is exploring this path, planning to use the TFT model to allocate token issuance among validators and network participants. It also integrates Fully Homomorphic Encryption (FHE) technology, allowing developers to deploy privacy-preserving large language models (Zk-LLMs) on its decentralized AI infrastructure “Brains.”
Privacy Encryption: A Key Step Toward AGI
The beauty of FHE technology is that users can enable personalized AI services while keeping their data fully encrypted. Privacy-preserving techniques such as Zero-Knowledge Machine Learning (ZkML), Blind Computation, and Homomorphic Encryption are filling this gap.
When people are confident that their data is protected by encryption and are willing to contribute data under strong privacy guarantees, we may be close to breakthroughs in AGI. This is because achieving AGI requires vast amounts of multi-dimensional data, but current user concerns about data security limit data flow.
However, challenges remain—these privacy-preserving technologies all require significant computational resources, keeping them in early-stage applications. Large-scale deployment is still a way off. But the trend is clear: the doors opened by “Attention Is All You Need” will be pushed into the next era by the integration of privacy, computation, and consensus.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
从 Attention Is All You Need 看 AI 突破的钥匙
An interesting event happened at the 2024 GTC Conference: NVIDIA founder Jensen Huang participated in a panel discussion with eight Google engineers, one of whom was surprisingly the founder of NEAR. These eight engineers co-authored the paper “Attention Is All You Need” seven years ago, which has been cited over 110,000 times to date. They probably didn’t expect that this research, published on June 12, 2017, would profoundly reshape the entire AI industry.
How Transformers Revolutionized AI Learning Methods
Imagine the human brain as an Amazon rainforest—full of various functional areas connected by dense pathways. Neurons are like messengers along these pathways, capable of sending and receiving signals to any part of the brain. This structure endows the human brain with powerful learning and recognition abilities.
The Transformer architecture is a neural network attempting to replicate this mechanism. By introducing the self-attention mechanism, it broke through the bottleneck of early RNNs (Recurrent Neural Networks)—which could only process sequential data step by step—allowing Transformers to analyze all parts of a sequence simultaneously, capturing long-range dependencies and context. Of course, current technology still falls far short of the human brain’s capabilities.
From voice recognition applications like Siri to today’s ChatGPT, the evolution of AI has been driven by iterative improvements in Transformer-based models: XLNet, BERT, GPT, and other derivatives have emerged. Among them, GPT is the most well-known, but it still has significant limitations in event prediction capabilities.
The Next Key for Large Language Models—Time Fusion Ability
The core contribution of “Attention Is All You Need” lies in the attention mechanism, and the next leap in AI will come from the Time Fusion Transformer (TFT). When large language models (LLMs) can predict future events based on historical data and patterns, it will mark a significant step toward realizing Artificial General Intelligence (AGI).
TFT not only predicts future values but also explains its prediction logic. This capability has unique application value in the blockchain field. By defining rules within the model, TFT can automatically manage consensus processes, increase block production speed, reward honest validators, and penalize malicious actors.
New Possibilities for Blockchain Consensus Mechanisms
The consensus process in public blockchain networks is essentially a game among validators—requiring over two-thirds of validators to agree on who will create the next block. This process is often contentious, leading to inefficiencies in networks like Ethereum.
The introduction of TFT offers a new approach. Public chains can establish a reputation score system based on validators’ voting history, block proposal records, Slash records, staking amounts, and activity levels. Validators with higher reputation scores can receive more block rewards, thereby improving block production efficiency.
The BasedAI project is exploring this path, planning to use the TFT model to allocate token issuance among validators and network participants. It also integrates Fully Homomorphic Encryption (FHE) technology, allowing developers to deploy privacy-preserving large language models (Zk-LLMs) on its decentralized AI infrastructure “Brains.”
Privacy Encryption: A Key Step Toward AGI
The beauty of FHE technology is that users can enable personalized AI services while keeping their data fully encrypted. Privacy-preserving techniques such as Zero-Knowledge Machine Learning (ZkML), Blind Computation, and Homomorphic Encryption are filling this gap.
When people are confident that their data is protected by encryption and are willing to contribute data under strong privacy guarantees, we may be close to breakthroughs in AGI. This is because achieving AGI requires vast amounts of multi-dimensional data, but current user concerns about data security limit data flow.
However, challenges remain—these privacy-preserving technologies all require significant computational resources, keeping them in early-stage applications. Large-scale deployment is still a way off. But the trend is clear: the doors opened by “Attention Is All You Need” will be pushed into the next era by the integration of privacy, computation, and consensus.