GPT-4, Gemini 1.5, and Microsoft AI PC have made impressive advancements in AI technology. However, the current development of AI still faces some problems. AppWorks’ Web3 researcher Bill has conducted in-depth research on these issues and explored 7 directions in which Crypto can empower AI.
Tokenization of Data
Traditional AI training mainly relies on publicly available data on the Internet, or more precisely, traffic data in the public domain. Except for a few companies that provide open APIs, most of the data remains untapped. How to enable more data holders to contribute or authorize their data for AI training while ensuring privacy is a key direction.
However, the biggest challenge in this field is that data is difficult to standardize like computing power. While distributed computing power can be quantified through GPU types, the quantity, quality, and usage of private data are difficult to measure. If distributed computing power is like ERC 20, the tokenization of datasets is similar to ERC 721, making liquidity and market formation more challenging than ERC 20.
Ocean Protocol’s Compute-to-Data feature allows data owners to sell private data while protecting privacy. vana provides a method for Reddit users to aggregate data and sell it to companies training AI large models.
Resource Allocation
Currently, there is a large gap between the supply and demand of GPU computing power. Large companies monopolize most of the GPU resources, which makes the cost of training models very high for small companies. Many teams are working hard to reduce costs by centralizing small-scale and low-utilization GPU resources through decentralized networks, but they still face significant challenges in ensuring stable computing power and sufficient bandwidth.
Incentive RLHF
RLHF (Reinforcement Learning from Human Feedback) is crucial for improving large-scale models, but it requires trained professionals. As market competition intensifies, the cost of hiring these professionals also increases. To reduce costs while maintaining high-quality annotations, the stake and slashing system can be used. One of the biggest expenses in data annotation is the need for supervisors to check quality. However, over the years, blockchain has successfully utilized economic incentive mechanisms to ensure work quality (PoW, PoS). It is believed that creating a sound token economy system can effectively reduce the cost of RLHF.
For example, Sapien AI has introduced Tag 2 Earn and collaborated with multiple gamefi guilds; Hivemapper has trained data on 2 million kilometers of roads through token incentive mechanisms; QuillAudits plans to launch an open-source smart contract audit agent, allowing all auditors to collectively train the agent and receive rewards.
Verifiability
How to verify if the computing power provider executes inference tasks according to specific requirements or models? Users are unable to verify the authenticity and accuracy of AI models and their outputs. This lack of verifiability can lead to distrust, errors, and even harm to interests in fields such as finance, medicine, and law.
By using encryption validation systems such as ZKP, OP, and TEE, inference service providers can prove that the output is generated by a specific model. The benefits of using encryption validation include maintaining the confidentiality of the model by the model provider, allowing users to verify the correctness of model execution, and aggregating the proof of encryption into smart contracts to overcome the computational limitations of blockchain. It is also worth considering running AI directly on the device to solve performance issues, but so far, satisfactory answers have not been found. Projects in this field include Ritual, ORA, and Aizel Network.
Depth fake
With the emergence of production AI, people are paying more and more attention to the issue of DeepFake. However, the progress of DeepFake technology is faster than that of detection technology, making it increasingly difficult to detect DeepFake. Although digital watermarking technologies (such as C2PA) can help identify DeepFake, they also have limitations because the processed images have been modified, and the public cannot verify the signature on the original image. It will be very difficult to verify only through the processed images.
Blockchain technology can solve the problem of deepfakes in various ways. Hardware authentication can use tamper-proof chip cameras to embed encrypted proofs in each original photo to verify the authenticity of the image. Blockchain has immutability, allowing images with metadata to be added to blocks with timestamps, preventing tampering and verifying the original source. In addition, wallets can be used to attach encrypted signatures to published posts to verify the identity of the content creator. KYC infrastructure based on zk technology can bind wallets with verified identities, while protecting user privacy. From an economic incentive perspective, authors should be punished for posting false information, and users can be rewarded for identifying false information.
Numbers Protocol has been deeply involved in this field for many years; Fox News’ verification tool is based on the Polygon blockchain, allowing users to search for articles and retrieve relevant data from the blockchain.
Privacy
When AI models involve sensitive information in fields such as finance, healthcare, and law, it is crucial to protect data privacy while using them. Homomorphic Encryption (FHE) can process data without decryption, thus safeguarding privacy when using the LLM model. The workflow is as follows:
The user starts the inference process on the local device and stops after completing the initial layer. This initial layer is not included in the model shared with the server;
The client encrypts the intermediate operation and forwards it to the server;
The server processes this encrypted data with partial attention mechanism and sends the result back to the client;
The client decrypts the result and continues reasoning locally.
By doing so, FHE ensures the protection of user data privacy throughout the entire processing process.
Zama is building a fully homomorphic encryption (FHE) solution and has recently completed a $73 million financing to support development.
AI Agent
AI agential idea is very futuristic. What would the future be like if AI agents could own assets and trade? People may shift from using general large-scale models to assigning tasks to specialized agents.
These agents will collaborate with each other, just as rational economic relationships can enhance human collaboration. Adding economic relationships to AI agents can also improve their efficiency.
Blockchain can become the testing ground for this concept. For example, Colony is experimenting with this idea through games, providing wallets for AI agents to trade with other agents or real players to achieve specific goals.
Conclusion
Most of the issues are actually related to open source AI. To ensure that this important technology in the next decade will not be monopolized by a few companies, the token economy system can quickly utilize decentralized computing resources and training datasets to narrow the resource gap between open source and closed source AI. Blockchain can track AI training and inference to achieve better data governance, while encryption technology can ensure trust in the post-AI era and address deepfakes and privacy protection issues.
Related Reading
An article reviews the direction and protocol of AI-enabled Crypto landing
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Crypto empowers AI development in 7 major directions (with representative potential projects).
Original Author | @cebillhsu
Compile | Golem
GPT-4, Gemini 1.5, and Microsoft AI PC have made impressive advancements in AI technology. However, the current development of AI still faces some problems. AppWorks’ Web3 researcher Bill has conducted in-depth research on these issues and explored 7 directions in which Crypto can empower AI.
Tokenization of Data
Traditional AI training mainly relies on publicly available data on the Internet, or more precisely, traffic data in the public domain. Except for a few companies that provide open APIs, most of the data remains untapped. How to enable more data holders to contribute or authorize their data for AI training while ensuring privacy is a key direction.
However, the biggest challenge in this field is that data is difficult to standardize like computing power. While distributed computing power can be quantified through GPU types, the quantity, quality, and usage of private data are difficult to measure. If distributed computing power is like ERC 20, the tokenization of datasets is similar to ERC 721, making liquidity and market formation more challenging than ERC 20.
Ocean Protocol’s Compute-to-Data feature allows data owners to sell private data while protecting privacy. vana provides a method for Reddit users to aggregate data and sell it to companies training AI large models.
Resource Allocation
Currently, there is a large gap between the supply and demand of GPU computing power. Large companies monopolize most of the GPU resources, which makes the cost of training models very high for small companies. Many teams are working hard to reduce costs by centralizing small-scale and low-utilization GPU resources through decentralized networks, but they still face significant challenges in ensuring stable computing power and sufficient bandwidth.
Incentive RLHF
RLHF (Reinforcement Learning from Human Feedback) is crucial for improving large-scale models, but it requires trained professionals. As market competition intensifies, the cost of hiring these professionals also increases. To reduce costs while maintaining high-quality annotations, the stake and slashing system can be used. One of the biggest expenses in data annotation is the need for supervisors to check quality. However, over the years, blockchain has successfully utilized economic incentive mechanisms to ensure work quality (PoW, PoS). It is believed that creating a sound token economy system can effectively reduce the cost of RLHF.
For example, Sapien AI has introduced Tag 2 Earn and collaborated with multiple gamefi guilds; Hivemapper has trained data on 2 million kilometers of roads through token incentive mechanisms; QuillAudits plans to launch an open-source smart contract audit agent, allowing all auditors to collectively train the agent and receive rewards.
Verifiability
How to verify if the computing power provider executes inference tasks according to specific requirements or models? Users are unable to verify the authenticity and accuracy of AI models and their outputs. This lack of verifiability can lead to distrust, errors, and even harm to interests in fields such as finance, medicine, and law.
By using encryption validation systems such as ZKP, OP, and TEE, inference service providers can prove that the output is generated by a specific model. The benefits of using encryption validation include maintaining the confidentiality of the model by the model provider, allowing users to verify the correctness of model execution, and aggregating the proof of encryption into smart contracts to overcome the computational limitations of blockchain. It is also worth considering running AI directly on the device to solve performance issues, but so far, satisfactory answers have not been found. Projects in this field include Ritual, ORA, and Aizel Network.
Depth fake
With the emergence of production AI, people are paying more and more attention to the issue of DeepFake. However, the progress of DeepFake technology is faster than that of detection technology, making it increasingly difficult to detect DeepFake. Although digital watermarking technologies (such as C2PA) can help identify DeepFake, they also have limitations because the processed images have been modified, and the public cannot verify the signature on the original image. It will be very difficult to verify only through the processed images.
Blockchain technology can solve the problem of deepfakes in various ways. Hardware authentication can use tamper-proof chip cameras to embed encrypted proofs in each original photo to verify the authenticity of the image. Blockchain has immutability, allowing images with metadata to be added to blocks with timestamps, preventing tampering and verifying the original source. In addition, wallets can be used to attach encrypted signatures to published posts to verify the identity of the content creator. KYC infrastructure based on zk technology can bind wallets with verified identities, while protecting user privacy. From an economic incentive perspective, authors should be punished for posting false information, and users can be rewarded for identifying false information.
Numbers Protocol has been deeply involved in this field for many years; Fox News’ verification tool is based on the Polygon blockchain, allowing users to search for articles and retrieve relevant data from the blockchain.
Privacy
When AI models involve sensitive information in fields such as finance, healthcare, and law, it is crucial to protect data privacy while using them. Homomorphic Encryption (FHE) can process data without decryption, thus safeguarding privacy when using the LLM model. The workflow is as follows:
Zama is building a fully homomorphic encryption (FHE) solution and has recently completed a $73 million financing to support development.
AI Agent
AI agential idea is very futuristic. What would the future be like if AI agents could own assets and trade? People may shift from using general large-scale models to assigning tasks to specialized agents.
These agents will collaborate with each other, just as rational economic relationships can enhance human collaboration. Adding economic relationships to AI agents can also improve their efficiency. Blockchain can become the testing ground for this concept. For example, Colony is experimenting with this idea through games, providing wallets for AI agents to trade with other agents or real players to achieve specific goals.
Conclusion
Most of the issues are actually related to open source AI. To ensure that this important technology in the next decade will not be monopolized by a few companies, the token economy system can quickly utilize decentralized computing resources and training datasets to narrow the resource gap between open source and closed source AI. Blockchain can track AI training and inference to achieve better data governance, while encryption technology can ensure trust in the post-AI era and address deepfakes and privacy protection issues.
Related Reading
An article reviews the direction and protocol of AI-enabled Crypto landing