After waking up, many fren asked me to check #manus, claiming to be a truly universal AI Agent in the world, capable of independent thinking and planning to execute complex tasks and deliver complete results. Sounds very cool, but besides the anxious voices of many fren circles about losing their jobs, what will it bring to the explosion of the web3 DeFai scene? Here, let me share my thoughts:
About a month ago, OpenAI launched a similar product called Operator, where AI can independently complete tasks such as restaurant reservations, shopping, ticket booking, and ordering takeout in the browser. Users can supervise visually and take control at any time.
The appearance of this set of Agent did not generate much discussion, because it is a single model-driven, or a framework for tool invocation. When users think that critical decisions still need intervention, they lose the idea of relying on it to perform tasks.
On the surface, Manus seems to be not much different, just with many more application scenarios, including resume screening, stock research, real estate purchase, etc., but in fact, it is the difference in the underlying framework and execution system. Manus is driven by a multimodal large model and innovatively adopts a multi-signature system.
In short, AI is to imitate the PDCA cycle of human performance (plan - execute - check - act), which will be completed by multiple large models collaborating together. Each model focuses on a specific aspect, which can reduce the decision-making risks of individual model tasks and improve execution efficiency. The so-called ‘multi-signature system’ is actually a decision verification mechanism for multi-model collaboration, which ensures the reliability of decision-making and execution by requiring the joint confirmation of multiple specialized models.
In this comparison, the advantages of manus are obviously highlighted, coupled with a series of operational experiences demonstrated in the video demo, which indeed gives people an extraordinary sense of experience. But objectively speaking, Manus’s iterative innovation for the Operator is just a beginning, and it has not yet reached the significance of a revolutionary revolution.
The key point lies in the complexity of its execution tasks, as well as the fault tolerance and successful delivery result definition of large models after non-uniform standard user input prompts enter. Otherwise, following this set of innovations, can the DeFai scenario of web3 immediately mature and be applied? Obviously, it is not achievable yet:
For example, in the DeFai scenario, the Agent needs to make transaction decisions, and there needs to be an Oracle layer Agent responsible for on-chain data collection and verification, as well as data integration and analysis. It also needs to capture trading opportunities in real time, which poses a great challenge to real-time analysis. There may be trading opportunities that were useful just a second ago, but once the large model is transmitted to the transaction execution Agent by the Oracle, the trading opportunity no longer exists (arbitrage window).
This actually exposes the biggest weakness of such multimodal large models in making execution decisions, how to connect to the network, trigger chain calls to analyze Real-Time level data, identify trading opportunities from it, and then capture trades. The networking environment is actually fine, as the prices of many e-commerce website orders do not change in real time, which does not easily cause huge dynamic balance problems for the entire multimodal collaboration. If it is on the chain, such challenges exist almost all the time.
Therefore, overall, the emergence of manus will indeed cause a wave of anxiety in the web2 field, after all, many highly repetitive clerical and information processing jobs may face the risk of being replaced by AI. But it’s their anxiety.
We must objectively understand the role of this matter in driving the application scenarios of DeFai in web3.
It must be admitted that the significance is definitely significant, after all, the LLM OS and the Less Structure more intelligence concept it proposes, especially the multi-signature system, will provide great inspiration for the combination of DeFi and AI in the expansion of web3.
This actually corrects a major misconception of most DeFi projects, don’t just rely on a large model to achieve complex goals such as AI Agent autonomous thinking + decision-making, which is simply not practical in the financial context.
The realization of the true DeFai vision requires solving complex problems such as the capacity limit of individual AI models, atomicity assurance of multimodal interactive collaboration, unified resource scheduling and allocation of multimodal systems, system fault tolerance and fault handling mechanisms, etc.
For example: Oracle layer Agent, responsible for collecting on-chain data and analysis, monitoring prices, forming an effective data source;
Decision-making Agent, analyze and evaluate risks based on the data fed by Oracle, and formulate a set of decision-making and action plans;
The execution layer Agent executes various solutions provided by the decision-making layer, taking into account the actual situation, including gas cost optimization, cross-chain state, transaction order conflicts, and so on.
Only when this series of Agents are synchronized and have a massive system framework in place, can a true DeFai revolution be triggered.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
What impact does Manus's popularity have on Web3 DeFAI?
After waking up, many fren asked me to check #manus, claiming to be a truly universal AI Agent in the world, capable of independent thinking and planning to execute complex tasks and deliver complete results. Sounds very cool, but besides the anxious voices of many fren circles about losing their jobs, what will it bring to the explosion of the web3 DeFai scene? Here, let me share my thoughts:
About a month ago, OpenAI launched a similar product called Operator, where AI can independently complete tasks such as restaurant reservations, shopping, ticket booking, and ordering takeout in the browser. Users can supervise visually and take control at any time.
The appearance of this set of Agent did not generate much discussion, because it is a single model-driven, or a framework for tool invocation. When users think that critical decisions still need intervention, they lose the idea of relying on it to perform tasks.
In short, AI is to imitate the PDCA cycle of human performance (plan - execute - check - act), which will be completed by multiple large models collaborating together. Each model focuses on a specific aspect, which can reduce the decision-making risks of individual model tasks and improve execution efficiency. The so-called ‘multi-signature system’ is actually a decision verification mechanism for multi-model collaboration, which ensures the reliability of decision-making and execution by requiring the joint confirmation of multiple specialized models.
The key point lies in the complexity of its execution tasks, as well as the fault tolerance and successful delivery result definition of large models after non-uniform standard user input prompts enter. Otherwise, following this set of innovations, can the DeFai scenario of web3 immediately mature and be applied? Obviously, it is not achievable yet:
For example, in the DeFai scenario, the Agent needs to make transaction decisions, and there needs to be an Oracle layer Agent responsible for on-chain data collection and verification, as well as data integration and analysis. It also needs to capture trading opportunities in real time, which poses a great challenge to real-time analysis. There may be trading opportunities that were useful just a second ago, but once the large model is transmitted to the transaction execution Agent by the Oracle, the trading opportunity no longer exists (arbitrage window).
This actually exposes the biggest weakness of such multimodal large models in making execution decisions, how to connect to the network, trigger chain calls to analyze Real-Time level data, identify trading opportunities from it, and then capture trades. The networking environment is actually fine, as the prices of many e-commerce website orders do not change in real time, which does not easily cause huge dynamic balance problems for the entire multimodal collaboration. If it is on the chain, such challenges exist almost all the time.
We must objectively understand the role of this matter in driving the application scenarios of DeFai in web3.
It must be admitted that the significance is definitely significant, after all, the LLM OS and the Less Structure more intelligence concept it proposes, especially the multi-signature system, will provide great inspiration for the combination of DeFi and AI in the expansion of web3.
This actually corrects a major misconception of most DeFi projects, don’t just rely on a large model to achieve complex goals such as AI Agent autonomous thinking + decision-making, which is simply not practical in the financial context.
The realization of the true DeFai vision requires solving complex problems such as the capacity limit of individual AI models, atomicity assurance of multimodal interactive collaboration, unified resource scheduling and allocation of multimodal systems, system fault tolerance and fault handling mechanisms, etc.
For example: Oracle layer Agent, responsible for collecting on-chain data and analysis, monitoring prices, forming an effective data source;
Decision-making Agent, analyze and evaluate risks based on the data fed by Oracle, and formulate a set of decision-making and action plans;
The execution layer Agent executes various solutions provided by the decision-making layer, taking into account the actual situation, including gas cost optimization, cross-chain state, transaction order conflicts, and so on.
Only when this series of Agents are synchronized and have a massive system framework in place, can a true DeFai revolution be triggered.