Waking up from a nap, many friends asked me to look at manus, claiming to be a globally universal AI Agent that can truly think independently, plan and execute complex tasks, and deliver complete results. It sounds very cool, but besides the voices of anxiety about losing jobs in many friend circles, what will it bring to the explosive growth of the web3 DeFai scene? Below, let me share my thoughts:
About a month ago, OpenAI launched a similar product, Operator, which allows AI to independently complete tasks such as restaurant reservations, shopping, ticket booking, and takeaway ordering in the browser. Users can supervise visually and take control at any time.
The appearance of this set of Agents did not generate much discussion, because it is a single-model-driven framework that calls tools. When users realize that critical decisions still require intervention, they lose the idea of relying on its task execution.
2)manus seems to be not much different on the surface, just adding many application scenarios, including resume screening, stock research, real estate purchases, etc., but in fact, the difference lies in the framework and execution system behind it. Manus is driven by multimodal large models and innovatively adopts a multi-signature system.
In short, AI is to imitate the PDCA cycle of human execution (plan - execute - check - act), which will be completed by multiple large models working together. Each model focuses on a specific link, which can reduce the decision-making risk of a single model and improve execution efficiency. The so-called “multi-signature system” is actually a decision verification mechanism for multi-model collaboration, ensuring the reliability of decision-making and execution by requiring the joint confirmation of multiple professional models.
By comparison, the advantages of manus are obviously highlighted, coupled with a series of operational experiences demonstrated in the video Demo, giving people a truly extraordinary experience. However, objectively speaking, the iterative innovation of manus on Operator is only the beginning and has not yet reached the significance of revolutionary subversion.
The key point lies in the complexity of its execution tasks, as well as the definition of the fault tolerance and delivery success rate of the large model after the non-uniform standard user input prompt. Otherwise, following this innovation, can the DeFai scene of web3 immediately become a mature application? Obviously, it’s not there yet:
For example: In the DeFai scenario, the Agent needs to make transaction decisions, and there should be an Oracle layer Agent responsible for on-chain data collection and verification, as well as data integration and analysis. It also needs to monitor on-chain prices in real time to capture trading opportunities. This process poses a great challenge to real-time analysis, as there may be trading opportunities that were useful just a second ago, but no longer exist after the large model is transmitted to the transaction-executing Agent by Oracle (arbitrage window).
This actually exposes the biggest weakness of such multimodal large models in making execution decisions, how to connect to the Internet, trigger chain calls to analyze Real-Time level data, identify trading opportunities from it, and then capture trades. The networking environment is actually not bad, as the order prices of many e-commerce websites do not change in real time, which does not easily cause significant dynamic balance problems for the entire multimodal collaboration. If it’s on the chain, such challenges are almost always present.
Therefore, overall, the emergence of manus will indeed cause a wave of anxiety in the web2 field, after all, many high-repetition clerical and information processing jobs may face the risk of being replaced by AI. But let them worry about it.
We need to objectively understand the role of this matter in promoting the application scenarios of DeFai in web3:
It must be acknowledged: the significance is definitely significant, after all, the LLM OS and Less Structure more intelligence concepts it proposes, especially the multi-signature system, will provide great inspiration for the combination of DeFi and AI expansion in web3.
This actually corrects a major misunderstanding of most DeFai projects. Don’t start by relying on a large model to achieve the complex goals of AI Agent’s autonomous thinking and decision-making. This is simply not practical in the financial context.
The realization of the true DeFai vision requires solving complex problems such as the capacity limit of single AI models, atomicity guarantee of multimodal interaction and collaboration, unified resource scheduling and management of multimodal systems, system fault tolerance, and fault handling mechanisms, etc.
For example: the Oracle layer Agent is responsible for collecting on-chain data and analyzing, monitoring prices, and forming effective data sources;
The decision-making Agent analyzes and evaluates risks based on the data fed by Oracle, and formulates a set of decision-making and action plans;
The execution layer Agent executes based on the various solutions provided by the decision-making layer and considers the actual situation, including gas cost optimization, cross-chain state, transaction sorting conflicts, and so on.
Only when this series of Agents are synchronized and a huge system framework is in place, can a true DeFai revolution be unleashed.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Interpreting manus: What impact will it have on the explosion of the web3 DeFai scene
Author: Haotian
Waking up from a nap, many friends asked me to look at manus, claiming to be a globally universal AI Agent that can truly think independently, plan and execute complex tasks, and deliver complete results. It sounds very cool, but besides the voices of anxiety about losing jobs in many friend circles, what will it bring to the explosive growth of the web3 DeFai scene? Below, let me share my thoughts:
About a month ago, OpenAI launched a similar product, Operator, which allows AI to independently complete tasks such as restaurant reservations, shopping, ticket booking, and takeaway ordering in the browser. Users can supervise visually and take control at any time.
The appearance of this set of Agents did not generate much discussion, because it is a single-model-driven framework that calls tools. When users realize that critical decisions still require intervention, they lose the idea of relying on its task execution.
2)manus seems to be not much different on the surface, just adding many application scenarios, including resume screening, stock research, real estate purchases, etc., but in fact, the difference lies in the framework and execution system behind it. Manus is driven by multimodal large models and innovatively adopts a multi-signature system.
In short, AI is to imitate the PDCA cycle of human execution (plan - execute - check - act), which will be completed by multiple large models working together. Each model focuses on a specific link, which can reduce the decision-making risk of a single model and improve execution efficiency. The so-called “multi-signature system” is actually a decision verification mechanism for multi-model collaboration, ensuring the reliability of decision-making and execution by requiring the joint confirmation of multiple professional models.
The key point lies in the complexity of its execution tasks, as well as the definition of the fault tolerance and delivery success rate of the large model after the non-uniform standard user input prompt. Otherwise, following this innovation, can the DeFai scene of web3 immediately become a mature application? Obviously, it’s not there yet:
For example: In the DeFai scenario, the Agent needs to make transaction decisions, and there should be an Oracle layer Agent responsible for on-chain data collection and verification, as well as data integration and analysis. It also needs to monitor on-chain prices in real time to capture trading opportunities. This process poses a great challenge to real-time analysis, as there may be trading opportunities that were useful just a second ago, but no longer exist after the large model is transmitted to the transaction-executing Agent by Oracle (arbitrage window).
This actually exposes the biggest weakness of such multimodal large models in making execution decisions, how to connect to the Internet, trigger chain calls to analyze Real-Time level data, identify trading opportunities from it, and then capture trades. The networking environment is actually not bad, as the order prices of many e-commerce websites do not change in real time, which does not easily cause significant dynamic balance problems for the entire multimodal collaboration. If it’s on the chain, such challenges are almost always present.
We need to objectively understand the role of this matter in promoting the application scenarios of DeFai in web3:
It must be acknowledged: the significance is definitely significant, after all, the LLM OS and Less Structure more intelligence concepts it proposes, especially the multi-signature system, will provide great inspiration for the combination of DeFi and AI expansion in web3.
This actually corrects a major misunderstanding of most DeFai projects. Don’t start by relying on a large model to achieve the complex goals of AI Agent’s autonomous thinking and decision-making. This is simply not practical in the financial context.
The realization of the true DeFai vision requires solving complex problems such as the capacity limit of single AI models, atomicity guarantee of multimodal interaction and collaboration, unified resource scheduling and management of multimodal systems, system fault tolerance, and fault handling mechanisms, etc.
For example: the Oracle layer Agent is responsible for collecting on-chain data and analyzing, monitoring prices, and forming effective data sources;
The decision-making Agent analyzes and evaluates risks based on the data fed by Oracle, and formulates a set of decision-making and action plans;
The execution layer Agent executes based on the various solutions provided by the decision-making layer and considers the actual situation, including gas cost optimization, cross-chain state, transaction sorting conflicts, and so on.
Only when this series of Agents are synchronized and a huge system framework is in place, can a true DeFai revolution be unleashed.