Flash Depth Data Forum

Forum

Search Extension

Vana: Empowering Your Data to Flow Freely like a Token and Create Value in the AI Era

思维怪怪

2024-09-24 10:30

Read this article in 24 Minutes

How can Vana Use "Data DAO" and "Proof of Contribution" to Refactor the Data Value Chain in the AI Era?

Have you ever wondered why social media like Reddit and X (formerly Twitter) are free to use? The answer lies in the posts you make, the likes you click, and even the time you stay on them every day.

Once upon a time, these platforms sold your attention as a commodity to advertisers. Now, they have found a bigger buyer - AI companies. It is reported that just a data licensing agreement between Reddit and Google can bring the former $60 million in revenue per year. And this huge wealth has nothing to do with you and me as data creators.

What's more disturbing is that AI trained with our data may replace our jobs in the future. Although AI may also create new jobs, the wealth concentration effect brought about by this data monopoly has undoubtedly exacerbated social inequality. We seem to be sliding into a cyberpunk world controlled by a few technology giants.

So, as ordinary people, how can we protect our interests in this AI era? After the rise of AI, many people regard blockchain as the last line of defense for humans to resist AI. It is based on this kind of thinking that some innovators began to explore solutions. They proposed: First, we must take back the ownership and control of our own data; second, we must use this data to jointly train an AI model that truly serves ordinary people.

This idea may seem idealistic, but history tells us that every technological revolution begins with a "crazy" idea. Today, a new public chain project called "Vana" is turning this idea into reality. As the first decentralized data liquidity network, Vana attempts to convert your data into freely circulated tokens, and thereby promote the realization of decentralized artificial intelligence that is truly controlled by users.

Vana's founder and project origin

In fact, the birth of Vana can be traced back to a classroom in the Massachusetts Institute of Technology (MIT) Media Lab. There, two young people who wanted to change the world - Anna Kazlauskas and Art Abal - met.

Left: Anna Kazlauskas; Right: Art Abal

Anna Kazlauskas majored in computer science and economics at MIT, and her interest in data and cryptocurrency dates back to 2015. At that time, she was involved in the early mining of Ethereum, which made her deeply aware of the potential of decentralized technology. Subsequently, Anna conducted data research at international financial institutions such as the Federal Reserve, the European Central Bank, and the World Bank, which made her realize that in the future world, data will become a new form of currency.

At the same time, Art Abal studied for a master's degree in public policy at Harvard University and conducted in-depth research on data impact assessment at the Belfer Center for Science and International Affairs. Prior to joining Vana, Art led innovative data collection methods at Appen, an AI training data provider, which made important contributions to the birth of many of today's generative AI tools. His insights on data ethics and AI responsibility have infused Vana with a strong sense of social responsibility.

When Anna and Art met in a course at the MIT Media Lab, they quickly discovered that they shared a common passion for data democratization and user data rights. They realized that to truly solve the problems of data ownership and AI fairness, a new paradigm was needed - a system that allows users to truly control their own data.

It was this shared vision that led them to co-found Vana. Their goal is to build a revolutionary platform that not only fights for data sovereignty for users, but also ensures that users can gain economic benefits from their own data. Through the innovative DLP (Data Liquidity Pool) mechanism and Proof of Contribution system, Vana enables users to safely contribute private data, jointly own and benefit from AI models trained with this data, thereby promoting the development of user-led AI.

Vana's vision has been quickly recognized by the industry. As of now, Vana has announced that it has completed a total of US$25 million in financing, including a US$5 million strategic round of financing led by Coinbase Ventures, an US$18 million Series A round of financing led by Paradigm, and a US$2 million seed round of financing led by Polychain. Other well-known investors include Casey Caruso, Packy McCormick, Manifold, GSR, etc.

In a world where data is the new oil, the emergence of Vana undoubtedly provides us with an important opportunity to reclaim data sovereignty. So, how does this promising project work? Let's take a deep look at Vana's technical architecture and innovative ideas.

Vana's Technical Architecture and Innovative Ideas

Vana's technical architecture is a carefully designed ecosystem designed to democratize data and maximize its value. Its core components include data liquidity pools (DLP), proof of contribution mechanism, Nagoya consensus, user self-hosted data, and decentralized application layer. Together, these elements build an innovative platform that protects user privacy and unlocks the potential value of data.

1. Data Liquidity Pool (DLP): The Cornerstone of Data Valorization

The data liquidity pool is the basic unit of the Vana network, which can be understood as the data version of "liquidity mining". Each DLP is essentially a smart contract that is specifically designed to aggregate a specific type of data assets. For example, Reddit Data DAO (r/datadao) is a successful DLP case that has attracted more than 140,000 Reddit users to join. It aggregates users' Reddit posts, comments, and voting history.

After users submit data to DLP, they can get rewards for the DLP's specific tokens. For example, the specific token of Reddit Data DAO (r/datadao) is RDAT. These tokens not only represent the user's contribution to the data pool, but also give users the right to govern the DLP and the right to future profit distribution. It is worth noting that Vana allows each DLP to issue its own tokens, which provides a more flexible value capture mechanism for different types of data assets.

In Vana's ecosystem, the top 16 DLPs can also receive additional VANA token emission rewards, which further stimulates the formation and competition of high-quality data pools. In this way, Vana cleverly transforms scattered personal data into liquid digital assets, laying the foundation for the value and liquidity of data.

2. Proof of Contribution: Accurate measurement of data value

Proof of Contribution is a key mechanism for Vana to ensure data quality. Each DLP can customize a unique proof of contribution function based on its own characteristics. This function not only verifies the authenticity and integrity of the data, but also evaluates the contribution of the data to the performance improvement of the AI model.

Take ChatGPT Data DAO as an example, its proof of contribution covers four key dimensions: authenticity, ownership, quality and uniqueness. Authenticity is ensured by verifying the data export link provided by OpenAI; ownership is verified by the user's email; quality assessment is scored by LLM for randomly sampled conversations; uniqueness is determined by calculating the feature vector of the data and comparing it with existing data.

This multi-dimensional assessment ensures that only high-quality and valuable data can be accepted and rewarded. Proof of contribution is not only the basis for data pricing, but also a key guarantee for maintaining the data quality of the entire ecosystem.

3. Nagoya Consensus: Decentralized Data Quality Assurance

Nagoya Consensus is the heart of the Vana Network, which draws on and improves Bittensor's Yuma Consensus. The core idea of this mechanism is to collectively evaluate the data quality through a group of verification nodes and use weighted average to obtain the final score.

What is more innovative is that the verification node not only evaluates the data, but also scores the scoring behavior of other verification nodes. This "double-layer evaluation" mechanism greatly improves the fairness and accuracy of the system. For example, if a verification node gives a high score to a piece of obviously low-quality data, other nodes will give a punitive score to this improper behavior.

Every 1,800 blocks (about 3 hours) as a cycle, the system will allocate corresponding rewards to the verification node based on the comprehensive score during this period. This mechanism not only incentivizes the verification node to remain honest, but also quickly identifies and eliminates bad behavior, thereby maintaining the healthy operation of the entire network.

4. Non-custodial data storage: the last line of defense for privacy protection

A major innovation of Vana lies in its unique data management method. In the Vana network, the user's original data has never really been "on-chain", but the user chooses the storage location by himself, such as Google Drive, Dropbox, or even a personal server running on a Macbook.

When users submit data to DLP, they actually just provide a URL pointing to the encrypted data and an optional content integrity hash. This information is recorded in Vana's data registry contract. When the verifier needs to access the data, it will request a decryption key, then download and decrypt the data for verification.

This design cleverly solves the problem of data privacy and control. Users always maintain full control of their own data while being able to participate in the data economy. This not only ensures the security of the data, but also opens up the possibility of a wider range of data application scenarios in the future.

5. Decentralized application layer: Diversified realization of data value

The top layer of Vana is an open application ecosystem. Here, developers can use the data liquidity accumulated by DLP to build various innovative applications, and data contributors can obtain actual economic value from these applications.

For example, a development team may train a specialized AI model based on the data of Reddit Data DAO. Users who participate in data contribution can not only use the model after training, but also obtain the income generated by the model according to their respective contribution ratios. In fact, such an AI model has been developed. For details, please read "Bottoming out, why did the old AI track currency r/datadao come back to life?".

This model not only encourages the contribution of more high-quality data, but also creates a truly user-led AI development ecosystem. Users have transformed from mere data providers to co-owners and beneficiaries of AI products.

In this way, Vana is reshaping the data economy. In this new paradigm, users have transformed from passive data providers to active, participating, and co-benefiting ecosystem builders. This not only creates new channels for individuals to acquire value, but also injects new vitality and innovation into the entire AI industry.

Vana's technical architecture not only solves core issues in the current data economy, such as data ownership, privacy protection, and value distribution, but also paves the way for future data-driven innovation. As more data DAOs join the network and more applications are built on the platform, Vana has the potential to become the infrastructure for the next generation of decentralized AI and data economy.

Satori Testnet: Vana's Public Testing Grounds

With the launch of the Satori testnet on June 11, Vana presented the prototype of its ecosystem to the public. This is not only a platform for technical verification, but also a preview of the future mainnet operation mode. At present, the Vana ecosystem provides three main paths for participants: running a DLP verification node, creating a new DLP, or submitting data to an existing DLP to participate in "data mining."

Running a DLP Verification Node

The verification node is the gatekeeper of the Vana network, responsible for verifying the quality of data submitted to the DLP. Running a verification node requires not only technical capabilities, but also sufficient computing resources. According to Vana's technical documentation, the minimum hardware requirements for a verification node are 1 CPU core, 8GB RAM, and 10GB of high-speed SSD storage space.

Users interested in becoming validators need to first select a DLP and then register as a validator through the smart contract of that DLP. Once the registration is approved, the validator can run a validation node specific to that DLP. It is worth noting that validators can run nodes for multiple DLPs at the same time, but each DLP has its own unique minimum staking requirements.

Creating a New DLP

Creating a new DLP is an attractive option for users with unique data resources or innovative ideas. Creating a DLP requires a deep understanding of Vana's technical architecture, especially the Proof of Contribution and Nagoya Consensus Mechanism.

The creator of a new DLP needs to design specific data contribution goals, verification methods, and reward parameters. At the same time, they also need to implement a proof of contribution function that can accurately assess the value of the data. Although this process is complex, Vana provides detailed templates and documentation support.

Participate in Data Mining

For most users, submitting data to an existing DLP to participate in "data mining" may be the most direct way to participate. At present, 13 DLPs have been officially recommended, covering multiple fields from social media data to financial prediction data.

· Finquarium: Collects financial prediction data.

· GPT Data DAO: Focuses on ChatGPT chat data export.

· Reddit Data DAO: Focuses on Reddit user data and has been officially launched.

· Volara: Focuses on the collection and utilization of Twitter data.

· Flirtual: Collects dating data.

· ResumeDataDAO: Focuses on LinkedIn data export.

· SixGPT: Collects and manages LLM chat data.

· YKYR: Collect Google Analytics data.

· Sydintel: Reveal the dark corners of the Internet through crowdsourced intelligence.

· MindDAO: Collect time series data related to user happiness.

· Kleo: Build the world's most comprehensive browsing history dataset.

· DataPIG: Focus on token investment preference data.

· ScrollDAO: Collect and utilize Instagram data.

Some of these DLPs are still under development, while others have already been launched, but they are all in the pre-mining stage. Because only after the mainnet is launched can users officially submit data for mining. However, users can now lock in their participation qualifications in advance in a variety of ways. For example, users can participate in related challenge activities in the Vana Telegram App, or pre-register on the official websites of various DLPs.

Summary

The emergence of Vana marks a paradigm shift in the data economy. In the current wave of AI, data has become the "oil" of the new era, and Vana is trying to reshape the mining, refining and distribution model of this resource.

In essence, Vana is building a data version of the "tragedy of the commons" solution. Through clever incentive design and technological innovation, it transforms personal data, a seemingly unlimited but difficult-to-monetize resource, into a manageable, priced, and tradable digital asset. This not only opens up new ways for ordinary users to participate in the distribution of AI dividends, but also provides a possible blueprint for the development of decentralized AI.

However, Vana's success still faces many uncertainties. Technically, it needs to find a balance between openness and security; economically, it needs to prove that its model can generate sustained value; socially, it also needs to deal with potential data ethics and regulatory challenges.

On a deeper level, Vana represents a reflection and challenge to the existing data monopoly and AI development model. It raises an important question: In the AI era, do we choose to continue to strengthen the existing data oligopoly, or try to build a more open, fair, and diverse data ecosystem?

Whether Vana will ultimately succeed or not, its emergence provides us with a window to rethink data value, AI ethics, and technological innovation. In the future, projects like Vana may become an important bridge connecting the Web3 ideal and AI reality, and point the way for the next stage of development of the digital economy.

Welcome to join the official BlockBeats community:

Telegram Subscription Group: https://t.me/theblockbeats

Telegram Discussion Group: https://t.me/BlockBeats_App

Official Twitter Account: https://twitter.com/BlockBeatsAsia

#AI #data

Correction/Report

This platform has fully integrated the Farcaster protocol. If you have a Farcaster account, you canLogin to comment