Original source: GraphProtocol Chinese’
Overview:
· For the current rapid development of AI, decentralized solutions are urgently needed to prevent the centralized control of technology by technology giants. Combining AI and blockchain is the best way to ensure data openness and verifiability.
· Knowledge graphs have excellent data organization and retrieval capabilities. Retrieval-augmented generation (RAG) and knowledge graphs improve the accuracy of large language models by providing up-to-date information relevant to the context.
· Decentralized knowledge graphs are the next major paradigm shift. It can use blockchain technology to ensure open access to information while enhancing trust through verifiability and transparent governance.
· Geo is a pioneering decentralized knowledge graph that will be launched on The Graph soon. Geo seamlessly integrates blockchain technology and AI to create a more accessible, reliable, and truly user-governed internet.
· With human-engaged verification and AI-driven content generation, information will be generated and organized at an exponential rate, ensuring trust and transparency while also maintaining a human touch.
We have already witnessed the rapid mainstream adoption of Large Language Models (LLMs) and the heated discussions about the risks this technology may bring. It is clear that AI will have a profound impact on culture, politics, and the pursuit of truth. Therefore, we, as a global community, must not allow a few tech giants to monopolize AI through data moats, but work together to build decentralized alternatives.
By ensuring that data remains open and public, we can build a trust layer to verify the accuracy of data in a way that is not possible in a commercial environment controlled by large tech companies. We should not be influenced by the biases, assumptions, and opinions of a few large companies, but must work together to build a decentralized brain that is truly accessible and owned by everyone. AI technology itself, and its integration into our daily lives, should be designed as a public good from the beginning, rather than being provided to the public by individual tech giants in a closed environment.
When discussing large language models and information retrieval, we can use the human brain as an analogy to understand how we interact with artificial intelligence through working memory and explicit memory. Large language models are good at explicit memory. During the training phase of the model, large language models use weights to encode data so that they can parse large amounts of content and remember this information well. However, this approach is not without its drawbacks. Since large language models cannot actually store all the training data (because the amount of data grows exponentially), this leads to the well-known phenomenon of large language models "hallucination", that is, large language models give laughable guesses to some seemingly simple problems. Moreover, since they cannot be continuously trained, large language models cannot absorb the latest information, that is, they are ignorant of the latest innovations and discoveries. This is why retrieval-augmented generation (RAG) technology can be a perfect complement to large language models.
RAG is a process that requires first referring to a dataset outside of the training knowledge of the large language model in order to provide new information and context to the large language model. RAG can be thought of as the working memory of the AI brain. RAG improves the accuracy and relevance of AI-generated content by integrating the latest knowledge using external knowledge bases and vector databases. However, if unstructured information is overly relied upon, it may make the process of extracting data very complicated, bring information redundancy, and fail to ensure that the correct contextual information is used when answering.
Knowledge graphs can greatly enhance the capabilities of RAG in large language models. Compared with vector databases, knowledge graphs have many advantages, such as deeper semantic analysis, more efficient data retrieval, and stronger verifiability. Knowledge graphs are very similar to human cognition, excelling at understanding the complexity of natural language and having a nuanced insight into the interrelationships between data. This semantic depth ensures that the large language model obtains accurate information related to the context, significantly improving the quality of generated content. In contrast, vector databases rely on document chunking approaches that either ignore contextual information or rely on irrelevant information, which can lead to the "hallucination" phenomenon of large language models. Only through the knowledge graph can the large language model quickly find relevant entities and traverse the graph to obtain all contextual information.
In addition, the structured nature of the knowledge graph is well suited to organizing large amounts of data, even as the dataset is constantly appended. This structural advantage will make the retrieval process more precise, providing the most relevant data for any given query, and improving the performance and efficiency of RAG applications. This performance improvement, combined with the information found in the "explicit memory" of the large language model, allows the large language model's prompt word business to be served from two "memory buckets", because each memory bucket has its own unique style and advantages, so it can provide more accurate and practical responses.
We believe that decentralized knowledge graphs are the perfect marriage of blockchain and AI - that is, a solution that connects all the data in the world and connects them in an easily explorable way through thoughtful creation, curation, organization, and combination. Previously, knowledge graphs were often built in a centralized way by companies or groups with unique knowledge bases, and continuously updated. While this is a great solution that meets specific needs, it does not meet our expectations for the future of this technology: to become the foundation of the future Internet.
Beyond the hype and hype surrounding the combination of blockchain and AI, we believe that decentralized knowledge graphs are unmatched in their importance, paradigm-shifting potential, and cultural relevance.
We are very excited about the work Geo is doing with The Graph, the world’s leading decentralized knowledge graph for indexing and querying blockchain data. Geo pioneers how to build this technology from the ground up in the true spirit of web3 — making all knowledge publicly accessible to everyone, without gatekeepers.
Geo’s goal is not just to organize the world’s data into a searchable database, but to ensure that data is unparalleled in its composability. Similar to an encyclopedia, the key is that there must be an easy way to retrieve the information you need. We can imagine a future world where you can interact with Geo through “agents.” Users ask questions directly to these agents, and the knowledge graph retrieves relevant content, databases, or APIs and feeds them to the Large Language Model (LLM) in real time. Unlike the current model of viewing search results one by one, the agent will directly give the answer to your question after loading all the information related to your query.
Of course, the quality of the information input into the agent is critical, and this is where blockchain technology excels: identity and reputation. By authenticating the original author of each piece of information and ensuring that the author's identity is traceable and verifiable, the attributes and quality of the source of information can be effectively guaranteed. In addition, because all content is highly composable, we can customize the way we interact with this information based on our own interests and needs without affecting the original data.
Our overall vision is to build a decentralized brain that stores information from various data sources, which is then organized by humans into several independent communities called "Spaces". This shared brain is able to reason with structured information to ensure that AI makes wise decisions. Once the decentralized brain is born, it can be connected to the real world through APIs and become a truly autonomous intelligent agent, performing tasks for users and automating daily chores, allowing humans to focus on more meaningful work. At the same time, the interconnected knowledge graph can extract data from multiple dynamic data sources to ensure data diversity.
The Graph is in the best position to implement this architecture in the world of data services, with its newly developed connected data graph. In addition to its existing services, The Graph will add support for big language model data services, meaning that indexers will provide open source model inference services. These models can access data validated by the connected data graph directly or through convenient developer tools. For the first time, developers can use an open, composable, low-latency, and fully integrated technology stack to build more powerful AI agents than ever before.
We must take a different approach to building the decentralized brains of the future to increase the resilience and reliability of AI systems, ensure that big language models provide meaningful responses, and simplify retrieval augmentation generation (RAG). In the design and architecture of The Graph's new era, we can see how a carefully constructed knowledge graph can help us build a better future.
1. Only cryptographically verified, reputable contributors can add information to the Internet graph (and therefore to Geo space)
2. Alternatively, information can be aggregated from verifiable third-party data sources
3. The large language model establishes logical connections between data already stored in working memory and the newly added information. Humans in Geo then verify this information
4. The intelligent agent receives the prompt word from the human and uses RAG to retrieve the most relevant information from the Internet graph
5. Users who wish to contribute content to Geo space are able to create higher quality content because they have direct access to relevant data and more comprehensive information
6. The intelligent agent itself can exist as a user interface. Through the agent, users can request information and create and submit new content themselves. Intelligent agents can help users edit, add, access and link other related information
7. In order to better complete the entire knowledge cycle process, the role of curator can also be introduced. By introducing trusted humans to participate in marking information, the knowledge graph can quickly determine which data is most valuable. We will redesign the role positioning of curators in The Graph and use GRT to incentivize them.
It is not difficult to imagine that in the future, the big language model will autonomously expand the knowledge graph with the help of humans, rather than humans using the big language model to retrieve information. The big language model can generate information and submit it to trusted humans for verification. This will greatly speed up the aggregation of information while retaining human verification of the data, and more importantly, retaining human contact with the data. We can prohibit the big language model from adding data directly, thereby filtering out potential hallucination information, while handling daily tasks with the help of the big language model.
Combining blockchain technology with knowledge graphs provides an additional layer of trust for data verification. Each piece of data can be attributed to a verifiable source, and a clear record of where the data came from, how it was modified, and who was involved is maintained. This transparency enhances the trustworthiness of the data and creates a safe environment for its use, ensuring that decentralized knowledge graphs are the best choice for advancing the use of RAG technology in large language models.
With Geo Browser, we can easily access global information, which puts The Graph in a unique position at the forefront of this exciting Internet revolution. A truly open, decentralized AI brain requires open and transparent governance, which is not possible under a centralized architecture. Therefore, The Graph not only meets the world's demand for a decentralized knowledge graph, but also enables a global user community to participate in the governance of such an important tool.
Let us forge ahead.
欢迎加入律动 BlockBeats 官方社群:
Telegram 订阅群:https://t.me/theblockbeats
Telegram 交流群:https://t.me/BlockBeats_App
Twitter 官方账号:https://twitter.com/BlockBeatsAsia