Insights      Technology      Web3      Why We Need Blockchain Data Infrastructure To Build Web3

Why We Need Blockchain Data Infrastructure To Build Web3

I’m Kat and I work on the Growth team here at Georgian. In my role, I spend most of my time looking for the most interesting companies within our thesis areas, having insightful conversations with co-founders and getting smart about emerging markets. 

One of those, of course, is Web3/crypto. It’s a space that’s been exciting to watch not only because of the massive amount of funding in the space but also the rapid pace of innovation and the talent that Web3 is attracting. 

At first, I found the Web3 world super intimidating and unintuitive. Last summer, I spent a lot of time reading and diving deeper, but until I downloaded Newton and started using the technology itself, it was harder for me to understand. I got more comfortable with the ecosystem by using Metamask and different exchanges, getting my own ETH domain and being part of various Discord/Telegram communities. Only then did I truly realize the potential of Web3.

One area I’m particularly excited about is blockchain data infrastructure, which can help build a new version of the internet, Web3, where people truly own their online identity and data.

In the current Web2 model, information is siloed and controlled by a few companies (namely, the FAANG companies). This can stifle innovation because, over the long term, reliance on a few winners that control most of the internet’s data means smaller companies struggle to build competitive solutions that would give people more choices on platforms they want to use. And, in this model, users have little say in how their data is used. 

With blockchain, we can move from this siloed model of the internet to one that’s more transparent, where data is public, verifiable and under the control of its users. With a blockchain-based model, innovative companies have the data they need to build their businesses and people can control what data they want to share, receive fair compensation for sharing their data, and have more choice in what platforms they want to use.

To achieve this vision, we need blockchain data infrastructure that makes it easier for people to build Web3 applications with more user-friendly interfaces, which can encourage non-crypto natives to try it out while making sense of patterns in data. Making activity and data on the blockchain easier to understand enables transparency and makes it easier to catch bad actors. 

Blockchain data takes a lot of different forms, and we will dive into three areas of blockchain’s data infrastructure that we are excited about. There are many different submarkets that fall under the broader blockchain data infrastructure market, but two we are excited about are blockchain indexing, query and storage, and blockchain intelligence platforms.

Blockchain indexing, query and storage

Public blockchains store data that is distributed over a decentralized network of servers. Probably the best-known use case for that type of database is storing public ledgers recording crypto transactions. As a result, many Web3 data projects are focused on making it easier to access that data for specific purposes, such as monitoring the movement of crypto assets across and between blockchain networks. 

There are data-related blockchain projects that enable access to that financial data, but they can also have more general applications for on-chain data query and storage. These projects are providing the ‘data plumbing’ for web3. They focus on indexing, query and storage for decentralized applications (dApps). While they are often used to create applications that analyze on-chain transactions, they can also support a wider range of use cases.

For example, Ceramic Network enables the storage of ‘streams’ of data on-chain using IPFS, a decentralized file storage product that allows people to download files that aren’t managed by one organization. This allows dApps to store data without having to rely on off-chain databases. Meanwhile, The Graph enables high-performance querying of data on various blockchains, including IPFS and Ethereum.  

Ceramic Network

  • Decentralized data network that uses IPFS for storage (also supports Amazon S3) 
  • Supports ‘mutable’ (changeable) data streams that dApp developers can use to store state in their applications/smart contracts using a decentralized approach. 
  • Enables backup of streams to IPFS, Filecoin, or Amazon S3
  • Supports authentication via W3C Decentralized Identifiers. 
  • Create custom functions that work on the state stored in your streams.

The Graph

  • The Graph aims to provide reliable decentralized query infrastructure for on-chain data.
  • Indexes data for high performance, GraphQL based queries. 
  • Ethereum support is generally available, many other blockchains in beta at this time. 
  • Enables blockchain protocols to automatically query data from other projects, such as Ethereum or the InterPlanetary File System—a decentralized web hosting service.
  • The infrastructure that indexes the blockchain data is itself implemented as a vast network of decentralized operators to process queries. 
  • Anonymous agents called indexers process queries by staking GRT tokens and earning more as a reward. Indexers require a powerful computer to do this, but the rewards can be worthwhile for effective indexers. (source)


  • Queryable database with blockchain characteristics such as decentralization, immutability and the ability to treat anything stored in the database as an asset. 
  • Combines MongoDB with Tendermint’s consensus algorithm to get blockchain features such as synchronizing data between nodes. 
  • Transactions in BigchainDB are similar to Bitcoin and Cardano. An asset transaction receives an asset input which is then transformed to an output that may in turn be used as the input for a subsequent transaction.
  • Supports CREATE transactions to generate a new asset (a JSON document in MongoDB) and a TRANSFER which allows the transfer of ownership of an asset or changes to the metadata. The data itself is immutable. 
  • Not fully decentralized but likely to be better suited to data generating applications that need a more decentralized option than writing to a regular database management system. 


  • Powered by the native CQT token, Covalent Network is a decentralized data infrastructure layer of Web 3.0.
  • An API can be used to pull on-chain data, such as wallet balances, token positions and transaction activity, across seventeen blockchain networks.
  • Covalent differentiates itself by fully indexing entire blockchains, including every transaction, wallet address and smart contract executed on each network. They want to be known as “the Google of blockchains.”
  • It essentially facilitates three main transactions – querying, retrieval and data storage.
  • It allows developers to create new applications and augment existing applications without the need for configuration or making changes to existing infrastructures.

Blockchain intelligence platforms

As blockchains scale, there are new users, new projects and new use cases. With that comes an increased need to read that data and trace specific transactions and users. These companies offer different approaches to blockchain intelligence, but all have the purpose of driving data-driven decisions among people who track blockchain activity to build products.

Blockchain intelligence platforms make it easier to organize and make sense of blockchain data. They query, extract and visualize vast amounts of data from the blockchain, dashboards of companies, sectors and indices. Customers for these solutions include a range of crypto/DeFi investors, hedge funds, exchanges/wallets and government and law enforcement. 

Projects in this submarket include relatively straightforward scanning tools such as Solscan and Etherscan that scan blockchains to give transaction and network visibility. Other more complicated use cases include professional-grade tools for opportunity identification (e.g., Valid Network) or investigation, compliance, and risk management (e.g., TRM Labs and Chainanalysis).  

One of our own favorite platforms for on-chain insights here at Georgian are Dune’s dashboards to analyze the adoption of specific projects and get an overall view of the ETH ecosystem. 

As the submarket develops, we expect blockchain intelligence platforms will continue to play a key role in AML and tracking illicit crypto transactions as well as crypto due diligence for asset and risk managers. We also see opportunities for continuous monitoring of transactional and wallet-level data to assess “reputation scores” for KYC.

We are most excited about projects that are interoperable, meaning they have an API to look across multiple chains and gain full visibility. We see the potential for the next wave of innovation would be applying machine learning techniques to gain actionable insights from this data rather than presenting raw data or using heuristics. 

Valid Network

  • Blockchain regulation protocol fueled by a proprietary AI engine that blends blockchain protocols with traditional enterprise systems to provide information to internal auditors and risk and financial officers.
  • Applies neural net algorithms to blockchain data to determine key transaction and wallet-level insights.
  • Wide range of use cases including risk scores, entity profiling and predictive market signals.

TRM Labs

  • Crypto risk, fraud and financial crime focus.
  • Enables customers to implement digital asset compliance & risk management, with main product being “Know-Your-VASP”, which allows you to assess risk for Virtual Asset Service Providers
  • Trace the source and destination of cryptocurrency transactions.
  • Monitor crypto transactions for AML/KYC compliance. 

Solidus Labs

  • Crypto-native risk monitoring suite, the HALO platform, which protects market participants and investors.
  • Focuses on trade surveillance, transaction monitoring and threat intelligence.
  • Becoming the “integration layer” for compliance / risk monitoring ecosystem and integrates with KYC/AML, KYT, wallet screening and bank account verification providers.
  • Uses supervised and unsupervised ML models to detect forms of market abuse and crypto-specific risks.


  • Chainalysis tools aim to help government agencies, cryptocurrency businesses and financial institutions understand which real-world entities transact with each other (source). In March 2021, Chainalysis raised a $100M Series E as part of its mission to “build trust in blockchains,” according to TechCrunch.
    • Chainalysis leverages ML and deep analytics to help government agencies and private sector businesses track illicit crypto transactions and money laundering.
    • Chainalysis maps cryptocurrency addresses to their real-world entities and provides access to their data through APIs so FIs, governments and crypto exchanges. 


  • Blockchain analytics platform that enriches on-chain data with millions of wallet labels.
  • Crypto investors use Nansen to discover trading opportunities, do due diligence and get real-time dashboards and alerts. 
  • “It enables easier exploration of complex on-chain data on investors’ money flows, exchange activity, and analyzing emerging trends in DeFi, NFTs and DAOs through user-friendly, streamlined visual dashboards” (source).

Dune Analytics 

  • Dune allows users to query, extract and visualize data from the blockchain, which are the basic building blocks for ETH’s blockchain information. It turns difficult-to-access data into tables and charts.
  • Dune mostly deals with Ethereum on-chain data and enables users to access, analyze and visualize decoded Ethereum smart contracts into self-explanatory graphs and reports.
  • Instead of starting from scratch, you are able to start from what tens of thousands of community members have already created, bringing the open finance dynamic of composability to the data layer.
  • The main use cases are sector dashboards, project dashboards and ecosystem dashboards.
    • For example, you can see DEX trading volume, market share across different DEXs.

What’s next

As with any project in Web3, network effects are difficult to achieve and some data projects (mainly within intelligence and indexing/querying) are widely recognized and used. The “dataverse” is an area that is still evolving, with new projects popping up and we are excited about all the different use cases and applications. 

Beyond blockchain data infrastructure, there are so many other building blocks that are fundamental to building the future of Web3. 

The content in this post is provided for informational purposes only and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. Furthermore, this content is not intended for use by any investor or prospective investor and may not be relied upon when making a decision to invest in any Georgian managed fund. An offering to invest in a fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of such fund which should be read in their entirety. Any investments or portfolio companies mentioned, referred to, or described in this presentation are not representative of all investments in vehicles managed by Georgian, and there can be no assurance that the investments referenced will be profitable or that other investments made in the future will have similar characteristics or results. A full list of investments made by funds managed by Georgian is available at As for investments in any cryptocurrency or token project, Georgian is acting in its own financial interest and not necessarily in the interests of other token holders. Georgian has no special role in any referenced projects or power over their management. Georgian does not undertake to continue to have any involvement in these projects other than as an investor and token holder, and other token holders should not expect that it will or rely on it to have any particularly involvement. There can be no assurances that Georgian’s investment objectives will be achieved or investment strategies will be successful. Any investment in a vehicle managed by Georgian involves a high degree of risk including the risk that the entire amount invested is lost. Past performance is not necessarily indicative of future results. Charts and graphs provided herein are solely for informational purposes and should not be relied upon when making any investment decision. All content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice. Please see for additional important information.

Read more like this

Why Georgian Invested in Coder

We are excited to announce that Georgian has led Coder’s $35M fundraise…

Why Georgian is Investing in SurrealDB

The proliferation of unstructured data has, in our view, made building modern…

Redefining Legal Impact with the Team at Darrow

When we think about legal tech software, we think about value add…