Layer-1 blockchains like Ethereum are not designed to "talk" to other blockchains or real-world data sources. However, possessing the ability to trigger transactions based on off-chain data increases the potency of decentralized applications. For example, i) decentralized prediction markets need to access real-world information for validating users' predictions, or ii) trading venues for tokenized commodities need to access prices of commodities from exchanges outside of their blockchains. Broadly speaking, oracles bring data from the outside world into an application or a blockchain.
How Do Oracles Work?
At a high level, oracles have two components – an on-chain contract and a node network that gathers off-chain data. The node network gets the off-chain data on-chain and the contract is responsible for receiving requests from other contracts and delivering the output to them.
An oracle's design depends on its use case. Immediate-read type oracles are used when data is needed immediately and unlikely to change. For example, if a prediction market needs to resolve a bet like 'who will be the Prime Minister of the UK in November 2022?' (after the election took place), it can use an immediate-read type oracle. A publish-subscribe oracle is a data feed streamed to smart contracts. This type of oracle is used when the data frequently changes. For example, a decentralized futures exchange constantly needs to know the price of ETHUSD. Another type of oracle design is request-response, where a client contract requests arbitrary data. It allows clients to access data other than what is provided by publish-subscribe oracles. This kind of oracle's on-chain contracts listen to requests from other on-chain smart contracts. Nodes on the oracle's network then gather this data from different sources via API calls or other means and deliver the data to the oracle contract. Nodes may collect this data directly from sources or use API calls from data providers like Amberdata and Kaiko. The oracle contract either directly delivers the data to contracts that requested it or performs computations on the data, such as aggregation, before doing so. This oracle design is ideal when smart contracts require only a small part of the data at one point in time. For example, when a supply-chain-based smart contract requires geolocation, it may employ a request-response type oracle. Interested readers can refer to this paper to further explore blockchain oracle designs.
First-Party vs. Third-Party Oracles
Another important design distinction of oracles is whether they are first-party or third-party oracles. In the case of third-party oracles, as the name suggests, oracle nodes are not data sources. The oracle incentivizes a network of nodes to gather data, ideally, from multiple third-party sources. The oracle then normalizes data collected from multiple nodes using methodologies like weighted averages or VWAP before delivering the data to the end user. If a third-party oracle has sufficiently diversified data sources, it is more robust to individual sources having issues. For example, in December 2021, Band Protocol reported an incident of misquoting USDT prices when CoinMarketCap and Brave New Coin, 2 of the 4 data sources used by its nodes, reported highly irregular prices. This resulted in inaccurate reporting despite the contract using the average of the two middle prices. In the case of third-party oracles, users trust the economic incentives for the nodes to deliver the correct data.
First-party oracles are provided by the API providers themselves. As the data provider publishes data on-chain, it doesn't have to pass through middlemen (nodes), reducing the trust a user needs to place in the nodes. Getting rid of the intermediate nodes that handle data allows first-party oracles to be cheaper than third-party ones and reduces the attack surface. Since the data sources are known in this type of oracle design, their credibility is at stake when misreporting data. Thus, they are more likely to play by the rules and provide accurate data.
In effect, there is a trade-off between the trust assumptions of the first-party and third-party oracles. In the case of first-party oracles, users trust the data provider and its security practices that ensure data integrity. Whereas, in the case of third-party oracles, users trust a network of nodes, their incentive model that assures data quality, and the oracle that normalizes and "vets" data from multiple providers. This boils down to whom users trust more - data providers themselves or an oracle that "vets" the data from multiple nodes. On the technical side, compared to third-party oracles, first-party oracles have faced more implementation challenges. Some of the API or data providers were apprehensive of running oracle nodes because, for example, i) first- party node operations can be unstable, ii) they may require regular user intervention, and iii) data providers are unwilling to spend or get paid in digital assets.
Oracle Solution Providers
Chainlink is the most widely used oracle within the digital asset industry. It primarily operates third-party oracles but Chainlink also provides first-party oracles for AD Derivatives (options data) and CipherTrace (KYC-related data). Chainlink uses its LINK token to incentivize its node operators to aggregate off-chain data.
Band Protocol has built its own Cosmos-SDK-based chain, BandChain. BandChain is a blockchain-agnostic solution that can seamlessly query data from interblockchain communication(IBC) protocol-compliant chains. This design is similar to that of Chainlink, but it is built on Cosmos rather than Ethereum. Like Chainlink, Band uses its native token, BAND, to compensate node operators.
API3 has built a solution called Airnode, which is an API wrapper that allows web APIs to connect to blockchains and directly provide data. Data providers, which are reluctant to run oracle nodes themselves because these require constant maintenance and monitoring, can use this solution.
Similar to API3, Pyth Network allows financial market participants such as CBOE (Chicago Board Options Exchange) and Jane Street to publish data on Pyth Network. Pyth is focused on pricing data and currently provides data for commodities, digital assets, equity, and forex. Operating on Solana allows Pyth to update data every 400ms (each block). Pyth mandates data publishers to stake PYTH tokens and shares 20% of the fee revenue with them. If published data is found to be incorrect, the publisher's stake is slashed.
While most oracles aggregate data from different sources and process it themselves before delivering it to requesters, Flux Protocol allows data providers to control data flow from the source to the destination chain. Its open-source provider node constantly queries the data providers' API endpoints and then calls the first-party oracle contract to post the data on the destination chain. Flux Protocol also uses its token, FLX, to ensure data correctness.
Appendix B: Data Delivery Methods
The table below provides an overview of key providers' market data delivery methods.