Bitcoin

Inside Ethereum’s Smart Contract Ecosystem: Security, Tokens, and Whitelist Verification

Abstract and 1. Introduction

  1. Background

    2.1 Ethereum Primer

    2.2 Whitelisted Address Verification

    2.3 Taint Analysis on Smart Contracts and 2.4 Threat Model

  2. Motivating Example and Challenges

    3.1 Motivating Example

    3.2 Challenges

    3.3 Limitations of Existing Tools

  3. Design of AVVERIFIER and 4.1 Overview

    4.2 Notations

    4.3 Component#1: Code Grapher

    4.4 Component#2: EVM Simulator

    4.5 Component#3: Vulnerability Detector

  4. Evaluation

    5.1 Experimental Setup & Research Questions

    5.2 RQ1: Effectiveness & Efficiency

    5.3 RQ2: Characteristics of Real-world Vulnerable Contracts

    5.4 RQ3: Real-time Detection

  5. Discussion

    6.1 Threats to Validity and 6.2 Limitations

    6.3 Ethical Consideration

  6. Related Work

  7. Conclusion, Availability, and References

2 Background

2.1 Ethereum Primer

There are two types of accounts in Ethereum: external owned account (EOA) and smart contract. Specifically, an EOA is an ordinary account, which is identified by a unique address and controlled by a private key. Furthermore, smart contracts can be regarded as scripts, which are mainly written in Solidity [28], a well-defined and easy-to-use programming language proposed by the Ethereum official. Interactions among accounts are achieved by initiating transactions, which carry the corresponding data. Smart contracts are executed in Ethereum Virtual Machine (EVM) [43], which is embedded in Ethereum client nodes. EVM is stack-based, and all data is stored either permanently or temporarily. Specifically, all operands of operators and intermediate values are pushed onto and popped from the stack. The memory area [42], the temporary one, only keeps data under the context of the current transaction. Only the data stored in the storage area [56] is permanent, i.e., is stored on-chain. We often denote a set of smart contracts which jointly achieve a specific functionality as a decentralized application (DApp).

DApps have demonstrated significant potential since their rise in 2016 [40]. Many genres of DApps have emerged, e.g., gambling [61], token swap [76], and lending [71]. Alongside the hundreds of billions of USD invested in Ethereum, the decentralized versions of traditional financial tools, e.g., exchanges and insurance, have appeared, which are called decentralized finance, i.e., DeFi. Taking advantage of the decentralization, permissionlessness, and transparency in Ethereum, DeFi starts to rise like a rocket. According to statistics, DeFi accounted for $163 billion at the end of 2022 [54].

Except for the official token, Ether, Ethereum allows users to issue tokens as they wish, as long as these tokens meet the standard like ERC-20 [63]. The ERC-20 standard consists of six mandatory functions. Any smart contract implements these functions can issue valid tokens that can circulate in Ethereum, like USDT [16] and USDC [9]. Therefore, DeFi can also issue its own ERC-20 tokens, and take other ERC-20 tokens as valid ones. The interoperability between ERC-20 tokens and DeFi pushes the prosperity of Ethereum.

2.2 Whitelisted Address Verification

In Ethereum, examining the validity of the given addresses is a common practice, which is called whitelisted address verification. It is widely adopted in DeFi apps such as lending [77] and bank [6]. Address verification is the cornerstone to ensure the safety of smart contracts. Therefore, OpenZeppelin [64], a well-known standard library provider in Ethereum, offers a whitelisted verification method. Moreover, through a comprehensive study of the top 40 DeFi projects ranked by TVL (Total Value Locked) [53], which account for over 95% of the total DeFi market, we summarized the verification techniques they adopted. In short, three whitelisted verification methods are involved, i.e., hard-encoded comparison, mapping validation, and hard-encoded address enumeration. Note that, though we cannot guarantee all adopted address verification techniques are covered, we cover the most prevalent ones. Considering the extensive copy-and-use in Ethereum smart contracts [40], these three mechanisms are representative.

Listing 1: An example of hard-encoded address comparison.Listing 1: An example of hard-encoded address comparison.

Listing 2: An example of hard-encoded address enumeration.Listing 2: An example of hard-encoded address enumeration.

Listing 1 illustrates how hard-encoded comparison works. As we can see, the passed token at L2[2] is required to equal the address of usdt, otherwise it raises an exception. Mapping validation adopts a mapping structure that can dynamically maintain the whitelisted status of addresses, e.g., mapping(address => bool) whitelist. As for the hard-encoded addresses enumeration, it is a variant of the first one. As shown in Listing 2, an array named addresses keeps all whitelisted addresses. Therefore, once the deposit function is invoked, the argument token is passed to the contains function, defined at L3, which is basically a hard-encoded comparison wrapped by a loop. At the bytecode level, these three techniques perform similarly. The contract loads the address in arguments by CALLDATALOAD and performs examination via a conditional opcode JUMPI. If an address is whitelisted, the control flow will be directed to the fallthrough branch, and the following logic will be used. Otherwise, the jumpdest branch is responsible for handling failed assertions.

2.3 Taint Analysis on Smart Contracts

Taint analysis is a fundamental method of program analysis, used for detecting vulnerabilities [34] and tracking sensitive information flow [47]. Before performing the taint analysis, sources and sinks should be specifically defined, where source refers to input fields controlled by adversaries, and sink refers to any part of the system where potentially dangerous data can be used in an unsafe manner. Taint analysis will track data flow from sources to sinks and identify any operations or transformations on the data along the way.

In the context of Ethereum smart contracts, the source is often the function of smart contracts that accept transactions from other accounts, while the sink varies, depending on specific goals. For example, to examine if a contract can be destructed, Ethainter [14] takes the SELFDESTRUCT opcode as the sink. Moreover, Michael et al. [34] introduce a tool based on symbolic execution and taint analysis, designating SSTORE as the primary sink for its evaluations.

2.4 Threat Model

Adversaries in our study do not require any extra privileges. This is because Ethereum is a permissionless blockchain platform, which allows any non-privileged account, including malicious ones, to initiate transactions with sufficient gas, deploy valid smart contracts, and invoke any already deployed ones. However, certain limitations still exist. For instance, they cannot breach the integrity of the Ethereum network or manipulate the block generation process, and cannot access the private keys of legitimate accounts. In a nutshell, we can barely distinguish adversaries from well-behaved accounts.

Authors:

(1) Tianle Sun, Huazhong University of Science and Technology;

(2) Ningyu He, Peking University;

(3) Jiang Xiao, Huazhong University of Science and Technology;

(4) Yinliang Yue, Zhongguancun Laboratory;

(5) Xiapu Luo, The Hong Kong Polytechnic University;

(6) Haoyu Wang, Huazhong University of Science and Technology.


Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button