Zechu (Steven) Li

I am currently a research assistant at MIT CSAIL, advised by Prof. Pulkit Agrawal. My research interest lies in reinforcement learning, especially its high-performance and scalable systems and applications (e.g., robotics, finance, and transportation).

Prior to this, I received my bachelor's degree from Columbia University in May 2022, majoring in computer science. During my undergraduate studies, I was fortunate to work with Prof. Xiaodong Wang, Prof. Anwar Walid and Prof. Sharon (Xuan) Di.

Email  /  Google Scholar  /  GitHub /  LinkedIn

Research Projects [* Equal contribution]
sym Social Learning for Sequential Driving Dilemmas
Xu Chen, Sharon (Xuan) Di, Zechu Li
Games, 2023
paper

Autonomous driving (AV) technology has elicited discussion on social dilemmas where trade-offs between individual preferences, social norms, and collective interests may impact road safety and efficiency. In this study, we aim to identify whether social dilemmas exist in AVs’ sequential decision making, which we call “sequential driving dilemmas” (SDDs), to help policymakers and AV manufacturers better understand under what circumstances SDDs arise and how to design rewards that incentivize AVs to avoid SDDs.

sym Parallel Q-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation
Zechu Li*, Tao Chen*, Zhang-Wei Hong, Anurag Ajay, Pulkit Agrawal
International Conference on Machine Learning (ICML), 2023
paper/ code

We present a novel parallel Q-learning framework that not only gains better sample efficiency but also reduces the training wall-clock time compared to PPO. Different from prior works on distributed off-policy learning, such as Apex, our framework is designed specifically for massively parallel GPU-based simulation and optimized to work on a single workstation. We demonstrate the capability of scaling up Q-learning methods to tens of thousands of parallel environments.

sym Stationary Deep Reinforcement Learning with Quantum K-spin Hamiltonian Equation
Xiao-Yang Liu*, Zechu Li*, Shixun Wu, Xiaodong Wang
Workshop on Physics for Machine Learning, International Conference on Learning Representations (ICLR) , 2023
paper

We propose a quantum K-spin Hamiltonian regularization term (called H-term) to help a policy network converge to a high-quality local minima. We take a quantum perspective by modeling a policy as a K-spin Ising model and employ a Hamiltonian equation to measure the energy of a policy. We derive a novel Hamiltonian policy gradient theorem and design a generic actor-critic algorithm that utilizes the H-term to regularize the policy network. The proposed method significantly reduces the variance of cumulative rewards by 65.2% ~ 85.6% on six MuJoCo tasks, etc.

sym Homomorphic Matrix Completion
Xiao-Yang Liu*, Zechu Li*, Xiaodong Wang
Advances in Neural Information Processing Systems (NeurIPS), 2022
paper

We propose a homomorphic matrix completion algorithm for privacy-preserving purpose. We first formulate a homomorphic matrix completion problem where a server performs matrix completion on cyphertexts, and propose an encryption scheme that is fast and easy to implement. We prove that the proposed scheme satisfies the homomorphism property and satisfies the differential privacy property. While with similar level of privacy guarantee, we reduce the best-known error bound to EXACT recovery at a price of more samples.

sym Social Learning In Markov Games: Empowering Autonomous Driving
Xu Chen, Zechu Li, Sharon (Xuan) Di
IEEE Intelligent Vehicles Symposium (IV), 2022
paper / code

We apply the social learning scheme to Markov games and leverage deep reinforcement learning (DRL) to investigate how individual AVs learn policies and form social norms in traffic scenarios. To capture agents' different attitudes toward traffic environments, a heterogeneous agent pool with cooperative and defective AVs is introduced to the social learning scheme. To solve social norms formed by AVs, we propose a DRL algorithm, and apply them to traffic scenarios: unsignalized intersection and highway platoon.

sym FinRL-Podracer: High Performance and Scalable Deep Reinforcement Learning for Quantitative Finance
Zechu Li, Xiao-Yang Liu, Jiahao Zheng, Zhaoran Wang,
Anwar Walid, Jian Guo
ACM International Conference on AI in Finance (ICAIF), 2021
paper / code

We introduce an RLOps in finance paradigm and present a FinRL-Podracer framework to accelerate the development pipeline of deep reinforcement learning (DRL)-driven trading strategy and to improve both trading performance and training efficiency. We evaluate the FinRL-Podracer framework for a stock trend prediction task on an NVIDIA DGX SuperPOD cloud, and show the high scalability by training a trading agent in 10 minutes with 80 A100 GPUs, on NASDAQ-100 constituent stocks with minute-level data over 10 years.

sym ElegantRL-Podracer: Scalable and Elastic Library for Cloud-native Deep Reinforcement Learning
Xiao-Yang Liu*, Zechu Li*, Zhuoran Yang, Jiahao Zheng, Zhaoran Wang,
Anwar Walid, Jian Guo, Michael Jordan
Deep Reinforcement Learning Workshop, NeurIPS, 2021
paper / code

We present a scalable and elastic library ElegantRL-podracer for cloud-native deep reinforcement learning, which efficiently supports millions of GPU cores to carry out massively parallel training at multiple levels. At a high-level, ElegantRL-podracer employs a tournament-based ensemble scheme to orchestrate the training process on hundreds or even thousands of GPUs, scheduling the interactions between a leaderboard and a training pool with hundreds of pods. At a low-level, each pod simulates agent-environment interactions in parallel by fully utilizing nearly 7,000 GPU CUDA cores in a single GPU.

Open Source Projects
sym FinRL: Financial Reinforcement Learning
project page / code / GitHub Star

FinRL is the first open-source framework to show the great potential of financial reinforcement learning. It has evolving into an ecosystem, including hundreds of financial markets, state-of-the-art algorithms, financial applications (portfolio allocation, cryptocurrency trading, high-frequency trading), live trading, cloud deployment, etc.

sym ElegantRL “小雅”: Massively Parallel Library for Cloud-native Deep Reinforcement Learning
project page / code / GitHub Star

ElegantRL is a massively parallel library for cloud-native deep reinforcement learning (DRL) applications. Our mission is to provide a scalable, efficienct, and accessible DRL platform for researchers and practitioners to develop cutting-edge RL applications. My specific interests are the development and usage of massively parallel simulations, and large-scale training with population-based training, ensemble methods, etc.

As a leader of this project, I have been contributing to

  • develop a series of large-scale training frameworks,
  • implemente SOTA algorithms and techniques,
  • build the documentation website.
Starting from Mar. 2021, I started to write tutorial blogs for the community,

Book Chapter
sym High-performance Tensor Decompositions for Compressing and Accelerating Deep Neural Networks
Xiao-Yang Liu, Yiming Fang, Liuqing Yang, Zechu Li, Anwar Walid
Tensors for Data Processing, Elsevier, 2021
chapter / book

Large-scale deep neural networks (DNNs) have led to impressive successes in many applications. However, two challenges often arise in DNN deployment in Internet of Things (IoT) devices and real-time applications: training time and memory footprint. This chapter takes a practical approach to seek a better efficiency-accuracy trade-off, which utilizes high performance tensor decompositions to compress and accelerate neural networks by exploiting low-rank structures of the network weight matrix.


This guy makes a nice webpage.