ATCS Projects 2023 Spring


Below, we list some tentative topics for your project. You are more than welcome to propose your own ideas.
Your project can be related to your own research as well.

 If you are not sure if the topic is appropriate, please contact the instructor.


You are encouraged to think about some new cool applications based on the learning technique we covered in the class (feel free to use other technique as well, but your project has to relate to some theoretical elements of learning, optimization and prediction). Simply applying an existing algorithm to a small dataset does not constitute a final project. An arbitrary heuristic for an arbitrary formulation without any insights does not suffice neither.
If you have some idea, but not really sure whether the idea is feasible or not, talk the instructor (but you are advised to do some preliminary search using google yourself beforehand).
If you need computing resource beyond your own PC and laptop, contact us and we will help you.



Mid-term report:  (May 12
)

A team should have at most 2 persons.

By that time, you should already have started (you have fixed your topic, your team, your idea how to pursue).
You need to start early by reading relevant papers, collecting and processing the data, and/or starting with some preliminary code etc.
You need to submit a brief mid-term report, which contains "What topic, what other people has done in
this topic, what is my new idea, how i plan to do it, and report if there is any preliminary result".
¡¡



Final report Deadline: TBA No late submission accepted.

-your final project deadline. You need to submit your final project report, your code and experiment result, plots generated .
Final Presentation (date TBA, should be in the exam week i.e., 17 or 18th week):
Each team needs to present their results in class. You should make some slides. Each project has 5-10 mins. The slides should be in English. The presentation can be either in Eng or Chs.



WARNING:
There are plenty open source codes for various ML tasks online.
If you use any open source code or package, you should cite it properly (in your report and slides)! It is very important!
However, you can modify existing open source code. If you do so, you need to be very explicit in your report which code you are using and which part is your modification, what your modification does.
Failing to do so is considered as plagiarism (you will get 0 for the project).



The project can be either theoretical or empirical (or both).


For a theoretical work, you can do the following:
(1) Be creative: Design a new algorithm with some theoretical guarantee. Prove something new theorems. This may be a difficult direction. Note that a very small and uninteresting result would not constitute a course project. If you are not sure whether your result (or direction) is interesting, contact the instructor.

For empirical work, you can do the following:
(1) Be creative: Find an interesting new problem and solve it, and/or design a new algorithm for an existing problem with or without theoretical guarantee.

(2) Implement others' methods: In this case, you need to survey a direction. The survey needs to be very detailed. You also need to implement a few most popular algorithms and compare them experimentally, and write down the experimental result in the survey as well. I would expect you to obtain at least some insights or potential improvement of existing methods (not just try to run existing open-sourced codes).
¡¡

In either case, your final report should be in the format of a NeurIPS paper (NeurIPS format, including references).It should contain a title, an abstract, an introduction (this is where you tell others why you do this, i.e., the motivation, and a summary of what you do), a related work section (you have to mention what others have done, it is important), and the main text (the details of your method), and the experimental section. It should be at least 6 pages long (excluding the references). In sum, it should look like a paper.
¡¡


Using ideas from online learning/bandit for hyperparameter optimization

  1. Hyperband: A novel bandit-based approach to hyperparameter optimization
  2. Practical Bayesian Optimization of Machine Learning Algorithms
  3. Fast bayesian optimization of machine learning hyperparameters on large datasets
    ¡¡

(try to provide a better modeling for the training process and better location of the resource)


Theory and Applications of Diffusion models

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

Statistical Efficiency of Score Matching: The View from Isoperimetry

Diffusion Models are Minimax Optimal Distribution Estimators


Theory of Deep Learning

There is already a large body of literature on this topic.

Recent active research topics:

  1. Generalization in deep learning: the well known rethinking paper: Understanding deep learning requires rethinking generalization
  2. Optimization algorithms in deep learning
  3. Neural Tangent Kernel / Lazy Training
  4. Mean Field regime and analysis
  5. Margin Theory
  6. Implicit bias
  7. Landscape of NN
  8. Robustness
  9. Compression
  10. Ensemble, Self-distillation

Talk to the instructor if you want to do something in this domain


Langevin Dynamics

classic paper: Bayesian Learning via Stochastic Gradient Langevin Dynamics

Convergence/hitting time:

  1. Sharp convergence rates for Langevin dynamics in the nonconvex setting
  2. Convergence of Langevin MCMC in KL-divergence
  3. On Thompson Sampling with Langevin Algorithms
  4. A hitting time analysis of stochastic gradient langevin dynamics (COLT 17 best paper)
  5. Hitting Time of Stochastic Gradient Langevin Dynamics to Stationary Points: A Direct Analysis (a simpler and better analysis than the colt17 paper; leverages tools from Ito calculus)

Generalization:

  1. Non-convex learning via stochastic gradient Langevin dynamics: a nonasymptotic analysis
  2. Generalization bounds of sgld for non-convex learning: Two theoretical viewpoints
  3. Generalization error bounds for noisy, iterative algorithms
  4. On generalization error bounds of noisy gradient methods for non-convex learning

Direction: analysis SGLD for NKT or mean field


Theory of Semi-supervised/self-supervised learning

¡¡

  1. Combining Labeled and Unlabeled Data with Co-Training

  2. Co-Training and Expansion: Towards Bridging Theory and Practice

  3. Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

  4. Predicting What You Already Know Helps: Provable Self-Supervised Learning.

  5. A theoretical analysis of contrastive unsupervised representation learning

possible direction: generalize the common representation assumption in [4]. for example, one may want to model the scenario where learning pretext task is helpful to build a partial (or close) representation for the downstream task. One still needs to do some finetuning for the downstream task. This is more realistic.


Theory of Meta-learning (multi-task, transfer learning)

  1. A Model of Inductive Bias Learning

  2. A SURVEY ON DOMAIN ADAPTATION THEORY: LEARNING BOUNDS AND THEORETICAL GUARANTEES

  3. Learning multiple tasks using shared hypotheses

  4. Risk bounds for Transfering Representations with and without Fine-tuning

  5. On Learning Invariant Representation for Domain Adaptation

  6. Bounds for Linear Multi-Task Learning

  7. The Benefit of Multitask Representation Learning

  8. Few-shot learning via learning the representation, provably

  9. On the Theory of Transfer Learning: The Importance of Task Diversity

  10. Awesome transfer learning papers: theory


Optimization in NN

Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability (a very interesting recent empirical paper. direction: explain the empirical finding of the paper)

On the global landscape of neural networks: an overview (a good survey)


Convex? NN

  1. Convex Neural Networks

  2. Neural Networks are Convex Regularizers

  3. Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization
  4. Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time

direction: the above mentioned papers are mainly about optimization. How about generalization (use the new convex formulation)?


Neural Tangent Kernel / Wide NN

  1. Neural tangent kernel: Convergence and generalization in neural networks
  2. Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks
  3. On Lazy Training in Differentiable Programming
  4. On Exact Computation with an Infinitely Wide Neural Net

Mean Field Regime of wide NN

  1. DEEP INFORMATION PROPAGATION
  2. Exponential expressivity in deep neural networks through transient chaos

  3. A mean field view of the landscape of two-layer neural networks
  4. On the global convergence of gradient descent for over-parameterized models using optimal transport
  5. Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit
  6. Feature Learning in Infinite-Width Neural Networks
arxiv.org › cs

Robustness

  1. Adversarial Examples Are Not Bugs, They Are Features
  2. Certified Adversarial Robustness via Randomized Smoothing
  3. Certifying some distributional robustness with principled adversarial training

  4. Provable defenses against adversarial examples via the convex outer adversarial polytope

  5. Formal guarantees on the robustness of a classifier against adversarial manipulation
  6. Randomized Smoothing of All Shapes and Sizes

    Check out this blog as well https://gradientscience.org


Explanation in ML

SHAP

LIME

Intergral gradient


Pricing Derivative and Machine Learning

  1. Online trading algorithms and robust option pricing
  2. Minimax Option Pricing Meets Black-Scholes in the Limit
  3. QLBS: Q-Learner in the Black-Scholes (-Merton) Worlds
  4. A-Nonparametric-Approach-to-Pricing-and-Hedging-Derivative-Securities-via-Learning-Networks
  5. A reinforcement learning approach for pricing derivatives
  6. Pricing Options with an Artificial Neural Network: A Reinforcement Learning Approach

(idea: connect the fundamental theorem of finance and statistical learning theory)


Apply online learning algorithm to trade stocks/futures - 1

  1. Universal Portfolios With and Without Transaction Costs
  2. Universal Portfolios with Side Information
  3. Semi-Universal Portfolios with Transaction Costs
  4. Portfolio Choices with Orthogonal Bandit Learning
  5. Portfolio Optimization by Means of a\ chi-Armed Bandit Algorithm
  6. Online portfolio selection: A survey

  7. NONPARAMETRIC KERNEL-BASED SEQUENTIAL INVESTMENT STRATEGIES

  8. Corn: Correlation-driven nonparametric learning approach for portfolio selection

  9. Moving average reversion strategy for on-line portfolio selection

  10. On-line portfolio selection with moving average reversion

Talk to the instructor for the idea and the data


Statistical Arbitrage

Statistical arbitrage is a major class of method in quantitative trading.
You can read the following material.

  1. statistical arbitrage pairs trading strategies: review and outlook
  2. Pairs trading book.(notes)
  3. Statistical Arbitrage in the US Equities Market

In this task, you can do one of the following:
(1) be creative and come up a new statistical arbitrage strategy. Implement it and test it in real data.
(2) implement a few methods (statistics based or machine learning based) mentioned in the book or the review. Try to optimize the parameters (if any).
Do extensive comparsions among them.

Continuous process in HFT (some papers also requires knowledge from stochastic control, such as Hamilton¨CJacobi¨CBellman (HJB) equation)

Algorithmic Trading of Co-integrated Assets

Buy Low Sell High: a High Frequency Trading Perspective

Refer to the following lecture notes for background knowledge (very well written): Stochastic Calculus, Filtering, and Stochastic Control


Multi-factor Models in Finance

  1. Fama¨CFrench three-factor model
  2. Deep Learning Factor Alpha
  3. 101 Formulaic Alphas
  4. Autoencoder Asset Pricing Models
  5. AutoAlpha: an Efficient Hierarchical Evolutionary Algorithm for Mining Alpha Factors in Quantitative Investment

Please contact the instructor for the data.


Asset Allocation

Risk parity

Hierarchical Risk Parity

Value-at-Risk (VAR) and Conditional VAR (CVAR)

Extend coding assignment 1

1. try to capture both the time series and cross-sectional dependency of the assets.

2. dynamic correlation model

3. problem with estimating the covariance matrix



¡¡