ATCS PROJECT 2017

ATCS Projects 2023 Spring

Below, we list some tentative topics for your project. You are more than welcome to propose your own ideas.
Your project can be related to your own research as well.

If you are not sure if the topic is appropriate, please contact the instructor.

You are encouraged to think about some new cool applications based on the learning technique we covered in the class (feel free to use other technique as well, but your project has to relate to some theoretical elements of learning, optimization and prediction). Simply applying an existing algorithm to a small dataset does not constitute a final project. An arbitrary heuristic for an arbitrary formulation without any insights does not suffice neither.
If you have some idea, but not really sure whether the idea is feasible or not, talk the instructor (but you are advised to do some preliminary search using google yourself beforehand).
If you need computing resource beyond your own PC and laptop, contact us and we will help you.

Mid-term report: (May 12)

A team should have at most 2 persons.

By that time, you should already have started (you have fixed your topic, your team, your idea how to pursue).
You need to start early by reading relevant papers, collecting and processing the data, and/or starting with some preliminary code etc.
You need to submit a brief mid-term report, which contains "What topic, what other people has done in
this topic, what is my new idea, how i plan to do it, and report if there is any preliminary result".
　

Final report Deadline: TBA No late submission accepted.

-your final project deadline. You need to submit your final project report, your code and experiment result, plots generated .
Final Presentation (date TBA, should be in the exam week i.e., 17 or 18th week):
Each team needs to present their results in class. You should make some slides. Each project has 5-10 mins. The slides should be in English. The presentation can be either in Eng or Chs.

WARNING:
There are plenty open source codes for various ML tasks online.
If you use any open source code or package, you should cite it properly (in your report and slides)! It is very important!
However, you can modify existing open source code. If you do so, you need to be very explicit in your report which code you are using and which part is your modification, what your modification does.
Failing to do so is considered as plagiarism (you will get 0 for the project).

The project can be either theoretical or empirical (or both).

For a theoretical work, you can do the following:
(1) Be creative: Design a new algorithm with some theoretical guarantee. Prove something new theorems. This may be a difficult direction. Note that a very small and uninteresting result would not constitute a course project. If you are not sure whether your result (or direction) is interesting, contact the instructor.

For empirical work, you can do the following:
(1) Be creative: Find an interesting new problem and solve it, and/or design a new algorithm for an existing problem with or without theoretical guarantee.

(2) Implement others' methods: In this case, you need to survey a direction. The survey needs to be very detailed. You also need to implement a few most popular algorithms and compare them experimentally, and write down the experimental result in the survey as well. I would expect you to obtain at least some insights or potential improvement of existing methods (not just try to run existing open-sourced codes).
　

In either case, your final report should be in the format of a NeurIPS paper (NeurIPS format, including references).It should contain a title, an abstract, an introduction (this is where you tell others why you do this, i.e., the motivation, and a summary of what you do), a related work section (you have to mention what others have done, it is important), and the main text (the details of your method), and the experimental section. It should be at least 6 pages long (excluding the references). In sum, it should look like a paper.
　

Using ideas from online learning/bandit for hyperparameter optimization

Hyperband: A novel bandit-based approach to hyperparameter optimization
Practical Bayesian Optimization of Machine Learning Algorithms
Fast bayesian optimization of machine learning hyperparameters on large datasets

(try to provide a better modeling for the training process and better location of the resource)

Theory and Applications of Diffusion models

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

Statistical Efficiency of Score Matching: The View from Isoperimetry

Diffusion Models are Minimax Optimal Distribution Estimators

Theory of Deep Learning

There is already a large body of literature on this topic.

Recent active research topics:

Generalization in deep learning: the well known rethinking paper: Understanding deep learning requires rethinking generalization

Optimization algorithms in deep learning
Neural Tangent Kernel / Lazy Training
Mean Field regime and analysis
Margin Theory
Implicit bias
Landscape of NN
Robustness
Compression
Ensemble, Self-distillation

Talk to the instructor if you want to do something in this domain

Langevin Dynamics

classic paper: Bayesian Learning via Stochastic Gradient Langevin Dynamics

Convergence/hitting time:

Sharp convergence rates for Langevin dynamics in the nonconvex setting
Convergence of Langevin MCMC in KL-divergence
On Thompson Sampling with Langevin Algorithms
A hitting time analysis of stochastic gradient langevin dynamics (COLT 17 best paper)
Hitting Time of Stochastic Gradient Langevin Dynamics to Stationary Points: A Direct Analysis (a simpler and better analysis than the colt17 paper; leverages tools from Ito calculus)

Generalization:

Non-convex learning via stochastic gradient Langevin dynamics: a nonasymptotic analysis
Generalization bounds of sgld for non-convex learning: Two theoretical viewpoints
Generalization error bounds for noisy, iterative algorithms
On generalization error bounds of noisy gradient methods for non-convex learning

Direction: analysis SGLD for NKT or mean field

Theory of Semi-supervised/self-supervised learning

Combining Labeled and Unlabeled Data with Co-Training
Co-Training and Expansion: Towards Bridging Theory and Practice
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data
Predicting What You Already Know Helps: Provable Self-Supervised Learning.
A theoretical analysis of contrastive unsupervised representation learning

possible direction: generalize the common representation assumption in [4]. for example, one may want to model the scenario where learning pretext task is helpful to build a partial (or close) representation for the downstream task. One still needs to do some finetuning for the downstream task. This is more realistic.

Theory of Meta-learning (multi-task, transfer learning)

A Model of Inductive Bias Learning
A SURVEY ON DOMAIN ADAPTATION THEORY: LEARNING BOUNDS AND THEORETICAL GUARANTEES
Learning multiple tasks using shared hypotheses
Risk bounds for Transfering Representations with and without Fine-tuning
On Learning Invariant Representation for Domain Adaptation
Bounds for Linear Multi-Task Learning
The Benefit of Multitask Representation Learning
Few-shot learning via learning the representation, provably
On the Theory of Transfer Learning: The Importance of Task Diversity
Awesome transfer learning papers: theory

Optimization in NN

Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability (a very interesting recent empirical paper. direction: explain the empirical finding of the paper)

On the global landscape of neural networks: an overview (a good survey)

Convex? NN

Convex Neural Networks

Neural Networks are Convex Regularizers

Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization
Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time

direction: the above mentioned papers are mainly about optimization. How about generalization (use the new convex formulation)?

Neural Tangent Kernel / Wide NN

Neural tangent kernel: Convergence and generalization in neural networks
Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks
On Lazy Training in Differentiable Programming
On Exact Computation with an Infinitely Wide Neural Net

Mean Field Regime of wide NN

DEEP INFORMATION PROPAGATION
Exponential expressivity in deep neural networks through transient chaos
A mean field view of the landscape of two-layer neural networks
On the global convergence of gradient descent for over-parameterized models using optimal transport
Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit
Feature Learning in Infinite-Width Neural Networks

arxiv.org › cs

Robustness

Adversarial Examples Are Not Bugs, They Are Features
Certified Adversarial Robustness via Randomized Smoothing
Certifying some distributional robustness with principled adversarial training
Provable defenses against adversarial examples via the convex outer adversarial polytope
Formal guarantees on the robustness of a classifier against adversarial manipulation
Randomized Smoothing of All Shapes and Sizes

Check out this blog as well https://gradientscience.org

Explanation in ML

SHAP

LIME

Intergral gradient

Pricing Derivative and Machine Learning

Online trading algorithms and robust option pricing
Minimax Option Pricing Meets Black-Scholes in the Limit
QLBS: Q-Learner in the Black-Scholes (-Merton) Worlds
A-Nonparametric-Approach-to-Pricing-and-Hedging-Derivative-Securities-via-Learning-Networks
A reinforcement learning approach for pricing derivatives
Pricing Options with an Artificial Neural Network: A Reinforcement Learning Approach

(idea: connect the fundamental theorem of finance and statistical learning theory)

Apply online learning algorithm to trade stocks/futures - 1

Universal Portfolios With and Without Transaction Costs
Universal Portfolios with Side Information
Semi-Universal Portfolios with Transaction Costs
Portfolio Choices with Orthogonal Bandit Learning
Portfolio Optimization by Means of a\ chi-Armed Bandit Algorithm
Online portfolio selection: A survey
NONPARAMETRIC KERNEL-BASED SEQUENTIAL INVESTMENT STRATEGIES
Corn: Correlation-driven nonparametric learning approach for portfolio selection
Moving average reversion strategy for on-line portfolio selection
On-line portfolio selection with moving average reversion

Talk to the instructor for the idea and the data

Statistical Arbitrage

Statistical arbitrage is a major class of method in quantitative trading.
You can read the following material.

statistical arbitrage pairs trading strategies: review and outlook
Pairs trading book.(notes)
Statistical Arbitrage in the US Equities Market

In this task, you can do one of the following:
(1) be creative and come up a new statistical arbitrage strategy. Implement it and test it in real data.
(2) implement a few methods (statistics based or machine learning based) mentioned in the book or the review. Try to optimize the parameters (if any).
Do extensive comparsions among them.

Continuous process in HFT (some papers also requires knowledge from stochastic control, such as Hamilton–Jacobi–Bellman (HJB) equation)

Algorithmic Trading of Co-integrated Assets

Buy Low Sell High: a High Frequency Trading Perspective

Refer to the following lecture notes for background knowledge (very well written): Stochastic Calculus, Filtering, and Stochastic Control

Multi-factor Models in Finance

Fama–French three-factor model
Deep Learning Factor Alpha
101 Formulaic Alphas

Autoencoder Asset Pricing Models
AutoAlpha: an Efficient Hierarchical Evolutionary Algorithm for Mining Alpha Factors in Quantitative Investment

Please contact the instructor for the data.

Asset Allocation

Risk parity

Hierarchical Risk Parity

Value-at-Risk (VAR) and Conditional VAR (CVAR)

Extend coding assignment 1

1. try to capture both the time series and cross-sectional dependency of the assets.

2. dynamic correlation model

3. problem with estimating the covariance matrix

　

classic paper: Bayesian Learning via Stochastic Gradient Langevin Dynamics

A Model of Inductive Bias Learning

A SURVEY ON DOMAIN ADAPTATION THEORY: LEARNING BOUNDS AND THEORETICAL GUARANTEES

Learning multiple tasks using shared hypotheses

Risk bounds for Transfering Representations with and without Fine-tuning

On Learning Invariant Representation for Domain Adaptation

Bounds for Linear Multi-Task Learning

The Benefit of Multitask Representation Learning

Few-shot learning via learning the representation, provably

On the Theory of Transfer Learning: The Importance of Task Diversity

Convex Neural Networks

Neural Networks are Convex Regularizers

Exponential expressivity in deep neural networks through transient chaos

Certifying some distributional robustness with principled adversarial training

Provable defenses against adversarial examples via the convex outer adversarial polytope

Online portfolio selection: A survey

NONPARAMETRIC KERNEL-BASED SEQUENTIAL INVESTMENT STRATEGIES

Corn: Correlation-driven nonparametric learning approach for portfolio selection

Moving average reversion strategy for on-line portfolio selection

On-line portfolio selection with moving average reversion