ATCS 每 Selected Topics in Learning, Prediction, and Optimization (with application in Finance)
2020 Spring
Lecturer: Jian Li ( lapordge at gmail dot com)
TA: Tianping Zhang ( ztp18 at mails.tsinghua.edu.cn )
time: every Monday 9:50am-12:15am
Room: TBD (we use ZOOM for now)
We intend to cover a subset of the following topics (tentative):
(1) I assume you already know all basics (convex optimization and machine learning, stochastic gradient descent, gradient boosting, deep learning basics, CNN, RNN, please see my undergrad course). If you don't know any machine learning, I do not suggest you to take this course. I will recall some concepts briefly when necessary.
(2) online learning, multi-armed bandit, statistical learning theory, theory of deep learning
I won't stickly follow the above order....I may skip something mentioned above and cover something not mentioned above...It is a graduate course.
I will be talking about several applications of ML and optimization in Finance (trading, pricing derivatives etc), and of course in typical CS areas like vision, nlp, social networks as well...
I will teach about 2/3 of the classes. For the rest, I will choose some topics and students need to do class presentation.
Tentative topics for class presentation: GAN, adverserial training, federated learning, AutoML, various financial applications.
Some
knowledge about convex optimization may be useful. See
this course (by
S. Boyd) and
a previous course by myself. But it will be fine if you didn't take those
courses. Basic machine learning knowledge is a must.
Andrew Ng's undergrad lecture
notes
The course is a blending of theory and practice. We will cover both the underlying mathematics as well as interesting heuristics.
Grading:
Schedule:
Feb 17 | Basics of Online learning The expert problem. Multiplicative weights method Online gradient descent |
|
Feb 24 | Basics of Stock and Future Markets. Cover's universal portfolio. 2nd order bound Online Mirror Descent: Fenchel conjugate function (closely related to other notion of duality, such as polar dual, dual norm) |
Universal Portfolios With and Without Transaction Costs |
Mar 2 | Online Mirror Descent Interval regret, Sleeping Expert, Adaptive Regret |
|
Mar 9 |
Bandit: stochastic: UCB adversarial: EXP3 Pure exploration: Median Elimination |
|
Mar 16 | Thompson Sampling
Contextual bandit: EXP4 epsilon-greedy |
The Epoch-Greedy Algorithm for Contextual Multi-armed BanditsTaming the monster: A fast and simple algorithm for contextual banditsSelected Reading:
|
Mar 23 | Hyperparameter optimization Hyperband Bayesian Optimization Online-to-batch Introduction to Stock and Future Market Efficient Market Hypothesis |
Selected Reading: Gaussian process optimization in the bandit setting: No regret and experimental design |
Mar 30 | Digress to
Martingale convergence theorem Brief Intro to Future and Option No arbitrage argument Basics of Brownian Motion Ito Calculus |
See corresponding chapters in Options, Futures and Other Derivatives |
Aril 6 | Break. No class. | ﹛ |
Apr 13. | Pricing Derivatives Binomial Tree model Black-Scholes Merton (BSM) model Pricing derivative via no regret online learning |
|
Apr 20 |
Relation between SDE and PDE Feynman-Kac formula Risk neutral valuation representation of BS formula Fokker-Planck formula Robust pricing options using online learning Volatility smile |
﹛
Selected Reading:
|
Apr 27 | Quick review of Classical Statistical
Learning Theory Rademacher complexity Maxima of (sub)Gaussian process Chaining Packing/Covering number VC-dimension |
There are many good refereces for these classical results I mainly followed the very well written (by van Handel) [lecture note] Probability in High Dimension (chapter 5 and 7) [Book] Neural Network Learning: Theoretical Foundations is also a very good source. |
May 4 | Labor day break, class moved to May 9 | ﹛ |
May 9 |
﹛ Pseudo-dimension, Fat-Shattering dimension Rademacher complexity Relation between these dimensions and covering/packing numbers. ﹛ |
Selected reading:
arxiv.org ›
cs
|
May 11 | Classical VC
Bounds of Neural Networks (overview) generalization bounds in deep learning Rethinking generalization Spectral norm generalization bound Generalization via Compression |
Selected Reading:
|
May 18 | RKHS MMD, Universality, integral operator generalization bound for kernel method (overview) why existing (kernel) generalization bounds are not enough﹛ |
Selected reading:
|
May 25 | student presentation group 1 | About Double Descent
﹛ |
Jun 1 | student presentation group 2 | About Kernels
|
Jun 8 | Project presentation | ﹛ |
﹛
﹛
References:
[Book] Introduction to online convex optimization
[Book] Learning, Prediction and Games
[Book] Options, Futures and Other Derivatives
[Book] Advances in Financial Machine Learning
[Book] Foundation of Machine Learning
[Book] Understanding Machine Learning: From Theory to Algorithms
Python is the default programming language we will use in the course.
If you haven't use it before, don't worry. It is very easy to learn (if you know any other programming language), and is a very efficient language, especially
for prototyping things related scientific computing and numerical optimization. Python codes are usually much shorter than C/C++ code (the lauguage has done a lot for you). It is also more flexible and generally faster than matlab.
A standard combination for this class is Python+numpy (a numeric lib for python)+scipy (a scientific computing lib for python)+matplotlab (for generating nice plots)
Another somewhat easier way is to install Anaconda (it is a free Python distribution with most popular packages).
﹛
﹛
﹛
﹛
﹛