ATCS – Learning and Prediction

ATCS – Selected Topics in Learning, Prediction, and Optimization (with applications in Finance)

2025 Spring

Lecturer: Jian Li ( lapordge at gmail dot com)

TA: Shaowen Wang (wangsw5653@gmail.com), Sichen Bao

time: every Monday 9:50am-12:15am

Room: 四教4102

We intend to cover a subset of the following topics (tentative):

(1) I assume you already know all basics (convex optimization and machine learning, stochastic gradient descent, gradient boosting, deep learning basics, CNN, RNN, please see my undergrad course). If you don't know much machine learning (e.g., you do not know how to derive the dual of SVM yet), please do NOT take this course. I will recall some concepts briefly when necessary.

I will cover modern topics related to AI and its application in Finance..It is a graduate course.

Selected topics on the fundamental theories and cutting-edge applications of modern artificial intelligence and machine learning.

I will cover something related to

1. machine learning theory, including deep learning theory

2. topics in large language models (compression&intelligence, CoT, expressive power, scaling law, etc.)

3. theory of diffusion model

4. fundamental theories of quantitative finance

5. Applications of large models and other machine learning methods in the financial domain

6. Other possible topics in ML: robustness, explainable AI, fairness, calibration

You may take a look at previous courses ATCS 2024, ATCS 2021

It is a very diverse set of topics. I don't expect you to be interested in everything. But if you have enough undergrad AI background, and are interested in 1-2 topics above, you can take this course.

It may be a good opportunity for you to get a flavor of other topics. Anyway there is no closed-book exam.

I will teach about 2/3-3/4 of the classes. For the rest, I will choose some topics and students need to do class presentation.

Basic machine learning knowledge is a must. Andrew Ng's undergrad lecture notes

The course may use various math tools from convex optimization, spectral theory, matrix pertubation, probability, high dimensional geometry, functional analysis, fourier analysis, real algebra geometry, stochastic differential geometry, information theory and so on. Only standard CS undergrad math and machine learning knowledge are required, otherwise the course will be self-contained. But certain math maturity is required.

Some knowledge about convex optimization may be useful. See this course (by S. Boyd) and a previous course by myself. But it will be fine if you didn't take those courses.

The course is a blending of theory and practice. We will cover both the underlying mathematics as well as interesting heuristics.

Grading:

Homeworks (20pts, 2-3 small coding homeworks)
5-10 pts for taking notes: Each student should take notes for at least one topics, using LaTex (use this template sample.tex algorithm2e.sty).
Class participation/class presentation (10pts)
Course projects (60 pts. 10pts for mid-report, 10pts for final presentation, 40pts for final report)
No close-book exam.

Schedule:

Feb 17	Introduction of the course	Intro to modern AI, deep learning theory, LLM, diffusion models,
Feb 24	Language modeling and LLMs	Word2Vec, Neural Language Models, Seq2Seq, Transformer, Cross-entropy loss (perplexity) and source coding, compression and predition, GPT (pre-training, Self-Masked Attention, parallel training, data mixing, training algorithms, MoE)
Mar 3	More on LLMs	Inference (sampling methods, speculative decoding), Chain of Thought, Tree of Thought, MTCS, RAG, Langchain, ReACT, Reflexion, Agents
Mar 10	Introduction to Diffusion Models	Basics of Brownian motion. SDE, Ito's integral and calculus. OU-process. Anderson's theorem. Diffusion models
Mar 17	More on Diffusion Models	SMLD, DDPM, Hallucinations in Diffusion models
Mar 24	First part: (Guest Lecture by Lijie Chen) Second part: Consistency models	Theoretical limitations of multi-layer Transformers Consistency models, Latent Consistency Models, LCM-Lora

References:

[Book] Introduction to online convex optimization

[Book] Learning, Prediction and Games

[Book] Options, Futures and Other Derivatives

[Book] Advances in Financial Machine Learning

[Book] Foundation of Machine Learning

[Book] Understanding Machine Learning: From Theory to Algorithms

Lecture notes for STAT928: Statistical Learning and Sequential Prediction

Python is the default programming language we will use in the course.

If you haven't use it before, don't worry. It is very easy to learn (if you know any other programming language), and is a very efficient language, especially

for prototyping things related scientific computing and numerical optimization. Python codes are usually much shorter than C/C++ code (the lauguage has done a lot for you). It is also more flexible and generally faster than matlab.

A standard combination for this class is Python+numpy (a numeric lib for python)+scipy (a scientific computing lib for python)+matplotlab (for generating nice plots)

Another somewhat easier way is to install Anaconda (it is a free Python distribution with most popular packages).