ML4T Book 2nd Edition

🇹🇭 ภาษาไทย

Machine Learning for Algorithmic Trading (2nd Edition) โดย Stefan Jansen ตีพิมพ์ กรกฎาคม 2020 | สำนักพิมพ์ Packt Publishing | 858 หน้า | 23 chapters + appendix

3rd Edition กำลังมา (June 2026) ใน ML4T Platform — ขยายเป็น 27 chapters เพิ่ม GenAI, causal inference, production MLOps

ML4T Workflow (Framework กลาง)

Data → Features → ML Model → Signal → Backtest → Portfolio → Live

โครงสร้าง Chapters

Part 1 — Data (Ch 1–5)

Ch	ชื่อ	สาระสำคัญ
1	ML for Trading – From Idea to Execution	ML4T workflow, use cases, strategy lifecycle
2	Market and Fundamental Data	ITCH feed, tick→bars, pandas-datareader
3	Alternative Data for Finance	Categories, evaluation criteria, web scraping
4	Financial Feature Engineering	Alpha factors, TA-Lib, Kalman filter, Alphalens
5	Portfolio Optimization and Performance Evaluation	Sharpe, HRP, pyfolio

Part 2 — ML Foundations (Ch 6–8)

Ch	ชื่อ	สาระสำคัญ
6	The Machine Learning Process	Bias-variance, cross-validation, purging/embargoing
7	Linear Models	OLS, ridge, lasso, Fama-French, logistic regression
8	The ML4T Workflow – From Model to Strategy Backtesting	backtrader, Zipline Pipeline API

Part 3 — Classical ML (Ch 9–13)

Ch	ชื่อ	สาระสำคัญ
9	Time-Series Models	ARIMA, GARCH, VAR, cointegration, pairs trading
10	Bayesian ML	PyMC3, Bayesian Sharpe ratio, rolling regression
11	Random Forests	Decision trees, RF, long-short Japanese stocks, LightGBM
12	Boosting	GBM, XGBoost, LightGBM, CatBoost, SHAP
13	Unsupervised Learning	PCA, clustering, HRP portfolio

Part 4 — NLP (Ch 14–16)

Ch	ชื่อ	สาระสำคัญ
14	Text Data for Trading – Sentiment Analysis	spaCy, TF-IDF, naive Bayes
15	Topic Modeling	LDA (sklearn + Gensim), earnings call topics
16	Word Embeddings	word2vec, GloVe, doc2vec, BERT intro

Part 5 — Deep Learning (Ch 17–21)

Ch	ชื่อ	สาระสำคัญ
17	Deep Learning for Trading	Feedforward NN, TF2, PyTorch, long-short strategy
18	CNNs	LeNet5, transfer learning, 1D conv for time series
19	RNNs	LSTM, GRU, multivariate time series, SEC filings
20	Autoencoders	VAE, conditional autoencoder for asset pricing
21	GANs	TimeGAN, synthetic financial time series

Part 6 — RL + Conclusions (Ch 22–23)

Ch	ชื่อ	สาระสำคัญ
22	Deep Reinforcement Learning	DDQN, OpenAI Gym, custom TradingEnvironment
23	Conclusions and Next Steps	Key lessons, backtest overfitting, platform comparison

Appendix: 100+ alpha factors ใน TA-Lib, WorldQuant formulaic alphas

Concepts หลัก

Concept	ความหมาย
IC (Information Coefficient)	Spearman rank correlation ระหว่าง predicted vs. actual returns
Lookahead Bias	ใช้ข้อมูลอนาคตโดยไม่ตั้งใจ
Deflated Sharpe Ratio	Sharpe ratio ที่ปรับสำหรับ multiple testing
Purging/Embargoing	Cross-validation technique สำหรับ time series
HRP	Hierarchical Risk Parity — portfolio construction ด้วย clustering

ML4T Platform — 3rd edition ecosystem
TradingView MCP — connect Claude Code กับ TradingView Desktop
Algorithmic Trading — domain concept

🇬🇧 English

Machine Learning for Algorithmic Trading (2nd Edition) by Stefan Jansen Published July 2020 | Packt Publishing | 858 pages | 23 chapters + appendix | 400+ notebooks

The 3rd Edition is coming (June 2026) as part of ML4T Platform — expands to 27 chapters, adds GenAI, causal inference, and production MLOps.

The ML4T Workflow (Central Framework)

Data → Features → ML Model → Signal → Backtest → Portfolio → Live
         ↑                                   ↓
         └───────── learn from results ───────┘

Every chapter applies this workflow to a different ML approach or data type.

Complete Chapter Structure

Part 1 — Data (Ch 1–5)

Ch	Title	Key Content
1	ML for Trading – From Idea to Execution	ML4T workflow overview, use cases, strategy lifecycle
2	Market and Fundamental Data	Nasdaq ITCH feed, tick→bars (time/volume/dollar), pandas-datareader, XBRL
3	Alternative Data for Finance	Categories (individuals/business/sensors/satellites), evaluation criteria, web scraping
4	Financial Feature Engineering	Alpha factors: momentum, value, volatility, quality; TA-Lib; Kalman filter; Alphalens
5	Portfolio Optimization and Performance Evaluation	Sharpe ratio, mean-variance, Black-Litterman, Kelly criterion, HRP, pyfolio

Part 2 — ML Foundations (Ch 6–8)

Ch	Title	Key Content
6	The Machine Learning Process	Supervised/unsupervised/RL overview; bias-variance tradeoff; cross-validation; purging/embargoing
7	Linear Models	OLS, ridge, lasso, CAPM→Fama-French factor models, logistic regression, predict returns
8	The ML4T Workflow – From Model to Strategy Backtesting	Backtest pitfalls (lookahead/survivorship/outlier); backtrader; Zipline Pipeline API

Part 3 — Classical ML (Ch 9–13)

Ch	Title	Key Content
9	Time-Series Models	ARIMA, SARIMAX, ARCH/GARCH, VAR, cointegration, pairs trading backtest
10	Bayesian ML	PyMC3, MAP/MCMC/variational inference; Bayesian Sharpe ratio; rolling regression for pairs
11	Random Forests	Decision trees, bagging, RF; long-short Japanese stocks with LightGBM; Alphalens evaluation
12	Boosting Your Trading Strategy	AdaBoost, GBM, XGBoost, LightGBM, CatBoost; SHAP values; intraday strategy
13	Unsupervised Learning for Risk Factors	PCA, ICA, t-SNE, UMAP; k-means, hierarchical, DBSCAN clustering; HRP portfolio

Part 4 — NLP (Ch 14–16)

Ch	Title	Key Content
14	Text Data for Trading – Sentiment Analysis	NLP pipeline (spaCy, TextBlob); TF-IDF; naive Bayes on news and Yelp data
15	Topic Modeling	LSI, pLSA, LDA (sklearn + Gensim); earnings call topic modeling
16	Word Embeddings	word2vec, GloVe, doc2vec; BERT/transformer intro; SEC filings for return prediction

Part 5 — Deep Learning (Ch 17–21)

Ch	Title	Key Content
17	Deep Learning for Trading	Feedforward NN, activation functions, dropout, SGD/Adam; TF2 and PyTorch; long-short strategy
18	CNNs for Financial Time Series	LeNet5, AlexNet, VGG16 transfer learning; 1D convolutions; CNN-TA clustering
19	RNNs for Multivariate Time Series	LSTM, GRU, bidirectional RNN; S&P500 regression; multivariate macro; SEC filing sentiment
20	Autoencoders	Feedforward/conv/denoising autoencoders; VAE; conditional autoencoder for asset pricing
21	GANs for Synthetic Time-Series Data	DCGAN, conditional GAN; TimeGAN (train on synthetic, test on real)

Part 6 — RL + Conclusions (Ch 22–23)

Ch	Title	Key Content
22	Deep Reinforcement Learning	MDP, value iteration, Q-learning, DDQN; OpenAI Gym; custom TradingEnvironment
23	Conclusions and Next Steps	Key lessons: data quality, bias-variance, backtest overfitting, platform comparison

Appendix — Alpha Factor Library: 100+ factors in TA-Lib (moving averages, momentum, volume, volatility) + WorldQuant formulaic alphas (Alpha001, Alpha054).

Key Concepts Across the Book

Concept	Description
IC (Information Coefficient)	Spearman rank correlation between predicted and actual returns — the primary signal quality metric
Lookahead Bias	Accidentally using future information in features — causes unrealistically good backtests
Deflated Sharpe Ratio	Sharpe ratio adjusted for multiple testing — guards against backtest overfitting
Purging/Embargoing	Cross-validation technique for time series — prevents leakage between train and test
Alpha Factor	A signal expected to predict returns before being arbitraged away
HRP	Hierarchical Risk Parity — portfolio construction using clustering instead of matrix inversion

Main Tools Used

Data & Features: pandas, NumPy, TA-Lib, Quandl, yfinance, Zipline bundles ML: scikit-learn, statsmodels, PyMC3, LightGBM, XGBoost, CatBoost Deep Learning: TensorFlow 2, PyTorch, Keras Backtesting: backtrader, Zipline, Alphalens, pyfolio NLP: spaCy, Gensim, TextBlob

PrasitN Wiki

รายการหน้า

ML4T Book 2nd Edition

ML4T Book 2nd Edition

🇹🇭 ภาษาไทย

ML4T Workflow (Framework กลาง)

โครงสร้าง Chapters

Concepts หลัก

🇬🇧 English

The ML4T Workflow (Central Framework)

Complete Chapter Structure

Key Concepts Across the Book

Main Tools Used

มุมมองกราฟ

สารบัญ

PrasitN Wiki

รายการหน้า

ML4T Book 2nd Edition

ML4T Book 2nd Edition

🇹🇭 ภาษาไทย

ML4T Workflow (Framework กลาง)

โครงสร้าง Chapters

Concepts หลัก

Related

🇬🇧 English

The ML4T Workflow (Central Framework)

Complete Chapter Structure

Key Concepts Across the Book

Main Tools Used

มุมมองกราฟ

สารบัญ