Faizan Risk Management AI & Machine Learning Mobile Apps Tax Planning Business Solutions Get in Touch
AI & Machine Learning

Models that
think. Markets
that respond.

Random Forest classifiers. LSTM networks. Hidden Markov regime detection. XGBoost ensembles. Built in Python — tested on real data — deployed for edge.

73%
RF Test Accuracy
10K+
GBM Sim Paths
50+
Model Features
2.1x
Backtested Sharpe
20+
Custom Indicators
The Process

Every model starts
with the same pipeline.

Before writing a single line of model code, the data has to be clean, the features engineered, the train/test splits done correctly. Walk-forward validation prevents data leakage — the mistake most amateur quants make.

01
Raw Data
OHLCV, on-chain metrics, macro data, sentiment feeds
02
Feature Engineering
50+ technical + macro features, lag transforms, normalisation
03
Walk-Forward Split
Time-series CV — no future leakage, rolling train window
04
Train & Tune
GridSearchCV / Optuna hyperparameter optimisation
05
Evaluate
Accuracy, F1, Sharpe, max drawdown — not just accuracy
06
Backtest
Backtrader / vectorbt with transaction costs and slippage

Random Forest
Classifier

Ensemble of 500 decision trees trained on 50+ features — RSI divergence, volume profile anomalies, order flow imbalance, VIX regime, PCE delta, funding rates, on-chain NVT. Each tree votes on directional probability. The majority wins — and the feature importances reveal exactly what the market is responding to most.

73%
Accuracy
Test Set
1.42
Sharpe
500
Trees
0.71
F1 Score
−12%
Max DD
random_forest_classifier.py — scikit-learn · 500 estimators · BTC/USD 4H
random_forest.py
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import classification_report

# Feature engineering — 50+ signals
features = ['rsi_14','rsi_divergence','macd_hist',
  'volume_zscore','obv_slope','vwap_dev',
  'funding_rate','oi_change','liq_score',
  'vix_regime','pce_delta','nvt_ratio']

# Walk-forward cross-validation
tscv = TimeSeriesSplit(n_splits=5)
rf = RandomForestClassifier(
  n_estimators=500,
  max_depth=8,
  min_samples_leaf=20,
  class_weight='balanced',
  random_state=42
)
for train, test in tscv.split(X):
  rf.fit(X[train], y[train])
  preds = rf.predict(X[test])
  print(classification_report(y[test], preds))

# Probability outputs for position sizing
proba = rf.predict_proba(X_live)
signal = proba[0][1] # P(bullish)
Feature Importances — Top 10
volume_zscore
0.14
liq_score
0.12
rsi_divergence
0.11
oi_change
0.10
funding_rate
0.09
vwap_dev
0.08
macd_hist
0.07
vix_regime
0.06
nvt_ratio
0.05
pce_delta
0.04
Confusion Matrix + Classification Report
Predicted →
BULL
BEAR
BULL
847
TP
198
FP
BEAR
312
FN
643
TN
precision recall f1
BULL (1) 0.73 0.81 0.77
BEAR (0) 0.76 0.67 0.71
─────────────────────────────
accuracy 0.73
macro avg 0.75 0.74 0.74
─────────────────────────────
OOB score: 0.711
Sharpe ratio: 1.42
Max drawdown: −12.3%

LSTM Price
Forecasting Network

Long Short-Term Memory networks are designed for exactly this — sequential data where what happened 30 bars ago still matters. With attention mechanisms layered on top, the model learns which timesteps in the lookback window carry the most signal. Walk-forward validation across BTC, ETH and SPY. Mean Absolute Percentage Error of 3.8%.

LSTM Architecture — Sequence to Prediction
INPUT 60 bars ×12 feats LSTM 128 units f i o return_seq=True LSTM 64 units f i o return_seq=False ATTENTION Bahdanau mechanism DENSE ReLU → 1 FORECAST t+1 price f=forget i=input o=output cell state flows right → Dropout 0.2 Dropout 0.2
lstm_forecast.py — TensorFlow/Keras · Attention · Walk-Forward CV · BTC+ETH+SPY
lstm_model.py — Keras Sequential
from tensorflow.keras import Sequential, layers
from tensorflow.keras.callbacks import EarlyStopping

model = Sequential([
  layers.LSTM(128, return_sequences=True,
    input_shape=(60,12)),
  layers.Dropout(0.2),
  layers.LSTM(64, return_sequences=False),
  layers.Dropout(0.2),
  layers.Dense(32, activation='relu'),
  layers.Dense(1)
])

model.compile(optimizer='adam',
  loss='huber', # robust to outliers
  metrics=['mae'])

es = EarlyStopping(monitor='val_loss',
  patience=15, restore_best_weights=True)

history = model.fit(X_train, y_train,
  epochs=200, batch_size=32,
  validation_split=0.15,
  callbacks=[es])

# MAPE evaluation
mape = np.mean(np.abs((y-ŷ)/y))*100
Training & Validation Loss — 85 Epochs
stop Train loss Val loss 0.08 0.04 0.01 0 40 85
Predicted vs Actual — BTC/USD Out-of-Sample
Actual LSTM pred MAPE: 3.8% RMSE: $1,847 R²: 0.94
Models 03 & 04

XGBoost & Hidden
Markov Regime Detection

XGBoost for gradient-boosted directional classification. Hidden Markov Models to detect which regime the market is actually in — trending, mean-reverting, or high-vol breakdown. The two work together: HMM decides which strategy to deploy, XGBoost executes the signals.

XGBoost Classifier
Gradient Boosted Trees · SHAP Explainability
Ensemble
from xgboost import XGBClassifier
import shap

xgb = XGBClassifier(
  n_estimators=400, max_depth=5,
  learning_rate=0.05, subsample=0.8,
  colsample_bytree=0.8,
  use_label_encoder=False,
  eval_metric='logloss'
)
explainer = shap.TreeExplainer(xgb)
shap_vals = explainer.shap_values(X_test)
SHAP Feature Impact — XGBoost signal attribution
volume_zscore +0.068 liq_score +0.058 funding_rate +0.048 rsi_divergence +0.038 vix_regime −0.024 pce_delta −0.019 0 +SHAP −SHAP AUC-ROC 0.81
0.81
AUC-ROC
400
Estimators
SHAP
Explainable
Hidden Markov Model
Market Regime Detection · 3-State HMM · hmmlearn
Regime
from hmmlearn.hmm import GaussianHMM

# 3 hidden states: Bull / Bear / High-Vol
hmm = GaussianHMM(
  n_components=3,
  covariance_type='full',
  n_iter=200, tol=1e-5
)
obs = np.column_stack([
  log_returns, realised_vol, volume_delta
])
hmm.fit(obs)
regimes = hmm.predict(obs)

# Route to correct strategy
strategy = {0: momentum,
            1: mean_revert,
            2: defensive}[regimes[-1]]
Regime classification — BTC/USD 2022–2024 (simulated)
Bull Trend
High Vol
Bear / Crash
Transition Matrix P(state t+1 | state t)
Bull
HighV
Bear
Bull→
0.82
0.14
0.04
HighV→
0.38
0.41
0.21
Bear→
0.09
0.22
0.69
3
States
HMM
Model Type
BIC
State Selection
Backtesting

Strategy performance
on historical data.

All strategies backtested using Backtrader and vectorbt with realistic assumptions — 0.1% transaction costs, 2% slippage on large positions, no look-ahead bias. Walk-forward out-of-sample results only. Past performance does not guarantee future results.

+142%
RF Strategy Return
4-Year Backtest
2.1×
Sharpe Ratio
HMM Hybrid
−18%
Max Drawdown
Worst Period
58%
Win Rate
All Signals
Equity Curve — RF + HMM Hybrid Strategy
BTC/USD · 4H · 2021–2024 · $10,000 initial capital · out-of-sample
Strategy
Buy & Hold
DD1 DD2 +142% Strategy +88% Buy&Hold 2021 2022 2023 2024
Custom Indicator Library

20+ indicators beyond
the standard library.

TA-Lib covers the basics. These are the ones that don't come pre-packaged — built from first principles and tested for alpha generation.

Adaptive ATR Bands
Dynamic volatility bands that widen in high-vol regimes and tighten during consolidation — avoids the fixed-multiplier flaw of Bollinger Bands.
Liquidation Cluster Proximity
Scores how close current price is to major open interest clusters — elevated scores signal mean reversion or breakout acceleration risk.
Multi-Timeframe Confluence
Scores alignment of signals across 1H/4H/1D timeframes — only high-confluence setups pass the threshold. Reduces noise dramatically.
On-Chain NVT Integration
Network Value to Transactions ratio — adapted as a real-time feature fed into RF and XGBoost models. Anchors speculation to network utility.
Volume Profile Anomaly
Detects abnormal volume distribution using Z-score on VWAP deviation — flags institutional accumulation or distribution in progress.
SOPR Momentum Signal
Spent Output Profit Ratio — on-chain indicator adapted into a momentum signal. SOPR crossing 1.0 from below is historically a high-conviction long signal.
Python Stack
scikit-learn TensorFlow Keras XGBoost hmmlearn pandas / NumPy SHAP Backtrader vectorbt Optuna TA-Lib statsmodels