Back to Blog

Machine Learning Backtesting: How SignalScope Gets Smarter Over Time

7 min readmachine-learningbacktestingmethodology

Most signal detection tools score signals once and move on. The scoring rules stay static until someone manually adjusts them. SignalScope takes a different approach: every signal's real-world outcome is tracked, measured, and fed back into a machine learning model that continuously refines how signals are scored, filtered, and staged. The platform gets smarter with every scan.

The feedback loop

The backtesting pipeline follows five steps: price snapshots, return computation, feature engineering, model training, and threshold optimization. Twice daily — at market open and close — automated price snapshots capture the current price of every validated ticker. Returns are then computed at 1, 3, 7, and 30 days after detection, building a growing time-series for each signal. Tolerance windows handle weekends and holidays: the 1-day return uses an 18-48 hour window, the 3-day return uses 54-120 hours, and so on. The system always picks the snapshot closest to the target time within these windows.

Feature engineering

Each signal in the dataset carries dozens of features: the number and type of sources, source weights, AI score, opportunity score, signal stage, P&D flag count, market cap, price, volume metrics, 52-week range position, sector, signal freshness, velocity, novelty indicators, and more. These features capture both the signal characteristics at detection time and the market context. The dataset grows with every scan, giving the model more examples to learn from.

XGBoost gradient boosting

The ML model uses XGBoost, a gradient boosted decision tree algorithm widely used in quantitative finance. XGBoost excels at finding non-linear relationships between features and outcomes — for example, that signals with a specific combination of source types, price ranges, and social velocity tend to outperform. The model is retrained periodically as the dataset grows, using standard train/test splits to validate that improvements generalize rather than overfit.

SHAP analysis for interpretability

Raw model predictions are useful, but understanding why the model makes certain predictions is essential for improving the signal pipeline. SHAP (SHapley Additive exPlanations) provides a principled way to attribute each prediction to individual features. This reveals which factors are actually driving accuracy. For example, SHAP analysis might show that the combination of SEC insider purchases and volume spikes is a much stronger predictor than AI score alone. These insights directly inform which thresholds to adjust, which flags to add or remove, and how to weight different source types.

Closing the loop

The insights from XGBoost and SHAP flow back into the signal pipeline as concrete optimizations: adjusting AI score thresholds for stage assignments, refining P&D flag logic, reweighting sources, and tuning novelty bonuses. Each optimization is tracked in an experiment log with commit hashes, performance metrics, and descriptions of what changed. Over time, this creates a record of which changes moved the needle and which were noise — informing future iterations of the model and the pipeline.