Abstract
This study presents an explainable self-attention framework for predicting individual financial vulnerability using large-scale longitudinal banking data. While achieving strong discriminative performance (AUC-ROC ≈ 0.90), the model prioritises transparency through outcome-aware feature attribution analysis across all prediction classes. Our results reveal that high-risk classifications are driven by asymmetric “risk activation” triggered by financial and employment volatility, specifically income variability and payment irregularity, rather than static demographic factors. Analysis of misclassifications indicates that errors stem from structural limitations in evidence aggregation rather than random noise. By linking predictive outcomes to underlying feature-level logic, this work provides a scalable approach to high-stakes automated decision-making in financial risk assessment.