Auto-Tagging and Temporal Refresh Using Reinforcement Learning-Based Meta-Learning Frameworks for Real-Time Fraud Detection
, PayPal
Detecting fraud poses multiple challenges on account of systemic changes in fraud snapshots, but also due to temporal changes stemming from incremental product releases, feature creep, and variations in starkly heterogeneous traffic from buyers and sellers transacting over global ecommerce. Such context poses challenges associated with temporal stability of machine learning frameworks in the context of an ever-evolving ecosystem, making a strong case for periodic model retraining. However, such refresh comes with its own set of subjective considerations such as the frequency, extent, and nature of model refresh. Furthermore, the source of truth associated with the outcome of an exchange is available with time delays, as indicators of fraudulent activity become quantifiable following notifications from various stakeholders during the different stages of payment transaction processing. We'll present an end-to-end framework from auto-tagging and temporal retraining using a meta-learning framework, providing a outline of results using metric, model, and optimization-based meta-learning approaches. We'll present different variations of policy gradient approaches, coupled with trust region optimization methods, along with a quantitative comparison on efficacy and temporal robustness of each method. Last, we'll address infrastructure challenges and associated solutions utilizing GPU hardware for training and real-time inferencing.