Financial decision‐making using reinforcement learning with Dirichlet priors and quantum‐inspired genetic optimization
Debjit Dhar, Prasun Nandy, Rik DasAbstract
We introduce a hybrid reinforcement learning (RL) framework for dynamic budget allocation that blends Dirichlet‐inspired stochasticity with quantum‐mutation genetic refinement. Trained on Apple Inc.'s quarterly financials (2009–2025), the RL agent learns to allocate budgets between R&D and SG&A to maximize profitability, whereas penalties enforce adherence to historical spending patterns and prevent unrealistic deviations. A Dirichlet state‐evolution model that shifts financial contexts and a genetic algorithm augmented by quantum mutation implemented as parameterized qubit‐rotation circuits refine the policy to escape local minima and improve generalization. Generation‐wise rewards and penalties are logged to visualize convergence and policy behavior. On held‐out fiscal data, the proposed approach achieves near‐perfect alignment with actual allocations (cosine similarity ; KL divergence ), underscoring the promise of combining deep RL, stochastic modeling, and quantum‐inspired heuristics for adaptive enterprise budgeting.