NeurIPS 2026Under Review2026

Density Matrix MDPs: Structured Probabilistic State Representations for Reinforcement Learning under Demand Uncertainty

Abstract

Introduces density matrix Markov decision processes (DM-MDPs) for structured probabilistic state representations in reinforcement learning under demand uncertainty. Derives theoretical convergence and sample complexity guarantees.

Methodology

01Formalized density matrix state representations for MDPs
02Derived convergence and sample complexity bounds
03Benchmarked on retail demand datasets against standard MDP baselines

Tools Used

PythonJAXNumPyQiskitLaTeX

Download PDF

Ask Claude about this paper