PolicyWindow

The number of consecutive steps of observations and actions over which to train the policy.