mirror of
https://github.com/wassname/ray.git
synced 2026-06-27 23:08:32 +08:00
6bb1103930
## What do these changes do? Previously we logged a warning if the PPO configuration would waste many samples. However, this didn't apply in the case of long episodes in `complete_episodes` batch mode, and also the amount of waste is up to 2x in common cases. This pr: - Estimates the number of sampling tasks needed to avoid over-sampling. - Collects all sample results and never discards any. In principle this can degrade performance at large scale if certain machines are slower. Add a config flag to enable this legacy behavior. ## Related issue number Closes: https://github.com/ray-project/ray/issues/3549