utils#
Source code: tianshou/trainer/utils.py
- gather_info(start_time: float, train_collector: Collector | None, test_collector: Collector | None, best_reward: float, best_reward_std: float) dict[str, float | str][source]#
A simple wrapper of gathering information from collectors.
- Returns:
A dictionary with the following keys:
train_stepthe total collected step of training collector;train_episodethe total collected episode of training collector;train_time/collectorthe time for collecting transitions in the training collector;train_time/modelthe time for training models;train_speedthe speed of training (env_step per second);test_stepthe total collected step of test collector;test_episodethe total collected episode of test collector;test_timethe time for testing;test_speedthe speed of testing (env_step per second);best_rewardthe best reward over the test results;durationthe total elapsed time.
- test_episode(policy: BasePolicy, collector: Collector, test_fn: collections.abc.Callable[[int, int | None], None] | None, epoch: int, n_episode: int, logger: BaseLogger | None = None, global_step: int | None = None, reward_metric: collections.abc.Callable[[numpy.ndarray], numpy.ndarray] | None = None) dict[str, Any][source]#
A simple wrapper of testing policy in collector.