utils

utils#

Source code: tianshou/trainer/utils.py

gather_info(start_time: float, train_collector: Collector | None, test_collector: Collector | None, best_reward: float, best_reward_std: float) → dict[str, float | str][source]#

A simple wrapper of gathering information from collectors.

Returns:

A dictionary with the following keys:

train_step the total collected step of training collector;
train_episode the total collected episode of training collector;
train_time/collector the time for collecting transitions in the training collector;
train_time/model the time for training models;
train_speed the speed of training (env_step per second);
test_step the total collected step of test collector;
test_episode the total collected episode of test collector;
test_time the time for testing;
test_speed the speed of testing (env_step per second);
best_reward the best reward over the test results;
duration the total elapsed time.

test_episode(policy: BasePolicy, collector: Collector, test_fn: collections.abc.Callable[[int, int | None], None] | None, epoch: int, n_episode: int, logger: BaseLogger | None = None, global_step: int | None = None, reward_metric: collections.abc.Callable[[numpy.ndarray], numpy.ndarray] | None = None) → dict[str, Any][source]#: A simple wrapper of testing policy in collector.