
Noisy networks facilitate the exploration process.Distributional RL with quantile regression gives similar results.The distributional approach significantly increases the performance of the agent.However, after roughly 300,000 training steps the agent trained without prioritized experience replay performs better Prioritized experience replay at first performs better,.n-step significantly decreased the performance.Double Q-Learning and Dueling Networks did not improve the performance.All six Rainbow extensions have been evaluated.The used hyperparameters can be found at the bottom of trainer.py below if _name_ = '_main_':.See superhexagon.SuperHexagonInterface._preprocess_frame for more implementational details.Such that the walls and the player belong to the foreground and everything else belongs to the background Additionally, a threshold function is applied to the frame.See utils.Network for more implementational details.

For the fully connected part of the network the feature maps of both streams are flattened and concatenated.

