Effects of Saltatory Rewards and Generalized Advantage Estimation on Reference-Based Deep Reinforcement Learning of Humanlike Motions