Mean Actor Critic

A new reinforcement learning algorithm for estimating the policy gradient with lower variance than traditional approaches.