Taylor TD-learning

Provides lower variance TD updates through a first-order Taylor expansion of expected TD updates.