@haldaume3 @yoavgo @karpathy seems like that shoul…

@haldaume3 @yoavgo @karpathy seems like that should make it much harder to learn long-distance functions, e.g. “forget” gate.


Posted

in

by

Tags: