Skip to content

Conversation

@Naren219
Copy link

i found out that tanh wasn't implemented in the repo so I copied Karpathy's code from the video to the value class as a method.

using this nonlinearity function should allow you to train better with negative numbers (found this out the hard way by trying to replicate the video dataset with relu instead and my loss was so high).

hope this helps!
-naren

@conscell
Copy link

This implementation is numerically unstable. For large x e.g. x=1000 it will cause the following error:
OverflowError: math range error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants