nn_zero_to_hero why in makemore2.jl does the loss bottom out at 4? the same code written with torch and flux consistently hits ~2.5ish todo: build bigram word model on the concatenated yt transcripts