G

G (approximate GELU)

Study / ResearchMentioned in 1 video

The G nonlinearity (approximate GELU) used by GPT-2; the video explains exact vs approximate forms and historical reasons.