Oct 12, 2021
As you state in the article and others have reinforced, this isn't really about Cuda vs Go but more about a parallized architecture vs a general purpose one.
With that in mind, it's probably worth taking a look at Gorgonia which allows some level of low-level GPU programming in Go. While I have done some CUDA work in the past (in C of course), I have not worked with Gorgonia personally so I don't know much about it other than what their github landing page states.
If you're curious..