With all the current news about NVIDIA AI/ML chips;
Can anybody give an overview of AI/ML/NPU/TPU/etc chips and pointers to detailed technical papers/books/videos about them? All i am able to find are marketing/sales/general overviews which really don't explain anything.
Am looking for a technical deep dive.
If you are familiar with linear algebra these specialized chips literally etch silicon so as to perform vector (and more general multi-array or tensor) computations faster than a general purpose CPU. They do that by loading and operating a whole set of numbers (a chunk of a vector or a matrix) simultaneously (whereas the CPU would operate mostly serially - one at a time).
The advantage is (in a nutshell) that you can get a significant speedup. How much depends on the problem and how big a chunk you can process simultaneously but it can be a significant factor.
There are disadvantages that people ignore in the current AI hype:
* The speedup in a one-off gain, the death of Moore's law is equally dead for "AI chips" and CPU's
* It is extremely specialized and fine-tuned software you need to develop and run and it only applies to the above linear algebra problems.
* In the past such specialized numerical algebra hardware was the domain of HPC (high performance computing). Many a supercomputer vendor went bankrupt in the past because the cost versus market size was not there.