An Interest In:
Web News this Week
- April 14, 2024
- April 13, 2024
- April 12, 2024
- April 11, 2024
- April 10, 2024
- April 9, 2024
- April 8, 2024
A machine-learning wishlist for hardware designers
Pete Warden (previously) is one of my favorite commentators on machine learning and computer science; yesterday he gave a keynote at the IEEE Custom Integrated Circuits Conference, on the ways that hardware specialization could improve machine learning: his main point is that though there's a wealth of hardware specialized for creating models, we need more hardware optimized for running models.
Read the restIve saved what I expect may be my most controversial request until last. The typical design process Ive seen from hardware teams is that they will look at some existing ML workloads, note that almost all of the time goes into just a few operations, and so design an accelerator that speeds up those critical-path ops.
This sounds fine in principle, but when an accelerator like that is integrated into a full system it often fails to live up to its potential. The problem is that even though most of the compute for almost all models does go into a handful of common operations, there are hundreds of others that often appear. Almost every model I see has some of these, and theyre almost always different from network to network. A good example is non-max suppression in MobileSSD and similar object detection models, where we need some very specific and custom operations to merge the many bounding boxes that are output by the model into just a few coherent final results. This doesnt require very much raw compute, but it does take a lot of logic, and is hard to express except as general C++ code.
Original Link: http://feeds.boingboing.net/~r/boingboing/iBag/~3/jr7TIGbCYPE/8-bits-are-enough.html