Hot Chips 31 Live Blogs: Habana's Approach to AI Scaling
by Dr. Ian Cutress on August 19, 2019 9:15 PM EST- Posted in
- Artificial Intelligence
- Hot Chips
- Live Blog
- Habana

09:21PM EDT - The final talk today at Hot Chips is from Habana, who is discussing its approach to how to scale AI compute.
09:21PM EDT - Goya and Gaudi
09:22PM EDT - Recapping Training vs Inference requirements
09:24PM EDT - Goya processor architecure
09:24PM EDT - 3 engines, RPC, GEMM, and DMA. Work Concurrently with shared SRAM
09:24PM EDT - TPC is VLIW SIMD core, C-programmable
09:24PM EDT - PCIe Gen 4.0 x16
09:24PM EDT - Two DDR4-2666 channels, built on TSMC 16
09:25PM EDT - Supports UINT8 to FP32
09:25PM EDT - Dedicated HW and TPC ISA for special function acceneration
09:25PM EDT - Have to adjust quantization to mix accuracy vs power
09:26PM EDT - PCIe card - Software stack is more important.
09:26PM EDT - Habana is a software company that just happens to do hardware
09:27PM EDT - Graph compiler with built-in quantization engine
09:27PM EDT - Multiple recipes can be loaded for the hardware
09:28PM EDT - Goya supports models trained on any processor: CPU, GPU, TPU, Gaudi etc
09:28PM EDT - Users can create custom layers and kernels
09:29PM EDT - Still market leader since benchmarks made 11 months ago vs common CPU/GPU
09:29PM EDT - New for today, natural language benchmark results
09:30PM EDT - Support BERT architecture on Goya
09:30PM EDT - GEMMs and TPCs are fully utilized
09:30PM EDT - Chip was designed long before BERT was invested
09:30PM EDT - invented
09:30PM EDT - High degree of accuracy when quantized
09:30PM EDT - Software managed SRAM
09:31PM EDT - Now Gaudi, the training processor
09:31PM EDT - Performance at Scale, high throughput at low batch size, high power efficiency
09:32PM EDT - Enable native ethernet scale out - on chip RDMA over Converged Ethernet
09:32PM EDT - Open Compute Project Accelerator Module: OAM = (OCP)AM
09:32PM EDT - Framework and ML compiler support, rich TPC Kernet Library
09:32PM EDT - Architecture looks similar to Goya
09:33PM EDT - Networking has changed, memory has changed
09:33PM EDT - PCIe 4.0 x16, 4x8GB HBM
09:33PM EDT - 10x 100 GbE, or 20x50 GbE
09:33PM EDT - Supports UINT8 to FP32 and BF16
09:34PM EDT - SW supports profiling tools
09:34PM EDT - Only AI Training chip with RoCE v2
09:35PM EDT - NVIDIA was first to showcase RoCE v2 for AI, but they haven't implemented it yet
09:36PM EDT - NVIDIA GPU is much more complex with RoCE v2 support via Mellanox
09:36PM EDT - Gaudi integrates both
09:36PM EDT - Supports Lossless and Lossy fabrics
09:36PM EDT - Advanced congestion controls
09:37PM EDT - Customers can buy OAM cards or an 8 card Server
09:38PM EDT - Server box has no CPU, up to customer to config to needed. Uses mini-SAS HD
09:38PM EDT - Ethernet connectivity for point-to-point links with non-blocking full mesh
09:38PM EDT - 3 ports per card for scale up
09:39PM EDT - Can choose ratio of CPUs to Gaudi cards
09:39PM EDT - Gaudi vs DGX
09:40PM EDT - Unlike DGX, do not force user to separate PCIe between management and scaleout. Gaudi offers separate PCIe ports
09:41PM EDT - PCIe card dual slot also available
09:41PM EDT - HL-200
09:41PM EDT - Data parallel possible, model parallel possible
09:44PM EDT - Can leapfrog performance over DGX-2 due to better connectivity. Can connect 64 gaudi chips with non-blocking throughput
09:45PM EDT - Q&A time
09:46PM EDT - Q: What type of quantization requires a processor? There is no quantization processor. There's a software engine that takes an FP32 model and can quantize to data types that are more efficient and gives the feedback on the accuracy
09:47PM EDT - Q: Can you comment on interconnectivity of GEMM? A: It's one functional unit.
09:48PM EDT - Q: What is the minimum viable for an IoT gateway? A: You can use a single card. You can put a gaudi in a single PCIe slot.
09:48PM EDT - That's a wrap for today. More talks tomorrow!
17 Comments
View All Comments
zenabartell - Wednesday, January 27, 2021 - link
Such sites are important because they provide a large dose of useful information. This is very significant, and yet necessary towards for me. Thank u! https://skribble-free.online -> skribbl io https://geometrydash-free.online -> geometry dashDenBrown - Monday, February 15, 2021 - link
Hellodaniellewatson - Friday, March 12, 2021 - link
Thank you for sharing this useful material. The information you have mentioned here will be useful. I would like to share with you all one useful source https://mid-terms.com/buy-discussion-board-post/ which might be interesting for you as well.gloribenedict - Tuesday, May 11, 2021 - link
Technology is reaching a new level. But working with the latest gadgets and programs for them has become commonplace for masters of the new century. And https://thesisleader.com/expert-excel-help-online/ is a help to better master! Training, work, self-development will seem like a nice bonus, after reviewing the works of masters. Qualified masters will consult, advise on how best to perform tasks, giving motivation.EvaHill - Thursday, May 27, 2021 - link
It's an interesting article and here there a lot of useful information from this presentation. I really like it. And also I have to find useful essays here https://essayscreator.com/satirical-essay.html Hope to find this information too. I hope so)adanielle - Wednesday, August 4, 2021 - link
It is really useful, because i'm writing artificial intelligence essay and this article is full of illustrative examples. If you'll have some advice, write me please. Good luck with what you'r dooing!Amanda33 - Saturday, August 7, 2021 - link
Thanks for the article. It is very interesting to read about the presentation of new technologies that will help us discover something new. I think you may need to buy a discussion board post on this site https://essays-writer.net/discussion-board-post-wr... to add information about new technologies.