Intel’s View on AI: Do What NV Doesn't

On the whole, Intel has a good point that there is "a wide range of AI applications", e.g. there is AI life beyond CNNs. In many real-life scenarios, traditional machine learning techniques outperform CNNs, and not all deep learning is done with the ultra-scalable CNNs. And in other real-world cases, having massive amounts of RAM is another big performance advantage, both while training the model and using it to infer new data. 

So despite NVIDIA’s massive advantage in running CNNs, high end Xeons can offer a credible alternative in the data analytics market. To be sure, nobody expects the new Cascade Lake Xeons to outperform NVIDIA GPUs in CNN training, but there are lots of cases where Intel might be able to convince customers to invest in a more potent Xeon instead of an expensive Tesla accelerator:

  • Inference of AI models that require a lot of memory
  • "Light" AI models that do not require long training times.
  • Data architectures where the batch or stream processing time is more important than the model training time.
  • AI models that depend on traditional “non-neural network” statistical models

As result, there might be an opportunity for Intel to keep NVIDIA at bay until they have a reasonable alternative for NVIDIA’s GPUs in CNN workloads. Intel has been feverishly adding features to the Xeons Scalable family and optimizing its software stack to combat NVIDIA AI hegemony. Optimized AI software like Intel’s own distribution for Python, the Intel Math Kernel Library for Deep Learning, and even the Intel Data Analytics Acceleration Library – mostly for traditional machine learning... 

All told then, for the second generation of Intel’s Xeon Scalable processors, the company has added new AI hardware features under the Deep Learning (DL) Boost name. This primarily includes the Vector Neural Network Instruction (VNNI) set, which can do in one instruction what would have previously taken three. However even farther down the line, Cooper Lake, the third-generation Xeon Scalable processor, will add support for bfloat16, further improving training performance.

In summary, Intel trying to recapture the market for “lighter AI workloads” while making a firm stand in the rest of data analytics market, all the while adding very specialized hardware (FPGA, ASICs) to their portfolio. This is of critical importance to Intel's competitiveness in the IT market. Intel has repeatedly said that the data center group (DCG) or “enterprise part” is expected to be the company's main growth engine in the years ahead.

Convolutional, Recurrent, & Scalability NVIDIA’s Answer: Bringing GPUs to More Than CNNs
Comments Locked

56 Comments

View All Comments

  • tipoo - Monday, July 29, 2019 - link

    Fyi, when on page 2 and clicking "convolutional, etc" for page 3, it brings me back to the homepage
  • Ryan Smith - Monday, July 29, 2019 - link

    Fixed. Sorry about that.
  • Eris_Floralia - Monday, July 29, 2019 - link

    Johan's new piece in 14 months! Looking forward to your Rome review :)
  • JohanAnandtech - Monday, July 29, 2019 - link

    Just when you think nobody noticed you were gone. Great to come home again. :-)
  • Eris_Floralia - Tuesday, July 30, 2019 - link

    Your coverage on server processors are great!
    Can still well remember Nehalem, Barcelona, and especially Bulldozer aftermath articles
  • djayjp - Monday, July 29, 2019 - link

    Not having a Tesla for such an article seems like a glaring omission.
  • warreo - Monday, July 29, 2019 - link

    Doubt Nvidia is sourcing AT these cards, so it's likely an issue of cost and availability. Titan is much cheaper than a Tesla, and I'm not even sure you can get V100's unless you're an enterprise customer ordering some (presumably large) minimum quantity.
  • olafgarten - Monday, July 29, 2019 - link

    It is available https://www.scan.co.uk/products/32gb-pny-nvidia-te...
  • abufrejoval - Tuesday, July 30, 2019 - link

    Those bottlenecks are over now and P100, V100 can be bought pretty freely, as well as RTX6000/8000 (Turings). Actually the "T100" is still missing and the closest siblings (RTX 6000/8000) might never get certified for rackmount servers, because they have active fans while the P100/V100 are designed to be cooled by server fans. I operate a handful of each and getting budget is typically the bigger hurdle than purchasing.
  • SSNSeawolf - Monday, July 29, 2019 - link

    I've been trying to find more information on Cascade Lake's AI/VNNI performance, but came up dry. Thanks, Johan. Eagerly putting this aside for my lunch reading today.

Log in

Don't have an account? Sign up now