For the very first time in history, most of the flops added to the TOP500 list originated from GPUs rather than CPUs. Can be this the condition of things to come?
In the latest TOP500 ranks announced this week, 56 percent of the additional flops were due to NVIDIA Tesla GPUs working in new supercomputers - that based on the Nvidians, who love monitoring such things. In this instance, most of those extra flops originated from three top systems not used to the list: Summit, Sierra, and the AI Bridging Cloud Infrastructure (ABCI).
Summit, the new Best500 champ, pushed the prior number one system, the 93-petaflop Sunway TaihuLight, into second place with a Linpack score of 122.3 petaflops. Summit is powered by IBM servers, each one built with two Power9 CPUs and six V100 GPUs. Regarding to NVIDIA, 95 percent of the Summit’s peak performance (187.7 petaflops) comes from the system’s 27,686 GPUs.
NVIDIA did an identical calculation for the less powerful, and somewhat less GPU-intense Sierra, which nowadays ranks as the 3rd fastest supercomputer on earth at 71.6 Linpack petaflops. And, although nearly the same as Summit, it features four V100 GPUs in each dual-socked Vitality9 node, rather than six. However, the 17,280 GPUs in Sierra still represent the lion’s share of this system’s flops.
Likewise for the new ABCI machine found in Japan, which is currently that country’s speediest supercomputer and is ranked fifth on the planet. Each of its servers pairs two Intel Xeon Gold CPUs with four V100 GPUs. Its 4,352 V100s deliver almost all the system’s 19.9 Linpack petaflops.
As dramatic just as that 56 percent quantity is for latest TOP500 flops, the truth is probably even more impressive. Relating to Ian Buck, vice president of NVIDIA’s Accelerated Computing business unit, over fifty percent the Tesla GPUs they offer into the HPC/AI/info analytics space happen to be bought by buyers who hardly ever submit their devices for TOP500 consideration. Although many of these GPU-accelerated machines would qualify for an area on the list, these specific customers either don’t value all the TOP500 fanfare or would prefer to certainly not advertise their hardware-buying behaviours to their competitors.
It’s also value mentioning that the Tensor Cores found in the V100 GPUs, with their specialized 16-bit matrix maths capacity, endow these 3 new systems with more deep learning potential than any previous supercomputer. Summit by itself features over three peak exaflops of deep learning effectiveness. Sierra’s functionality in this respect is more in a nearby of two peak exaflops, while the ABCI number is around half an exaflop. Taken along, these three supercomputers signify extra deep learning capability compared to the other 497 devices on the TOP500 list put together, at least from the perspective of theoretical performance.
The addition of AI/equipment learning/deep learning into the HPC application space is a comparatively new phenomenon, however the V100 appears to be acting as a catalyst. This year’s TOP500 list represents a distinct shift towards devices that support both HPC and AI computing,” noted TOP500 writer Jack Dongarra, Professor at University of Tennessee and Oak Ridge National Laboratory.
While company’s like Intel, Google, Fujitsu, Wave Processing, Graph core, and other folks are growing specialized deep learning accelerators for the data center, NVIDIA is sticking with a built-in AI-HPC design because of its Tesla GPU series. And this certainly seems to be paying down, given the growing pattern of using artificial intelligence to accelerate traditional HPC applications. Although the percentage of users integrating HPC and AI continues to be relatively tiny, this mixed-work flow unit is slowly being extended to practically every technology and engineering domain, from conditions forecasting and economic analytics, to genomics and essential oil & gas exploration.
Buck admits this interplay around traditional HPC modelling and machine learning continues to be in the initial stages, but maintains it’s only likely to get extra intertwined.” He says despite the fact that some customers use only a subset of the Tesla GPU’s features, the benefits of supporting 64-little bit HPC, equipment learning, and visualization on the same chip far outweighs any positive aspects that could be understood by single-goal accelerators.
And, thanks in large portion to these deep-learning-enhanced V100 GPUs, mixed-workload equipment are actually popping up about a reasonably regular basis. For instance, although Summit was formerly going to be just another humongous supercomputer, it really is now staying groomed as a program for cutting-edge AI as well. In comparison, the ABCI system was conceived right from the start as an AI-in a position supercomputer that could serve users working both classic simulations and analytics, as well as deep learning workloads. Before this month, the MareNostrum supercomputer added three racks of Power9/V100 nodes, paving just how for critical deep learning job to commence at the Barcelona Supercomputing Center. And even the addition of simply 12 V100 GPUs to the Nimbus cloud assistance at the Pawsey Supercomputing Center was enough to declare that AI would today be fair video game on the Aussie program.
Seeing that Buck implied, you don’t need to take benefit of the Tensor Cores to really get your money’s well worth from the V100. At seven double-precision teraflops, the V100 is an extremely capable accelerator for regular supercomputing. And regarding to NVIDIA, there are 554 codes ported to these images chips, including all of the leading 15 HPC applications.
But as V100-powered systems get their way into exploration labs, universities, and professional datacenters, more researchers and engineers will end up being tempted to inject AI into their 64-bit applications. And whether this turns out to be a case of the tail wagging your dog or the various other way around, ultimately, it doesn’t really subject. The HPC application landscape will be forever changed.
The TOP500 list can be an intensely valuable tool for the HPC community, tracking aggregate trends over 25 years. Nevertheless, a few observers have noted that new publications of the Top rated500 list have many duplicate entries, often at anonymous sites.
Let’s park the debate on what is a legitimate HPC system or not for the present time and assume that any program that has completed a higher Performance Linpack (HPL) go is a good entry in the list.
But, there are 124 entries in the list that happen to be identical copies of different entries. Put simply, a single HPL run offers been done on one system, and the vendor has said “Since we’ve marketed N copies of that machine, we can send N entries on the list”.
What goes on to the list figures if we delete all of the duplicate systems?
The set of 500 reduces to 376 entries.
The largest change is Lenovo, dropping from 117 entries to just sixty-one - yes, there are 56 duplicate entries from Lenovo! HPE drops from 79 to 62 entries and retakes the most notable spot for greatest show of the list with 16.5 percent. Lenovo drops to second place with 16.2 percent share.
Does this matter? Very well, it probably will to Lenovo, who thought we would submit many copies, and HPE, who probably sold various copies but chose not to send the duplicates. And eventually it matters with their market share PR.
For ordinary people, it comes down to what the list is about. If it likely to list the 500 quickest supercomputers on the globe, clearly it doesn’t do this as many supercomputer owners choose not to acknowledge their devices. Is it the set of regarded supercomputers? No, because several referred to supercomputers are not listed, for instance, Blue Waters. Hence, it can only be a set of acknowledged HPL works, which would recommend that the copies approach is wrong.
However, it isn’t simply because simple mainly because that. If the list is normally for monitoring vendor/technology market share, then list stuffing is okay - even desired. If the list is usually for monitoring adoption of technology, vendor wins, the fortunes of HPC sites, and improvement of nations, in that case If argue that stuffing in this manner breaks the usefulness of the list.
The comparison of progress of nations can be affected. Carry out we measure just how many devices deployed? Or who gets the biggest system? I think the list stuffing is certainly less critical below, as we are able to readily extract both developments.
I’m not sure there is a right or wrong answer to this but, as often, the headline statistics of market share just tell one aspect of the story, and users of the list must dig deeper for appropriate insight.
Another value of the list over time has been the capability to track who is ordering supercomputers and what they are employing them for.
In the June 2018 list, 97 percent of the systems do not specify an application area. This implies this categorization is actually meaningless. Either drop it from upcoming lists or require long term systems to identify an application area.
But, beyond that, the June 2018 List features 283 (!) anonymous entries. That’s over one half of the list where we have no idea the business or site that has deployed the system nor, in most cases, what it really is being used for.
The big question is: does that render meaningless the opportunity to track the who and what of HPC, or would we be in a worse position if we excluded anonymous systems?
There are maybe 238 systems which can be lumped into cloud / IT service / web hosting companies, and it is the sheer quantity of these that delivers the possibly unhelpful distortion.
The rest of the 45 arguably represent real HPC deployments and have plenty of categorization (e.g., “Strength Company” to be useful. Useful guesses can even be produced for some of these anonymous systems. For example, one might reckon that “Energy Firm (A)” found in Italy is in fact Eni. The 45 devices are a mixture of energy, manufacturing, authorities, or finance sectors. Many interestingly, a few university devices are listed anonymously also!
Of the 284 devices listed as Industry, only 16 systems actually have named owners that aren’t cloud suppliers (the other 268 are usually the anonymous companies and Stuffing discussed above). In fact, due to multiple systems per web page, we actually only employ a few listed companies. These are Eni, Total, PGS, Saudi Aramco, BASF, EDF, Volvo and the suppliers Intel, NVIDIA, and PEZY.
I just assure you there are a lot more supercomputers out there in industry than this. I intend to explore the Major500 tendencies, along lessons from my impartial HPC consulting job, of HPC systems in industry for a future article. Look out for this after ISC18.
I’ve previously said that the HPC is lucky to really have the TOP500 list. It is a info set collected in a regular manner twice annually for 25 years, with each entry recording many characteristics - vendor, technology details, effectiveness, location, year of entry, etc. That is a hugely wealthy resource for our network, and the TOP500 authors did an amazing task over 25 years to keep carefully the list valuable as the world of HPC has evolved enormously. I have high confidence in the authors efficiently addressing the challenges right here and keeping the list as a great resource for years to come.