Can they put a GPU next to 2 CPU's?

Question

Can they put a GPU next to 2 CPU's?

Christian Cox

July 15, 2017 - 02:46

Other urls found in this thread:

hsafoundation.com
youtu.be/NoelgG8JoyQ?t=7m29s
techpowerup.com/235092/intel-says-amd-epyc-processors-glued-together-in-official-slide-deck
twitter.com/AnonBabble

Oliver Russell

Yep,HSA comes to live,hsafoundation.com

July 15, 2017 - 02:48

Hunter Moore

dunno, but im more interested if they can put some HBM2 stacks next to it

July 15, 2017 - 02:52

Lincoln Barnes

yes. Custom processor for the CERN is on track.

July 15, 2017 - 02:54

Sebastian Gutierrez

you can glue anything together with infinity fabric, user.

July 15, 2017 - 02:54

Chase Russell

DELET or else.

July 15, 2017 - 02:55

Zachary Perry

It would be cool if you only needed "one" chip to run the most important parts of the computer.

July 15, 2017 - 02:58

Asher Moore

That's what Vega is.

July 15, 2017 - 03:01

John Collins

next to 2 CPU's to act as a giant L4 cache, not next to a GPU

July 15, 2017 - 03:04

Jason Reyes

Apparently AMD can glue them but Intel can't.

July 15, 2017 - 03:07

Zachary Williams

AMD puts the glue *between* dies to connect them together.
Intel puts the glue *on* the die to save 4 cents on TIM.

July 15, 2017 - 03:18

Jacob Hill

Fuck indonesia

July 15, 2017 - 03:20

Andrew Bailey

What?

July 15, 2017 - 03:21

Liam Turner

Kek

July 15, 2017 - 03:22

Ryder Mitchell

>ryzen will enable CERN to rule the world

July 15, 2017 - 04:13

Hudson Gutierrez

yes

July 15, 2017 - 04:14

Isaiah Williams

What's that?

July 15, 2017 - 04:24

Andrew Lopez

that's a 32 core APU with memory on die

July 15, 2017 - 04:26

Robert Moore

Looks like a Final Solution for "Intel in HPC" problem.

July 15, 2017 - 04:27

Thomas Baker

if they manage to make it comparable to a real gpu in compute, yes

at least for organizations that don't want to buy separate cpu and gpus this could work

July 15, 2017 - 04:33

Nicholas Robinson

Beats anything Intel has to offer, in any case.

July 15, 2017 - 04:35

Ethan Jackson

fair enough

July 15, 2017 - 04:37

Anthony Wood

oi vey, anudda shoa!

July 15, 2017 - 06:28

Jaxson Edwards

its called an igpu

so yeah

July 15, 2017 - 06:38

Michael Torres

It's designed for exascale aka a billion-billion calculations per second. Imagine entire farms running cabinets stuffed full of dual socket chips like this in every node. It would be near the top if not the first of the top 500.

There are like three places in the world where something like that would even be stressed.

July 15, 2017 - 06:44

Gavin Reyes

so like an APU?

July 15, 2017 - 06:44

Grayson Martin

Nigga they can do what ever the fuck they want.

July 15, 2017 - 06:45

Jose Miller

That is not the same thing at all. Integrated GPU with it's own on-die RAM would be better than shared system RAM.

July 15, 2017 - 06:46

Leo Roberts

implementing a gpu on-die with the cpu is called an igpu

it's integrated
how are you missing the point

July 15, 2017 - 06:49

Isaac Brooks

TR4 socket APUs would probably cannibalise their discrete GPU market and need alot of cooling

July 15, 2017 - 06:54

Cameron Powell

what are some examples of organizations that'd need/want this?

July 15, 2017 - 06:57

Brandon Howard

Goverments that want VERY powerful supercomputers.

July 15, 2017 - 07:03

Adam Cox

DARPA, CERN, any National Lab whether it's for aeronautics, nuclear energy, or space exploration, Department of Energy, anything that needs the computational power of millions of CPU/GPU cores.

July 15, 2017 - 07:08

Jonathan Sanders

sorry for all those questions, but why those APUS are better than that server AMD showed that has 100 Tflops of computing power on a 2U form factor?

July 15, 2017 - 07:08

Liam Reyes

Because an Exaflop is 1000 Petaflops, which is 1000 Teraflops.

July 15, 2017 - 07:11

Jackson Kelly

I understand this, what I don't understand is why someone would take these over dedicated gpus and cpus, which are arguably better at their own jobs

July 15, 2017 - 07:16

Julian Reyes

I see the future! CPU + GPU in one chip! Nvidia BTFO. SSD so fast that it competes with memory ram! I need more ram! Just create a larger partition!

July 15, 2017 - 07:18

Jace Roberts

Less latency getting shit to and from the GPU portion. Right now they're punished by PCI-E latency in getting data from the CPU and main memory to the GPU.

July 15, 2017 - 07:18

Owen Bailey

>SSD so fast that it competes with memory ram!
You mean SCM?

July 15, 2017 - 07:19

Hudson Reyes

Read up on HSA.

With your solution you have either serial(CPU) OR parallel(GPU).

With HSA you have serial AND parallel at the same time. The A in APU stands for accelerated so it could complete tasks MUCH faster than one or the other.

July 15, 2017 - 07:21

Bentley Gutierrez

so literally squeezing every last drop of performance out of their equipment?

makes sense actually, every microsecond saved adds up to a lot of time when you consider the amount of nodes they're using

July 15, 2017 - 07:21

Jonathan Sanders

Exactly. Also, not sure if it's true or not, but if such a "super-APU" with the HBM is possible, there's the rumor going around that with HSA the CPU cores would be able to directly access and use the HBM as an L4 cache.

July 15, 2017 - 07:25

Michael Gonzalez

Fuck no they wouldn't.

July 15, 2017 - 07:25

Jackson Sullivan

Noice.

July 15, 2017 - 07:26

Henry Smith

They can test this with EPYC2.
Slapping some HBM2 into SP3 package, like a 8-hi stack per die.

July 15, 2017 - 07:26

Henry Reed

>CPU +GPU in one chip
>wat is APU

July 15, 2017 - 07:26

David Bailey

nice read, now I understand, thanks

fast L4 cache would be a nice thing to have

July 15, 2017 - 07:29

Dominic Hughes

Better than that, if HBM is acting as an L4$, that means the data from the GPU is dropping in to the L4 and it becomes truly heterogeneous. There is zero delay between parallel and serial compute.

July 15, 2017 - 07:32

Cooper Peterson

They'd have to install the GPU though because of the requirement of an interposer to mount the HBM to in the first place.

>There is zero delay between parallel and serial compute.
Well, significantly less vs bouncing it from GPU memory to main memory and back for processing. Still non-zero due to the unavoidable latency from sending the data out across the IF link then through the GPU to the HBM, which IIRC has a latency penalty of its own simply due to how it works.

July 15, 2017 - 07:36

Carson Cooper

what size should that L4 cache have so it can do this properly?

July 15, 2017 - 07:37

Jason Nguyen

>They'd have to install the GPU though because of the requirement of an interposer to mount the HBM to in the first place.
You can use silicon interposer without GPUs, dummy.

July 15, 2017 - 07:38

Levi Wilson

So how does HBM compare in speed to RAM?

July 15, 2017 - 07:42

Samuel Miller

Yes, that's what an APU is. It's hard to buy an Intel CPU that isn't one.

There are two problems with what you're specifically thinking:
- Power and cooling. Performance GPUs are expected to nominally cap out at 300W a chip, while even the most housefire of CPUs top out around 150. GPUs specifically live on riser cards to move that heat away to somewhere more manageable while providing room for a lot of power circuitry; without wildly reengineering everything from sockets to PSU designs the best you can integrate is a budget gaming GPU plus a truly garbage CPU or a good CPU plus an >intel integrated tier GPU.

- Workload scalability: if your work is being done on GPUs, by definition it scales amazingly; you're using a GPU for its parallelization. Which means you don't just want a budget GPU, you want the beefiest one you can possibly fit. And by "one", I mean "four dual-socket parts per machine for whiteboxes, with a serious look at engineering your own backplanes to fit more".
Meanwhile, all this can be controlled by a single CPU. So if you integrate the whole mess on a single die, you're sacrificing money and performance for seven (or more!) useless CPUs, even before we get into the problem of cramming several kilowatts of TDP into a physical standard designed for around fifty and later "supplemented" to 150.

July 15, 2017 - 07:43

Zachary Ramirez

HBM2 caps at 256GB/s per stack.
That's VERY fast.

July 15, 2017 - 07:44

Caleb Martin

True, but it would require spending space on the CPU dies for the HBM interface that would otherwise not be used in most cases (Do you see AMD making Ryzen2 with a fat lump of still expensive as fuck HBM2?). IMO it would be more efficient to use the GPU for the HBM interface on its own interposer, and just link the CPUs and GPU via IF.

July 15, 2017 - 07:45

Levi Morris

HBM PHY is relatively small and full node shrink that is 7nm LP makes it possible.
>expensive as fuck HBM2
Meme. The volume is not there, HBM itself is relatively cheap, since the dies are peanuts-sized.

July 15, 2017 - 07:46

Carson Bell

why are we still using normal ram when this exists then? just make some hbm modules that can be popped into the motherboard and cooled with a heatsink

July 15, 2017 - 07:54

Colton Williams

Capacity.
Upgradability.

July 15, 2017 - 07:55

Austin Powell

damn, that's almost L3 cache levels of fast

July 15, 2017 - 07:55

Josiah Cook

So... A SoC?

July 15, 2017 - 07:57

David Harris

Yes.
Did i tell you it was made by AMD?
ATi/AMD are historically good at inventing fucking memory and i don't really know why.
They don't even fab it.

July 15, 2017 - 07:57

Jose Nelson

>capacity
aren't there 8GB hbm2 stacks? just slap a bunch of those under a heatsink and done

July 15, 2017 - 07:57

Noah Morgan

There are, but that's still nowhere near enough memory.

July 15, 2017 - 07:58

Thomas Ramirez

For what?

Facebook on chrome?

July 15, 2017 - 07:59

Angel Brown

No, some database on server.

July 15, 2017 - 08:00

Jonathan Williams

There's also an old idea of stacking SRAM under the chip itself.
Intel did that in Polaris.

July 15, 2017 - 08:01

Jonathan Hill

it could still be used as "L3.5" cache, even 1GB of that would help a lot in some workloads

July 15, 2017 - 08:02

Leo Gray

You mean L4 cache?
Also AMD needs to make IF faster and even lower latency to leverage advantages of on-package HBM.
We'll see.
It's their tech, i'm sure they'll find a good use for it.

July 15, 2017 - 08:03

Colton Campbell

yes, but with speeds that close to L3, it's not that far off really

IF can hold on it's own up to 512GB/s, the problem is that it runs at too low frequency on ryzen

July 15, 2017 - 08:05

David Sanders

>the problem is that it runs at too low frequency on ryzen
IF is a protocol.
The physical layer speed depends on implementation.
GMI caps at 42.6GBps bidirectional.
IF going through PCI-E root complex capts at 37.9GBps bidirectional.
If Navi truly is MCMed GPUs, we'll see what kind of PHY they will engineer for it.

July 15, 2017 - 08:07

Brody Brooks

I stopped giving a shit about amd and intel and nvidia two years ago.

Quick rundown?

Is it athlon 64 all over again?

Is my 2500k still good?

July 15, 2017 - 08:08

Ethan Howard

>Quick rundown?
Intel is panicking and screaming
>EPYC IS ANNUDA SHOAH
in official SKL-SP slides.
Vega may or may not be R300 2.0: electric boogaloo.

July 15, 2017 - 08:09

Sebastian Moore

yes
also yes

July 15, 2017 - 08:10

Christian Thomas

Jokes aside, what has amd done?

Did they kill bulldozer?

Did they release their fucking arm+x86 soc?

Is intel TRULLY JOKES ASIDE NO HOMO NOT A PRANK NOT RUSING NOT BAMBOOZLING doomed?

July 15, 2017 - 08:13

Lucas Collins

>Did they kill bulldozer?
Yes.
>Did they release their fucking arm+x86 soc?
There was never one.
>Is intel TRULLY JOKES ASIDE NO HOMO NOT A PRANK NOT RUSING NOT BAMBOOZLING doomed?
If their new x86 uarch is shit they will die like the DEC did.

July 15, 2017 - 08:14

Kayden Perez

AMD has roughly caught up to Intel (Broadwell-E/Skylake IPC, but 20% better virtualized thread performance versus HT), but has undercut Intel in price while not resorting to Jew tactics to artificially segment their product line.
Intel is doing really poor damage control as a result.

AMD has almost caught up to Nvidia, but Vega is still not good enough. Blame poor drivers (again) rather than a shitty architecture. Nvidia is laughing at Vega's unoptimized state and not giving a single fuck.

July 15, 2017 - 08:15

Cooper Reyes

>Vega is still not good enough
Looks like Vega is doing mighty fine where it works.

July 15, 2017 - 08:17

Evan Cooper

AMD released a scaleable architecture, with which they can just slap 4x8 core dies together and get 90% scaling. Intel's monolithic dies with shit yields, low clockspeeds and high price tags can't compete against this. So they resorted to screeching like a little kid that AMD's arch is "4 desktop dies glued together", the result was massive hilarity and laughter all around.

July 15, 2017 - 08:17

Thomas Martinez

youtu.be/NoelgG8JoyQ?t=7m29s

July 15, 2017 - 08:19

Lincoln Campbell

It has a larger die size than Fiji, but only 1.15% of the performance at similar clock speeds. The FE card can only beat a GTX 1070. Something has gone horribly wrong with Vega, since Nvidia's pushing 12 TFLOPs on a similar-sized die

July 15, 2017 - 08:19

Charles Wright

tl;dr?

July 15, 2017 - 08:20

Jeremiah Bennett

>It has a larger die size than Fiji
What are you smoking?
>1.15% of the performance at similar clock speeds

July 15, 2017 - 08:21

Oliver King

What kind of jewish tricks has intel done?

The only one i actually fell for was a "binned" 2500k

July 15, 2017 - 08:21

Colton Lopez

...

July 15, 2017 - 08:22

Brayden Lopez

>What kind of jewish tricks has intel done?
Spreading FUD riiiiight in the official SKL-SP launch slides.

July 15, 2017 - 08:22

Carter Garcia

they released 56 cpus that are basically 15 different models with certain features on/off and they cost a fuckton of money

they're also resorting to FUDding on AMD products because they're desperate

July 15, 2017 - 08:24

Charles Taylor

techpowerup.com/235092/intel-says-amd-epyc-processors-glued-together-in-official-slide-deck

July 15, 2017 - 08:25

Camden Hughes

Now about gpu's, is amd still powerfull but also hot and an energy hog

July 15, 2017 - 08:28

William Williams

We don't know anything substantional about Vega.
Also GPUs are inherently housefires.

July 15, 2017 - 08:29

Tyler Hill

now this is some good marketing, not that ""marketing"" from intel

July 15, 2017 - 08:38

John Hernandez

Yes.
There was a video about IF but they removed it.

July 15, 2017 - 08:41

Ryan Clark

he said "tomorrow" a lot of times, what is actually happening today?

July 15, 2017 - 08:42

Benjamin Anderson

Nothing. Looks like it was filmed a day before EPYC launch.

July 15, 2017 - 08:43

Anthony Phillips

>8c/16thread + vega with hbm2
muh dick

July 15, 2017 - 08:44

Landon Cruz

Samsung needs to hurry with low-cost HBM already.
GDDR really-really needs to die already.

July 15, 2017 - 08:45

Hunter Bennett

but he mentioned "glued together", "FUD" and "ecosystem" a number of times, wasn't he mentioning those intel slides?

July 15, 2017 - 08:45

Bentley Hill

These slides are from June.
AMD knew about them.
You know that Intel has no friends left anymore?
Price gouging is bad. Bad!

July 15, 2017 - 08:46

Benjamin Green

7:29

July 15, 2017 - 09:01

Chase Jones

oh well, now intel doesn't look that much of a retard anymore, I wonder what they did when they saw epyc's presentation

July 15, 2017 - 09:02

Owen Ross

They look even more retarded, user.
It was a closed door presentation for chosen few. And Intel was showing THAT.

July 15, 2017 - 09:03

1 2 ... 10 Next

Can they put a GPU next to 2 CPU's?

Last threads