>According to this paper, AMD wants to get around this "large die issue" by making their Exascale APUs using a large number of smaller dies, which are connected via a silicon interposer. This is similar to how AMD GPUs connect to HBM memory and can, in theory, be used to connect two or more GPU, or in this case CPU and GPU dies, to create what is effectively a larger final chip using several smaller parts.
>In the image below you can see that this APU uses eight different CPU dies/chiplets and eight different GPU dies/chiplets to create an exascale APU that can effectively act like a single unit. If these CPU chiplets use AMD's Ryzen CPU architecture they will have a minimum of 4 CPU cores, giving this hypothetical APU a total of 32 CPU cores and 64 threads.
>This new APU type will also use onboard memory, using a next-generation memory type that can be stacked directly onto a GPU die, rather than be stacked beside a GPU like HBM. Combine this with an external bank of memory (perhaps DDR4) and AMD's new GPU memory architecture and you will have a single APU that can work with a seemingly endless amount of memory and easily compute using both CPU and GPU resources using HSA (Heterogeneous System Architecture).
>In this chip both the CPU and GPU portions can use the packages onboard memory as well as an external memory, opening up a lot of interesting possibilities for the HPC market, possibilities that neither Intel or Nvidia can provide themselves.
Finally, I was wondering why GPUs were first to receive on-chip memory, considering there are far fewer products which use them.
My only question is, what will board partners do with all the space created by removing RAM?
Dylan Bennett
That thing is gonna be fuckhuge. I wanna see one fully assembled.
Ryan Jenkins
Wouldn't clustering all those components together increase the thermal load by a fuckton, though?
Interesting concept, nonetheless.
Cooper Gomez
>Combine this with an external bank of memory (perhaps DDR4)
HPC need a lot RAM
Elijah Myers
just slap a big ass heatsink on it
David Baker
>Right now this new "Mega APU" is currently in early design stages, with no planned release date. It is clear that this design uses a new GPU design that is beyond Vega, using a next-generation memory standard which offers advantages over both GDDR and HBM.
Sebastian Davis
Put 3 - RX480 gpus together via interposer system uses it as one gpu with 100 scaling
wew.
Blake Bailey
>no grid >horsefly gets in the fan
Julian Taylor
>chiplets When will they learn?
Henry Davis
Nvidiots on suicide watch
Aaron Murphy
>MEGA APU
I want one now. No, yesterday. GIVE ME THE FUCKING APU.
Julian Morris
not everyone lives in a stable.
Camden Reed
>>Combine this with an external bank of memory (perhaps DDR4) Yeah, NOW. There's no reason it can't simply be migrated onto the chip.
This is a stop-gap to something which could be amazing.
Charles Russell
>what will board partners do with all the space created by removing RAM? batteries no more laptops
Nathaniel Miller
>horsefly >not housecat it's like you live in a barn sonypony detected
Brayden Price
>80 chiplets combined into one mega chip gattai!
Jack Moore
>My only question is, what will board partners do with all the space created by removing RAM?
More space for on-bus flash memory.
Gabriel King
don't forget the six million barns
Landon Taylor
>tfw too intelligent too kek
Joshua Flores
> what is a cache
Owen Reyes
>This new APU type will also use onboard memory, using a next-generation memory type that can be stacked directly onto a GPU die
Oh, so that's that "next gen memory" that they mentioned to use for Navi, their successor for Vega. In 2019.
Austin Cruz
INTEL IS FINISHED
Anthony Ross
>tfw I was right about AMD moving towards an APU that acts like a desktop grade SoC >tfw there will come a time when all you need is a very barebones motherboard -acting as a glorified mounting place for VRM, I/O, audio, and optionally additionsl PCIe lanes and memory dimms- and a single chip. >tfw by nature OEMs will love this for laptops and AiOs and buy them in droves >tfw we'll eventually be able to pack console level performance into a mac mini sized enclosure
This shit is so fucking cool. I know it'll be quite a while before consumers will be able to get their hands on these, but at least it's something genuinely new. I also have to wonder how this will affect CPU design and production in general. Using this strategy on a bigger scale could lead to massive yields per wafer, being able to salvage even dies with only single function cores and pairing them together to make up a super budget offering. Depending on how streamlined they can make the interposer mounting process, this could very well revolutionize the way CPUs are designed and produced.
To me, this is even more exciting than Zen and Vega combined, even if I'll likely continue using high end separates for the foreseeable future.
Brody Miller
ADORED TV WAS RIGHT NAVI CONFIRMED TO BTFO NVIDIA
Eli Rogers
There is a very good reason AMD chose to refer to the HBM on Vega as a "high bandwidth cache" and why they've invested in designing a memory controller that can access many parts of the system independent of the CPU's memory controller. Vega, like Fiji, is a test bed, proof of concept, and flashy advertisement for new, forward-thinking technology they plan to use in future designs.
This is what I like about AMD. With each new product, they introduce new stuff to rethink the way computing and architecture design and production is currently done. Tessellation, async, HBM, GCN, mantel, freesync, etc. They're working to alter the entire industry in their favor one feature and one architecture at a time. They're the literal antithesis of Intel.
Julian Garcia
More battry space my man.
Ronnie will go for days
Grayson Stewart
>what if we put the chips.... ON ANOTHER CHIP
fucking AMD
Hudson Scott
This not goin to happen anytime soon, prob 2030
Amd should focus on how to crossfire apu and gpu . this could happen in few year if they try hard
Xavier Bennett
Should I buy AMD when the markets open tomorrow?
Oliver Walker
I recommend not fucking with the stock market at all.
Charles Morales
holy fuc
Jaxson Roberts
why not shortsell intel?
Dominic Hughes
They already did that. It's called dual graphics. And it sucked.
Kayden Campbell
Holy shit couldn't this tech improve consumer CPU yields by a lot?
>Take a Ryzen "core unit" of 4 cores >Just make a shit ton of those as their own dies >Bin defective ones, glue some together and glue an iGPU to it >Arbitrarily large chips made from small, high yield dies
Robert Price
>crossfire API+GPU That's already a thing. Can crossfire APU with R7 250. Also, the multi GPU in DX12/Vulkan will accomplish this sort of task in software.
Justin Morales
navi is rumored to be two gpus working on the same transistor
but it's all speculation
get dem heatsinks ready boys
Caleb Watson
Just suck my dick and I'll give you 20 bucks, which is probably more than you'd make from AMD after tax.
Chase Allen
>chiplets
c-cute!!!!
Jayden Young
yes, intel is also experimenting with it. though it's not just binning yields, since the cuts are physically smaller they can more efficiently use each wafer.
Nathaniel King
>That's already a thing.
But it probably won't suck this time.
Caleb Russell
that's very nearly what AMD's new server platform, Naples, will be.
> each Zeppelin die has 8c/16t, 16 MB L3, 2 DDR4 channels, and 32 PCIe lanes > sell 2x MCMs with 16c/32t, 32 MB L3, 4 DDR channels, 64 PCIe lanes > sell 4x MCMs with 32c/64t, 64 MB L3, 8 DDR channels, 128 PCIe lanes > support 2 sockets of either of the above
they won't have great AVX capacity for HPC/simulation stuff, but they will be beastly NVMe file and web servers
Gabriel Parker
Once saw a horsefly an inch long
Not in my home though, I don't take my noctua systems to rural sectors
Aaron Hill
fucking chiplets, when will they learn?
Matthew Evans
>> sell 2x MCMs with 16c/32t, 32 MB L3, 4 DDR channels, 64 PCIe lanes consumer naples when
Jonathan Robinson
that barn looks really happy/angry
Cameron Ward
I'm a huge nvidiot, but if AMD could actually demonstrate that working well, with real 100% scaling efficiency, I think nvidia would be btfo.
Grayson Flores
it not goin to work like that, gpu had alot wasted transistor during "crossfire". they need to redesign everything
Cameron Rivera
As someone who tried it, can confirm.
Levi Davis
high inter-die bandwidth is necessary but not sufficient for proper GPU scaling.
you need, at the absolute least, geometry setup engines that can feed a rasterizer on not just another compute block but on potentially a different die.
control issues like this mean that successful designs will take several years to design and validate. navi might still try something like this, but Vega will almost certainly not.
Ethan Sanders
so, the madman are actually going to make navi scaleable? I'd be really surprised if nvidia doesn't rebrand volta this year, they wanted to make same thing for very long time, can't imagine them being ahead on this.
Lincoln Sullivan
well, technically that's how it would work how it would talk with each other that's another story, no idea, if they pull it off it's going to change GPU as we know it
Henry Turner
AMD hasn't had Navi described as "scalable" in the last few roadmap slides, so who the fuck knows.
Nvidia does great work with their fixed function units (color compression, tiling rasterizers, etc.) but lagged so much on async compute/graphics shaders that I can't imagine them being first with MCM GPUs.
Zachary Cruz
Volta was just slated to have Hybrid Memory Cube memory
Maxwell was going to have an ARM processor integrated
Of course neither one of those is gonna materialize and they pulled Pascal out their ass later on.
Elijah Collins
So it's just stacked dies 3D ICs that everyone already has been looking at and testing already? Yeah Intel sure is fucked.
Christian Perez
do you think they will sue what they developed for zen with "infinite fabric"?
2 years back it was scaleable in the slides, they changed it, so was navi, that navi pushed back to 2019 is what makes me think they plan on something
Ryan Jones
Reminder: Intel is GPU subsidiary of AMD.
Landon Hernandez
SLI/CF are fundamentally fucked nowadays since basically every modern engine pipeline uses previous frame data for effects, and AFR rendering was invented with the assumption that frames could be rendered independently.
Every new game with SLI/CF support is basically a gigantic one-off hack, which is not the way to make MCM GPUs succeed.
Ryan Gomez
I want everyone drop CF/SLi support altogether, it wastes development time for damn 1.5% of users skimping on optimization for single GPUs. Problem is those users are the most obnoxious vocal minority.
Justin Ortiz
They did spend twice on a GPU, I'd be pretty pissed being that stupid too.
Direct X 12 explicit multi-adapter sounds interesting though.
Lincoln Reyes
>that chip THICC
Carson Gray
I really love technology. Can't wait to see how much progress is made in 10 years.
Ethan Johnson
>explicit multi-adapter sounds interesting though. it's hard to program for, again leads to wasted dev time
Cameron Evans
HMC was always questionable for GPUs. It was always more about capacity and design flexibility than bandwidth or power. Pascal (the real GP100 one, not the Maxwell shrink GTX 10x0 series) is most of what Volta was supposed to be, assuming substituting HBM for HMC.
Infinity Fabric will be used for Zeppelin-Vega MCM APUs late this year or early next year. But technically IF is more a set of communication design libraries that are even used internally in their newer chips, so it's not clear what capabilities or operational characteristics/semantics it has. A HyperTransport successor doesn't have the same needs as internal pipeline structuring or control paths, etc.
Isaac Richardson
how possible do you think it is to integrate neural net inside a chip to handle all complicated intercommunication? will latency be unbearable or other way around?
Bentley Thompson
If only I could set my Apu to do only "Shadow/effect" and leave the rest for main Gpu.
Zachary Sullivan
>tfw I bought AMD stocks at $12 hopefully they go up enough to fund my next upgrade
Matthew Hill
dude, they will either jump to 20 or drop to 5 after 28th be careful
Zachary James
But is that natty?
Anthony James
Eh, I only dropped $300 on them. Not a huge deal if the price drops a heap. If they go below $5 with no I'll just sit on them for the long run.
Andrew Torres
That looks like a cool office park, where is it at?
Carter Howard
"neural nets" if you're generous enough to call them that, are only good in a CPU for speculative decisions, which boil down to branch prediction and prefetching at most. something like cache eviction could be done like this in principle too but wouldn't be worth it.
intra- and inter-chip communications are just protocols for buffer management and state transitions, where heuristics don't really have much of a place.
Henry Hughes
THEY AREN'T MAKING THIS.
Joseph Davis
>32 CPU cores *"mOar corez!!1!!" off in the distance*
Jaxson Cook
Wouldn't it be possible to simply build a scheduler into the interposer (say at 28nm to make the traces small but the scheduler not totally suck) and have that run the inter-die communication. I suspect that you could even break it down further with smaller schedulers to tie together multiple gpu dies, and that scheduler+infinity fabric then talks to the cpu scheduler+infinity fabric.
I'm clearly not a CE or EE, but would such a design be workable?
Hudson Lewis
>chiplets
are these like the manlets of the CPU world? im sceptic in that case!
Ryder Powell
Is it bad, gaymer?
Jaxson Wilson
>chiplets when wil they ever learn
Oliver Walker
>next-generation memory type that can be stacked directly onto a GPU die doesn't this cause heat transfer issues?
John Wright
Should've been called subchip
Jonathan White
The term chiplet has a specific meaning, it implies die stacking. Not to be confused with 3D stacking. Die stacking means you fab small parts and connect them via interproser to reduce cost on a low yielding extremely dense node. Instead of one CPU die with 16 cores they're proposing 4 smaller dies each with 4 cores. A couple years ago AMD proposed even breaking up the CPU itself into individual components to fab them all on separate processes tailored specifically for the part.
Jaxson Barnes
>chiplets
When will they learn?
Christopher Cruz
Neat
2022 it says.
2030 is when Raja says discrete GPUs won't even exist anymore.
Ayden Perry
>we're moving towards fully modular CPU fabrication
This shit is so cool. This is how technology markets are supposed to work. Everyone's so concerned with reaching the limit in shrinking transistors they've forgotten how many other ways there are to innovate products from the initial design phase up. This is what it looks like when there's real competition: the potential for methods that turn entire pre-established norms on their fucking heads.
Charles Wood
would massively increase yield/wafer
cash money
Hudson Diaz
It will require expensive MB and come with WC unit like that 220W CPU ;^)
Aiden Cook
If they could reliably package it then it would be an enormous cost savings. Individual parts costing perhaps pennies instead of monolithic dies costing upwards of $20-$50 each. 2.5D integration would be pretty big.
Ryan Sullivan
Raja is wrong. Gpu would last longer. powerlimit/vrm capacity would be huge factor. unless you willing to pay premium for MB
Noah Evans
When you can do 2 TFLOPs of full precision, which is a bit more powerful than a 1080, on a single APU iGPU die (there's 8 of these dies in the server HBM), and that is only 5 years from now, then 13 years from now he's probably right.
Easton Russell
If everyone loses money playing the stock market then who gains money? Checkmate atheists.
Zachary Brooks
...
Michael Kelly
>then who gains money
Cameron Richardson
I wonder how long the PS3 and 360 could've lasted commercially if MS and Sony had waited for something like this instead of going with the Bobcat family, since it was obvious they wanted an APU solution for the PS4 and XB1, but both opted for the only low-power moar coar option they had at the time
Consoles could ACTUALLY be competitive if they had a Ryzen-based HBM APU
Evan Wilson
i remember jewlander, good movie
Jack Reed
Consoles went with the lowest bidder for a credible platform, which ended up being mostly AMD.
Bulldozer was too high-power for the platform, AMD had nothing else to sell except for Bobcat/Jaguar, Intel didn't have credible GPU performance, and Jen-Hsun was still crying salty tears over never being able to get an x86 license.
Evan Carter
that what they say with fusion back then, just think 300 watt apu lol hope. discrete GPU will stay like how ddr still use even with hbm
Isaiah Parker
kek
Christian Nelson
Um. They're projecting 200watts for a 16 TFLOP full precision APU, which sounds reasonable. A 4 TFLOP one won't be 300 watts. Especially in 2030 when it's on 4nm fab or smaller instead of 7nm.