Thank you based AMD

Reminder that Naples can run a bunch of NVMe devices, Infiniband and SIX GPUs at 16x PCie off a single CPU.
For comparison.. The Intel equivalent Xeon can run 3 GPUs at 16x without anything else.

Other urls found in this thread:

youtu.be/vyKgT5QUU2Q?t=1h5m40s
twitter.com/NSFWRedditVideo

Demolished.

How is this even fair anymore?

Look at this fucking thing, it gets your dick hard just thinking about that 4TB memory per node.

I hope Intel can survive this.

>P2P communication between GPUs

That got my eye, if this is possible with the Infinity Fabric, shit's about to get very real, and since these are Vega chips, we'll see in 2 months

They just need to hire some software developers to make it viable against Cuda for OpenCL.

Then another hundred or so linux developers to bring the OpenGPU drivers to the masses.

And then they can start dominating.

GPUs have always been secondary to AMD, but they seem to have a good solution for CUDA in ROCm

...

...

There's a reason Vega has an 8GB "cache".

And each CCX in Ryzen has a huge 8mb L3 victim cache that's fed by both L1 and L2 cache.

>board is empty besides a BMC

Beautiful, this is what a SoC is!

is it normal to get hard from looking at this?

I'm thinking the same as well, AMD had more than enough time to release a 1080 competitor if they wanted from a pure performance perspective, just Polaris with more shaders and higher clocks would do it, but they're waiting a full year for something and it's not performance with Vega, what are they keeping hidden?

it's because vega is dependent on hbm2 and prices of HBM is too high to launch for consumers at this point

No, it's just two stacks of HBM, that isn't expensive, you know what's expensive? 8 G5X chips and all the motherboard traces and cooling for it.

HBM price isn't a issue, there are only two stacks on top of that Hynix offers cheaper lower clocked stacks, Fury X had a more expensive HBM setup and it's still sold.

There's something more at play here.

This, people seriously underestimate the price of GDDR.

Maybe something related with this youtu.be/vyKgT5QUU2Q?t=1h5m40s ?

Noob question but what does SoC refer to and why is it significant with Zen?

System on Chip

SoC is a system on a chip. It condenses things like a motherboards north bridge and south bridge. And sometimes even PCIe lanes or a wireless network card like on cellphones.

It's pretty significant on Ryzen because the CPU also controls the memory controller, which in turn is part of the infinity fabric controller. It's why in order to maximize RAM speeds on Ryzen you need to increase the Base clock.

Everything largely runs straight to the CPU. Which reduces latency and power consumption. This is shown to be true with Ryzen having faster input latency than Intel when using USB ports that are wired to the CPU as opposed to the motherboards chipset.

Are they going back to single slot graphics cards or some shit
Because this is what's being implied in that image

Special mobo designed by Microsoft use in Azure?

All of their radeon Pro cards are single slot except for the Hawaii based one.

>they're waiting a full year for something and it's not performance with Vega, what are they keeping hidden?
Probably the fact that it's a complete piece of shit and they're just pulling the Project Scorpio marketing tactic of "what if?" and "soon™"

Is this the comeback of AMD?

why are they still making the hawaii ones?
i mean i luve muh 290x housfire but i thought people who run serverfarm are one of the few people who care about power consumption

Oh look, a living Sup Forumsedditor.

I bring up some current event in video games to compare the marketing tactics being used by AMD here and some faggot instantly attacks me instead of trying to contribute to the thread's topic. Just fuck off already.

MAKE AMADA HAPPY! BUY RADEON EVERYTHING!

QUADS DEMAND IT!

Literally another shoah.

CHECKED

Back to Sup Forumseddit, child.

>2012: Is this the comeback of AMD?
>2017: Is this the comeback of AMD?

etc. etc.

The ssd is just a re-badged ocz. Pretty decent drive though.

...

because they don't have a replacement yet.

Why isn't Vega released yet? I suppose not enough manpower to handle two big releases at the same time.

yea but radeon graphics are dogshit. and vega is a failure.

>because they don't have a replacement yet.
Polaris 10 is a thing and as fast as the big Hawaii core, probatly also way cheaper to make

>Vega is a failure
It's not even out yet.

As opposed to Intel? The APUs are going to sell like hotcakes.

...

Not for FP64, which is still Hawaii's turf

MS Project Olympus, but it's probably gonna be used in more than Azure

...

Skylake-EX is 44 PCIe lanes, Naples is 128..
For Skylake to even reach that, you need chipset PCIe lanes or a PLX chip, and both increase platform power on top of having much higher latency than lanes from the CPU.

Seriously, AMD hit home run on this, holy shit.

why nvme is a thing?
sata is not enough?

SATA tops out around 500MB/s and its 4k read/write access times are several times higher than NVME

For a simple storage server that's fine, but when you're serving 10k > 1M customers you'll need all the latency reductions you can get, the less the CPU waits on the block devices, the better.

2 socket servers will use 64 of those lanes for interconnect, but I'm wondering if a 1s can expose more than 64 (minus the ones used for sata/USB etc).

There's been no word on 4s+ either.

This is how it goes

1P setup = 128 PCie lanes total from one CPU
2P setup = also 128 lanes total at 64x2

Each CPU uses 64 of its PCie lanes for socket communication, so you have 64+64 =128


4P boards aren't gonna happen, 4P is some tiny part of the market 2-3% IIRC, AMD is not targeting it with Naples.

When is naples out? I want an ATX board with naples for a workstation.

just wait

Nipples :3

That is everything I was hoping for.

Unless you have a workload which is very specifically AVX2 heavy, or which Intel gave locked dedicated silicon to (and AMD are much more enthusiastic about customising SoCs, which is why the consoles went with them), or which is exclusively single-threaded and CPU-bound (and that does not describe much in the way of server instances) that is a pretty spectacular showing against Intel's ridiculously-priced best Xeons.

Ryzen was competitive - certainly not dominating at gaming, but pretty good and deserving to be mentioned in the same breath once more (especially with VS2017 compiles) and brought competition back after a long drought.

But this... this is what we were waiting for. The heady return of the Opteron glory days, right into Intel's most lucrative market. Xeons have dominated for so long in servers they got totally complacent. They are probably going to be furious, defensive, and aggressively FUDding and shilling everything they can against it. They are going to be offering sweetheart discounts to Google, MS, Facebook, all the big cloud providers, everyone. Intel are going to shit their pants about this - and that means everyone wins.

Songs and dances about the (hugely underwhelming for the next decade) 3D X-Point won't stop the big corps looking into this. And several years from now you'll likely see a return of the Facebook special.

Let's see the pricing. I might take a rack of these instead of v7s. I'm certainly going to be suggesting that I might, and seeing what they'll do to respond...

maybe for FP64?

most of their newer shit pulled nvidia's trick and cucked the shit out of FP64

Intel will be regulated to buying hookers and blow for the hardware guys, only to have them buy AMD anyway because their emoloyers will buy them more hookers and blow for saving YUGE money on Naples. I've already talked to guys who said "fuck Intel, where was my deal 2-3 years ago?".

user, Intel already lost MS.

Later half of Q2

>Naples
Who cares? I have neither the need nor the money to afford server hardware.

Who are you again?

4TB is theoretically possible but unrealistic.
Most 2S Naples machines will be using 32GB DIMMs for 1TB total capacity.

>4TB is theoretically possible but unrealistic.
Jewgle/jewbook are probably cumming buckets for MOAR RAM.

>4TB is theoretically possible but unrealistic.
Anyone with enough money to blow on hardware, and that's pretty much Google/MS/Amazon/Alibaba/Paypal/Ebay/etcetc will have no qualms about buying 128GB LRDIMMs

So no, it's not really unrealistic

Most server cards are single slot

Everyone wants more, but I don't think you're aware how sharp the elbow on the size/$ graph for memory is.

> 32GB: $300+ ($10k for 1TB)
> 64GB: $900+ ($30k for 2TB)
> 128GB: $3k+ ($100k+ for 4TB)

Companies spend a lot of time on dividing up tasks in software so cheaper machines can in aggregate do a comparable amount of work. I'll admit that some tasks can't be effectively split and can benefit from more local RAM, but they're in the tiny minority.

I can't wait for vertically mounted single slot cards for easy hotswapping

But you're forgetting the fact these companies run petabytes of memory farms, and DIMMs do take power, at that number very much.
So less DIMMs but more memory also means less cooling needed and lower electricity costs.

SMBs won't buy 128GB DIMMs, but oil and gas and big software companies will

Do you think larger capacity DIMMs don't use more power?

A 4TB pool on 16 DDR4 channels would take around 15 seconds to read into CPUs, while DRAM cells need to be refreshed a few 10s of times per second, so it should be pretty clear that row access power at least is not dominated by IO requests.

They survived lying about their clock speeds openly to consumers for almost a decade, i think they'll be okay

that's sad, I was dreaming about running 128 cores / 256 threads on the same machine with 8TB of RAM

Come on, you can't even fit 64 DIMMs on a 4P motherboard even if you tried.

I bet the thing would cost like $20k

It'd have to be 1 dimm per channel due to space restrictions, so actually ram capacity would still be capped at 4TB.

It would also be fuckstupid expensive.

What is infinityband used for? Is it just ethernet on steroids?

Is there even an arch that supports 1DPC? Naples is 2DPC, Broadwell-E is 3, Skylake-X is 2 as well IIRC

Uh all of them do. Just dont add a second dimm slot per channel. 1DPC is the minimum required to run a memory channel after all.

Hell, I'm running 2 Xeons (LGA 1366 for one, and LGA 2011 v1 for the second) with 1DPC. If anything, for any modern machine running with less dimms per channel allows for higher ram clock speeds due to less stress on the memory controllers, Registered/LR Dimms or no.

How many channels?

well, at least they have the full speed on 100% of ram you know...

3 channels on the 1366 chip (board is DX58SO), 4 channels on the 2011 v1 chip (Board is Biostar Tpower X79). Excluding the 2 slots for the first channel on the DX58SO, both boards have 1 slot per channel, hence 1 dimm per channel.

Even though I don't thing Ryzen has quite enough IO for 4S systems (32 PCIe lanes for each die with half used for the inter-socket link in 2S already), 4S with 64 total DIMM slots is physically possible if you're willing to skimp on PCIe slots and/or use an enlongated mobo/chassis.

The only reason 4S 64 DIMM builds haven't existed already is that 4DPC just doesn't exist.

ooooooo the implications are nice.

Wouldn't a theoretical 4P naples system have 256 PCIe lanes?

How is that not enough I/O? I simply think AMd didn't bother designing the Infinity Fabric to scale to 4P at the moment.

>>P2P communication between GPUs
>That got my eye, if this is possible with the Infinity Fabric
What am I missing here?
Hasn't that been the principle behind CFX/XDMA on Radeon GPUs on any number of host platforms for years?

Is there some subtle shortcoming in existing on-CPU PCIe root complexes that generally doesn't get discussed?

If you are a cloud provider this thing is god-sent.

>Thinking AMD is not lying like they did with Ryzen
Mmmmmhhhhh....

There's a pretty big difference between point to point(CFX) and peer to peer (assuming infinity fabric)

Yes MS certainly believed pajeet lies and now AMD is part of Project Olympus.

Each 4-die Naples MCM has 128 off-package IO lanes:
1S Naples has 128 PCIe lanes.
2S Naples has 64 PCIe lanes from each socket, and 64 interconnect lanes between the two sockets.

4S systems have either 2 (shitty) or 3 (good, fully connected) interconnect lanes per socket, which would need to come out of the total IO lane budget in Naples.
E.g., 32*3 interconnect lanes + 32 PCIe lanes per socket = still 128 total PCIe lanes but only half the bandwidth between sockets.

what's the distinction, in general and in terms of performance implications?

...

It means one device won't have to traverse multiple devices, inducing literally ten thousands of clock cycles in latency to read from some other device cache.

This isn't a consumer feature, if you're looking at it in gaming terms

what about 3 sockets?

Power of two, always power of two.

at least you would have direct communication instead of going through an intermediate processor to get to the one in the other end, or losing more lanes

there aren't enough IO lanes to make it worthwhile, considering that multi-lane links need to be power-of-2 lanes wide for cleaner designs too.

2*64 lane interconnects = no PCIe IO whatsoever.
2*32 lane interconnects = still half speed, and 192 PCIe total lanes are largely wasted for most users.

I think AMD will either have some other way for socket-socket talk like QPI, or they'll have more lanes per zeppelin die next year.

Then there's the fact such a machine would have 16 NUMA nodes to contend with, one for each die. Hell, AMD hasn't ever gone beyond 8 nodes with any of their previous designs (G34 was 2 nodes per processor, previous designs were 1 per processor, and those were typically limited to 4 sockets per system with 1 exception.), so it simply wouldn't make sense with AMD's history to push harder like that.


As for the 1 exception, Tyan (server and workstation mobo company) went full on fucking derp and made an 8 socket system, with 4 of those sockets supported on a slave motherboard. How they got that shit wired up is something I dont think I'll ever be able to figure out.

oh my.

AM4 is supposed to last through 3 gens (Zen2/Zen3 or Zen+/Zen++ or whatever), so any extra IO lanes would be likely wasted on consumer products.

On the other side of things, Naples will already have 5500-6000 pin sockets, so it's hard to see them squeezing in even more.

The motherboard trace routing has already go to be a nightmare with 2*8 memory channels, and it would probably take a lot more layers to pull off a 4S design with that constraint already firmly in place.

H-how much pins?