This is AMD's Vega, which you're gonna see in a month or two on the shelves.
It's 4096 shader design, like Fury X was, difference is, this Vega (instinct) runs at 1540MHz, while the Fury X runs at 1050MHz The consumer Vega should run slightly higher clocks, but not all too much higher.
Do some math, and you'll roughly understand the performance even without any architectural updates over Fury/Fiji. So, what are you waiting for? Argue away.
This isn't Polaris? Duh, different architecture, different pipeline and clocks, not very smart are you?
This card was already released in December, why are you arguing facts?
Kevin Lee
MI25 = 25 TFLOPS fp16 = 12.5 TFLOPS fp32 = ~1525 MHz on a 4096 ALU chip.
That's also on an enterprise part with a passive heatsink that relies on server fans to blow through it. Historically AMD sells consumer part with active cooling clocked 5-10% higher. Highest end Vega will almost certainly have 1600-1650ish clocks.
Parker Foster
>why are you arguing facts? They know that even just a Fury X at those clocks is devastating to 1080ti, as AMD's IPC is far higher than Nvidias at the moment. Not to mention the biggest arch change for AMD since GCN rolled out 6 years ago.
So, you know, shut it down?
Josiah Myers
You know what this means? If they didn't increase the shader count from 28nm to 14nm, that means that the shaders are much bigger, ergo their IPC is notably higher. Add to that a dramatic increase in clockspeed and this thing will be insane.
Cameron Thompson
Honestly, the whole Pascal lineup outside of P100 felt like a stopgap to me, no ALU increase at all, just purely clockspeed, I expected more from the jump to 16nm FinFETs honestly.
P100 is impressive though, but GP102 and anything smaller felt underwhelming
David Smith
those are some big shaders.
Colton Smith
GCN's IPC is actually fractionally worse than Maxwell's, and Vega's play accordingly is more on efficiency than anything else.
Fiji had a ton of FLOPS that went underutilized or outright wasted due to a number of architectural shortcomings.
Liam Clark
Actual arch change is costly and takes fuckload of time, see Vega. Also NVIDIA needs hardware async in Volta.
Juan Brown
Yeah, but Nvidia already had notably bigger shaders than AMD did, this makes them roughly equal in size.
Levi Ortiz
the shaders aren't bigger, the chip is just smaller
Brayden Campbell
I had that lined up perfectly for you, how did you not say "For you."
Andrew Jones
The 480 keeps up with a 1060 with far lower clocks though?
Ian Robinson
Pascal has some neat tricks, what's the most impressive for me is that they managed to get good clock signal quality at those clocks, this is by no means a small feat, it's like IBM pushing 4.0GHz clocks on those POWER8 monsters, degradation happens at high clocks, this is a silicon issue though.
But at its core, Pascal is a higher clocked Maxwell, not a big arch change like Fermi > Kepler or Terascale > GCN was
Gabriel Williams
Vega is AMD's next big architecture change while Volta is Nvidia's next big change, makes sense they're taking their time with it
Grayson Gutierrez
4u
Hunter Hughes
I'd like to keep memes out of any slight technical discussion as I still somewhat care about this board and its rapidly decaying post quality.
Kevin Howard
Too late, the moments over now - he ruined it.
Daniel Miller
>conso- I mean GPU wars >board quality fuck off Sup Forums
Aaron Cook
and many more shaders
Christopher Sanders
To add to this, high clockspeeds also have latency and precision hickups, this is why you don't see 4.5GHz low core Xeons for markets that strictly need ST performance and reliability is important. But lower clocked with shitload of L3 cache (up to 60MB), yes 4 core Xeons with 60MB LLC
Owen James
But they're at similar die sizes(some 5-10%?) so AMD's shaders should be far smaller.
Jaxon Nelson
That that should mean AMD's IPC per mm2 is higher, rather than saying their IPC per ALU is higher.
Nicholas Murphy
Technically that's right, but this is mostly nitpicking though, AMD's simple shaders aren't directly comparable to Nvidia's complex ones.
Still, this discussion came into being from the discussion of Vega having the same amount of ALUs as Fury X, which indicates a large IPC increase, and since the 1080ti is somewhere around 3600 CUDA cores, direct ALU to ALU comparison between AMD and Nvidia will be much easier now.
Austin Long
That is if Vega's diesize is around GP102's diesize.
Daniel Myers
It's a bit larger.
Jaxson Lopez
Even better, it has slightly more shaders with a bit larger die size, that should make this comparison even more direct since you're comparing ALUs of similar size. Gp102 isn't exactly small though, and I haven't seen (big) Vega's size yet.
Benjamin Torres
When we're on this topic, anyone know if AMD is putting a super fat scalar unit in their CUs in Vega while still mostly having vector ones? This would be pretty amazing thing. I've seen a patent about it, but this would explain why this launch is taking this long
Juan Roberts
>The 480 keeps up with a 1060 with far lower clocks though?
The 480 is at rough parity with the 1060 but is 15% bigger, has 33% more RAM size/bandwidth, and uses 25% more power.
GCN is amazingly competitive for a 5 year old architecture, but AMD really needs Vega to be a big efficiency jump if it wants to even be competitive in the future.
Connor Ross
Then again Polaris has near similar perf/watt as Fiji but it's at 14nm, so Polaris shouldn't really be a discussion on efficiency anything since AMD obviously prioritized cost on it than anything else.
Chase Campbell
I think I can give the answer to that, Polaris is for all intents and purposes a Hawaii cut in half with some frontend and I/O changes. It's ALU array is not very dense and its caches and SRAM cells are neither low power or high performance. Looking purely at chip design, someone must have made this while shitfaced, from a price standpoint, this thing is easy to fab and make, which I guess was AMD's market hook when they released this.
Jose Torres
Also its low amount of fixed function units like ROPs is pretty telling.
Aaron Miller
Polaris is not Fiji cut in half, it's Fiji with nearly half its shaders removed, which isn't the same thing.
Cooper Baker
my nama jeff
Jayden Gray
...
Wyatt Richardson
Its memory system is nothing like Fiji's though
Jaxon Watson
Fiji had some very suspect performance regarding its memory subsystem.
While Polaris doesn't look like it actively chokes in nearly the same way, it's clear that it's still bandwidth starved (mem OCs matter), indicating that better delta color compression, tiling rasterizers, etc. would have helped a lot.
Carson Sanders
Fiji was just unbalanced, high ALU and TMU throughput, catastrophic polyfill, either from lack of ROPs or something else(l2$?)
Tyler Wilson
the memory subsystem is structurally identical and just varying in external bus and cache slice sizes.
the most commonly accepted theory is that GCN's central memory crossbar just doesn't scale past a certain point, which Fiji apparently hits. the signs are that realizable HBM bandwidth is surprisingly low and that DCC yields really meager improvements:
AYYMD HOUSEFIRES REBRANDEON can't even hit 1.5K, you think a much bigger AYYMD HOUSEFIRES can?
Eli Bailey
Fiji had several problems, most of which can be pretty easily identified here: - really low Geometry Engine throughput (half that of the competition's): 1 tri/clock per each of 4 SEs - low ROP ouput (2/3 that of 980 Ti). highly contentious how much this actually matters. - unexpected poor bandwidth. allegedly the cause of "starved" shaders. - 4x4px rasterizers vs. 2x4px for Nvidia: twice as much wasted capacity on small triangle, which is why HairWorks fucks over GCN comparatively more.