Mesh vs Ringbus vs Infinity Fabric

Question

Mesh vs Ringbus vs Infinity Fabric

Michael Barnes

Mesh (Skylake X) vs bingbus (Kaby Lake, Coffee lake) vs Infinity Fabric (Ryzen)

Discuss what the benefits of each architecture are and why, and what would be best suited for each purpose (gaymen, productivity etc).

So far it seems like bingbus based CPUs stomp on mesh and IF based CPUs in gaymen, but is this because gaymen software is written for bingbus based CPUs, or is bingbus inherently better at single core/thread and thus better at gaymen?

October 8, 2017 - 01:40

Other urls found in this thread:

community.amd.com/community/gaming/blog/2017/06/23/even-more-performance-updates-for-ryzen-customers
twitter.com/NSFWRedditImage

Dylan Sullivan

idk

October 8, 2017 - 01:46

Jayden Collins

>So far it seems like bingbus based CPUs stomp on mesh and IF based CPUs in gaymen

That's entirely unrelated.
Ringbus is shit but does work on low core count CPUs. Low core count CPUs can afford to pump more power/heat through each individual core because there's less of them.
Most shit games are written by the retarded monkeys and only run on one or two cores anyways.

October 8, 2017 - 01:52

Henry Collins

It IS related, but it may not be what causes it. Read the rest of what i said, it exactly addresses what you're talking about. The thing here seems to be that not only is it a matter of power delivery and MC vs SC, but that each individual ringbus core outperforms each individual mesh core in gaymen, even at the same clockspeeds (see 8700k vs 7800x). Im wondering if this is caused by code being optimized to run on ringbus CPUs, or if if RB is inherently better at SC than mesh is. The former would mean a software difference, the latter would mean a hardware difference.

October 8, 2017 - 01:59

Asher Smith

Bingbus for LCC, IF for anything else.
Though intra-CCX performance is so good you might as well choose IF as the winner.
Mesh a shit.

October 8, 2017 - 01:59

David Rivera

Mesh was designed for massive throughtput over latency, SKL-SP is a server first uAarch for fucks sake.
Of course mesh will be worse at low core counts.

October 8, 2017 - 02:02

Charles Rivera

Software develoreps don't operate on such a low level to care about bus type.

bingbus is simply inferior to crossbar bus but it happens to be on the CPUs that are most fitting for current games.

October 8, 2017 - 02:02

Noah Howard

is there some youtube bideo explaining all this?

October 8, 2017 - 02:04

Christian Perry

So you're saying Mesh is basically useless for desktop? If Mesh really is worse at SC on a hardware level, there'd be no point for a desktop user to buy it over ringbus. Same goes for IF as its basically the same thing as Mesh, just executed differently.

October 8, 2017 - 02:05

Chase Lewis

>Same goes for IF as its basically the same thing as Mesh, just executed differently.
H-what?
Zeppelin and LCC/HCC/XCC dies are fundamentally different, Zeppelin has no logically unified L3, instead it's partitioned into 8MB chunks.
Everything inside CCX is blazing fast.

October 8, 2017 - 02:06

Tyler Evans

>bingbus
Gotta admit OP, I didn't expect to get promoted for laughing

October 8, 2017 - 02:07

Lucas Gray

>bingbus is simply inferior to crossbar bus but it happens to be on the CPUs that are most fitting for current games.

So what does cause differing results on different platforms then? Im seeing CPUs with the same core count, roughly the same IPC, same clockspeed and same RAM get vastly differing results, where ringbus CPUs clearly outperform Mesh and IF cpus in SC workloads. If its not the bus, then what is it that causes this?

October 8, 2017 - 02:08

James Hill

I mean its the same general concept, isnt it. Sure there are differences but they're fairly similar

October 8, 2017 - 02:10

Christopher Morales

>vs IF
optimization and clock speed
>vs Mesh
L3 cache capacity and optimization

October 8, 2017 - 02:10

Noah Rivera

>Sure there are differences but they're fairly similar
FUNDAMENTALLY
FUCKING
DIFFERENT
Heck we don't even know what internal interconnect Zeppelin uses.
Infinity Fabric is merely a protocol.

October 8, 2017 - 02:11

Jack Diaz

I disagree. They even perform similarly and are made for similar purposes. Both are made with MC performance in mind primarily and aimed towards workstation and server use first and foremost. Furthermore they both underperform in SC compared to their much better MC performance. Though there are differences, i will readily agree with you on that, i think you can definitely bunch them together as similar archs.

October 8, 2017 - 02:16

Lucas Diaz

Infinity Fabric allows me to make a cheaper cpu farm to compile my Nim programming language files faster than I can say Nim programming language.

October 8, 2017 - 02:16

Luis Cook

>So what does cause differing results on different platforms then?
> differing results on different platforms then?

The different platforms FFS.
Skylake X sucks balls in single thread performance because it has fuckloads of slow cores that cant go fast because of heat.
Skylake has only 4 cores that can run @ 5Ghz on all cores. Skylake X cant do that.

Also the more cores you have the longer is ICC latency.

Long story short THE LESS CORES THE BETTER SINGLE CORE PERFORMANCE

October 8, 2017 - 02:19

Levi Price

>optimization

Could optimization also cause lower SC performance in benchmarks? Sounds a bit unlikely to me.

>clock speed

Tests were done with Ryzen and Kaby lake at both 4ghz, still got drastically different results, so i doubt its this.

>L3 cache capacity and optimization

Elaborate

October 8, 2017 - 02:19

Justin Roberts

>Elaborate
Skylel-X changes cache hierarchy from 256KB L2$ (inclusive of L1D) and 2MB of L3$ (inclusive of L3$) to 1MB L2$ (inclusive of L1D) and 1.375MB L3$ (mostly exclusive of L2$, basically like L3 in Zen).

October 8, 2017 - 02:22

Chase Rogers

>Could optimization also cause lower SC performance in benchmarks
lack of optimization? yes.
>Tests were done with Ryzen and Kaby lake at both 4ghz, still got drastically different results
different archs need different optimizations.
>Elaborate
mesh arch has smaller l3 cache than bingbus but bigger L2 cache than bingbus.
all of the optimizations thus far were made for bingbus, so mesh has problems if there is no optimizations.

so everything comes to optimizations. period.

October 8, 2017 - 02:23

Nicholas Perez

Its not this either. I am talking about the same core count and same clock speed but wildly differing SC results. As an example i gave 8700k vs 7800x, where the 8700k outperformed the 7800x in SC at roughly the same clock speed. Its something with the arch's themselves that causes it

October 8, 2017 - 02:23

Isaac Young

you really don't understand anything. if the software isn't there, hardware cannot work.

October 8, 2017 - 02:25

Luis Nguyen

You dont even understand what im talking about. You talk about variables that i already told you have been ruled out by tests. Its not a matter of MC vs SC in general or power consumption/heat (assuming you're the poster i replied to since this was his argument).

October 8, 2017 - 02:30

Kevin Edwards

7800x is L3 starved.

October 8, 2017 - 02:30

Dylan Green

8700K has way more cache and only 6 cores at which bingbus is still manageable but it's not a good thing.

If anything bingbus takes less die space allowing them to put more cache in the same size.

That's the only remotely positive quality of bingbus.

October 8, 2017 - 02:31

Evan Watson

If it really is an optimization issue it would mean that the r5 1600 would be the best CPU to buy for le future proofing meme as it seems to be the most popular CPU lately, but this would mean lower perf in already existing and recent games.

Seems like whatever way you look at it you have to make a major compromise if you buy a CPU in 2017.

October 8, 2017 - 02:31

Leo Gomez

>You talk about variables that i already told you have been ruled out by tests
and who programmed those tests? what kind of a compiler did they use? which arch did they have in mind while programming?
read some Agner Fog. you cannot isolate software from hardware.
it would be definitely. I don't know if you have seen the rie of the tomb raider tests.
at first ryzen was shit performing on the game. around may-june an update from the game developers fix the problems.
Nvidia's drivers still causes problems for ryzen, because nvidia refuses to fix its driver accoring to zeppelin arch.
it is all about programmes themselves and how the developers code it for the existing hardware.

October 8, 2017 - 02:35

David Allen

8700k is just better than 1600
BUT 1600 is the best kick for your buck ever and offers you future upgrades to pinnacle ridge and Ryzen2

October 8, 2017 - 02:36

Ayden King

So what you're saying is the 7800x is just a bad CPU in general and its better to get the 8700k at this core count?

October 8, 2017 - 02:36

Aaron Ramirez

YES

October 8, 2017 - 02:37

Joshua Kelly

>and who programmed those tests? what kind of a compiler did they use? which arch did they have in mind while programming?

Good points but you need to understand i am not making a point here, im asking a question. So far it does not seem to me to be a software only problem. I'd say this is the case if the benchmark results came in high but the real world scores came in lower, but we're seeing lower SC performance in IF and Mesh based CPUs in ALL applications. Im willing to believe its purely a software issue if you can back that up with evidence, but for now that doesnt seem to be the case.

October 8, 2017 - 02:41

Dominic Baker

>scalability, cost, yields, power, compartmentalization(less due to IF more due to EPYC design)
IF, also is not only a core interconnect, but DRAM and GDDR/HBM interconnect as well, it's basically an all purpose bus for most parts of a IC needing high bandwidth

>consistency
Mesh

>ḷow core count
Ringbus, though IF also works just as well inside one CCX

October 8, 2017 - 02:41

Brody Hall

Define low core count. Up to 4? 6? 8? 10?

October 8, 2017 - 02:44

Hudson Morgan

first you say this,
>So far it does not seem to me to be a software only problem.
then this,
>I'd say this is the case if the benchmark results came in high but the real world scores came in lower.
what do you mean by real world? games?
you can bet your ass the reason is developers.
look at far cry primal (single core optimized), then crysis 3(multicore optimized). digital foundry have lots of videos about this. look it up. they did one with 8700k recently.
>Im willing to believe its purely a software issue if you can back that up with evidence,
gave one above, and add one more. Agner Fog has found out many years ago that intel compiler cripples amd processors. many developers was using intel compiler at that time. intel was de-optimizing the software in case it runs on a non-intel cpu.
and again, recently Agner Fog tested ryzen, found out that clock by clock ryzen has higher IPC than Skylake. check his blog.

everything comes down to software.
but for now that doesnt seem to be the case.

October 8, 2017 - 02:51

Xavier Miller

>define low
2
crossbar would work better than bingbus even on 4 cores too. but the margin of improvement is small and does not justify the costs.

October 8, 2017 - 02:52

Tyler Wood

*but for now that's doesn't seem to be the case is your words, I forgot to delete it.

October 8, 2017 - 02:53

Eli Thompson

Alright but do we have any way of confirming this? I am just trying to work with the little information the public gets, and where im looking from this does not seem to be a software issue only.

Furthermore, if everyone uses an intel compiler and this wont change anytime soon then it wont matter if a given CPU is better or not since it will be fucked by the compiler anyways. These are things to consider as well.

To clarify i actually hope its just a software problem and will be fixed soon as AMD catching up to Intels peak SC performance would be great for the market, i just dont think its the case.

October 8, 2017 - 03:00

Robert Morris

>Alright but do we have any way of confirming this? I am just trying to work with the little information the public gets, and where im looking from this does not seem to be a software issue only.

To clarify with this i mean if this is the case for the specific arch's we're talking about now (Zen, KL and SL-X)

October 8, 2017 - 03:02

Ian Bennett

ok, you can check one of the things I mentioned here: community.amd.com/community/gaming/blog/2017/06/23/even-more-performance-updates-for-ryzen-customers
30% performance uplift with an update says there is something wrong with the code.

October 8, 2017 - 03:07

Connor Lewis

Alright then, but can we expect to see this happen to most software anytime soon? Seems like it will be better to get a Kaby/Coffee Lake CPU now and get Ryzen 2 or even 3 later if it will take a year or two for developers to catch.

October 8, 2017 - 03:10

Hunter Williams

Catch up*

October 8, 2017 - 03:11

David Gray

It depends on AMD's marketshare. it is going up, that means the code needs adjusting for the market.
Bingbus is a arch. intel still tries to squeeze it but this is the end, IPC is the same for 2 years, only clock speeds go up and it is marginal.
Ryzen is new and rough. it is open to get polished. Zen 2 targets 5ghz for base clock. for Zen 1, it was 3ghz.
AMD is coming like a freight train, developers will catch up eventually. But Nvidia needs to fucking update its shit drivers.

October 8, 2017 - 03:17

Brody Rivera

*bingbus is dead arch.

October 8, 2017 - 03:18

Joshua Watson

Actually I suspect software will get worse or stay the same because stronger hardware allows to run code that is badly written, that means "diverse" code needs less unfucking and is thusly cheaper... I hope I'm wrong, though

October 8, 2017 - 03:33

Adrian Wilson

Low core count is the number where a ring has a shorter longest path between cores than the other architecture.

October 8, 2017 - 03:43

Kayden Murphy

>"diverse" code
nice

October 8, 2017 - 03:43

Jose Sullivan

Infinity Fabric is inm every single way superior to ring bus and ring bus mesh edition

October 8, 2017 - 06:18

Daniel Russell

more software supports bing bus more than mesh, skylake x was an utter failure and gave the entire HEDT market to AMD

October 8, 2017 - 06:19

James Ortiz

where is the picture for infinity fabric
that name is cringy shit btw

October 8, 2017 - 06:27

Eli Torres

>more software supports bing bus more than mesh
because bing bus is around a nearly a decade. its time has come and there is nowhere to go with it.
it is a great name. you are just a retard.

October 8, 2017 - 06:29

Nathan Rodriguez

Because it scales into infinity

October 8, 2017 - 06:30

Henry Gomez

that is physically impossible

October 8, 2017 - 06:31

Asher Smith

Don't tell marketing departament that

October 8, 2017 - 06:33

Owen Cooper

They just mean it can scale a lot

October 8, 2017 - 08:48

Robert Jackson

i know that doesn't change that the name is bullshit

October 8, 2017 - 09:05

Luke Hall

IF is really interesting because it brings ringbus latencies to super high core counts, ringbus can't scale.

It has a catch, cross CCX hopping, can be alleviated by improving the IF, faster RAM or making bigger CCXs.
This is perfect for VMs though, lets say Skylake-X makes a 4 core VM, it will still have 80ns core to core, but make a 4 core VM on EPYC and it will have around 40ns, just like a ringbus design.
Thankfully AMD did their research, most vendors make use of multiple smaller VMs instead of big core ones.

October 8, 2017 - 10:02

Benjamin Perez

The latencies are not that crippling for most workloads, even.
Both mesh and IF are throughtput over latency (but Zen designs also have low latency inside the CCX so it's the best of both worlds).
Keller, Clark and the rest of the Zen team should get a fucking medal and their own religion for that.

October 8, 2017 - 10:06

Bentley Nelson

Relational databases, though I guess that's legacy shit at this point.

October 8, 2017 - 10:11

Jace Fisher

Ye, most scale-up workloads might as well be legacy shit in the age of ebin Cloud.

October 8, 2017 - 10:13

Grayson Long

8 core ccx's when

October 8, 2017 - 10:44

Isaiah Perez

IF is a general purpose interconnect that happens to be a great for high core counts, it can serve cores, VRAM, DRAM, GPUs, I/O, dies and sockets and god knows what else.

Only thing it can't be used in its current iteration is private caches

October 8, 2017 - 13:31

David Williams

>It has a catch, cross CCX hopping
wasn't cross-ccx latency like 10ns higher than mesh is between any two cores with 3200mhz memory?

the inter-ccx latency was lower than ringbus

October 8, 2017 - 15:32

Andrew Howard

There's no 3200MHz quad rank ECC RAM

October 8, 2017 - 15:36

Daniel Robinson

sure but the desktop platform gives you a peek into IF's capabilities. we're a long way off 3200MHz ECC, we barely just got 2667MHz ECC and 99% of DDR4 on the desktop side is overclocked ICs with the original 2133MHz JEDEC spec, I believe they only just started releasing sticks thst ship with proper 2400MHz 'stock' or fallback speeds outside the XMP profiles.

October 8, 2017 - 15:38

Juan Foster

Something tells me they wont ever make a +4 core ccx. I think the architecture isnt made to handle thah

October 8, 2017 - 18:09

Caleb Howard

If they can get cross CCX latencies down on then, staying with 4 core CCX is optimal.

I think they'll just ride it out until DDR5 or something, when that hits latency will no longer be a problem due to the sheer DDR5 bandwidth
Also PCIe4/5, AMD gains the most from them due to their architecture.

October 8, 2017 - 22:12

Luis Collins

DDR5 is a meme user, the frequencies are increased, which will help some workloads, but CL will double too, which will fuck with others

October 9, 2017 - 00:29

Alexander Kelly

CL is not real latency, a highend kit DDR2 or 3 is not lower latency than a highend DDR4 kit.

October 9, 2017 - 01:04

Jacob Sullivan

Case in point.

October 9, 2017 - 01:05

Levi Cruz

>have 1066mhz 4-4-4-12 DDR2
>7.5ns latency

October 9, 2017 - 01:19

Brayden Rogers

Does anyone know how cores within a ccx communicate? It will depend on this wether we'll ever see +4 core ccx's.

October 9, 2017 - 03:55

Lucas Reyes

stfu nigga
even a ddr2 800mhz cl3 has lower absolute latency than your ddr4 shit. and no rowhammer vulnerabilities either

October 9, 2017 - 03:58

Ryan Ortiz

Through some internal bus, unknown which one.
AMD and Intel have never given info about going that deep, they mostly limited it to shared chip wide interconnects

October 9, 2017 - 04:13

Zachary Foster

see

October 9, 2017 - 04:14

Aiden Garcia

HEY HEY
HO HO
THIS FUCKING BINGBUS HAS GOT TO GO

October 9, 2017 - 04:16

Christopher Price

Ringbus is literally "make useless loopty loops" tier, mesh makes more sense but still why didn't they fucking do it from the beginning
But infinity fabric is fairly kino and I don't see either mesh or ring reaching it's speeds

October 9, 2017 - 04:29

David Gray

In low core count, it faster to ask "Hey is this the shit you want" than "I need deliver those package to #3 when turn left."

October 9, 2017 - 05:24

Connor Ortiz

Hence why I'm confused at to why they continued it with higher core counts as well.
I mean, I'm an engineer but obviously I can't even come close to touching the qualifications these Intel guys have, and obviously they might have a good reason but I still question why?

October 9, 2017 - 05:37

Michael Johnson

Cost. Its easier and cheaper to make one single die and just disable features for lower end models than to make several different dies. Plus they had no competition at that market segment so there really was no incentive for them to improve it.

October 9, 2017 - 06:41

Liam Hernandez

Mesh is garbage though, its barely an improvement

October 9, 2017 - 08:30

1 2 ... 9 Next

Mesh vs Ringbus vs Infinity Fabric

Last threads