Mesh (Skylake X) vs bingbus (Kaby Lake, Coffee lake) vs Infinity Fabric (Ryzen)
Discuss what the benefits of each architecture are and why, and what would be best suited for each purpose (gaymen, productivity etc).
So far it seems like bingbus based CPUs stomp on mesh and IF based CPUs in gaymen, but is this because gaymen software is written for bingbus based CPUs, or is bingbus inherently better at single core/thread and thus better at gaymen?
>So far it seems like bingbus based CPUs stomp on mesh and IF based CPUs in gaymen
That's entirely unrelated. Ringbus is shit but does work on low core count CPUs. Low core count CPUs can afford to pump more power/heat through each individual core because there's less of them. Most shit games are written by the retarded monkeys and only run on one or two cores anyways.
Henry Collins
It IS related, but it may not be what causes it. Read the rest of what i said, it exactly addresses what you're talking about. The thing here seems to be that not only is it a matter of power delivery and MC vs SC, but that each individual ringbus core outperforms each individual mesh core in gaymen, even at the same clockspeeds (see 8700k vs 7800x). Im wondering if this is caused by code being optimized to run on ringbus CPUs, or if if RB is inherently better at SC than mesh is. The former would mean a software difference, the latter would mean a hardware difference.
Asher Smith
Bingbus for LCC, IF for anything else. Though intra-CCX performance is so good you might as well choose IF as the winner. Mesh a shit.
David Rivera
Mesh was designed for massive throughtput over latency, SKL-SP is a server first uAarch for fucks sake. Of course mesh will be worse at low core counts.
Charles Rivera
Software develoreps don't operate on such a low level to care about bus type.
bingbus is simply inferior to crossbar bus but it happens to be on the CPUs that are most fitting for current games.
Noah Howard
is there some youtube bideo explaining all this?
Christian Perry
So you're saying Mesh is basically useless for desktop? If Mesh really is worse at SC on a hardware level, there'd be no point for a desktop user to buy it over ringbus. Same goes for IF as its basically the same thing as Mesh, just executed differently.
Chase Lewis
>Same goes for IF as its basically the same thing as Mesh, just executed differently. H-what? Zeppelin and LCC/HCC/XCC dies are fundamentally different, Zeppelin has no logically unified L3, instead it's partitioned into 8MB chunks. Everything inside CCX is blazing fast.
Tyler Evans
>bingbus Gotta admit OP, I didn't expect to get promoted for laughing
Lucas Gray
>bingbus is simply inferior to crossbar bus but it happens to be on the CPUs that are most fitting for current games.
So what does cause differing results on different platforms then? Im seeing CPUs with the same core count, roughly the same IPC, same clockspeed and same RAM get vastly differing results, where ringbus CPUs clearly outperform Mesh and IF cpus in SC workloads. If its not the bus, then what is it that causes this?
James Hill
I mean its the same general concept, isnt it. Sure there are differences but they're fairly similar
Christopher Morales
>vs IF optimization and clock speed >vs Mesh L3 cache capacity and optimization
Noah Rivera
>Sure there are differences but they're fairly similar FUNDAMENTALLY FUCKING DIFFERENT Heck we don't even know what internal interconnect Zeppelin uses. Infinity Fabric is merely a protocol.
Jack Diaz
I disagree. They even perform similarly and are made for similar purposes. Both are made with MC performance in mind primarily and aimed towards workstation and server use first and foremost. Furthermore they both underperform in SC compared to their much better MC performance. Though there are differences, i will readily agree with you on that, i think you can definitely bunch them together as similar archs.
Lucas Diaz
Infinity Fabric allows me to make a cheaper cpu farm to compile my Nim programming language files faster than I can say Nim programming language.
Luis Cook
>So what does cause differing results on different platforms then? > differing results on different platforms then?
The different platforms FFS. Skylake X sucks balls in single thread performance because it has fuckloads of slow cores that cant go fast because of heat. Skylake has only 4 cores that can run @ 5Ghz on all cores. Skylake X cant do that.
Also the more cores you have the longer is ICC latency.
Long story short THE LESS CORES THE BETTER SINGLE CORE PERFORMANCE
Levi Price
>optimization
Could optimization also cause lower SC performance in benchmarks? Sounds a bit unlikely to me.
>clock speed
Tests were done with Ryzen and Kaby lake at both 4ghz, still got drastically different results, so i doubt its this.
>L3 cache capacity and optimization
Elaborate
Justin Roberts
>Elaborate Skylel-X changes cache hierarchy from 256KB L2$ (inclusive of L1D) and 2MB of L3$ (inclusive of L3$) to 1MB L2$ (inclusive of L1D) and 1.375MB L3$ (mostly exclusive of L2$, basically like L3 in Zen).
Chase Rogers
>Could optimization also cause lower SC performance in benchmarks lack of optimization? yes. >Tests were done with Ryzen and Kaby lake at both 4ghz, still got drastically different results different archs need different optimizations. >Elaborate mesh arch has smaller l3 cache than bingbus but bigger L2 cache than bingbus. all of the optimizations thus far were made for bingbus, so mesh has problems if there is no optimizations.
so everything comes to optimizations. period.
Nicholas Perez
Its not this either. I am talking about the same core count and same clock speed but wildly differing SC results. As an example i gave 8700k vs 7800x, where the 8700k outperformed the 7800x in SC at roughly the same clock speed. Its something with the arch's themselves that causes it
Isaac Young
you really don't understand anything. if the software isn't there, hardware cannot work.
Luis Nguyen
You dont even understand what im talking about. You talk about variables that i already told you have been ruled out by tests. Its not a matter of MC vs SC in general or power consumption/heat (assuming you're the poster i replied to since this was his argument).
Kevin Edwards
7800x is L3 starved.
Dylan Green
8700K has way more cache and only 6 cores at which bingbus is still manageable but it's not a good thing.
If anything bingbus takes less die space allowing them to put more cache in the same size.
That's the only remotely positive quality of bingbus.
Evan Watson
If it really is an optimization issue it would mean that the r5 1600 would be the best CPU to buy for le future proofing meme as it seems to be the most popular CPU lately, but this would mean lower perf in already existing and recent games.
Seems like whatever way you look at it you have to make a major compromise if you buy a CPU in 2017.
Leo Gomez
>You talk about variables that i already told you have been ruled out by tests and who programmed those tests? what kind of a compiler did they use? which arch did they have in mind while programming? read some Agner Fog. you cannot isolate software from hardware. it would be definitely. I don't know if you have seen the rie of the tomb raider tests. at first ryzen was shit performing on the game. around may-june an update from the game developers fix the problems. Nvidia's drivers still causes problems for ryzen, because nvidia refuses to fix its driver accoring to zeppelin arch. it is all about programmes themselves and how the developers code it for the existing hardware.
David Allen
8700k is just better than 1600 BUT 1600 is the best kick for your buck ever and offers you future upgrades to pinnacle ridge and Ryzen2
Ayden King
So what you're saying is the 7800x is just a bad CPU in general and its better to get the 8700k at this core count?
Aaron Ramirez
YES
Joshua Kelly
>and who programmed those tests? what kind of a compiler did they use? which arch did they have in mind while programming?
Good points but you need to understand i am not making a point here, im asking a question. So far it does not seem to me to be a software only problem. I'd say this is the case if the benchmark results came in high but the real world scores came in lower, but we're seeing lower SC performance in IF and Mesh based CPUs in ALL applications. Im willing to believe its purely a software issue if you can back that up with evidence, but for now that doesnt seem to be the case.
Dominic Baker
>scalability, cost, yields, power, compartmentalization(less due to IF more due to EPYC design) IF, also is not only a core interconnect, but DRAM and GDDR/HBM interconnect as well, it's basically an all purpose bus for most parts of a IC needing high bandwidth
>consistency Mesh
>ḷow core count Ringbus, though IF also works just as well inside one CCX
Brody Hall
Define low core count. Up to 4? 6? 8? 10?
Hudson Morgan
first you say this, >So far it does not seem to me to be a software only problem. then this, >I'd say this is the case if the benchmark results came in high but the real world scores came in lower. what do you mean by real world? games? you can bet your ass the reason is developers. look at far cry primal (single core optimized), then crysis 3(multicore optimized). digital foundry have lots of videos about this. look it up. they did one with 8700k recently. >Im willing to believe its purely a software issue if you can back that up with evidence, gave one above, and add one more. Agner Fog has found out many years ago that intel compiler cripples amd processors. many developers was using intel compiler at that time. intel was de-optimizing the software in case it runs on a non-intel cpu. and again, recently Agner Fog tested ryzen, found out that clock by clock ryzen has higher IPC than Skylake. check his blog.
everything comes down to software. but for now that doesnt seem to be the case.
Xavier Miller
>define low 2 crossbar would work better than bingbus even on 4 cores too. but the margin of improvement is small and does not justify the costs.
Tyler Wood
*but for now that's doesn't seem to be the case is your words, I forgot to delete it.
Eli Thompson
Alright but do we have any way of confirming this? I am just trying to work with the little information the public gets, and where im looking from this does not seem to be a software issue only.
Furthermore, if everyone uses an intel compiler and this wont change anytime soon then it wont matter if a given CPU is better or not since it will be fucked by the compiler anyways. These are things to consider as well.
To clarify i actually hope its just a software problem and will be fixed soon as AMD catching up to Intels peak SC performance would be great for the market, i just dont think its the case.
Robert Morris
>Alright but do we have any way of confirming this? I am just trying to work with the little information the public gets, and where im looking from this does not seem to be a software issue only.
To clarify with this i mean if this is the case for the specific arch's we're talking about now (Zen, KL and SL-X)
Alright then, but can we expect to see this happen to most software anytime soon? Seems like it will be better to get a Kaby/Coffee Lake CPU now and get Ryzen 2 or even 3 later if it will take a year or two for developers to catch.
Hunter Williams
Catch up*
David Gray
It depends on AMD's marketshare. it is going up, that means the code needs adjusting for the market. Bingbus is a arch. intel still tries to squeeze it but this is the end, IPC is the same for 2 years, only clock speeds go up and it is marginal. Ryzen is new and rough. it is open to get polished. Zen 2 targets 5ghz for base clock. for Zen 1, it was 3ghz. AMD is coming like a freight train, developers will catch up eventually. But Nvidia needs to fucking update its shit drivers.
Brody Rivera
*bingbus is dead arch.
Joshua Watson
Actually I suspect software will get worse or stay the same because stronger hardware allows to run code that is badly written, that means "diverse" code needs less unfucking and is thusly cheaper... I hope I'm wrong, though
Adrian Wilson
Low core count is the number where a ring has a shorter longest path between cores than the other architecture.
Kayden Murphy
>"diverse" code nice
Jose Sullivan
Infinity Fabric is inm every single way superior to ring bus and ring bus mesh edition
Daniel Russell
more software supports bing bus more than mesh, skylake x was an utter failure and gave the entire HEDT market to AMD
James Ortiz
where is the picture for infinity fabric that name is cringy shit btw
Eli Torres
>more software supports bing bus more than mesh because bing bus is around a nearly a decade. its time has come and there is nowhere to go with it. it is a great name. you are just a retard.
Nathan Rodriguez
Because it scales into infinity
Henry Gomez
that is physically impossible
Asher Smith
Don't tell marketing departament that
Owen Cooper
They just mean it can scale a lot
Robert Jackson
i know that doesn't change that the name is bullshit
Luke Hall
IF is really interesting because it brings ringbus latencies to super high core counts, ringbus can't scale.
It has a catch, cross CCX hopping, can be alleviated by improving the IF, faster RAM or making bigger CCXs. This is perfect for VMs though, lets say Skylake-X makes a 4 core VM, it will still have 80ns core to core, but make a 4 core VM on EPYC and it will have around 40ns, just like a ringbus design. Thankfully AMD did their research, most vendors make use of multiple smaller VMs instead of big core ones.
Benjamin Perez
The latencies are not that crippling for most workloads, even. Both mesh and IF are throughtput over latency (but Zen designs also have low latency inside the CCX so it's the best of both worlds). Keller, Clark and the rest of the Zen team should get a fucking medal and their own religion for that.
Bentley Nelson
Relational databases, though I guess that's legacy shit at this point.
Jace Fisher
Ye, most scale-up workloads might as well be legacy shit in the age of ebin Cloud.
Grayson Long
8 core ccx's when
Isaiah Perez
IF is a general purpose interconnect that happens to be a great for high core counts, it can serve cores, VRAM, DRAM, GPUs, I/O, dies and sockets and god knows what else.
Only thing it can't be used in its current iteration is private caches
David Williams
>It has a catch, cross CCX hopping wasn't cross-ccx latency like 10ns higher than mesh is between any two cores with 3200mhz memory?
the inter-ccx latency was lower than ringbus
Andrew Howard
There's no 3200MHz quad rank ECC RAM
Daniel Robinson
sure but the desktop platform gives you a peek into IF's capabilities. we're a long way off 3200MHz ECC, we barely just got 2667MHz ECC and 99% of DDR4 on the desktop side is overclocked ICs with the original 2133MHz JEDEC spec, I believe they only just started releasing sticks thst ship with proper 2400MHz 'stock' or fallback speeds outside the XMP profiles.
Juan Foster
Something tells me they wont ever make a +4 core ccx. I think the architecture isnt made to handle thah
Caleb Howard
If they can get cross CCX latencies down on then, staying with 4 core CCX is optimal.
I think they'll just ride it out until DDR5 or something, when that hits latency will no longer be a problem due to the sheer DDR5 bandwidth Also PCIe4/5, AMD gains the most from them due to their architecture.
Luis Collins
DDR5 is a meme user, the frequencies are increased, which will help some workloads, but CL will double too, which will fuck with others
Alexander Kelly
CL is not real latency, a highend kit DDR2 or 3 is not lower latency than a highend DDR4 kit.
Jacob Sullivan
Case in point.
Levi Cruz
>have 1066mhz 4-4-4-12 DDR2 >7.5ns latency
Brayden Rogers
Does anyone know how cores within a ccx communicate? It will depend on this wether we'll ever see +4 core ccx's.
Lucas Reyes
stfu nigga even a ddr2 800mhz cl3 has lower absolute latency than your ddr4 shit. and no rowhammer vulnerabilities either
Ryan Ortiz
Through some internal bus, unknown which one. AMD and Intel have never given info about going that deep, they mostly limited it to shared chip wide interconnects
Zachary Foster
see
Aiden Garcia
HEY HEY HO HO THIS FUCKING BINGBUS HAS GOT TO GO
Christopher Price
Ringbus is literally "make useless loopty loops" tier, mesh makes more sense but still why didn't they fucking do it from the beginning But infinity fabric is fairly kino and I don't see either mesh or ring reaching it's speeds
David Gray
In low core count, it faster to ask "Hey is this the shit you want" than "I need deliver those package to #3 when turn left."
Connor Ortiz
Hence why I'm confused at to why they continued it with higher core counts as well. I mean, I'm an engineer but obviously I can't even come close to touching the qualifications these Intel guys have, and obviously they might have a good reason but I still question why?
Michael Johnson
Cost. Its easier and cheaper to make one single die and just disable features for lower end models than to make several different dies. Plus they had no competition at that market segment so there really was no incentive for them to improve it.