Why are x86 processors limited to only 2 threads per core, but shit like the sparc t4 has 8 threads per core?

Why are x86 processors limited to only 2 threads per core, but shit like the sparc t4 has 8 threads per core?

Especially for shit like xeons and opterons

Halp

It's easier to take advantage of two threads vs eight per one core.

Even for shit that eats up all the cores like video rendering?

Or servers?

multithreaded video rendering is more of a GPU thing.
Point being, being able to do four things at once covers essentially 100% of the consumer market usecase.

They aren't limited, the architectures used just don't scale like that. They aren't designed for it.

In most situations it would be more advantageous to have two cores vs one core with two threads. If you really need so many threads, you can grab a Xeon Phi or GPU like the other user suggested.

Then why does sparc have so many threads? Is it a meme?

>2 sets of physical execution units is better than 1

Thanks cpn obvious

I don't know how it works on sparc but in x86 the second thread is just a trick to get the OS to schedule 2 processes per core. You get another 40% on a real good day. 10-20% on average.

What happens is process one stalls for some reason and the cpu has to wait for something to happen. It has all these left over executions units sitting around doing nothing and runs process 2 on them. When process one unstalls it starts running again. Tricking the OS in to running 8 threads on one x86 core just means you have 7 threads that the OS thinks have been assigned to something where it has a good chance of getting done. In reality you have 1 process getting a whole core and the other 7 fighting over an effective 20%. It's better to let your OS' scheduler pick who wins that fight than piece of silicon that has no real idea what the priorities should be.

Other architectures may lend themselves to high thread per core performance. x86 doesn't right now. Supposedy it's poorly written code that has cache misses left and right that lead to performance gains for hyper threading. So you get a stall when you have to find that variable that's not in cache and that's when thread 2 steps up. Code written for performance is going to bend over backwards to avoid that. A cache miss costs about 10ns to access ram so diminishing returns kicks in real fast when adding stuff to grab that extra time.

I honestly don't know why Sun and later Oracle had such a hard on for many threaded CPUs. It would be nice to hear from a Sparc aficionado on this subject.

I looked at some of the older sparc wiki pages. Apparenty some didn't even have cache. So you probably have a bunch of processes waiting for memory access just hanging around.

It looks like a trade off. On x86 the idea is the OS decides what needs to run. It sets the CPU state up and tells the process here you go have a bast for a time share. On Sparc they have things like built in ways to specify the priority of threads. So the OS sets up the threads and tells the CPU I need these done and if you have to choose to run only some prioritize these. My guess is that everything from the chip layout, instruction set, and software written for it is optimized for many slow threads instead of a couple fast ones. It's a tradeoff. You probably get more throughput but higher latency.

Who is this magnificent semen demon?

Why is Sup Forums clickbait the board?
source?

Because virgins and weaboos

you obviously don't know what you are talking about.

>Then why does sparc have so many threads?
all risc architectures use a lot of threads per physical core because they scale better.

ibm power8 (which is like the only modern comercially available risc architecture unfortunately) offers 12 core cpus with 96 threads

risc/cisc doesn't mean anything anymore, nub.
power8 is certainly not a risc cpu, it has even more instructions than an intel i7.

>power8 is certainly not a risc cpu, it has even more instructions than an intel i7.
some evidence would be in order

We had this exact thread 1-2 weeks ago.

It boils down to hyperthreading not really being "free" the only way it really works well is if you double the majority of CPU resources so the 2nd/3rd/4th/5th/6th/etc threads don't hit a bottleneck.

Basically if you JUST double the registers, you can use spare CPU time to work on other processes, BUT if the MAIN thread is using up the resources the 2nd thread has to wait until the first thread is done.

If you double up on most of your CPU resources (not all because that defeats the point of multithreaded vs just adding more real cores) then you can mitigate this, the trouble is finding the balance of what needs to be doubled, and what you can get by with just 1 of that the threads will end up sharing.

Whatever you have the least of is likely going to be your bottleneck.


So why dont Xeons and such use more than 2 threads per core? It would require massive redesign of the core and instead of doubling up on resources you'd have 4x or 8x as many depending how many threads you wanted per core.

power8 and intel isa manuals. i7 has ~700 instructions, power8 has +1000.

Different core and CPU designs mostly.

x86 processors don't really allow two threads per core, more like if part of the core is idle then another thread can be slotted into it.
By this principle you shouldn't be able to scale up to three or four or more threads because it is very unlikely that the second thread will use only part of the idle part of the core.

To increase the number of threads you could slot into the core you would have to increase the core sizes.

It was mainly due to the client base: super high volume data processing. Like AI, webserver (at a huge scale), video processing, and mainframes.

If you read their marketing slides they put out for every revision you can see what they are optimizing it for.

Also the idea from what I remember was the compiler would parallelize your workload for you to a small degree (call & load instructions). Basically when you call a sub you have to supply another instruction to run on another core for data loading/other stuff. The big thing was you had 0 cash latency because by the time you needed to hit the memory you wanted it was loaded. Thus requires another core per call thread iirc.

Nobody video renders with GPU, idiot. unless they want garbage quality

Actually I wouldn't be surprised if GPU raytracing was the big thing these days