Can someone explain GPU cores, they seem misleading, are they real "cores"?

Can someone explain GPU cores, they seem misleading, are they real "cores"?

Cores is a dumb word. We should go back to using ALU and FPU

Yes, each core run a different thread, this is great for graphics computing because you can easily divide the work for parallelism

They do simpler things.

They're more simple in "build" as well.

yes they're real cores
>user learns for the first time that cores don't matter outside of very specific optimized tasks

Checkout my 16 ALU, 8 FPU, 12 AGU Intel i7! 4 ALUs, 2 FPU, and 3 AGUs are scheduled at once and the scheduler has two input instruction streams it can schedule between at any times with special register banks to avoid collisions between the two threads! Not to mention a pretty fancy uOp unpacker because those pesky CISC instructions, ya know? Oh there is other stuff, but I'd hate to go on for the next hour and half about them. How about your CPS (central processing SoC)?

in short
>CPU
each core has its own memory
fast cores
can do complex tasks
cores can be used separately
>GPU
all cores have the same memory
slow
can only do "simple" tasks
all of them have to do the same task (video rendering)

They're similar to cores in that they've become programmable, though they're still targeted at being most efficient at rendering 3D graphics.

They're evolving, however. GPGPU has gained some serious traction over the past 3-5 years thanks namely to Google's interest in machine learning, Tesla with self driving cars, and Nvidia's efforts to expand hardware acceleration in conjunction with big name CAD companies.

AMDs efforts with async could also make for some significant changes in the computer hardware market. their biggest obstical has always been that no one cares what they do until Nvidia makes their own version.

Nvidia has had async since the 980ti

No the absolute smallest thing resembling an actual processor core is the Compute Unit (AMD) or Streaming Multiprocessor (Nvidia), which contain and operate the actual ALUs (64 each in GCN, 32 in post-Fermi Nvidia).

Even then, adjacent CUs/SMs tend to be running the exact same fucking instruction sequences just on different parts of the data, so it's very, very unlike any sort of traditional independent CPU cores.

One of the biggest obstacles in obtaining high performance in GPGPU is due to data transfer latency between CPU and GPGPU. Depending on the application kernel, data transfer times can go up to one or two orders of magnitude larger than the time required to actually execute the kernel. This causes a number of issues with respect to real-time requirements in safety-critical applications (see self-driving cars).
The GPU architecture can be exploited to obtain high performance in very specific contexts where the data set is bounded and operations can be planned ahead with a real-time scheduler running on the CPU. And of course the application must be suitable for parallelization.

It's not exact to say that all of the cores must do the same task; different core groups can be instructed to perform different computations, of course with a limit on granularity due to the shared resources (the GPU architecture is hierarchical).

stop it

Make me

>be me using 2012 computer with upgraded amd x6 1060t processor
>8 gigs ram
>be making music with reaper
>4 instances of amplitube
>distortion vst
>synth vst
>gorillion tracks
>delays, reverbs, choruses, modulation
>check task manager
>46% RAM usage
>average 23% / core usage spread nicely out among 6 cores

fucking a not bad for an old pos.

>he doesn't use his PC exclusively for single core optimized VIDYA GAEMS

Look at this faggot.

Jokes on you comrade. This setup with a 1gb ATi hd 6970 runs Stalker just fine.

They're cores in the sense that they can execute the instruction set of the device.

Though there was a period where 'cuda cores' were a thing. And stream processors.

But really, GPUs are amazingly fast devices even for non graphics tasks. You just need to adapt your algorithms to them. Their biggest downfall is the latency.

what if we gave each gpu core (isnt that cuda core?) its own memory

Memory organization is hierarchical. See The post you are replying to is grossly oversimplifying things.

so your claim is that it is good to dumb down the language because it is easier to advertise?
They could list the number of flipflops for all I care.
Nothing matters anyway. The target demographic for advertisement is gamers and they are all morons who will buy the latest regardless of what the specifications.

I believe his claim is that the number of ALU and FPU is not enough to characterize the processing power and capabilities of a CPU. A number of complex architectural details make specifics not as simple as comparing the number of flip-flops or the number of pipeline stages or the nominal operating frequency.
Moving high-level specifications from the number of cores to the number of ALUs is not at all an improvement. It's similar to evaluating a sports car performance by reporting the number of bolts in the engine.
In the end you can only rely on benchmarks and hope that the tested kernels are valid performance approximators.

What does the architecture of a GPU core look like? How many registers, and how big are they? Is their instruction set standardized like x86, or is it different between manufacturers?

GPU code is decoded from API(d3d, ogl, ect) code to the underlying architecture assembly code from drivers, the drivers are kind like at JIT compiler at this point. The Assembly code is different for each family of GPU as they all have varying instruction sets.

The GPU then decode the Assembler code into wavefronts that can be scheduled for each compute unit. Each use can then distribute work to the ALUs. ALUs do all the complex vertex and pixel calculations. Geometry is done by a geometry engine and finally the render back ends will actually turn the different layers of shaded data into pixels.

yea it was easier back when every instruction was completed in X instructions and you could easily estimate execution time, but computers are faster now, clock frequency doesn't have to be constant and power saving is more important than performance in most cases.

...

>archiecture

You could make the argument that most GPUs only have one "core" since (at least for AMD and nVidia's GPUs) everything is scheduled by a shared back end/command processor/"gigathread engine".