How difficult is it to program for more processor threads...

How difficult is it to program for more processor threads? Went had quad core computers for over 10 years now but we're only just beginning to embrace multi core processing.

Other urls found in this thread:

chimera.labs.oreilly.com/books/1230000000929/index.html
twitter.com/AnonBabble

Really difficult.
There are a lot of difficulties that appear only in this style of programming.
First your language needs some kind of execution thread. Every thread has its own instruction that it's executing and it's doing that independently from all other threads. All threads can run simulatneously or one after the other or something in between.
There are several different ways to implement threads in a language.
The next thing is communication and synchronization. When threads are running independently, there needs to be a way for them to communicate through data.
There are several different ways to implement thread communication in a language.
With that you have concurrency. But it creates some problems.
It mostly comes down to the fact that there's no ordering on the instructions. First, as a programmer you actually have to identify, which parts of your program can be executed independently, which can get very tricky. You can create threads that do struff independently, but the threads can execute in any order. They can access and manipulate shared data in any order, if the synchronization is wrong. Or they get into a deadlock, because of bad synchronization. And it's very easy to get this wrong. It might even be that sometimes everything goes well, and sometimes your program crashes, simply because you can't predict in which order your program executes. This makes debugging more difficult.
Concurrency also creates some new possibilities though. It's a new and different way to structure your program and through simultaneous execution of threads you get a speedup, but you have to take care of the issues that come with it.

>How difficult is it to program for more processor threads?
Ever tried to do it? It's incredibly difficult. You basically have to think, for every single line of code in the program, "Okay, what might go wrong if at this exact moment in execution, some other thread changed something?" Finding race conditions and minimizing the amount of locking you have to do is brain-twisting work.

This is after you restructure all your algorithms to break work up into bits that can run in parallel. For some problems there's just no way to do that.

The first issue is that not many workloads can be parallelized. Even if a workload can be parallelized, it doesn't mean that it can benefit from it, and it might even slow down if the threads spent much time waiting.

This. Most cases it slows down the whole process.

Very difficult. You're at the level of retarded that not only are you incapable of doing the task, but incapable of even understanding what the task is.

>Q6600 released Q1 2007
>Phenom X4 9500 Q4 2007
> over 10 years.

I think you need to go back to elementary and learn how to count to 10

On the other hand, even "multi-threaded" games barely make use of 4 cores and still prioritize the first 2~3 cores, which is why the cheap I3 is still widely used today for budget gaming.

Those weren't the first.

for most problems, it's basically impossible to benefit from multiple threads, but there are some problems where having multiple threads can speed up drastically, and in those problems it's often trivial to implement. Sometimes you have to deal with shared data, which is a pain; it's easy to implement, hard to debug.

Have either of you actually programmed with multiple threads, it's not as hard as you're making it out to be; you sound like you just started an OS course. You just have to use mutexes for shared data and it'll all work out well enough, provided you haven't done anything retarded. Debugging can be a pain in the ass, but it's not the worst thing to debug.

>You just have to use mutexes for shared data and it'll all work out well enough
and the more locking you do the less parallelism you can extract. You want all your threads making progress and doing productive work, not waiting to acquire a mutex. The hard part of parallel programming is designing things so that several threads can work together on the same thing without traffic-jamming into each other over their shared data structures all the time.

Wouldn't being able to execute 2 threads at once mean the program can work up to 2x as fast in some scenarios?

>Wouldn't being able to execute 2 threads at once mean the program can work up to 2x as fast in some scenarios?
It will always go less than 2x because of the cost of creating a new thread and the setup of the program.

there aren't any problems I've encountered where threads spend much time waiting on locks. Most things I've dealt with just require a lot of shared reads and asynchronous writes with no chance of a race issue. Do you have any examples of problems that aren't like that?

The devil is in the "up to". Some types of problems are "embarrassingly parallel" - it's trivial to divide them up into arbitrary numbers of independent pieces. These show the linear speedup with number of cores that you describe. Double the cores, double the speed. Much of graphics rendering is like this, which is why you have over a thousand cores ("stream processors", whatever) in your GPU.

Some problems experience overhead, so 2x the cores gets you more than 1x but less than 2x the speed. Commonly there's a point of diminishing returns where adding more threads stops making things significantly faster. This happens because of contention over shared data, the program becoming limited by something else (e.g. memory bandwidth) as the number of threads increases, or just because significant extra work needs to be done to divide up the problem into independent chunks.

And some problems are inherently serial. If computation X depends on knowing the result of computation Y, X can't happen until Y is completed. You can split up such a problem into several threads, but it's pointless, only one will be making progress at a time.

And since all the kids here are especially interested in gaming, it's worth noting that games have an additional complication: what computations are to be performed are dependent on user input, which is received continuously. You can't get too far ahead of the player since you don't know what input he's going to give - you don't know you need to draw the muzzle-flash sprite on the gun barrel until the player clicks the mouse. This limits your ability to use idle computational resources to get a head start on chewing through drawing the next frame.

>How difficult is it to program for more processor threads?
Depends on your workload.

Making something run on multiple processors is easy. The difficult part is subdividing your workload.

They just sound like C programmers to me.

Multi-threading and thread-safe data structures are trivial in higher level languages, for example Haskell, but in C you (as usual) have to reinvent the wheel every single time you do something.

About the games, I'm sure there's some idle/consistent shit throughout the game that might benefit from a dedicated core?
Maybe water effects or fog? Idk, but not every aspect of the game environment is user controlled

>for most problems, it's basically impossible to benefit from multiple threads,
I'd tend to disagree, I just think that our conventional languages make it extremely difficult.

With something like pure functional programming, for example, you can easily extract parallelism wherever there are things you can reduce in parallel. The only major challenge is figuring out which parts of the code will take long enough to compute that the cost of spawning a thread is less than the cost of simply computing it, but with some simple programmer annotations you can ignore that problem for the time being.

If the player looks in a different direction this frame than last frame, the fog or water effects have to be drawn differently.

I always assumed the fog was being rendered for the whole area and not just fov.

Damn, that makes sense now.

>Some problems experience overhead, so 2x the cores gets you more than 1x but less than 2x the speed. Commonly there's a point of diminishing returns where adding more threads stops making things significantly faster. This happens because of contention over shared data, the program becoming limited by something else (e.g. memory bandwidth) as the number of threads increases, or just because significant extra work needs to be done to divide up the problem into independent chunks.
This is not the primary reason. The primary reason is much simpler than that: The ratio of parallel to serial code.

If your workload is 75% parallel and 25% serial, you have an asymptotic limit of 4x speedup no matter how many cores you add. Even if you add a hundred million cores, you won't exceed 4x speedup.

It will always depend on the problem.
And there is a huge difference between doing N different tasks simultaneously and splitting a single task into N threads.

This thread has done much to remind me how truly ignorant I am.

Can someone please recommend me a beginner's guide/book to learning more about this?

You should play wolfenstein 3d for a few minutes.
It was one of the first 3d shooters and it rendered stuff something like what you think.
Going around objects is very entertaining as you see why it doesn't work.

Pretty difficult.
Don't get me wrong here, it's easy to occupy all available cores!
But from the point of the performance of a single unit of work, single core performance is the only thing that really matters. Well, that and memory. And IO. But if you look at a single stack, there is nothing you can do to make it finish faster other than improve performance of whatever core is running it. Because there is only one core running it at any given time.
You can run many threads and load balance between them, and show your CPU load chart, claiming that you solved it, but it's not real.

I've seen a picture here on Sup Forums with the cow getting separated into pieces and multiple core making hamburgers out of it faster than a single one. But no matter how many cores you have, you will not be able to decrease the time they take to produce the first, single hamburger. And that's the most important metric.

Like this?

Not exactly what you were looking for, but speaking of book recommendations I can recommend this to anybody who's either a Haskell programmer already or just curious about it:

chimera.labs.oreilly.com/books/1230000000929/index.html

It's not very difficult at all... for some applications. There are a few applications that make it difficult, but the majority can be scaled to multicore pretty easily.

The problem is most developers are shit. They cheated their way to their degree (or don't even have one) and have no idea what they're doing.

>trivial in higher level languages, for example Haskell
trivial languages for trivial people.

That was one of my favorite games as a child, even though I couldn't figure out how to make it past the first level.

That and totala, but totala was better

>11
>octo
?!

Pretty easy In Rust

>itt fags who never wrote C and cant manage their own memory

Yes.