>With running a number of new Ryzen Linux tests lately, a number of readers requested I take a fresh look at the reported Ryzen segmentation fault issues / bugs affecting a number of many Linux users. I did and still am able to reproduce the problem.
>For those that missed our earlier article on the matter from early June, heavy workloads can cause problems on Ryzen, in particular segmentation faults while there have also been reports of some stability problems.
>This Google Doc remains among the resources trying to track this issue on Linux while on the Gentoo Forums, AMD Forums, and elsewhere are more reports of various problems encountered under extreme workloads -- like a ton of code compiling for hours on end, but can also happen in other scenarios.
>AMD hasn't publicly commented on the problem and as of Linux 4.13 the issue is still happening. If carrying out the same tests on Intel CPUs, the segmentation faults do not occur. There is even ryzen-test to easily try reproducing the issue. The ryzen-test script will build GCC in parallel loops from a compressed ramdisk, in order to easily stress the CPU. In my day-to-day benchmarking of Ryzen CPUs, however, I haven't hit this problem or even on my main production desktop with using Ryzen 5. The problem really comes to light just under very heavy and continuous workloads it seems.
>AMD hasn't publicly commented on the problem and as of Linux 4.13 the issue is still happening. If carrying out the same tests on Intel CPUs, the segmentation faults do not occur. >AMD hasn't publicly commented on the problem and as of Linux 4.13 the issue is still happening. If carrying out the same tests on Intel CPUs, the segmentation faults do not occur. >AMD hasn't publicly commented on the problem and as of Linux 4.13 the issue is still happening. If carrying out the same tests on Intel CPUs, the segmentation faults do not occur.
only Sup Forumstards fall for amd cpus so who cares
Camden Perry
>Linux making Ryzen crash Try again, Brian.
David Evans
>first shill thread fails > because of terrible formatting and bad image >try again with spam and a crying wojak lmao
Christian Gonzalez
>it's OK when AMD does it
Jackson Rogers
Work on my machine.
Asher Rogers
>Linux Found your problem.
Jordan Jenkins
Didn't BSD fix this already?
Christian Murphy
>Using linux hahahahahahahhaah
Andrew Smith
>Linux 'developers' haven't fixed an issue for a new product yet >this is somehow AMD's fault
TRY AGAIN, BRIAN.
Cameron Hill
Sounds like something that will get patched.
Nolan Nguyen
It's a GCC thing. Literally never happened on Clang or Microsoft's compiler.
Owen Young
Hi, Matt Dillon here. Yes, I did find what I believe to be a hardware issue with Ryzen related to concurrent operations. In a nutshell, for any given hyperthread pair, if one hyperthread is in a cpu-bound loop of any kind (can be in user mode), and the other hyperthread is returning from an interrupt via IRETQ, the hyperthread issuing the IRETQ can stall indefinitely until the other hyperthread with the cpu-bound loop pauses (aka HLT until next interrupt). After this situation occurs, the system appears to destabilize. The situation does not occur if the cpu-bound loop is on a different core than the core doing the IRETQ. The %rip the IRETQ returns to (e.g. userland %rip address) matters a *LOT*. The problem occurs more often with high %rip addresses such as near the top of the user stack, which is where DragonFly's signal trampoline traditionally resides. So a user program taking a signal on one thread while another thread is cpu-bound can cause this behavior. Changing the location of the signal trampoline makes it more difficult to reproduce the problem. I have not been because the able to completely mitigate it. When a cpu-thread stalls in this manner it appears to stall INSIDE the microcode for IRETQ. It doesn't make it to the return pc, and the cpu thread cannot take any IPIs or other hardware interrupts while in this state.
>Yes, I did find what I believe to be a hardware issue with Ryzen related to concurrent operations.
JUST WAIT(TM) FOR MICROCODE PATCHES
Gabriel Murphy
>The bug is in Clang but worse.... I can get twice the number of seg faults when using Clang.... 121 per hour... More details tomorrow. been running tons of tests all day.
>More details tomorrow. It is tomorrow, where are the detail Michael?
Logan Lopez
That's what you get for buying Rypoo garbage from Poos in Loos
David Fisher
Why do you feel the need to put a quote into a really obnoxious "code" box, faggot?
Christian Sanders
>linux That's your problem.
Carson Edwards
>muh ryzen moar coars because I can compile gorillion gentoo VMs >ryzen has critical bug related to heavy tasks like compiling >w-who gives a shit about loonix, poojeetsoft designated street 10 4lyfe wew
Isaac Flores
>no one at AMD considered testing their processors by compiling a bunch of stuff
lel
Kevin Cook
Eypc is dead in water if they don't fix this
Camden Gray
Sounds like a Linux problem
Luis Ortiz
>Linux Same bug has been reproduced by running crypto mining software on Windows
Sup Forumstards only uses gaymer intel cpu. They don't use gaymer amd cpu.
Matthew Martin
>install gentoo on my brand new ryzen computer >SEGFAULT.ogg starts playing
Robert Howard
>buying a meme CPU and using a meme OS
Parker Gutierrez
>being a wincuck Hello, Sup Forums.
Dominic Foster
Really surprised they haven't fixed this by now.
Justin Powell
This is really to be expected. Ryzen is a completely new design. Remember: If you buy the gen 1 of anything, you're beta testing for the manufacturer.
Carson Bailey
...
Jackson Gray
Remember that FMA and VTE bug a few weeks from launch, guess what, microcode update.
Worst case if this can't be fixed my micrcode it can by a new stepping.
Evan Morgan
Good thing I only play games.
Ryan James
>Ryzen >hyperthread Is this stupid nigger serious? No wonder why they can't get the fucker to work properly, they're trying to use Intel drivers on it.
Connor Diaz
AMD has sent out 5000 EPYC samples for partner testing since 2017 started til computex, I find it hard to believe companies and AMD aren't aware of this months before ryzen launched.
Robert Sanchez
The ayymd damage controll team now look at server market as a sour grape?
Christopher Anderson
>Running loonix or wangblows I think I found the problem
Ethan Williams
Its like no one remembers the Phenom TLB bug where the only fix was to cripple memory performance. But its probably cause no one bought that pile of shit.
Eli Powell
This. Works perfectly on templeOS.
Dominic Robinson
>w-who needs to compile shit on server cpu
Jacob Watson
Works perfectly with minuet.
Ryder Harris
I know one die hard amdfag who bought it and defended it.
Julian Morris
>ocaml >fixed where's amds update?
Evan Thomas
This shit is still present on the new stepping as epycs are affected too.
Jace Jenkins
EPYC fail
Nathaniel Mitchell
>Ryzen can't into gaming >Epyc can't into compiling on OS ran by majority of servers Quick guess in what way AMD will fuck up threadripper. My bet is on the lack of proper cooling since no copper plate actually covers the IHS
Ethan Kelly
>FOSS >Being smart
chooch one
Landon Davis
Noctua is already making heatsinks that cover the entire IHS. AMD is distributing Epyc heatsinks with Threadripper. I think the shitty watercooler is Ayylienware only.
Also, this is going to be fixed like every other issue. Intel had a similar Hyperthreading crash bug recently, too.
Tyler Sanchez
Should say "fixed with microcode".
James Murphy
that affects ocaml fags only, not the whole Linux stack
Aaron Morgan
OCAml discovered it first. Doesn't mean it wouldn't effect other types of software.
Hunter Wilson
no, the increased demands of ocaml on the compiler was the blame, honey.
Nathan Young
It might just be that they don't give enough voltage to the cpu when all the cores are running at 100%.
It would explain why some people seem to be much more affected than others and why it only happens at high workloads.
>sphagetti code lincucks has problems with cutting edge hardware More news at 11
Jack Reed
> verified that the microcode fix indeed solved the OCaml issue >One important point is that the code pattern that triggered the issue in OCaml was present on gcc-generated code. There were extra constraints being placed on gcc by OCaml