Inspired by the bunch of fucktards yesterday who knew nothing about device drivers, I decided to make this thread explaining why microkernels that run drivers in userspace is a meme and needs to die.
Myth 1: Running a driver as a normal user-space process is safer and doesn't crash the system.
This claim is based on a number of misunderstandings. While it is true that protection faults (aka dereferencing a NULL pointer) in kernel space will crash the kernel and cause a kernel panic and will only kill the user-space process in user-space, this issue is minimal.
There are two situations I'd like to address here, MMIO and DMA.
For MMIO, IO devices have on-board memory regions called BARs that are mapped into IO address space by the BIOS on system boot in a process called bus enumeration. This allows the CPU to read and write to memory addresses, and these read and writes will be forwarded to the IO device itself. In other words, this is how the CPU is able to read and write registers onboard the device.
Running drivers in user-space would mean exposing physical addresses to user-space. With no additional form of protection, a bad or malicious driver could potentially read and write from arbitrary locations in RAM including where the kernel resides. It would be able to not only crash the system, but breaking out of user-space isolation, meaning that the separation of kernel-space and user-space is completely void.
You might argue that the kernel could provide some form of protection against this, for example offering an API that provides the physical memory regions that are valid for the device alone. The issue, however, is that where in memory device are mapped is a completely arbitrary process and done solely at the discression of the BIOS. In other words, you'd end up with an extremely bloated API that does a bunch of redundant checking in order not to expose physical address space to user-space. This would violate the very premise for running a microkernel in itself, not to mention that you'd be exporting a bunch of functionality such as pinning pages, requesting DMA buffers at certain ranges with such and such alignment, to user-space while still having to do all the checks in kernel. If you think syscalls are a bad idea and very monolithic, imagine this monstrosity of an API.
Of course, this only addresses the issue of a driver having access to physical address space. Then there's also the issue of DMA and the device itself. When a driver does DMA, it typically requests the kernel for a continuous memory region which the device is able to reach (some devices, such as Nvidia GPUs only have 30 address bits, meaning that they can't address the entire 64-bit address space). The driver then passes the address of this range to the device (writing it into a register using MMIO) and the device will then either read or write directly to RAM without involving the CPU. In other words, Direct Memory Access (DMA).
2 / 4
Colton Robinson
Myth 2: IOMMUs solve everything
The problem though, is that the driver can pass along ANY physical address and make the device read or write into arbitrary memory locations (again, for example where the kernel resides). Of course, you'd might be thinking right now that this is where IOMMUs come in, and you're quite right.
In addition to eliminating the need for bounce buffers (the case mentioned above where the Nvidia GPU need to address something above 30-bit address), IOMMU can also provide address isolation by grouping devices into so-called domains. This prevents a driver from flushing data into random physical addresses. HOWEVER, the problem again is that setting these up are usually the task of the device driver. Exposing IOMMU access to user-space is a bad idea, so you'd end up incorporating this into the horrendously bloated API mentioned above.
This of course assumes that there is an IOMMU available in the first case, something architectures other than x86 usually don't have. There's also the issue with PCIe P2P, enabling the IOMMU means that every TLP (aka memory operation) is forwarded to the root complex instead of just taking the shortest path. A network card reading from a disk would experience a serious performance degradation. There is stuff like ATS but they are highly vendor specific and an AMD implementation of ATS is not respected by Intel's VT-d for example. In additon, only a minority of devices actually support ATS. Nvidia GPUs certainly don't.
3 / 4
Carter Diaz
Myth 3: The performance penalty of running in user-space is negligible
As I mentioned above, P2P performance is kill if you use an IOMMU. However, the biggest performance killer is the cost of context switch.
First of all, running in user-space means that your driver is subject to the scheduler and in risk of having it's memory swapped out. Of course, you could solve this by giving higher priorities to the driver as well as pinning its memory in RAM, but then you again have the situation where there is no real separation between user-space and kernel-space. The Linux kernel also runs in virtual address space and its memory is just always part of the first 1 GB of memory and protected using the hardware page protections. So in effect you blur out the differences between kernel-space and user-space and you gain none of the "benefits" of running in user-space.
The real issue here, however, is that you'd need some sort of mechanism to disable interrupts (and thus preemption) from user-space, because sometimes the device driver might do something that requires atomicity and can't be interrupted by the scheduler. So add this functionality to the already bloated driver API and in addition further blur out the hard separation between kernel-space and user-space.
Secondly, there's also the issue of device initiated interrupts. Imagine a Gigabit ethernet network controller generating an interrupt for every received packet. Normally, interrupt routines are short and do little stuff. For user-space drivers, however, you'd need to context switch back into user-space and then run some routine in user-space while at the same time providing deadline guarantees and blocking guarantees.
4 / 4
Liam Anderson
tl;dr
Someone is mad as fuck
William Sanders
tldr
Oliver Moore
No one cares, you fucking angry turbonerd
Sage
Henry Cox
>informative argument against microkernel design >"tl;dr"
Millennials
I guess Linus was right, then.
Noah Miller
>muh millennials
It's an autistic rant spanning 4 posts about a subject no one cares about. KYS
Jayden Howard
>Inspired by the bunch of fucktards yesterday who knew nothing about device drivers, I decided to make this thread explaining why microkernels that run drivers in userspace is a meme and needs to die. Now that's some dedicated shitposting
Noah Mitchell
I made an informative rant about why I think device drivers in userspace is a bad idea, and that's your best comeback?
Christopher Reyes
>I just slapped together some random words and hope that they sound coherent enough to seem intelligible Fix'd. Now go back to playing with Java, pajeet, and let the grown ups discuss kernel design.
Hunter Perry
Why are you so mad? How about actually addressing the actual arguments instead of throwing semi-racist Sup Forums memes around?
Brody Murphy
>baaawwww it's racist!!!!
Did I hurt your little poo in loo feelings, you disgusting currynigger?
Luke Brooks
All of this is very easy to refute: managed code.
You wouldn't be willing to let heavy abstraction layers such as JVM and CLR clog up your systems in exchange for a little security to begin with if monolithic kernels weren't such a massive failure.
2/10 made me reply
Josiah Jones
Cool now go make your autism kernel that will never ever be used.
Linux, Windows, Mac and the good BSDs are popular because they deliver features instead of internet autism wars
Ryan Morales
Managed code would introduce even more abstractions...
>Cool now go make your autism kernel that will never ever be used. I'm arguing for the design that's already used in Linux and FreeBSD, instead of some autistic pet projects
Juan Green
So for the MMIO you just map a page to that physical address, I don't see the problem here.
Carson Cruz
>All of this is very easy to refute: managed code. Not OP, but you'd still at some point need to deal with physical addresses and actual pages, which is OP's argument. And when you do so, it doesn't matter how "safe" the rest of the driver is, you're still able to fuck up unless you introduce layers upon layers with bloated "security" abstractions (which entirely violates the microkernel design principle in the first place).
Jaxson Jenkins
>Not running NodeOS
Nicholas Watson
>So for the MMIO you just map a page to that physical address, I don't see the problem here. The problem is that you're exposing physical addresses to user-space, which would then be able to read and write into arbitrary memory locations thus breaking out of the user-space encapsulation.
Sebastian Thompson
Read part 2. It isn't a problem to map pages to physical addresses (this is already how it's done in all systems), the problem is that you're allowing user-space access to physical memory as said.
Jace Rivera
Uh, no Richard. Linux is the operating system, not just the kernel, and correcting users to say ganoo plus Linux in a vapid attempt to stay relevant is sad.
Xavier Williams
What the fuck did you just fucking say about me, you proprietary slave? I’ll have you know I graduated top of my class at Harvard, and I’ve been involved in numerous free software projects, and I have contributed to over 300 core-utils for GNU. I am skilled in Lisp and I’m St. IGNU-cius, saint of the Church of Emacs. You are nothing to me but just another unethical non-free software advocate. I will distribute the fuck out of your source code with freedom the likes of which has never been seen before on this Earth, mark my fucking words. You think you can get away with saying that shit about me and the GPL on the Internet? Think again, fucker. As we speak I am contacting my colleagues at FSF and your binaries are being reversed engineered right now so you better prepare for the storm, maggot. The storm that wipes out the pathetic little thing you call your copyright. You're fucking dead, kid. Free software can be anywhere, anytime, and it can ensure your freedom in over four ways, and that’s just with the GPLv2. Not only am I extensively skilled at C hacking, but I have access to the source of the entire GNU userland and core-utils and I will use it to its full extent to wipe your miserable proprietary code off the face of the continent, you little shit. If only you could have known what ethical retribution your little “clever” program was about to bring down upon you, maybe you would have ensured your users' freedom. But you couldn’t, you didn’t, and now you’re paying the price, you goddamn idiot. I will shit free as in freedom all over you and you will drown in it. You’re fucking dead, kiddo.
Ian Williams
Sup Forums in a nutshell
James Turner
That's Doctor Richard Matthew Stallman, PhD to you.
Brody Baker
>Subject no one cares about > Sup Forums - Technology They should rename this board to mindless consumerism or something
Dylan Cox
I use Windows, couldn't care less about this.
Charles White
The Kernel/user space dichotomy shouldn't exist. There should simply be an int's worth of process spaces, with no TLB or cache flushing required to switch between them.
Processor hardware is retarded.
Isaiah Russell
>I don't care about inner workings of an operating system or how device drivers work because I'm a mindless consumer ftfy
Jack Davis
>it's 2017 and he still uses drivers lmao
Jace Flores
Like that I see. I'm actually thinking of making my own microkernel, so I thought about this. How it would work in my kernel is that the drivers get loaded in either by the init system or a command ran by the user (under root of course). Upon init the driver then requests the devices it wants to use to be mapped into their address space. If the device is already mapped the call fails and the driver is shut down. It prevents just any process taking control over a driver.
As for the part about exposing physical addresses to user-space processes, it is not any more dangerous than running a driver in a monolithic kernel. Drivers carry certain responsibilities, part of which is not fucking up the devices they are made for.
Jaxon Perez
Fuckin nerd
Brody Mitchell
>Upon init the driver then requests the devices it wants to use to be mapped into their address space. If the device is already mapped the call fails and the driver is shut down. It prevents just any process taking control over a driver. So what about stuff like NVMe disks that can potentially support multiple lightweight drivers (as long as one is responsible for setting up admin queues)? What about SR-IOV capable devices that have a separate driver for each virtual function (which maps to the same physical address range)?
>As for the part about exposing physical addresses to user-space processes, it is not any more dangerous than running a driver in a monolithic kernel. This is true, and is also part of my point. You're effectively blurring out the hard line between user-space and kernel-space.
>Drivers carry certain responsibilities, part of which is not fucking up the devices they are made for. I'm mostly concerned with breaking out of their isolation, by accessing arbitrary regions of system memory and gaining access to kernel pages. In this case, the argument for running in user-space is entirely moot.
Parker Kelly
To reformulate my posts: I'm not saying that user-space drivers are inherently any less secure than kernel-space drivers. What I'm saying is that I fail to see how the benefits of running in user-space applies when you're dealing with stuff that has full control over physical memory. Such a driver crashing will still bring down the entire system in almost all cases.
Hunter Jenkins
atlast an informative quality post. Thanks for making 4chin better.
Anthony Diaz
>informative quality post >pure lies huh.... really makes you think..
Benjamin Sanders
How is it lies? Or are you baiting?
Jackson Johnson
>b-but I'm too retarded to read one or two paragraphs so only Sup Forums tier GPU shitposting shoul be allowed on Sup Forums
Carter Ward
obvioua bait
Xavier Bell
(You)
Ian Torres
This isn't smartphones, get out
Anthony Collins
I guess why people do microkernels is for the little added security such as dereferencing a null pointer or restarting drivers when they crash. Also tell me in what case a driver would crash the whole system.
Mason Powell
>I guess why people do microkernels is for the little added security such as dereferencing a null pointer This could be handled simply by adding a protection fault handler to do some gracious error handling in kernel space, seeing how most modern kernels run in virtual address space (like Linux do).
>Also tell me in what case a driver would crash the whole system. Pointing a device DMA to kernel memory for example (which could be handled by IOMMU, assuming you've set up proper domains, something your microkernel must also do and further blurring out the lines between user-space and kernel-space). Or a malicious driver could retrieve physical address of kernel pages and inject code into them.
Then there's device initiated interrupts and having interrupt routines in user-space, you'd have to add the cost of doing a context switch.
Then there's the issue of disabling interrupts and scheduler preemption, you'd be even further blurring out the lines between user-space and kernel-space. If you disable preemption and you start a blocking call by mistake (to take a lock that's already taken or whatever), you're fucked and have deadlocked your system no matter if it runs in kernel space or user space.
Jose Sullivan
Not the guy you're responding to, but thanks.
Jack Cox
wtf i hate microkernels now
Kayden Martin
What does this autistic shitfest have to do with Sup Forums ?
Fuck off back to your friendly linux thread generals.
Gabriel Gonzalez
>what does computer architecture and OS design have to do with technology
Charles Anderson
Thamk you based Torwalds.
Cooper Miller
Thank you based OP
Joshua Reed
So you got butthurt and rekt in the Rust thread yesterday and now you made an 8000 character long rant about how butthurt you are? keke
Sebastian Sanders
I think your computer science teachers are still teaching you from books written in the '80s, when the word "micro-kernel" was associated with a future utopia.
Jeremiah Smith
>Tannenbaum He is a fucking retard who spouts academic memes, because he can't into real stuff.
I would even go as far as to say that anyone who would like to see some different shit should look at Terry A. Davis's TempleOS. The way he does certain things is interesting.
Dominic Cooper
I was actually referring to (uninformed) opinions touted in a thread yesterday, where a bunch of Sup Forumsentoo men claimed that microkernels and userspace drivers was the best thing since sliced bread yet they seemed completely unknowledgable about how device drivers actually work.
Christian Adams
Terry doesn't do things interestingly. Quite the contrary, his FAT-like file system, his 1:1 virtual memory layout, his drawing to VGA buffer graphics etc are just how DOS did it 30 years ago. The only improvement upon DOS is an preemptive scheduler, which isn't fucking hard to implement.
Parker Gutierrez
desu Minix is a lot more interesting than Terry "I can't into virtual memory" Davis
Luis Wilson
>Minix is a lot more interesting than [templeOS] This
Jason Hernandez
Sorry, I meant that more towards them, than you.
Aiden Murphy
...
Brayden Howard
>However, the biggest performance killer is the cost of context switch.
No, not really. See cost of context switch on Linux vs cost of context switch on seL4.
You'd need hundreds of times as many context switches on seL4 for it to even match Linux overhead of one context switch.
Levi Turner
>took a semester of operating systems, had to implement a microkernel >it was literally just a rip off of MIT's 6.828 >implying I didn't steal all the code Hardware classes are more fun than operating systems. Fuck OS niggers.
Carter Brooks
This is not true, user. The cost of context switching on Linux is actually very low since the kernel makes out the first 1 GB of memory of every process and is pinned to that memory.
All that drivers under seL4 can do is use the IOMMU to contain the hardware to the driver's own pagetable.
Asher Fisher
The only numbers I see in that papers are for "one-way IPC of various L4 kernels", which does not really give any indication of the cost of context switching.
>All that drivers under seL4 can do is use the IOMMU to contain the hardware to the driver's own pagetable. It clearly states that IOMMU is experimental and for the unproved/unverified variant of seL4.
Leo Jenkins
Was meant for
Josiah Taylor
OP you are so intelligent. I wish I could regurgitate my Intro Operating System lecture notes, and put them in my own words as well as you!
Ryan Cox
>IOMMU to contain the hardware to the driver's own pagetable. That's not how IOMMUs work though. IOMMU provides virtual IO addresses (bus addresses) which translates into physical addresses. EIther the driver or the kernel needs to set up the correct IOMMU domains, it doesn't happen magically. Usually it's done by the driver in order to have full control over the device.
Luke Diaz
>It clearly states that IOMMU is experimental and for the unproved/unverified variant of seL4.
seL4 usually implements things and only verifies them sometime later. Personally, I don't care so much about verification as I do about the ongoing virtualization support, particularly the VMM, which is userland.
> The only numbers I see in that papers are for "one-way IPC of various L4 kernels", which does not really give any indication of the cost of context switching.
That's the cost of sending one message to one process to another. That is the most relevant case of context switch, in terms of microkernel design overhead. There's also the timer causing tasks to switch so that all runnable tasks get a chance to run, but that one isn't microkernel specific.
Aaron Barnes
>I think I'm not a mindless consumer because I use a broken operating system, with the compensation "At least I understand how my OS werks XD"
Jeremiah Taylor
But it is an extremely vaguely defined metric (messages can be of an arbitrary size), and it doesn't say anything about how they measured it (did they run it once, is it an average, what is the distribution? etc). I cannot compare this number fairly, you'd need to run tests on the same hardware with fairly similar circumstances.
Anyway, as for the cost of context switching on Linux. As mentioned, it is very low because kernel memory is already mapped into the first 1 GB of every process' address space. It's only a matter of restoring a stack pointer and a couple of registers and flushing the cache (which is the highest cost of context switching). Not even seL4 can avoid this.
We're not discussing an actual OS here, we're discussing the cost of running drivers in user-space. Pay attention, you might learn something.
Nathaniel Stewart
I was remarking your stubborn pretentiousness of not understanding people share your interests. Keep shitposting pretending to be a CS PhD, you might go somewhere.
Gavin Young
*don't share your interests
Adrian Collins
>I was remarking your stubborn pretentiousness of not understanding people share your interests You are free to ignore this thread completely, yet you decided to post in it and pretending to be proud of being an ignoramus.
The post literally reads "I use Windows, couldn't care less about this", yet it should be obvious that even Windows has IO device drivers and needs to do things in an optimal way.
>Keep shitposting pretending to be a CS PhD, you might go somewhere. I am actually a PhD student.
Camden Bailey
I posted in this thread because I try ignoring idiots like you everyday. No it shouldn't be obvious, because he simply "doesn't care" something you fail to understand. >I'm a PhD student Okay kid
Ryan Wilson
I agree OP.
You want to know what's really going on? Microsoft and other companies, (and even governments) are independently trying to kill linux and open source.
The GPL is a danger to every big tech company because it means they can't community driven software, re-brand it, and sell it.
Governments are also anti-FOSS because they're all abusing technology to spy on us. If all software were open source and people thought negatively of closed source, it would make it much harder to spy on people (we know the NSA was working with Microsot).
Also, these organizations use SJW tactics to try to control and destroy communities. Look what happened to Mozilla. Firefox was the alternative to the bot net, now shills post everyday about how it's an SJW company, and the SJWs have really infiltrated it.
People pushing non-Linux kernels are just attempting to disrupt the Linux and open source ecosystem.
Christopher Campbell
>I posted in this thread because I try ignoring idiots like you everyday. That literally doesn't make any sense.
Noah Walker
What's to understand, I got tired of seeing threads like these everyday, so I started shitposting them. Was that hard Doc?
Kayden Miller
>No it shouldn't be obvious, because he simply "doesn't care" something you fail to understand. You're on a technology discussion board and you get angry because people are discussing how technology works?
That's some high level autism right there.
Matthew Harris
>I got tired of seeing threads like these everyday, You get tired of seeing threads discussing hardware architecture and OS design every day?
Tell me, why the fuck are you even on Sup Forums then?
Kayden Thompson
>and it doesn't say anything about how they measured it
They didn't just measure it. This is WCET (Worst Case Execution Time), which is one of the proofs they do.
Easton Harris
But still, you agree that it is a vague metric? I mean, the statement "message passing takes 0.05 microseconds" doesn't really say anything. What is a message? How long is it?
Hudson Sullivan
I get tired of reading about pseudo-intellectuals getting butthurt with other posters because they don't share interests.
Anthony Flores
More like "Shallow IT and PC Gaming".
Make /prog/ a thing again.
Dominic Martinez
>it is very low because kernel memory is already mapped into the first 1 GB of every process' address space. It's only a matter of restoring a stack pointer and a couple of registers and flushing the cache (which is the highest cost of context switching)
Linux is remarkably slow at this. It takes over a whole microsecond, even on many-GHz CPUs.
Part of why is that the process concept is bloated on Linux. Part of it is that kernel memory being mapped isn't anywhere as good as a whole microkernel permanently pinned in the TLB.
Aiden Cox
> Running drivers in user-space would mean exposing physical addresses to user-space. With no additional form of protection, a bad or malicious driver could potentially read and write from arbitrary locations in RAM including where the kernel resides.
How about every driver having its virtual address space just like any ordinary process?
Samuel Bennett
>pseudo-intellectuals Do you have some sort of inferiority complex, user?
I was making a case for why user-space drivers is a bad idea, by explaining how it would be implemented and why this is insufficient. What would you have me do, post memes and reaction images about Tannenbaum instead?
I'm not butthurt that you don't share my interest, all I did was commenting on the ignorant statement saying "I use Windows therefore I don't care". Obviously Windows have IO devices and drivers too, therefore Windows users aren't unaffected by driver design choices.
Justin Bell
>Linux is remarkably slow at this. It takes over a whole microsecond, even on many-GHz CPUs. No, it literally takes nanoseconds.
>Part of it is that kernel memory being mapped isn't anywhere as good as a whole microkernel permanently pinned in the TLB. It *IS* permanently pinned in the TLB... Why do you think it is mapped into the same area of memory for every process?
Matthew Williams
>How about every driver having its virtual address space just like any ordinary process? If you want an actual device to access memory, you cannot avoid having to deal with physical memory.
>How about every driver having its virtual address space just like any ordinary process? Drivers already run in virtual address space, they do it on Linux and they do it on most OSes I know. The point is that you have to get physical addresses for stuff like DMA buffers.
Jason White
isn't that why IOMMU has been recently included in new processors? just to have MMU for IO devices just like the processor? so If we have IOMMU, every driver can deal with its space without having to know the real physical address or worrying about other process stealing or modifying its data, right?
Henry Evans
user, this is seven years old.
Ethan Turner
If you're saying this changed, where's your newer data?
Robert Cox
It's halfway correct. See and First of all, only x86s have IOMMUs, and they are still highly vendor specific. In addition, enabling the IOMMU absolutely kills device to device access, because instead of taking the shortest PCIE path, everything is now routed through the root complex.
Secondly, > every driver can deal with its space without having to know the real physical address or worrying about other process stealing or modifying its data, right? Not really. Someone has to set up correct IOMMU mappings, and this is a job for the device driver (because the device driver is aware of addressing limitations of the device, knows how many DMA buffers it needs, knows when to set up and tear down these buffers, knows when to access potential other devices [to do RDMA] etc). This would still lead to the situation where the device driver has control over physical address space.
Charles Roberts
>This would still lead to the situation where the device driver has control over physical address space.
No, not necessarily. See openbsd's pledge() for the idea of dropping privileges after initialization, and Genode handbook 16.05 for the idea of capabilities to physical memory frames.
Jackson Nguyen
I do share your interests, you're obviously distraught when you are calling someone a "mindless consumer".
Jordan Cook
Again, you're using imprecise metrics because "cost of context switch" isn't clearly defined. The blog you posted to, for example, says that system calls aren't triggering what he calls a "full context switch", but he doesn't define what he considers a full context switch.
What he measures is the time it takes for a thread to wait for a mutex, which isn't the same as the cost of a context switch but how long it takes for a blocked process to be rescheduled.
Seeing that flushing the cache is the most influential cost of a context switch, you and I could agree on using the time it takes for flushing the cache as a metric of the cost of a context switch. But I imagine you wouldn't agree to this, because it is the same regardless of OS.
Juan Murphy
>enabling the IOMMU absolutely kills device to device access, because instead of taking the shortest PCIE path, everything is now routed through the root complex.
Please provide a common real-life example of this device to device communication which isn't DMA (as DMA always has CPU as arbiter) happening. I'm genuinely curious.