Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io_uring for linux 5.1+ #923

Closed
rvolgers opened this issue Apr 6, 2019 · 25 comments
Closed

io_uring for linux 5.1+ #923

rvolgers opened this issue Apr 6, 2019 · 25 comments

Comments

@rvolgers
Copy link

rvolgers commented Apr 6, 2019

The new io_uring API for generic asynchronous IO was merged for (currently unreleased) linux 5.1.

An overview of the API can be found here: http://kernel.dk/io_uring.pdf

While the overall API is designed with completion-based async io in mind, it also has IORING_OP_POLL_ADD which I think allows you to use it as "epoll, but more efficient" API?

As I said linux 5.1 isn't even out yet, but it might be interesting to start thinking about how/if mio can use this.

@Thomasdezeeuw
Copy link
Collaborator

@rvolgers Thanks for pointing this out to us. Over the last couple of days I've been (slowly) reading about this but from what I understand this is mostly aimed at disk I/O, with possible support for sockets in future (although what I read could be outdated and socket could already be supported). This is great because the currently solutions for disk I/O are far from perfect, but mio would still need a low-overhead cross platform way to do this. So I don't think we can use this to offer disk I/O (yet).

For sockets and an epoll replacement the only benefit I see is performance as you mentioned, but I have yet to any "real world" benchmarks for this. I've read the commit message and it shows a ~2ns improvement, which I don't know is worth it. Because some people might use features of epoll outside of what mio provides this would be a serious breaking change.

So this is something keep track of, but it will take a while for it to become useful for mio I think.

@ismell
Copy link

ismell commented May 8, 2019

Here is a benchmark from libuv: libuv/libuv#1947 (comment)

@the8472
Copy link

the8472 commented May 26, 2019

The PDF claims support for socket IO (page 8). Timeouts are supported via timerfd according to a tweet by @deweerdt. Linux 5.2 also adds eventfd and fsync support. So this should have all the building blocks needed for a complete event loop.

@kamalmarhubi
Copy link

kamalmarhubi commented May 26, 2019

I think this will be a bit hard to force into mio's readiness-based model, and with Rust's ownership model in general. The buffers passed into the kernel must remain valid until the IO completes, which doesn't give them any statically knowable lifetime.

My best thought at the moment is this would require an extra copy in the library to get around the lifetime issues: the buffers passed to the kernel are owned by mio, and on completion mio marks those handles as ready. Later, mio copies into the requestor's buffers when requested. I think this is what miow does for IOCP on windows, but I'm not 100% sure. At that point, we'd be trading off syscall overhead against the extra copy. Benchmarking this would be super interesting though!

@rvolgers
Copy link
Author

It also allows to pre-register buffers so the kernel can skip the step of mapping them into kernel space (or maybe this is a future direction they wanted to move in, I forget), in which case it's even more clear the ring owns the buffers. It also makes sense in a lot of other IO models that involve memory mapping or even DMA (although we are getting really far from mio now). One downside of having the ring own the buffers is that you still have to copy the data if you cannot do anything with it immediately, or you are blocking other tasks / threads from reusing that buffer.

@kamalmarhubi
Copy link

It also allows to pre-register buffers so the kernel can skip the step of mapping them into kernel space (or maybe this is a future direction they wanted to move in, I forget), in which case it's even more clear the ring owns the buffers.

Oh I think I remember reading that. That does make the ownership clearer. But even if you don't pre-register buffers, I think the ring has to own the buffers as it's the only thing that can know when they are safe to drop. Does that seem right?

you still have to copy the data if you cannot do anything with it immediately

This is where I'm a bit hung up: I think you have to copy no matter what, at least in mio's model. You'd get a set of events back from poll() signalling things are ready. You still have to ask for the data, since all you got was that readiness signal. And by now the data is already in these buffers that the ring owns, so to get it into the requestor's buffers you'll need a copy.

To avoid that copy, the ring could hand out &mut references to the buffers' contents instead of a user requesting the copy, but again now we're outside of mio's model.

@quininer
Copy link
Member

quininer commented Oct 19, 2019

I implemented a simple POC. It uses io-uring for polling (no IO).

@slanterns
Copy link

FYI: linux-io-uring

@carllerche
Copy link
Member

Is io-uring for TCP or mostly just files?

@vorner
Copy link

vorner commented Jan 7, 2020

From what I've read around it, it was mostly designed for files, but it can be used for TCP (or UDP or probably whatever file descriptors). There are likely to be some performance gains under high loads, as it allows doing multiple recvs/sends per single syscall (or in some extreme cases without syscalls completely). But without any experiments/measurements, my guess is the gains won't be as big as for the files.

@rvolgers
Copy link
Author

rvolgers commented Jan 7, 2020

For files there was basically only the io_submit API if you wanted asynchronous behaviour, and that had some very significant shortcomings (most importantly, it could actually block on some things like allocating room for fs metadata), so it was basically only used with O_DIRECT for things like databases. So in that sense, that's the motivation for the API, to finally have a way to do truly async fs operations.

It's rapidly moving towards a general purpose async IO API though. Basic socket IO in particular has always worked fine, it's just that some of the more esoteric things you can do with sockets aren't available yet.

It does seem to offer a nice performance boost for socket operations, especially if you use the functionality to pre-register buffers and file descriptors so the kernel doesn't have to grab a reference to them on every call. So that could be a reason to use it for socket operations, despite it only being available on bleeding edge Linux so far. There's some benchmark numbers in the Linux commits, and the main Linux dev working on it also posts updates / benchmarks etc on his twitter: https://twitter.com/axboe

(Disclaimer: I've been keeping up with the kernel patches for io_uring on a superficial level, but not actually used it much so far.)

@jasonwilliams
Copy link

@rvolgers did runtimes like Go and Node (i think using LibUV) open up a second thread to read from files in the past? where as with this it's now a lot easier to read files using asyncIO instead of a background thread?

I think because of Disk IO using something like epoll would not be that useful on local files right?

@jeromegn
Copy link

jeromegn commented Feb 8, 2020

Is io-uring for TCP or mostly just files?

Kernels 5.5+ supports TCP accept / connect. I think 5.6 will support send and recv (and more)

@ufoscout
Copy link

ufoscout commented Feb 9, 2020

I suppose you are aware of it, anyway, this seems interesting: https://github.com/spacejam/rio

@lnicola
Copy link

lnicola commented Feb 19, 2020

Here's a benchmark for a TCP echo server: https://twitter.com/hielkedv/status/1218891982636027905.

@nicokoch
Copy link

io_uring is slowly losing its file io focus. It's pretty much evolving into "call any system call, but asynchronously".

@lnicola
Copy link

lnicola commented Mar 20, 2020

New set of benchmarks: https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.6-IO-uring-Tests. Last one from me, because I don't want to spam this issue.

@wanton7
Copy link

wanton7 commented Jul 9, 2020

Links relating to .NET and io_uring
https://twitter.com/tkp1n/status/1270010546205659139
https://ndportmann.com/io_uring-preview-release/
https://gist.github.com/sebastienros/82f5dd4ef1560b793574f3c7bd8dc656

@Thomasdezeeuw
Copy link
Collaborator

Io_uring will not be support in Mio v1.0, which will keep with the current kqueue/epoll model. But I aim to use it in Mio v2.0.

@Thomasdezeeuw Thomasdezeeuw removed this from the v1.0 milestone Jan 5, 2021
@ewtoombs
Copy link

ewtoombs commented Mar 4, 2021

Lending the ring's IO buffers to the user only blocks the ring if the ring plans on reusing the buffers immediately. If you want a permanent copy of the data, there's no need for copying. Just don't reuse that particular IO buffer and the user can keep it for as long as they want. If that requires allocating a new IO buffer just for that user, fine. I don't see a problem with that.

For dataflows that reuse IO buffers, it seems to me like a different API should be exposed to the user anyway, reflecting this idea.

@Thomasdezeeuw
Copy link
Collaborator

I don't think Mio (v1) is going to support io_uring. The design for it is just too different. For people using Tokio please take a look at https://github.com/tokio-rs/tokio-uring. For people not using Tokio I'm working on a io_uring library, but progress is slow and it won't be part of Mio.

@serzhiio
Copy link

I don't think Mio (v1) is going to support io_uring. The design for it is just too different. For people using Tokio please take a look at https://github.com/tokio-rs/tokio-uring. For people not using Tokio I'm working on a io_uring library, but progress is slow and it won't be part of Mio.

Any chances to test your io-uring lib?

@Thomasdezeeuw
Copy link
Collaborator

Any chances to test your io-uring lib?

@serzhiio I've just made it public at https://github.com/Thomasdezeeuw/a10, you'll need a fairly recent kernel, I'm using 6.1 myself.

@serzhiio
Copy link

Any chances to test your io-uring lib?

@serzhiio I've just made it public at https://github.com/Thomasdezeeuw/a10, you'll need a fairly recent kernel, I'm using 6.1 myself.

Did you find it faster or more efficient?

@Thomasdezeeuw
Copy link
Collaborator

Did you find it faster or more efficient?

I haven't had time to do proper performance testing, but with high amount of I/O I think it will be faster than epoll.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests