io_uring/io_uring.c
Source file repositories/reference/linux-study-clean/io_uring/io_uring.c
File Facts
- System
- Linux kernel
- Corpus path
io_uring/io_uring.c- Extension
.c- Size
- 90433 bytes
- Lines
- 3271
- Domain
- Kernel Services
- Bucket
- io_uring
- Inferred role
- Kernel Services: syscall or user/kernel boundary
- Status
- core implementation candidate
Why This File Exists
Shared kernel service surface used by multiple subsystems, including helpers, cryptography, virtualization support, and async I/O infrastructure.
- Shared kernel service surface used by multiple subsystems, including helpers, cryptography, virtualization support, and async I/O infrastructure.
- Defines or participates in a user/kernel boundary; inspect argument validation, copy_from_user/copy_to_user, credentials, and dispatch target.
- Touches user memory; correctness depends on fault-safe copying and privilege boundary handling.
- Uses kernel synchronization; read lock ordering, sleepability, and interrupt context assumptions before translating.
- Allocates kernel memory; connect allocation flags and lifetime to context constraints.
- Defines or uses C structs; map object ownership, embedded links, reference counts, and lock ownership.
Dependency Surface
linux/kernel.hlinux/errno.hlinux/syscalls.hlinux/refcount.hlinux/bits.hlinux/sched/signal.hlinux/fs.hlinux/mm.hlinux/percpu.hlinux/slab.hlinux/anon_inodes.hlinux/uaccess.hlinux/nospec.hlinux/task_work.hlinux/io_uring.hlinux/io_uring/cmd.hlinux/audit.hlinux/security.hlinux/jump_label.htrace/events/io_uring.huapi/linux/io_uring.hio-wq.hfiletable.hio_uring.hopdef.hrefs.htctx.hregister.hsqpoll.hfdinfo.hkbuf.hrsrc.h
Detected Declarations
syscall io_uring_entersyscall io_uring_setupstruct io_tctx_exitfunction io_poison_cached_reqfunction io_poison_reqfunction req_fail_link_nodefunction io_req_add_to_cachefunction io_ring_ctx_ref_freefunction io_alloc_hash_tablefunction io_free_alloc_cachesfunction io_clean_opfunction io_req_track_inflightfunction io_prep_async_workfunction io_prep_async_linkfunction io_queue_iowqfunction io_req_queue_iowq_twfunction io_req_queue_iowqfunction io_linked_nrfunction io_queue_deferredfunction __io_commit_cqring_flushfunction __io_cq_lockfunction io_cq_lockfunction __io_cq_unlock_postfunction io_cq_unlock_postfunction __io_cqring_overflow_flushfunction io_cqring_overflow_killfunction io_cqring_do_overflow_flushfunction io_cqring_overflow_flush_lockedfunction io_put_taskfunction io_task_refs_refillfunction io_uring_drop_tctx_refsfunction io_cqring_add_overflowfunction io_cqring_queuedfunction io_fill_nop_cqefunction io_cqe_cache_refillfunction io_fill_cqe_aux32function io_fill_cqe_auxfunction io_init_cqefunction io_cqe_overflowfunction io_cqe_overflow_lockedfunction io_post_aux_cqefunction io_add_aux_cqefunction io_req_post_cqefunction io_req_post_cqe32function io_req_complete_postfunction io_req_defer_failedfunction io_issue_sqefunction io_free_req
Annotated Snippet
SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
u32, min_complete, u32, flags, const void __user *, argp,
size_t, argsz)
{
struct io_ring_ctx *ctx;
struct file *file;
long ret;
if (unlikely(flags & ~IORING_ENTER_FLAGS))
return -EINVAL;
file = io_uring_ctx_get_file(fd, flags & IORING_ENTER_REGISTERED_RING);
if (IS_ERR(file))
return PTR_ERR(file);
ctx = file->private_data;
ret = -EBADFD;
/*
* Keep IORING_SETUP_R_DISABLED check before submitter_task load
* in io_uring_add_tctx_node() -> __io_uring_add_tctx_node_from_submit()
*/
if (unlikely(smp_load_acquire(&ctx->flags) & IORING_SETUP_R_DISABLED))
goto out;
if (io_has_loop_ops(ctx)) {
ret = io_run_loop(ctx);
goto out;
}
/*
* For SQ polling, the thread will do all submissions and completions.
* Just return the requested submit count, and wake the thread if
* we were asked to.
*/
ret = 0;
if (ctx->flags & IORING_SETUP_SQPOLL) {
if (unlikely(ctx->sq_data->thread == NULL)) {
ret = -EOWNERDEAD;
goto out;
}
if (flags & IORING_ENTER_SQ_WAKEUP)
wake_up(&ctx->sq_data->wait);
if (flags & IORING_ENTER_SQ_WAIT)
io_sqpoll_wait_sq(ctx);
ret = to_submit;
} else if (to_submit) {
ret = io_uring_add_tctx_node(ctx);
if (unlikely(ret))
goto out;
mutex_lock(&ctx->uring_lock);
ret = io_submit_sqes(ctx, to_submit);
if (ret != to_submit) {
mutex_unlock(&ctx->uring_lock);
goto out;
}
if (flags & IORING_ENTER_GETEVENTS) {
if (ctx->int_flags & IO_RING_F_SYSCALL_IOPOLL)
goto iopoll_locked;
/*
* Ignore errors, we'll soon call io_cqring_wait() and
* it should handle ownership problems if any.
*/
if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
(void)io_run_local_work_locked(ctx, min_complete);
}
mutex_unlock(&ctx->uring_lock);
}
if (flags & IORING_ENTER_GETEVENTS) {
int ret2;
if (ctx->int_flags & IO_RING_F_SYSCALL_IOPOLL) {
/*
* We disallow the app entering submit/complete with
* polling, but we still need to lock the ring to
* prevent racing with polled issue that got punted to
* a workqueue.
*/
mutex_lock(&ctx->uring_lock);
iopoll_locked:
ret2 = io_validate_ext_arg(ctx, flags, argp, argsz);
if (likely(!ret2))
ret2 = io_iopoll_check(ctx, min_complete);
mutex_unlock(&ctx->uring_lock);
} else {
struct ext_arg ext_arg = { .argsz = argsz };
ret2 = io_get_ext_arg(ctx, flags, argp, &ext_arg);
if (likely(!ret2))
Annotation
- Immediate include surface: `linux/kernel.h`, `linux/errno.h`, `linux/syscalls.h`, `linux/refcount.h`, `linux/bits.h`, `linux/sched/signal.h`, `linux/fs.h`, `linux/mm.h`.
- Detected declarations: `syscall io_uring_enter`, `syscall io_uring_setup`, `struct io_tctx_exit`, `function io_poison_cached_req`, `function io_poison_req`, `function req_fail_link_node`, `function io_req_add_to_cache`, `function io_ring_ctx_ref_free`, `function io_alloc_hash_table`, `function io_free_alloc_caches`.
- Atlas domain: Kernel Services / io_uring.
- Implementation status: core implementation candidate.
- This snippet crosses the user/kernel memory boundary; validate fault handling and access checks before translating the pattern.
- Synchronization appears in or near this file; preserve lock ordering, sleepability, and interrupt-context constraints.
Implementation Notes
- This generated page is the file-by-file coverage layer; curated subsystem chapters should link here when they synthesize a multi-file control flow.
- Core OS pages should be promoted from atlas-only to deep-reviewed when they explain data structures, invariants, locking, lifecycle, and C implementation snippets.
- Driver-family pages are intentionally pattern-oriented unless they are part of the selected PCIe/NVMe representative device path.