RDMA 101 - basics
SRC:
Netdev 0x16: RDMA Tutorial by Roland Dreier (Enfabrica) and Jason Gunthorpe (NVIDIA)
https://netdevconf.info/0x16/slides/40/RDMA%20Tutorial.pdf
Introduction to Programming Infiniband RDMA by Insu Jang
https://insujang.github.io/2020-02-09/introduction-to-programming-infiniband/
Code
Copy-Verbatim, all rights belong to the original author(s)
Async queues:
- work requests are sent to send/recv queues
- poll completion queue for completion
- One-sided operations (RDMA)
- one host moves data directly to or from memory of its communication peer without notifying its CPU
- Two-sided operations
- (1) one host posts a receive work request (2) peer posts a send work requests that consumes the recv work request to deliver data
one-/or two-sided operations can be mixed.
transport layers
Transport Layering
[Application]
_____|_____
| |
[UD QPs] [RC QPs]
| |
+-----+-----+
|
[ Verbs ] libverbs
|
________________|_________________
| | | |
[RoCE] [iWARP] [InfiniBand] [Omni-Path] Transport
|________|
|
[Ethernet] [IB] [Omni-Path] Physical
SW
librdmacmconnection establishment w/ IP addressinglibverbsAPI for control/data path operations. “verbs” interface implementation
Lingua
Reliable Connected (RC)
Unreliable Diagram (UD)
- Queue Pair (QP), Completion Queue (CQ), Send Queue (SQ), Receive Queue (RQ)
- SQ and RQ are always grouped and managed as a queue pair (QP). SQ and RQ in a queue pair could use distinct CQ or share one.
CA 1 CA 2
---- ----
SEND --------------> RECV
RECV <-------------- SEND
- Work Request (WR), Work Completion (WC)
- post WR by generating a work queue entry (WRE) into the work queue. e.g. SEND WR to SQ; RECV WR to RQ. Once a request is completed, the HW posts a Work Completion (WC) into a CQ.
Protection Domain (PD)
Memory Region (MR)
InfiniBand (IB)
- Channel Adapter (CA)
- an end node in the IB network (comparable to a NIC)
Host Channel Adapter (HCA)