nvmf/vfio-user: avoid doorbell reads in cq_is_full()

Profiling data showed the deference of the CQ head in cq_is_full() was a significant contributor to the CPU cost of post_completion(). Use the cached ->last_head value instead of a doorbell read every time. Signed-off-by: John Levon <john.levon@nutanix.com> Change-Id: Ib8c92ce4fa79683950555d7b0c235449e457b844 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11848 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2022-03-08 19:18:43 +00:00 · 2022-03-08 19:18:43 +00:00 · a8326f8155
commit a8326f8155
parent 14ecc7787d
1 changed files with 18 additions and 1 deletions
--- a/lib/nvmf/vfio_user.c
+++ b/lib/nvmf/vfio_user.c
@ -509,6 +509,17 @@ cq_tail_advance(struct nvmf_vfio_user_cq *cq)
 	}
 }

+/*
+ * As per NVMe Base spec 3.3.1.2.1, we are supposed to implement CQ flow
+ * control: if there is no space in the CQ, we should wait until there is.
+ *
+ * In practice, we just fail the controller instead: as it happens, all host
+ * implementations we care about right-size the CQ: this is required anyway for
+ * NVMEoF support (see 3.3.2.8).
+ *
+ * Since reading the head doorbell is relatively expensive, we use the cached
+ * value, so we only have to read it for real if it appears that we are full.
+ */
 static inline bool
 cq_is_full(struct nvmf_vfio_user_cq *cq)
 {
@ -521,7 +532,13 @@ cq_is_full(struct nvmf_vfio_user_cq *cq)
 		qindex = 0;
 	}

-	return qindex == *cq_dbl_headp(cq);
+	if (qindex != cq->last_head) {
+		return false;
+	}
+
+	cq->last_head = *cq_dbl_headp(cq);
+
+	return qindex == cq->last_head;
 }

 static bool