Skip to content
This repository has been archived by the owner on Sep 5, 2023. It is now read-only.

Find a way to track struct rpma_conn from which the struct rpma_completion comes from #977

Closed
janekmi opened this issue Apr 2, 2021 · 8 comments
Labels
enhancement New feature or request work-in-progress

Comments

@janekmi
Copy link

janekmi commented Apr 2, 2021

Having an SRQ on its way (#958 #685 #783 #737) we are close to a situation when it will be no longer obvious from which struct rpma_conn a particular struct rpma_completion comes from. It will be handy to allow tracking it down.

struct ibv_wc has a field named uint32_t qp_num which allegedly is:

Local QP number of completed WR. Relevant for Receive Work Completions that are associated with an SRQ

Indeed, struct ibv_qp also has the same field which is:

The number of this Queue Pair. A 24 bits value, which is unique per RDMA device. As QPs are destroyed and created, QP numbers may be reused. However, at a given point in time, only a single QP in the RDMA device will exist with the given number. The user cannot control or influence this value

Roughly speaking we can imagine two possible ways of solving this issue:

  • allow querying the uint32_t qp_num value for struct rpma_conn and struct rpma_completion which will allow the end-user to match these two entities or
  • allow getting directly struct rpma_conn related to a particular struct rpma_completion

Let's discuss this fascinating issue. :-)

Ref: https://www.rdmamojo.com/2013/02/15/ibv_poll_cq/
Ref: https://www.rdmamojo.com/2012/12/21/ibv_create_qp/

@janekmi janekmi added the enhancement New feature or request label Apr 2, 2021
@ultra-nexus
Copy link

Hello. I have a question. The qp_num in ibv_wc is used to identify a QP uniquely. It seems the most useful usecase is when we use a CQ for multiple QPs. I can't understand why this member is relevant for Receive Work Completions that are associated with an SRQ. I can't figure out the relationship bwtween qp_num and SRQ.

Back to the question. I think the second approach would be more suitable because it's convenient for API users.

@janekmi
Copy link
Author

janekmi commented Apr 7, 2021

@ultra-nexus:

The qp_num in ibv_wc is used to identify a QP uniquely. It seems the most useful usecase is when we use a CQ for multiple QPs. I can't understand why this member is relevant for Receive Work Completions that are associated with an SRQ. I can't figure out the relationship bwtween qp_num and SRQ.

In general, a QP can have two CQs (for send completions and recv completions). These CQs can be a single CQ (as it is currently in librpma) or two separate CQs (#737). Each of these CQs might be connected to one or more connections at will.

SRQ is a feature that allows sharing RQ (part of QP) between QPs. Having a shared RQ makes sharing a recv CQ between connections potentially a very common use-case. I think this is why qp_num is especially relevant for SRQ use-cases. Nonetheless, I hope it is available all the time. :-)

I think the second approach would be more suitable because it's convenient for API users.

I do not like the second idea (allow getting directly struct rpma_conn related to a particular struct rpma_completion) for exactly the same reason I have voted against the internal SRQ table maintained by the librpma library (#737 (comment)). It has to be MT-safe and very efficient since processing completions is the performance-critical path. Whereas MT-safety does not go well with efficiency. As of now, I think the application is better equipped to address this issue in an optimal way.

@ultra-nexus
Copy link

ultra-nexus commented Apr 7, 2021

SRQ is a feature that allows sharing RQ (part of QP) between QPs. Having a shared RQ makes sharing a recv CQ between connections potentially a very common use-case. I think this is why qp_num is especially relevant for SRQ use-cases. Nonetheless, I hope it is available all the time. :-)

Thanks for explanation. Does this mean we firstly need to make sure the qp_num is available in non-SRQ scenario?

I think the second approach would be more suitable because it's convenient for API users.

I do not like the second idea (allow getting directly struct rpma_conn related to a particular struct rpma_completion) for exactly the same reason I have voted against the internal SRQ table maintained by the librpma library (#737 (comment)). It has to be MT-safe and very efficient since processing completions is the performance-critical path. Whereas MT-safety does not go well with efficiency. As of now, I think the application is better equipped to address this issue in an optimal way.

But the first approach also needs a data structure such as hashtable to build relationship between qp_num and rpma_conn. Does this data structure is also created and managed by user applications? If this data structure is managed by rpma, we also need to make it MT-safe and efficient enough.

@janekmi
Copy link
Author

janekmi commented Apr 7, 2021

Does this mean we firstly need to make sure the qp_num is available in non-SRQ scenario?

It is very important information for this discussion. :-)

the first approach also needs a data structure such as hashtable to build relationship between qu_num and rpma_conn

No. It is not. You can imagine an application having few connections and looking for a connection might be a lot simpler using e.g. a table or set of comparisons if the number of connections is known upfront.

@ultra-nexus
Copy link

Does this mean we firstly need to make sure the qp_num is available in non-SRQ scenario?

It is very important information for this discussion. :-)

After reading the implementation of ibv_cq, I find there are two APIs to poll CQ. One is ibv_poll_cq and the other is ibv_start_poll. The latter will call mlx5_parse_cqe() with the last parameter being set to 1:

mlx5_parse_cqe(cq, cqe64, cqe, &cq->cur_rsc, &cq->cur_srq, NULL, cqe_ver, 1);

If the last argument is 1, mlx5_parse_cqe will use lazy mode which won't fill wc->qp_num.

qpn = be32toh(cqe64->sop_drop_qpn) & 0xffffff;
if (lazy) {
        cq->cqe64 = cqe64;
        cq->flags &= (~MLX5_CQ_LAZY_FLAGS);
} else {
        wc->wc_flags = 0;
        wc->qp_num = qpn; // qp num
}

The lazy mode is introduced in this patch. I haven't understood it completely but it seems qp_num may not be reliable.

@janekmi
Copy link
Author

janekmi commented Apr 15, 2021

You are right that the MLX implementation of the mlx5_parse_cqe() can set or not set qp_num depending on lazy argument. But you have also noticed that lazy == 1 only for mlx5_parse_lazy_cqe() which is used only by mlx5_start_poll() and mlx5_next_poll() which are NOT part of the standard ibv_poll_cq() implementation.

The mlx5_start_poll() and mlx5_next_poll() functions are part of MLX-specific interface which we should not use at least till we decide to incorporate any MLX-specific optimizations. This is not the case right now.

Ref: https://manpages.debian.org/stretch/libibverbs-dev/ibv_create_cq_ex.3.en.html

TL;DR: Don't worry about it. qp_num is set for mlx5_poll_one() -> poll_cq() -> mlx5_poll_cq() -> ibv_poll_cq().

@yangx-jy
Copy link
Contributor

yangx-jy commented Jun 9, 2021

Hi @janekmi

How about the following design:

/*  get qp_num from rpma_conn object  */
int rpma_conn_get_qp_num(const struct rpma_conn *conn, uint32_t *qp_num)
{
    if (conn == NULL ||  == NULL)
        return RPMA_E_INVAL;

     *qp_num = conn->id->qp->qp_num;

     return 0;
}

/* add qp_num in struct rpma_completion */
struct rpma_completion {
    ...
    uint32_t qp_num;
};

/* copy the qp_num from struct ibv_wc to struct rpma_completion */
int rpma_cq_get_completion(struct rpma_cq *cq, struct rpma_completion *cmpl)
{
    ...
    cmpl->qp_num = wc.qp_num;
}

Note: let applications compare two qp_num by themselves.

yangx-jy added a commit to yangx-jy/rpma that referenced this issue Oct 7, 2021
1) add new rpma_conn_get_qp_num() to query the connection's unique ID.
2) add unit tests for the rpma_conn_get_qp_num().

Ref: pmem#977

Signed-off-by: Xiao Yang <[email protected]>
yangx-jy added a commit to yangx-jy/rpma that referenced this issue Oct 12, 2021
1) add rpma_conn_get_qp_num() to query the connection's unique ID.
2) add unit tests for the rpma_conn_get_qp_num().

Ref: pmem#977

Signed-off-by: Xiao Yang <[email protected]>
@grom72
Copy link
Contributor

grom72 commented Feb 24, 2022

Resolved by #1087

@grom72 grom72 closed this as completed Feb 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request work-in-progress
Projects
None yet
Development

No branches or pull requests

4 participants