Here is the comment in nvme_pcie_qpair_construct():  

/*
* Reserve space for all of the trackers in a single allocation.
*   struct nvme_tracker must be padded so that its size is already a power of 2.
*   This ensures the PRP list embedded in the nvme_tracker object will not span a
*   4KB boundary, while allowing access to trackers in tr[] via normal array indexing.
*/

It's a beautiful design in SPDK.  :)


On Wed, Aug 9, 2017 at 1:18 PM, Liu, Changpeng <changpeng.liu@intel.com> wrote:
Yes, you are right.
SPDK embedded PRP list into the struct nvme_tracker, and the data structure is 4KiB aligned,
And also several other fields, so only 506 entries left for PRP lists.

> -----Original Message-----
> From: SPDK [mailto:spdk-bounces@lists.01.org] On Behalf Of Lance Hartmann
> ORACLE
> Sent: Wednesday, August 9, 2017 1:01 PM
> To: Storage Performance Development Kit <spdk@lists.01.org>
> Subject: Re: [SPDK] Determination of NVMe max_io_xfer_size
> (NVME_MAX_XFER_SIZE) ?
>
>
> Ok, but 506 * PAGE_SIZE?  Surely 506 wasn’t arbitrarily selected?  I understand
> that the controller’s Identify Controller structure may indicate far fewer pages
> supported, but if, as the comment suggests, PRP2 is pointing to a list, then why
> reduce the number “just a few”?  I feel like I’m missing something.
>
> Let’s say PRP1 is aligned to a memory page boundary and the length of the data
> transfer is more than two (2) memory pages.  PRP1 points to the first memory
> page of data, and PRP2 points to a memory page containing PRP entries; i.e. a
> PRP list.  If the memory page size is 4096 (4KB), then up to 4096 / (size of PRP
> pointer in bytes) = 4096 / 8 = 512 of PRP entries could be created in that page.
> Thus, if I follow it correctly, with PRP1 pointing to 4KB of data in the first memory
> page, and with PRP2 pointing to a 4KB page of PRP entries, we should be able to
> transfer 1 + 512 = 513 memory pages, and so in this case 513 * 4096 = 2,101,248
> bytes of data.  And, that’s only if the implementation of the SPDK NVMe driver
> elects not to support the mechanism of using the last entry of the page of PRP
> entries to point to another page of PRP entries.
>
> --
> Lance Hartmann
> lance.hartmann@oracle.com
>
>
> > On Aug 8, 2017, at 11:24 PM, Liu, Changpeng <changpeng.liu@intel.com> wrote:
> >
> > Hi Lance,
> >
> > NVME_MAX_XFER_SIZE is the maximum data length supported by SPDK driver,
> of course the NVMe controller has a field(MDTS)
> > to show the limit from hardware,  so choose the smaller one as the command
> limit to split commands bigger than this number.
> >
> > Most of  Intel NVMe SSDs has a hardware value 128KiB, so the driver limit with
> (506*4) KiB is big enough to support it.
> >
> >> -----Original Message-----
> >> From: SPDK [mailto:spdk-bounces@lists.01.org] On Behalf Of Lance Hartmann
> >> ORACLE
> >> Sent: Wednesday, August 9, 2017 11:52 AM
> >> To: Storage Performance Development Kit <spdk@lists.01.org>
> >> Subject: [SPDK] Determination of NVMe max_io_xfer_size
> >> (NVME_MAX_XFER_SIZE) ?
> >>
> >> Hello,
> >>
> >> I’m trying to reconcile the #define NVME_MAX_XFER_SIZE and leading
> comment:
> >>
> >> /*
> >> * For commands requiring more than 2 PRP entries, one PRP will be
> >> *  embedded in the command (prp1), and the rest of the PRP entries
> >> *  will be in a list pointed to by the command (prp2).  This means
> >> *  that real max number of PRP entries we support is 506+1, which
> >> *  results in a max xfer size of 506*PAGE_SIZE.
> >> */
> >>
> >> in lib/nvme/nvme_pcie.c with my interpretation from reading the NVMe spec.
> >> I’d greatly appreciate if someone could “show me the math” or otherwise
> help
> >> me to understand this.  How was NVME_MAX_PRP_LIST_ENTRIES (506)
> derived?
> >> I don’t know if I’m lost in the semantics of the naming, the comment, or
> perhaps
> >> there’s a nuance in the “…we support…” part.  I would’ve guessed, otherwise,
> >> that the max # of PRP entries would be a function of the PAGE_SIZE.
> >>
> >> I did see that the driver in nvme_ctrlr_identify() compares this derived
> maximum
> >> transfer size with that which the controller can actually support as reported in
> >> the Identify Controller structure, choosing the minimum of the two values,
> but
> >> that’s understood and separate from the above.
> >>
> >> regards,
> >>
> >>
> >> --
> >> Lance Hartmann
> >> lance.hartmann@oracle.com
> >>
> >>
> >> _______________________________________________
> >> SPDK mailing list
> >> SPDK@lists.01.org
> >> https://lists.01.org/mailman/listinfo/spdk
> > _______________________________________________
> > SPDK mailing list
> > SPDK@lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
>
> _______________________________________________
> SPDK mailing list
> SPDK@lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
_______________________________________________
SPDK mailing list
SPDK@lists.01.org
https://lists.01.org/mailman/listinfo/spdk