From: SPDK <spdk-bounces(a)lists.01.org> on behalf of Fenggang Wu
Reply-To: Storage Performance Development Kit <spdk(a)lists.01.org>
Date: Thursday, March 29, 2018 at 1:43 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] SPDK Async I/O Thead
I have two follow-up questions:
1) In the last email, you mentioned that the cost of starting a second poll mode
thread exceeds the benefit of it. I am interested in how SPDK reach this design decision.
Would you please give me some insight on the tradeoffs between the cost and the benefit?
The cost is burning an extra core for the extra polled mode thread. This cost is really
only worthwhile if the first polled mode thread cannot keep up with the rate of I/O
requests coming from the synchronous RocksDB threads. It is very difficult to come up
with a workload that can generate enough I/O requests to consume the single polled mode
thread. The flush and compaction threads will be generating very large read and write
requests which will consume a lot of the SSD bandwidth – so the number of I/O requests is
small, but each I/O request itself is relatively large (it doesn’t take much more time for
the driver to submit a 64KB I/O than it does a 4KB I/O). The application threads will
generate smaller I/O requests, but always QD=1 since they are synchronous.
A workload + configuration that would require a second polled mode thread would likely
need the following characteristics:
· Very heavily read-oriented – such that the flush+compaction traffic is minimal,
meaning more bandwidth is available for smaller read requests resulting from application
· A backing device that is striped across multiple SSDs. I know you’ve been doing
some work in this area already. ☺ With just a single NVMe SSD – best case IO/s is around
500K and a single polled mode thread can definitely drive that and more. But if it’s
striped across multiple SSDs to get a higher aggregate IO/s capability, an additional
polled mode thread might be needed if we can prove that the synchronous threads are being
bottlenecked on the polled mode thread and not the media.
2) Is the order of the events preserved through the async I/O thread's event
queue? Will any merging, reordering, or scheduling happen during the queuing?
Events will not be reordered. There’s a multi-producer, single-consumer lockless ring for
passing the events – each synchronous thread will post its events to this ring and the
polled mode thread will then execute those events always in order.
Thank you very much!
On Thu, Mar 22, 2018 at 11:04 PM Harris, James R
From: SPDK <firstname.lastname@example.org<mailto:email@example.com>> on
behalf of Fenggang Wu <firstname.lastname@example.org<mailto:email@example.com>>
Reply-To: Storage Performance Development Kit
Date: Friday, March 23, 2018 at 2:30 AM
To: Storage Performance Development Kit
Subject: [SPDK] SPDK Async I/O Thead
I have some questions about the SPDK async I/O thread. I know that such async I/O thread
is receiving I/O requests as events from other application threads, and forward the
requests to the I/O device then polling for the completion.
My questions are:
1) When is the async I/O thread created? How many async I/O threads are created? If more
than one, is the number configurable? I know that for blobstore there is only one such
async I/O thread.
There is only one async I/O thread. Currently it is not configurable – additional work
would be needed to spread out the asynchronous requests to different cores. It could be
done but so far using the single async I/O thread is not a bottleneck and the cost of
starting a second polled mode thread exceeds the benefit of it. This async I/O thread is
created as part of starting the SPDK app framework – see the call to spdk_app_start() in
the RocksDB plugin (lib/rocksdb/env_spdk.cc).
2) My intuition is that such Async I/O thread is crucial for the I/O performance so we
would prefer to prevent it from evicted from the core. Does SPDK have any method that
keeps such async I/O thread on the core? With such method, is the async I/O thread free
Each of the threads started by the SPDK app framework are pinned to their own core – this
is typically enough to ensure the thread does not get evicted as the scheduler will run
the other threads on other available cores. But if there are many threads running in
addition to the async I/O thread, additional precautions can be made – for example, Linux
isolcpus to isolate specific cores from the scheduler.
SPDK mailing list