From: Alastair D'Silva <alastair(a)d-silva.org>
This series adds support for OpenCAPI SCM devices, exposing
them as nvdimms so that we can make use of the existing
- "powerpc: Map & release OpenCAPI LPC memory"
- Fix #if -> #ifdef
- use pci_dev_id to get the bdfn
- use __be64 to hold be data
- indent check_hotplug_memory_addressable correctly
- Remove export of check_hotplug_memory_addressable
- "ocxl: Conditionally bind SCM devices to the generic OCXL driver"
- Improve patch description and remove redundant default
- "nvdimm: Add driver for OpenCAPI Storage Class Memory"
- Mark a few funcs as static as identified by the 0day bot
- Add OCXL dependancies to OCXL_SCM
- Use memcpy_mcsafe in scm_ndctl_config_read
- Rename scm_foo_offset_0x00 to scm_foo_header_parse & add docs
- Name DIMM attribs "ocxl" rather than "scm"
- Split out into base + many feature patches
- "powerpc: Enable OpenCAPI Storage Class Memory driver on bare metal"
- Build DEV_DAX & friends as modules
- "ocxl: Conditionally bind SCM devices to the generic OCXL driver"
- Patch dropped (easy enough to maintain this out of tree for development)
- "ocxl: Tally up the LPC memory on a link & allow it to be mapped"
- Add a warning if an unmatched lpc_release is called
- "ocxl: Add functions to map/unmap LPC memory"
- Use EXPORT_SYMBOL_GPL
Alastair D'Silva (27):
memory_hotplug: Add a bounds check to __add_pages
nvdimm: remove prototypes for nonexistent functions
powerpc: Add OPAL calls for LPC memory alloc/release
mm/memory_hotplug: Allow check_hotplug_memory_addressable to be called
powerpc: Map & release OpenCAPI LPC memory
ocxl: Tally up the LPC memory on a link & allow it to be mapped
ocxl: Add functions to map/unmap LPC memory
ocxl: Save the device serial number in ocxl_fn
ocxl: Free detached contexts in ocxl_context_detach_all()
nvdimm: Add driver for OpenCAPI Storage Class Memory
nvdimm/ocxl: Add register addresses & status values to header
nvdimm/ocxl: Read the capability registers & wait for device ready
nvdimm/ocxl: Add support for Admin commands
nvdimm/ocxl: Add support for near storage commands
nvdimm/ocxl: Register a character device for userspace to interact
nvdimm/ocxl: Implement the Read Error Log command
nvdimm/ocxl: Add controller dump IOCTLs
nvdimm/ocxl: Add an IOCTL to report controller statistics
nvdimm/ocxl: Forward events to userspace
nvdimm/ocxl: Add an IOCTL to request controller health & perf data
nvdimm/ocxl: Support firmware update via sysfs
nvdimm/ocxl: Implement the heartbeat command
nvdimm/ocxl: Add debug IOCTLs
nvdimm/ocxl: Implement Overwrite
nvdimm/ocxl: Expose SMART data via ndctl
powerpc: Enable OpenCAPI Storage Class Memory driver on bare metal
MAINTAINERS: Add myself & nvdimm/ocxl to ocxl
MAINTAINERS | 3 +
arch/powerpc/configs/powernv_defconfig | 4 +
arch/powerpc/include/asm/opal-api.h | 2 +
arch/powerpc/include/asm/opal.h | 3 +
arch/powerpc/include/asm/pnv-ocxl.h | 2 +
arch/powerpc/platforms/powernv/ocxl.c | 42 +
arch/powerpc/platforms/powernv/opal-call.c | 2 +
drivers/misc/ocxl/config.c | 50 +
drivers/misc/ocxl/context.c | 6 +-
drivers/misc/ocxl/core.c | 60 +
drivers/misc/ocxl/link.c | 60 +
drivers/misc/ocxl/ocxl_internal.h | 36 +
drivers/nvdimm/Kconfig | 2 +
drivers/nvdimm/Makefile | 2 +-
drivers/nvdimm/nd-core.h | 4 -
drivers/nvdimm/ocxl/Kconfig | 21 +
drivers/nvdimm/ocxl/Makefile | 7 +
drivers/nvdimm/ocxl/scm.c | 2220 ++++++++++++++++++++
drivers/nvdimm/ocxl/scm_internal.c | 238 +++
drivers/nvdimm/ocxl/scm_internal.h | 284 +++
drivers/nvdimm/ocxl/scm_sysfs.c | 163 ++
include/linux/memory_hotplug.h | 5 +
include/misc/ocxl.h | 19 +
include/uapi/nvdimm/ocxl-scm.h | 127 ++
mm/memory_hotplug.c | 21 +
25 files changed, 3377 insertions(+), 6 deletions(-)
create mode 100644 drivers/nvdimm/ocxl/Kconfig
create mode 100644 drivers/nvdimm/ocxl/Makefile
create mode 100644 drivers/nvdimm/ocxl/scm.c
create mode 100644 drivers/nvdimm/ocxl/scm_internal.c
create mode 100644 drivers/nvdimm/ocxl/scm_internal.h
create mode 100644 drivers/nvdimm/ocxl/scm_sysfs.c
create mode 100644 include/uapi/nvdimm/ocxl-scm.h
This patch series enables DAX support for virtio-fs filesystem. Patches
are based on 5.3-rc5 kernel and need first patch series posted for
virtio-fs support with subject "virtio-fs: shared file system for virtual
Enabling DAX seems to improve performance for most of the operations
in general a great deal. I have reported performance numbers in first patch
series so I am not repeating these here.
Any comments or feedback is welcome.
Sebastien Boeuf (3):
virtio: Add get_shm_region method
virtio: Implement get_shm_region for PCI transport
virtio: Implement get_shm_region for MMIO transport
Stefan Hajnoczi (4):
dax: remove block device dependencies
fuse, dax: add fuse_conn->dax_dev field
virtio_fs, dax: Set up virtio_fs dax_device
fuse, dax: add DAX mmap support
Vivek Goyal (12):
dax: Pass dax_dev to dax_writeback_mapping_range()
fuse: Keep a list of free dax memory ranges
fuse: implement FUSE_INIT map_alignment field
fuse: Introduce setupmapping/removemapping commands
fuse, dax: Implement dax read/write operations
fuse: Define dax address space operations
fuse, dax: Take ->i_mmap_sem lock during dax page fault
fuse: Maintain a list of busy elements
dax: Create a range version of dax_layout_busy_page()
fuse: Add logic to free up a memory range
fuse: Release file in process context
fuse: Take inode lock for dax inode truncation
drivers/dax/super.c | 3 +-
drivers/virtio/virtio_mmio.c | 32 +
drivers/virtio/virtio_pci_modern.c | 108 +++
fs/dax.c | 89 +-
fs/ext2/inode.c | 2 +-
fs/ext4/inode.c | 2 +-
fs/fuse/cuse.c | 3 +-
fs/fuse/dir.c | 2 +
fs/fuse/file.c | 1206 +++++++++++++++++++++++++++-
fs/fuse/fuse_i.h | 99 ++-
fs/fuse/inode.c | 138 +++-
fs/fuse/virtio_fs.c | 134 +++-
fs/xfs/xfs_aops.c | 2 +-
include/linux/dax.h | 12 +-
include/linux/virtio_config.h | 17 +
include/uapi/linux/fuse.h | 47 +-
include/uapi/linux/virtio_fs.h | 3 +
include/uapi/linux/virtio_mmio.h | 11 +
include/uapi/linux/virtio_pci.h | 11 +-
19 files changed, 1868 insertions(+), 53 deletions(-)
Since the last RFC patch set much of the discussion of supporting RDMA with
FS DAX has been around the semantics of the lease mechanism. Within that
thread it was suggested I try and write some documentation and/or tests for the
new mechanism being proposed. I have created a foundation to test lease
functionality within xfstests. This should be close to being accepted.
Before writing additional lease tests, or changing lots of kernel code, this
email presents documentation for the new proposed "layout lease" semantic.
At Linux Plumbers just over a week ago, I presented the current state of the
patch set and the outstanding issues. Based on the discussion there, well as
follow up emails, I propose the following addition to the fcntl() man page.
<fcntl man page addition>
Layout (F_LAYOUT) leases are special leases which can be used to control and/or
be informed about the manipulation of the underlying layout of a file.
A layout is defined as the logical file block -> physical file block mapping
including the file size and sharing of physical blocks among files. Note that
the unwritten state of a block is not considered part of file layout.
**Read layout lease F_RDLCK | F_LAYOUT**
Read layout leases can be used to be informed of layout changes by the
system or other users. This lease is similar to the standard read (F_RDLCK)
lease in that any attempt to change the _layout_ of the file will be reported to
the process through the lease break process. But this lease is different
because the file can be opened for write and data can be read and/or written to
the file as long as the underlying layout of the file does not change.
Therefore, the lease is not broken if the file is simply open for write, but
_may_ be broken if an operation such as, truncate(), fallocate() or write()
results in changing the underlying layout.
**Write layout lease (F_WRLCK | F_LAYOUT)**
Write Layout leases can be used to break read layout leases to indicate that
the process intends to change the underlying layout lease of the file.
A process which has taken a write layout lease has exclusive ownership of the
file layout and can modify that layout as long as the lease is held.
Operations which change the layout are allowed by that process. But operations
from other file descriptors which attempt to change the layout will break the
lease through the standard lease break process. The F_LAYOUT flag is used to
indicate a difference between a regular F_WRLCK and F_WRLCK with F_LAYOUT. In
the F_LAYOUT case opens for write do not break the lease. But some operations,
if they change the underlying layout, may.
The distinction between read layout leases and write layout leases is that
write layout leases can change the layout without breaking the lease within the
owning process. This is useful to guarantee a layout prior to specifying the
unbreakable flag described below.
**Unbreakable Layout Leases (F_UNBREAK)**
In order to support pinning of file pages by direct user space users an
unbreakable flag (F_UNBREAK) can be used to modify the read and write layout
lease. When specified, F_UNBREAK indicates that any user attempting to break
the lease will fail with ETXTBUSY rather than follow the normal breaking
Both read and write layout leases can have the unbreakable flag (F_UNBREAK)
specified. The difference between an unbreakable read layout lease and an
unbreakable write layout lease are that an unbreakable read layout lease is
_not_ exclusive. This means that once a layout is established on a file,
multiple unbreakable read layout leases can be taken by multiple processes and
used to pin the underlying pages of that file.
Care must therefore be taken to ensure that the layout of the file is as the
user wants prior to using the unbreakable read layout lease. A safe mechanism
to do this would be to take a write layout lease and use fallocate() to set the
layout of the file. The layout lease can then be "downgraded" to unbreakable
read layout as long as no other user broke the write layout lease.
</fcntl man page addition>
I am the new EXECUTIVE DIRECTOR and Head of Operations of
Credit Commission London, UK. Sometime ago, in
our organization your overdue consignment fund was brought to our
office for final delivery clearance.
However, upon my arrival to this office, I found your consignment
clearance file lying fallow on my desk without any attention. On
thorough scrutiny I discovered that your consignment have been
abandoned by your delivery agent. Meanwhile, I have made several
attempts to contact your delivery agent but to no avail. To my
greatest surprise, during my recent routine checking, I
discovered that your consignment content declaration documents
that your consignment contains personal effects meanwhile, it
United States dollar bills worth over US$ 10.5 Million dollars,
made it impossible for the consignment to be delivered to you
Based on this personal discovery, I am contacting you now to let
know that with my position and power in this office, I can assist
to legally clear your consignment fund, but you must agree with
You must not disclose to any member of my organization whatever
assistance that I am going to render to you in respect of
this consignment fund into your custody.
You will provide me with an authenticated promissory note, other
known as partnership agreement, that upon the safe arrival of the
consignment fund to your custody, that you give me 40% of the
fund contained in the consignment.
You must give adequate attention to this matter until we
and legally clear the consignment fund into your custody.
Upon your acceptance with my above conditions, I will furnish you
further details of what should be done to legally get the
fund cleared and delivered to you as the rightful owner of the
Meanwhile, if you know that you would be unable to keep the
my assistance to you in this regards, please do not border to
Hence the consignment fund would be recovered into our
treasury account as unclaimed consignment fund. Be informed that
consignment contain the sum of $10.5 million dollars in the
But if you can assure me of your competency to keep this secret,
would like to hear from you soon as possible so that I can email
further details and guidelines.
I am looking forward to your earliest response.
Have a Great Day
Mr. Peter Brook
Head of Operations