[PATCH 1/1] ndctl/namespace: Fix disable-namespace accounting relative to seed devices
by Redhairer Li
Seed namespaces are included in "ndctl disable-namespace all". However
since the user never "creates" them it is surprising to see
"disable-namespace" report 1 more namespace relative to the number that
have been created. Catch attempts to disable a zero-sized namespace:
Before:
{
"dev":"namespace1.0",
"size":"492.00 MiB (515.90 MB)",
"blockdev":"pmem1"
}
{
"dev":"namespace1.1",
"size":"492.00 MiB (515.90 MB)",
"blockdev":"pmem1.1"
}
{
"dev":"namespace1.2",
"size":"492.00 MiB (515.90 MB)",
"blockdev":"pmem1.2"
}
disabled 4 namespaces
After:
{
"dev":"namespace1.0",
"size":"492.00 MiB (515.90 MB)",
"blockdev":"pmem1"
}
{
"dev":"namespace1.3",
"size":"492.00 MiB (515.90 MB)",
"blockdev":"pmem1.3"
}
{
"dev":"namespace1.1",
"size":"492.00 MiB (515.90 MB)",
"blockdev":"pmem1.1"
}
disabled 3 namespaces
Signed-off-by: Redhairer Li <redhairer.li(a)intel.com>
---
ndctl/lib/libndctl.c | 11 ++++++++---
ndctl/region.c | 4 +++-
2 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/ndctl/lib/libndctl.c b/ndctl/lib/libndctl.c
index ee737cb..49f362b 100644
--- a/ndctl/lib/libndctl.c
+++ b/ndctl/lib/libndctl.c
@@ -4231,6 +4231,7 @@ NDCTL_EXPORT int ndctl_namespace_disable_safe(struct ndctl_namespace *ndns)
const char *bdev = NULL;
char path[50];
int fd;
+ unsigned long long size = ndctl_namespace_get_size(ndns);
if (pfn && ndctl_pfn_is_enabled(pfn))
bdev = ndctl_pfn_get_block_device(pfn);
@@ -4260,9 +4261,13 @@ NDCTL_EXPORT int ndctl_namespace_disable_safe(struct ndctl_namespace *ndns)
devname, bdev, strerror(errno));
return -errno;
}
- } else
- ndctl_namespace_disable_invalidate(ndns);
-
+ } else {
+ if (size == 0)
+ /* Don't try to disable idle namespace (no capacity allocated) */
+ return -ENXIO;
+ else
+ ndctl_namespace_disable_invalidate(ndns);
+ }
return 0;
}
diff --git a/ndctl/region.c b/ndctl/region.c
index 7945007..0014bb9 100644
--- a/ndctl/region.c
+++ b/ndctl/region.c
@@ -72,6 +72,7 @@ static int region_action(struct ndctl_region *region, enum device_action mode)
{
struct ndctl_namespace *ndns;
int rc = 0;
+ unsigned long long size;
switch (mode) {
case ACTION_ENABLE:
@@ -80,7 +81,8 @@ static int region_action(struct ndctl_region *region, enum device_action mode)
case ACTION_DISABLE:
ndctl_namespace_foreach(region, ndns) {
rc = ndctl_namespace_disable_safe(ndns);
- if (rc)
+ size = ndctl_namespace_get_size(ndns);
+ if (rc && size != 0)
return rc;
}
rc = ndctl_region_disable_invalidate(region);
--
2.20.1.windows.1
1 month, 1 week
[PATCH ndctl v1 00/10] daxctl: Support for sub-dividing soft-reserved regions
by Joao Martins
Hey,
This series introduces the daxctl support for sub-dividing soft-reserved
regions created by EFI/HMAT/efi_fake_mem. It's the userspace counterpart
of this recent patch series [0].
These new 'dynamic' regions can be partitioned into multiple different devices
which its subdivisions can consist of one or more ranges. This
is in contrast to static dax regions -- created with ndctl-create-namespace
-m devdax -- which can't be subdivided neither discontiguous.
See also cover-letter of [0].
The daxctl changes in these patches are depicted as:
* {create,destroy,disable,enable}-device:
These orchestrate/manage the sub-division devices.
It mimmics the same as namespaces equivalent commands.
* Allow reconfigure-device to change the size of an existing *dynamic* dax
device.
* Add test coverage (so far tried to cover all range allocation code paths,
but I am still fishing for bugs). Additionally, there are bugs so applying
[0] may not make it pass the added functional test yet.
I am sending the series earlier (i.e. before the kernel patches get merged)
mainly to share a common unit tests, and also letting others try it out.
The only TODOs left is documentation, and perhaps listing of the mappingX sysfs
entries.
Thoughts, comments appreciated. :)
Thanks!
Joao
[0] "device-dax: Support sub-dividing soft-reserved ranges",
https://lore.kernel.org/linux-nvdimm/158500767138.2088294.171316462598039...
Dan Williams (1):
daxctl: Cleanup whitespace
Joao Martins (9):
libdaxctl: add daxctl_dev_set_size()
daxctl: add resize support in reconfigure-device
daxctl: add command to disable devdax device
daxctl: add command to enable devdax device
libdaxctl: add daxctl_region_create_dev()
daxctl: add command to create device
libdaxctl: add daxctl_region_destroy_dev()
daxctl: add command to destroy device
daxctl/test: Add tests for dynamic dax regions
daxctl/builtin.h | 4 +
daxctl/daxctl.c | 4 +
daxctl/device.c | 310 ++++++++++++++++++++++++++++++++++++++-
daxctl/lib/libdaxctl.c | 67 +++++++++
daxctl/lib/libdaxctl.sym | 7 +
daxctl/libdaxctl.h | 3 +
test/Makefile.am | 1 +
test/daxctl-create.sh | 293 ++++++++++++++++++++++++++++++++++++
util/filter.c | 2 +-
9 files changed, 686 insertions(+), 5 deletions(-)
create mode 100755 test/daxctl-create.sh
--
2.17.1
7 months, 3 weeks
[PATCH 00/12] device-dax: Support sub-dividing soft-reserved ranges
by Dan Williams
The device-dax facility allows an address range to be directly mapped
through a chardev, or turned around and hotplugged to the core kernel
page allocator as System-RAM. It is the baseline mechanism for
converting persistent memory (pmem) to be used as another volatile
memory pool i.e. the current Memory Tiering hot topic on linux-mm.
In the case of pmem the nvdimm-namespace-label mechanism can sub-divide
it, but that labeling mechanism is not available / applicable to
soft-reserved ("EFI specific purpose") memory [1]. This series provides
a sysfs-mechanism for the daxctl utility to enable provisioning of
volatile-soft-reserved memory ranges.
The motivations for this facility are:
1/ Allow performance differentiated memory ranges to be split between
kernel-managed and directly-accessed use cases.
2/ Allow physical memory to be provisioned along performance relevant
address boundaries. For example, divide a memory-side cache [2] along
cache-color boundaries.
3/ Parcel out soft-reserved memory to VMs using device-dax as a security
/ permissions boundary [3]. Specifically I have seen people (ab)using
memmap=nn!ss (mark System-RAM as Peristent Memory) just to get the
device-dax interface on custom address ranges.
The baseline for this series is today's next/master + "[PATCH v2 0/6]
Manual definition of Soft Reserved memory devices" [4].
Big thanks to Joao for the early testing and feedback on this series!
Given the dependencies on the memremap_pages() reworks in Andrew's tree
and the proximity to v5.7 this is clearly v5.8 material. The patches in
most need of a second opinion are the memremap_pages() reworks to switch
from 'struct resource' to 'struct range' and allow for an array of
ranges to be mapped at once.
[1]: https://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit...
[2]: https://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit...
[3]: https://lore.kernel.org/r/20200110190313.17144-1-joao.m.martins@oracle.com/
[4]: http://lore.kernel.org/r/158489354353.1457606.8327903161927980740.stgit@d...
---
Dan Williams (12):
device-dax: Drop the dax_region.pfn_flags attribute
device-dax: Move instance creation parameters to 'struct dev_dax_data'
device-dax: Make pgmap optional for instance creation
device-dax: Kill dax_kmem_res
device-dax: Add an allocation interface for device-dax instances
device-dax: Introduce seed devices
drivers/base: Make device_find_child_by_name() compatible with sysfs inputs
device-dax: Add resize support
mm/memremap_pages: Convert to 'struct range'
mm/memremap_pages: Support multiple ranges per invocation
device-dax: Add dis-contiguous resource support
device-dax: Introduce 'mapping' devices
arch/powerpc/kvm/book3s_hv_uvmem.c | 14 -
drivers/base/core.c | 2
drivers/dax/bus.c | 877 ++++++++++++++++++++++++++++++--
drivers/dax/bus.h | 28 +
drivers/dax/dax-private.h | 36 +
drivers/dax/device.c | 97 ++--
drivers/dax/hmem/hmem.c | 18 -
drivers/dax/kmem.c | 170 +++---
drivers/dax/pmem/compat.c | 2
drivers/dax/pmem/core.c | 22 +
drivers/gpu/drm/nouveau/nouveau_dmem.c | 4
drivers/nvdimm/badrange.c | 26 -
drivers/nvdimm/claim.c | 13
drivers/nvdimm/nd.h | 3
drivers/nvdimm/pfn_devs.c | 13
drivers/nvdimm/pmem.c | 27 +
drivers/nvdimm/region.c | 21 -
drivers/pci/p2pdma.c | 12
include/linux/memremap.h | 9
include/linux/range.h | 6
mm/memremap.c | 297 ++++++-----
tools/testing/nvdimm/dax-dev.c | 22 +
tools/testing/nvdimm/test/iomap.c | 2
23 files changed, 1318 insertions(+), 403 deletions(-)
7 months, 4 weeks
[RFC PATCH 0/8] dax: Add a dax-rmap tree to support reflink
by Shiyang Ruan
This patchset is a try to resolve the shared 'page cache' problem for
fsdax.
In order to track multiple mappings and indexes on one page, I
introduced a dax-rmap rb-tree to manage the relationship. A dax entry
will be associated more than once if is shared. At the second time we
associate this entry, we create this rb-tree and store its root in
page->private(not used in fsdax). Insert (->mapping, ->index) when
dax_associate_entry() and delete it when dax_disassociate_entry().
We can iterate the dax-rmap rb-tree before any other operations on
mappings of files. Such as memory-failure and rmap.
Same as before, I borrowed and made some changes on Goldwyn's patchsets.
These patches makes up for the lack of CoW mechanism in fsdax.
The rests are dax & reflink support for xfs.
(Rebased to 5.7-rc2)
Shiyang Ruan (8):
fs/dax: Introduce dax-rmap btree for reflink
mm: add dax-rmap for memory-failure and rmap
fs/dax: Introduce dax_copy_edges() for COW
fs/dax: copy data before write
fs/dax: replace mmap entry in case of CoW
fs/dax: dedup file range to use a compare function
fs/xfs: handle CoW for fsdax write() path
fs/xfs: support dedupe for fsdax
fs/dax.c | 343 +++++++++++++++++++++++++++++++++++++----
fs/ocfs2/file.c | 2 +-
fs/read_write.c | 11 +-
fs/xfs/xfs_bmap_util.c | 6 +-
fs/xfs/xfs_file.c | 10 +-
fs/xfs/xfs_iomap.c | 3 +-
fs/xfs/xfs_iops.c | 11 +-
fs/xfs/xfs_reflink.c | 79 ++++++----
include/linux/dax.h | 11 ++
include/linux/fs.h | 9 +-
mm/memory-failure.c | 63 ++++++--
mm/rmap.c | 54 +++++--
12 files changed, 498 insertions(+), 104 deletions(-)
--
2.26.2
9 months
[PATCH v2 0/2] Replace and improve "mcsafe" with copy_safe()
by Dan Williams
Changes since v1:
- Rename memcpy_mcsafe() to copy_safe() since the x86-machine-check
specifics have already been de-emphasized in a previous commit and are
further de-emphasized by these changes. (Linus)
- Move copy_safe() out-of-line since it no longer reverts to plain
memcpy (Linus)
- Move copy_safe() to its own stand-alone compilation unit where it no
longer entangles with arch/x86/lib/memcpy_64.S. This also allows perf
to stop tracking ongoing updates to that file due to copy_safe()
updates. (Linus)
- Move the PowerPC implementation over to the new name.
[1]: http://lore.kernel.org/r/158654083112.1572482.8944305411228188871.stgit@d...
---
The primary motivation to go touch memcpy_mcsafe() is that the existing
benefit of doing slow and careful copies is obviated on newer CPUs. That
fact solves the problem of needing to detect machine-check recovery
capability. Now the old "mcsafe_key" opt-in to careful copying can be made
an opt-out from the default fast copy implementation.
The discussion with Linus further made clear that this facility had
already lost its x86-machine-check specificity starting with commit
2c89130a56a ("x86/asm/memcpy_mcsafe: Add write-protection-fault
handling"). The new changes to not require a "careful copy" further
de-emphasizes the role that x86-MCA plays in the implementation to just
one more source of recoverable trap during the operation.
With the above realizations the name "mcsafe" is no longer accurate and
copy_safe() is proposed as its replacement. x86 grows a copy_safe_fast()
implementation as a default implementation that is independent of
detecting the presence of x86-MCA.
---
Dan Williams (2):
copy_safe: Rename memcpy_mcsafe() to copy_safe()
x86/copy_safe: Introduce copy_safe_fast()
arch/powerpc/Kconfig | 2
arch/powerpc/include/asm/string.h | 2
arch/powerpc/include/asm/uaccess.h | 4
arch/powerpc/lib/Makefile | 2
arch/powerpc/lib/copy_safe.S | 4
arch/x86/Kconfig | 2
arch/x86/Kconfig.debug | 2
arch/x86/include/asm/copy_safe.h | 18 ++
arch/x86/include/asm/copy_safe_test.h | 75 +++++++++
arch/x86/include/asm/mcsafe_test.h | 75 ---------
arch/x86/include/asm/string_64.h | 32 ----
arch/x86/include/asm/uaccess_64.h | 21 ---
arch/x86/kernel/cpu/mce/core.c | 9 -
arch/x86/kernel/quirks.c | 10 -
arch/x86/lib/Makefile | 1
arch/x86/lib/copy_safe.c | 66 ++++++++
arch/x86/lib/copy_safe_64.S | 163 ++++++++++++++++++++
arch/x86/lib/memcpy_64.S | 115 --------------
arch/x86/lib/usercopy_64.c | 21 ---
drivers/md/dm-writecache.c | 12 +
drivers/nvdimm/claim.c | 2
drivers/nvdimm/pmem.c | 6 -
include/linux/string.h | 17 +-
include/linux/uio.h | 10 +
lib/Kconfig | 2
lib/iov_iter.c | 36 ++--
tools/arch/x86/include/asm/copy_safe_test.h | 13 ++
tools/arch/x86/include/asm/mcsafe_test.h | 13 --
tools/arch/x86/lib/memcpy_64.S | 115 --------------
tools/objtool/check.c | 5 -
tools/perf/bench/Build | 1
tools/perf/bench/mem-memcpy-x86-64-lib.c | 24 ---
tools/testing/nvdimm/test/nfit.c | 49 +++---
.../testing/selftests/powerpc/copyloops/.gitignore | 2
tools/testing/selftests/powerpc/copyloops/Makefile | 6 -
.../selftests/powerpc/copyloops/copy_safe.S | 0
36 files changed, 429 insertions(+), 508 deletions(-)
rename arch/powerpc/lib/{memcpy_mcsafe_64.S => copy_safe.S} (98%)
create mode 100644 arch/x86/include/asm/copy_safe.h
create mode 100644 arch/x86/include/asm/copy_safe_test.h
delete mode 100644 arch/x86/include/asm/mcsafe_test.h
create mode 100644 arch/x86/lib/copy_safe.c
create mode 100644 arch/x86/lib/copy_safe_64.S
create mode 100644 tools/arch/x86/include/asm/copy_safe_test.h
delete mode 100644 tools/arch/x86/include/asm/mcsafe_test.h
delete mode 100644 tools/perf/bench/mem-memcpy-x86-64-lib.c
rename tools/testing/selftests/powerpc/copyloops/{memcpy_mcsafe_64.S => copy_safe.S} (100%)
base-commit: b8dcd632c06b8706d22934f9bf9bf16a42b1ecc7
10 months
[PATCH v6 0/4] powerpc/papr_scm: Add support for reporting nvdimm health
by Vaibhav Jain
The PAPR standard[1][3] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[2]. Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.
To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch nvdimm health stats using hcalls described
in Ref[2]. This health and performance stats are then exposed to
userspace via syfs and PAPR-nvDimm-Specific-Methods(PDSM) issued by
libndctl.
These changes coupled with proposed ndtcl changes located at Ref[4]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.
Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:
# ndctl list -DH
[
{
"dev":"nmem0",
"health":{
"health_state":"fatal",
"shutdown_state":"dirty"
}
}
]
Dimm health report output on a pseries guest lpar with vPMEM or HMS
based nvdimms that are in perfectly healthy conditions:
# ndctl list -d nmem0 -H
[
{
"dev":"nmem0",
"health":{
"health_state":"ok",
"shutdown_state":"clean"
}
}
]
PAPR nvDimm-Specific-Methods(PDSM)
==================================
PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_scm_pdsm.h'. Support for
more PDSMs will be added in future.
Structure of the patch-set
==========================
The patchset starts with implementing support for fetching nvdimm health
information from PHYP and partially exposing it to user-space via a nvdimm
sysfs flag.
Second & Third patches deal with implementing support for servicing PDSM
commands in papr_scm module.
Finally the Fourth patch implements support for servicing PDSM
'PAPR_SCM_PDSM_HEALTH' that returns the nvdimm health information to
libndctl.
Changelog:
==========
v5..v6:
* Incorporate review comments from Mpe and Dan Williams.
* Changed the usage of term DSM to PDSM as former conflicted with
usage in ACPI context.
* UAPI updates to remove usage of bool and marking the structs
defined as 'packed'.
* Simplified the health-bitmap handling in papr_scm to use u64
instead of __be64 integers.
* Caching of the health information so reading the dimm-flag file
doesn't result in costly hcalls everytime.
* Changed dimm-flag 'save_fail' to 'flush_fail'
* Moved the dimm flag file from 'papr_flags' to 'papr/flags'.
* Added a patch to document H_SCM_HEALTH hcall return values.
* Added sysfs ABI documentation for newly introduce dimm-flag
sysfs file 'papr/flags'
v4..v5:
* Fixed a bug in new implementation of papr_scm_ndctl() that was triggering
a false error condition.
v3..v4:
* Restructured papr_scm_ndctl() to dispatch ND_CMD_CALL commands to a new
function named papr_scm_service_dsm() to serivice DSM requests. [Aneesh]
v2..v3:
* Updated the papr_scm_dsm.h header to be more confimant general kernel
guidelines for UAPI headers. [Aneesh]
* Changed the definition of macro PAPR_SCM_DIMM_UNARMED_MASK to not
include case when the nvdimm is unarmed because its a vPMEM
nvdimm. [Aneesh]
v1..v2:
* Restructured the patch-set based on review comments on V1 patch-set to
simplify the patch review. Multiple small patches have been combined into
single patches to reduce cross referencing that was needed in earlier
patch-set. Hence most of the patches in this patch-set as now new. [Aneesh]
* Removed the initial work done for fetch nvdimm performance statistics.
These changes will be re-proposed in a separate patch-set. [Aneesh]
* Simplified handling of versioning of 'struct
nd_papr_scm_dimm_health_stat_v1' as only one version of the structure is
currently in existence.
References:
[1] "Power Architecture Platform Reference"
https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[2] commit 58b278f568f0
("powerpc: Provide initial documentation for PAPR hcalls")
[3] "Linux on Power Architecture Platform Reference"
https://members.openpowerfoundation.org/document/dl/469
[4] https://patchwork.kernel.org/project/linux-nvdimm/list/?series=244625
Vaibhav Jain (4):
powerpc: Document details on H_SCM_HEALTH hcall
powerpc/papr_scm: Fetch nvdimm health information from PHYP
ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_HEALTH
Documentation/ABI/testing/sysfs-bus-papr-scm | 27 ++
Documentation/powerpc/papr_hcalls.rst | 43 ++-
arch/powerpc/include/asm/papr_scm.h | 49 +++
arch/powerpc/include/uapi/asm/papr_scm_pdsm.h | 192 +++++++++++
arch/powerpc/platforms/pseries/papr_scm.c | 317 +++++++++++++++++-
include/uapi/linux/ndctl.h | 1 +
6 files changed, 617 insertions(+), 12 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-scm
create mode 100644 arch/powerpc/include/asm/papr_scm.h
create mode 100644 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
--
2.25.3
10 months, 1 week
[ndctl PATCH v2 0/6] Add support for reporting papr-scm nvdimm health
by Vaibhav Jain
This patch-set proposes changes to libndctl to add support for reporting
health for nvdimms that support the PAPR standard[1]. The standard defines
machenism (HCALL) through which a guest kernel can query and fetch health
and performance stats of an nvdimm attached to the hypervisor[2]. Until
now 'ndctl' was unable to report these stats for papr_scm dimms on PPC64
guests due to absence of ACPI/NFIT, a limitation which this patch-set tries
to address.
The patch-set introduces support for the new PAPR-SCM PDSM family
defined at [3] via a new dimm-ops named
'papr_scm_dimm_ops'. Infrastructure to probe and distinguish papr-scm
dimms from other dimm families that may support ACPI/NFIT is
implemented by updating the 'struct ndctl_dimm' initialization
routines to bifurcate based on the nvdimm type. We also introduce two
new dimm-ops member for handling initialization of dimm specific data
for specific DSM families.
These changes coupled with proposed kernel changes located at Ref[4] should
provide a way for the user to retrieve NVDIMM health status using ndtcl for
papr_scm guests. Below is a sample output using proposed kernel + ndctl
changes:
# ndctl list -DH
[
{
"dev":"nmem0",
"flag_smart_event":true,
"health":{
"health_state":"fatal",
"shutdown_state":"dirty"
}
}
]
Structure of the patchset
=========================
We start with a re-factoring patch that splits the 'add_dimm()' function
into two functions one that take care of allocating and initializing
'struct ndctl_dimm' and another that takes care of initializing nfit
specific dimm attributes.
Patch-2 introduces probe function of papr_scm nvdimms and assigning
'papr_scm_dimm_ops' defined in 'papr_scm.c' to 'dimm->ops' if
needed. The patch also code to parse the dimm flags specific to
papr-scm nvdimms
Patch-3 introduces new dimm ops 'dimm_init()' & 'dimm_uninit()' to handle
DSM family specific initialization of 'struct ndctl_dimm'.
Patches-4,5 implements scaffolding to add support for PAPR_SCM PDSM
requests and pull in their definitions from the kernel.
Finally Patch-6 add support for issuing and handling the result of
'struct ndctl_cmd' to request dimm health stats from papr_scm kernel module
and returning appropriate health status to libndctl for reporting.
Changelog
=========
v1..v2:
* Reordered and squashed couple of patches to reduce the patchset size.
* Addressed few offline review comments from Santosh Sivaraj.
* Updated the uapi header file for papr_scm_pdsm.h.
* Updated the struct names based on the new uapi header.
References
==========
[1] "Power Architecture Platform Reference"
https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[2] "Hypercall Op-codes (hcalls)"
https://github.com/torvalds/linux/blob/master/Documentation/powerpc/papr_...
[3] "[PATCH v6 3/4] ndctl/papr_scm,uapi: Add support for PAPR nvdimm
specific methods"
https://lore.kernel.org/linux-nvdimm/20200420070711.223545-4-vaibhav@linu...
[4] "powerpc/papr_scm: Add support for reporting nvdimm health"
https://lore.kernel.org/linux-nvdimm/20200420070711.223545-1-vaibhav@linu...
Vaibhav Jain (6):
libndctl: Refactor out add_dimm() to handle NFIT specific init
libncdtl: Add initial support for NVDIMM_FAMILY_PAPR_SCM dimm family
libndctl: Introduce new dimm-ops dimm_init() & dimm_uninit()
libndctl,papr_scm: Add definitions for PAPR nvdimm specific methods
libndctl,papr_scm: Add scaffolding to issue and handle PDSM requests
libndctl,papr_scm: Implement support for PAPR_SCM_PDSM_HEALTH
ndctl/lib/Makefile.am | 1 +
ndctl/lib/libndctl.c | 260 ++++++++++++++++++++++++---------
ndctl/lib/papr_scm.c | 293 ++++++++++++++++++++++++++++++++++++++
ndctl/lib/papr_scm_pdsm.h | 192 +++++++++++++++++++++++++
ndctl/lib/private.h | 7 +
ndctl/libndctl.h | 1 +
ndctl/ndctl.h | 1 +
7 files changed, 685 insertions(+), 70 deletions(-)
create mode 100644 ndctl/lib/papr_scm.c
create mode 100644 ndctl/lib/papr_scm_pdsm.h
--
2.25.3
10 months, 1 week
[PATCH v2 0/3] mm/memory_hotplug: Allow to not create firmware memmap entries
by David Hildenbrand
This is the follow up of [1]:
[PATCH v1 0/3] mm/memory_hotplug: Make virtio-mem play nicely with
kexec-tools
I realized that this is not only helpful for virtio-mem, but also for
dax/kmem - it's a fix for that use case (see patch #3) of persistent
memory.
Also, while testing, I discovered that kexec-tools will *not* add dax/kmem
memory (anything not directly under the root when parsing /proc/iomem) to
the elfcorehdr, so this memory will never get included in a dump. This
probably has to be fixed in kexec-tools - virtio-mem will require this as
well.
v1 -> v2:
- Don't change the resource name
- Rename the flag to MHP_NO_FIRMWARE_MEMMAP to reflect what it is doing
- Rephrase subjects/descriptions
- Use the flag for dax/kmem
I'll have to rebase virtio-mem on these changes, there will be a resend.
[1] https://lkml.kernel.org/r/20200429160803.109056-1-david@redhat.com
David Hildenbrand (3):
mm/memory_hotplug: Prepare passing flags to add_memory() and friends
mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
device-dax: Add system ram (add_memory()) with MHP_NO_FIRMWARE_MEMMAP
arch/powerpc/platforms/powernv/memtrace.c | 2 +-
arch/powerpc/platforms/pseries/hotplug-memory.c | 2 +-
drivers/acpi/acpi_memhotplug.c | 2 +-
drivers/base/memory.c | 2 +-
drivers/dax/kmem.c | 3 ++-
drivers/hv/hv_balloon.c | 2 +-
drivers/s390/char/sclp_cmd.c | 2 +-
drivers/xen/balloon.c | 2 +-
include/linux/memory_hotplug.h | 15 ++++++++++++---
mm/memory_hotplug.c | 14 ++++++++------
10 files changed, 29 insertions(+), 17 deletions(-)
--
2.25.3
10 months, 1 week
[PATCH v1 0/3] mm/memory_hotplug: Make virtio-mem play nicely with kexec-tools
by David Hildenbrand
This series is based on [1]:
[PATCH v2 00/10] virtio-mem: paravirtualized memory
That will hopefull get picked up soon, rebased to -next.
The following patches were reverted from -next [2]:
[PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use
As discussed in that thread, they should be reverted from -next already.
In theory, if people agree, we could take the first two patches via the
-mm tree now and the last (virtio-mem) patch via MST's tree once picking up
virtio-mem. No strong feelings.
Memory added by virtio-mem is special and might contain logical holes,
especially after memory unplug, but also when adding memory in
sub-section size. While memory in these holes can usually be read, that
memory should not be touched. virtio-mem managed device memory is never
exposed via any firmware memmap (esp., e820). The device driver will
request to plug memory from the hypervisor and add it to Linux.
On a cold start, all memory is unplugged, and the guest driver will first
request to plug memory from the hypervisor, to then add it to Linux. After
a reboot, all memory will get unplugged (except in rare, special cases). In
case the device driver comes up and detects that some memory is still
plugged after a reboot, it will manually request to unplug all memory from
the hypervisor first - to then request to plug memory from the hypervisor
and add to Linux. This is essentially a defragmentation step, where all
logical holes are removed.
As the device driver is responsible for detecting, adding and managing that
memory, also kexec should treat it like that. It is special. We need a way
to teach kexec-tools to not add that memory to the fixed-up firmware
memmap, to not place kexec images onto this memory, but still allow kdump
to dump it. Add a flag to tell memory hotplug code to
not create /sys/firmware/memmap entries and to indicate it via
"System RAM (driver managed)" in /proc/iomem.
Before this series, kexec_file_load() already did the right thing (for
virtio-mem) by not adding that memory to the fixed-up firmware memmap and
letting the device driver handle it. With this series, also kexec_load() -
which relies on user space to provide a fixed up firmware memmap - does
the right thing with virtio-mem memory.
When the virtio-mem device driver(s) come up, they will request to unplug
all memory from the hypervisor first (esp. defragment), to then request to
plug consecutive memory ranges from the hypervisor, and add them to Linux
- just like on a reboot where we still have memory plugged.
[1] https://lore.kernel.org/r/20200311171422.10484-1-david@redhat.com/
[2] https://lore.kernel.org/r/20200326180730.4754-1-james.morse@arm.com
David Hildenbrand (3):
mm/memory_hotplug: Prepare passing flags to add_memory() and friends
mm/memory_hotplug: Introduce MHP_DRIVER_MANAGED
virtio-mem: Add memory with MHP_DRIVER_MANAGED
arch/powerpc/platforms/powernv/memtrace.c | 2 +-
.../platforms/pseries/hotplug-memory.c | 2 +-
drivers/acpi/acpi_memhotplug.c | 2 +-
drivers/base/memory.c | 2 +-
drivers/dax/kmem.c | 2 +-
drivers/hv/hv_balloon.c | 2 +-
drivers/s390/char/sclp_cmd.c | 2 +-
drivers/virtio/virtio_mem.c | 3 +-
drivers/xen/balloon.c | 2 +-
include/linux/memory_hotplug.h | 15 +++++++--
mm/memory_hotplug.c | 31 +++++++++++++------
11 files changed, 44 insertions(+), 21 deletions(-)
--
2.25.3
10 months, 1 week
[PATCH v2 0/3] Maintainer Entry Profiles
by Dan Williams
Changes since v1 [1]:
- Simplify the profile to a hopefully non-controversial set of
attributes that address the most common sources of contributor
confusion, or maintainer frustration.
- Rename "Subsystem Profile" to "Maintainer Entry Profile". Not every
entry in MAINTAINERS represents a full subsystem. There may be driver
local considerations to communicate to a submitter in addition to wider
subsystem guidelines.
- Delete the old P: tag in MAINTAINERS rather than convert to a new E:
tag (Joe Perches).
[1]: http://lore.kernel.org/r/154225759358.2499188.15268218778137905050.stgit@...
---
At last years Plumbers Conference I proposed the Maintainer Entry
Profile as a document that a maintainer can provide to set contributor
expectations and provide fodder for a discussion between maintainers
about the merits of different maintainer policies.
For those that did not attend, the goal of the Maintainer Entry Profile,
and the Maintainer Handbook more generally, is to provide a desk
reference for maintainers both new and experienced. The session
introduction was:
The first rule of kernel maintenance is that there are no hard and
fast rules. That state of affairs is both a blessing and a curse. It
has served the community well to be adaptable to the different
people and different problem spaces that inhabit the kernel
community. However, that variability also leads to inconsistent
experiences for contributors, little to no guidance for new
contributors, and unnecessary stress on current maintainers. There
are quite a few of people who have been around long enough to make
enough mistakes that they have gained some hard earned proficiency.
However if the kernel community expects to keep growing it needs to
be able both scale the maintainers it has and ramp new ones without
necessarily let them make a decades worth of mistakes to learn the
ropes.
To be clear, the proposed document does not impose or suggest new
rules. Instead it provides an outlet to document the unwritten rules
and policies in effect for each subsystem, and that each subsystem
might decide differently for whatever reason.
---
Dan Williams (3):
MAINTAINERS: Reclaim the P: tag for Maintainer Entry Profile
Maintainer Handbook: Maintainer Entry Profile
libnvdimm, MAINTAINERS: Maintainer Entry Profile
Documentation/maintainer/index.rst | 1
.../maintainer/maintainer-entry-profile.rst | 99 ++++++++++++++++++++
Documentation/nvdimm/maintainer-entry-profile.rst | 64 +++++++++++++
MAINTAINERS | 20 ++--
4 files changed, 175 insertions(+), 9 deletions(-)
create mode 100644 Documentation/maintainer/maintainer-entry-profile.rst
create mode 100644 Documentation/nvdimm/maintainer-entry-profile.rst
10 months, 1 week