I want to be in your bed
by fifi@mailman.proximity.com
What's up nice. I’ve just, only watched your pictures. You’re very attractive. I’m so, very bored tonight and I would like 2 offer you talking. look at me here
2 years, 4 months
[PATCH V4 0/4] Fix kvm misconceives NVDIMM pages as reserved mmio
by Zhang Yi
For device specific memory space, when we move these area of pfn to
memory zone, we will set the page reserved flag at that time, some of
these reserved for device mmio, and some of these are not, such as
NVDIMM pmem.
Now, we map these dev_dax or fs_dax pages to kvm for DIMM/NVDIMM
backend, since these pages are reserved. the check of
kvm_is_reserved_pfn() misconceives those pages as MMIO. Therefor, we
introduce 2 page map types, MEMORY_DEVICE_FS_DAX/MEMORY_DEVICE_DEV_DAX,
to indentify these pages are from NVDIMM pmem. and let kvm treat these
as normal pages.
Without this patch, Many operations will be missed due to this
mistreatment to pmem pages. For example, a page may not have chance to
be unpinned for KVM guest(in kvm_release_pfn_clean); not able to be
marked as dirty/accessed(in kvm_set_pfn_dirty/accessed) etc.
V1:
https://lkml.org/lkml/2018/7/4/91
V2:
https://lkml.org/lkml/2018/7/10/135
V3:
https://lkml.org/lkml/2018/8/9/17
V4:
[PATCH V3 1/4] Added "Reviewed-by: David / Acked-by: Pankaj"
[PATCH V3 2/4] Added "Reviewed-by: Jan"
[PATCH V3 3/4] Added "Acked-by: Jan"
[PATCH V3 4/4] Fix several typos
Zhang Yi (4):
kvm: remove redundant reserved page check
mm: introduce memory type MEMORY_DEVICE_DEV_DAX
mm: add a function to differentiate the pages is from DAX device
memory
kvm: add a check if pfn is from NVDIMM pmem.
drivers/dax/pmem.c | 1 +
include/linux/memremap.h | 8 ++++++++
include/linux/mm.h | 12 ++++++++++++
virt/kvm/kvm_main.c | 16 ++++++++--------
4 files changed, 29 insertions(+), 8 deletions(-)
--
2.7.4
2 years, 4 months
[PATCH v2] libnvdimm, pfn: during init, clear errors in the metadata area
by Vishal Verma
If there are badblocks present in the 'struct page' area for pfn
namespaces, until now, the only way to clear them has been to force the
namespace into raw mode, clear the errors, and re-enable the fsdax mode.
This is clunky, given that it should be easy enough for the pfn driver
to do the same.
Add a new helper that uses the most recently available badblocks list to
check whether there are any badblocks that lie in the volatile struct
page area. If so, before initializing the struct pages, send down
targeted writes via nvdimm_write_bytes to write zeroes to the affected
blocks, and thus clear errors.
Cc: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
---
drivers/nvdimm/pfn_devs.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 60 insertions(+), 1 deletion(-)
v2:
- Rename to nd_pfn_clear_memmap_errors() (Dan)
- Move the mode check into the clearing function (Dan)
- Use page aligned chunks to clear errors (Dan)
- make bb_present an int (Dan)
diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index 3f7ad5bc443e..24c64090169e 100644
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -361,6 +361,65 @@ struct device *nd_pfn_create(struct nd_region *nd_region)
return dev;
}
+/*
+ * nd_pfn_clear_memmap_errors() clears any errors in the volatile memmap
+ * space associated with the namespace. If the memmap is set to DRAM, then
+ * this is a no-op. Since the memmap area is freshly initialized during
+ * probe, we have an opportunity to clear any badblocks in this area.
+ */
+static int nd_pfn_clear_memmap_errors(struct nd_pfn *nd_pfn)
+{
+ struct nd_region *nd_region = to_nd_region(nd_pfn->dev.parent);
+ struct nd_namespace_common *ndns = nd_pfn->ndns;
+ void *zero_page = page_address(ZERO_PAGE(0));
+ struct nd_pfn_sb *pfn_sb = nd_pfn->pfn_sb;
+ int num_bad, meta_num, rc, bb_present;
+ sector_t first_bad, meta_start;
+ struct nd_namespace_io *nsio;
+
+ if (nd_pfn->mode != PFN_MODE_PMEM)
+ return 0;
+
+ nsio = to_nd_namespace_io(&ndns->dev);
+ meta_start = (SZ_4K + sizeof(*pfn_sb)) >> 9;
+ meta_num = (le64_to_cpu(pfn_sb->dataoff) >> 9) - meta_start;
+
+ do {
+ unsigned long zero_len;
+ u64 nsoff;
+
+ bb_present = badblocks_check(&nd_region->bb, meta_start,
+ meta_num, &first_bad, &num_bad);
+ if (bb_present) {
+ dev_dbg(&nd_pfn->dev, "meta: %x badblocks at %lx\n",
+ num_bad, first_bad);
+ nsoff = ALIGN_DOWN((nd_region->ndr_start
+ + (first_bad << 9)) - nsio->res.start,
+ PAGE_SIZE);
+ zero_len = ALIGN(num_bad << 9, PAGE_SIZE);
+ while (zero_len) {
+ unsigned long chunk = min(zero_len, PAGE_SIZE);
+
+ rc = nvdimm_write_bytes(ndns, nsoff, zero_page,
+ chunk, 0);
+ if (rc)
+ break;
+
+ zero_len -= chunk;
+ nsoff += chunk;
+ }
+ if (rc) {
+ dev_err(&nd_pfn->dev,
+ "error clearing %x badblocks at %lx\n",
+ num_bad, first_bad);
+ return rc;
+ }
+ }
+ } while (bb_present);
+
+ return 0;
+}
+
int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig)
{
u64 checksum, offset;
@@ -477,7 +536,7 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig)
return -ENXIO;
}
- return 0;
+ return nd_pfn_clear_memmap_errors(nd_pfn);
}
EXPORT_SYMBOL(nd_pfn_validate);
--
2.14.4
2 years, 4 months
[PATCH] libnvdimm, pfn: during init, clear errors in the metadata area
by Vishal Verma
If there are badblocks present in the 'struct page' area for pfn
namespaces, until now, the only way to clear them has been to force the
namespace into raw mode, clear the errors, and re-enable the fsdax mode.
This is clunky, given that it should be easy enough for the pfn driver
to do the same.
Add a new helper that uses the most recently available badblocks list to
check whether there are any badblocks that lie in the volatile struct
page area. If so, before initializing the struct pages, send down
targeted writes via nvdimm_write_bytes to write zeroes to the affected
blocks, and thus clear errors.
Cc: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
---
drivers/nvdimm/pfn_devs.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 57 insertions(+)
diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index 3f7ad5bc443e..04b341758cd6 100644
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -361,8 +361,59 @@ struct device *nd_pfn_create(struct nd_region *nd_region)
return dev;
}
+static int nd_pfn_clear_meta_errors(struct nd_pfn *nd_pfn)
+{
+ struct nd_region *nd_region = to_nd_region(nd_pfn->dev.parent);
+ struct nd_namespace_common *ndns = nd_pfn->ndns;
+ void *zero_page = page_address(ZERO_PAGE(0));
+ struct nd_pfn_sb *pfn_sb = nd_pfn->pfn_sb;
+ sector_t first_bad, meta_start;
+ struct nd_namespace_io *nsio;
+ int num_bad, meta_num, rc;
+ bool bb_present;
+
+ nsio = to_nd_namespace_io(&ndns->dev);
+ meta_start = (SZ_4K + sizeof(*pfn_sb)) >> 9;
+ meta_num = (le64_to_cpu(pfn_sb->dataoff) >> 9) - meta_start;
+
+ do {
+ unsigned long zero_len;
+ u64 nsoff;
+
+ bb_present = !!badblocks_check(&nd_region->bb, meta_start,
+ meta_num, &first_bad, &num_bad);
+ if (bb_present) {
+ dev_dbg(&nd_pfn->dev, "meta: %x badblocks at %lx\n",
+ num_bad, first_bad);
+ nsoff = (nd_region->ndr_start + (first_bad << 9)) -
+ nsio->res.start;
+ zero_len = num_bad << 9;
+ while (zero_len) {
+ unsigned long chunk = min(zero_len, PAGE_SIZE);
+
+ rc = nvdimm_write_bytes(ndns, nsoff, zero_page,
+ chunk, 0);
+ if (rc)
+ break;
+
+ zero_len -= chunk;
+ nsoff += chunk;
+ }
+ if (rc) {
+ dev_err(&nd_pfn->dev,
+ "error clearing %x badblocks at %lx\n",
+ num_bad, first_bad);
+ return rc;
+ }
+ }
+ } while (bb_present);
+
+ return 0;
+}
+
int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig)
{
+ int rc;
u64 checksum, offset;
enum nd_pfn_mode mode;
struct nd_namespace_io *nsio;
@@ -477,6 +528,12 @@ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig)
return -ENXIO;
}
+ if (mode == PFN_MODE_PMEM) {
+ rc = nd_pfn_clear_meta_errors(nd_pfn);
+ if (rc)
+ return rc;
+ }
+
return 0;
}
EXPORT_SYMBOL(nd_pfn_validate);
--
2.14.4
2 years, 4 months
[ndctl PATCH] ndctl, test: add a new unit test pfn metadata error clearing
by Vishal Verma
The pfn driver lacked a way to clear badblocks in the volatile struct
page area, but this is expected to be fixed for v4.20.
Add a unit test that creates an fsdax namespace, forces it to raw mode,
injects errors to the metadata area, and converts it back to fsdax.
For a kernel with the error clearing improvements, this will clear the
injected errors, but if the kernel changes are missing, the errors will
stay intact. At this point, convert the namespace back to raw mode and
check for the presence of badblocks.
Cc: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma(a)intel.com>
---
test/Makefile.am | 3 +-
test/pfn-meta-errors.sh | 74 +++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 76 insertions(+), 1 deletion(-)
create mode 100755 test/pfn-meta-errors.sh
diff --git a/test/Makefile.am b/test/Makefile.am
index 50bb2e4..ebdd23f 100644
--- a/test/Makefile.am
+++ b/test/Makefile.am
@@ -24,7 +24,8 @@ TESTS =\
rescan-partitions.sh \
inject-smart.sh \
monitor.sh \
- max_available_extent_ns.sh
+ max_available_extent_ns.sh \
+ pfn-meta-errors.sh
check_PROGRAMS =\
libndctl \
diff --git a/test/pfn-meta-errors.sh b/test/pfn-meta-errors.sh
new file mode 100755
index 0000000..2b57f19
--- /dev/null
+++ b/test/pfn-meta-errors.sh
@@ -0,0 +1,74 @@
+#!/bin/bash -Ex
+
+# SPDX-License-Identifier: GPL-2.0
+# Copyright(c) 2018 Intel Corporation. All rights reserved.
+
+blockdev=""
+rc=77
+
+. ./common
+
+force_raw()
+{
+ raw="$1"
+ $NDCTL disable-namespace "$dev"
+ echo "$raw" > "/sys/bus/nd/devices/$dev/force_raw"
+ $NDCTL enable-namespace "$dev"
+ echo "Set $dev to raw mode: $raw"
+ if [[ "$raw" == "1" ]]; then
+ raw_bdev=${blockdev}
+ test -b "/dev/$raw_bdev"
+ else
+ raw_bdev=""
+ fi
+}
+
+check_min_kver "4.20" || do_skip "may lack PFN metadata error handling"
+
+set -e
+trap 'err $LINENO' ERR
+
+# setup (reset nfit_test dimms)
+modprobe nfit_test
+$NDCTL disable-region -b $NFIT_TEST_BUS0 all
+$NDCTL zero-labels -b $NFIT_TEST_BUS0 all
+$NDCTL enable-region -b $NFIT_TEST_BUS0 all
+
+rc=1
+
+# create a fsdax namespace and clear errors (if any)
+dev="x"
+json=$($NDCTL create-namespace -b $NFIT_TEST_BUS0 -t pmem -m fsdax)
+eval "$(echo "$json" | json2var)"
+[ $dev = "x" ] && echo "fail: $LINENO" && exit 1
+
+force_raw 1
+if read -r sector len < "/sys/block/$raw_bdev/badblocks"; then
+ dd of=/dev/$raw_bdev if=/dev/zero oflag=direct bs=512 seek="$sector" count="$len"
+fi
+force_raw 0
+
+# find dataoff from sb
+force_raw 1
+doff=$(hexdump -s $((4096 + 56)) -n 4 "/dev/$raw_bdev" | head -1 | cut -d' ' -f2-)
+doff=$(tr -d ' ' <<< "0x${doff#* }${doff%% *}")
+printf "pfn dataoff: %x\n" "$doff"
+dblk="$((doff/512))"
+
+metaoff="0x2000"
+mblk="$((metaoff/512))"
+
+# inject in the middle of the struct page area
+bb_inj=$(((dblk - mblk)/2))
+$NDCTL inject-error --block="$bb_inj" --count=32 $dev
+$NDCTL start-scrub && $NDCTL wait-scrub
+
+# after probe from the enable-namespace, the error should've been cleared
+force_raw 0
+force_raw 1
+if read -r sector len < "/sys/block/$raw_bdev/badblocks"; then
+ false
+fi
+
+_cleanup
+exit 0
--
2.14.4
2 years, 4 months
[ndctl PATCH v3] ndctl: Introduce dirty-dimm command
by Dan Williams
Some DIMMs provide a facility to track dirty-shutdown events. The
counter only rolls forward after the OS sets a latch. This allows the
agent tracking dirty shutdowns to ignore events that occur while the
capacity has not been written. For these DIMMs dirty-dimm will trigger
the counter to roll to the next state. The shutdown state can be
retrieved with 'ndctl list -DH'
Cc: Keith Busch <keith.busch(a)intel.com>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
Changes since v2:
* report whether ndctl_cmd_submit() suceeded (vishal)
Documentation/ndctl/Makefile.am | 1 +
Documentation/ndctl/ndctl-dirty-dimm.txt | 29 +++++++++++++++++++++++++++
builtin.h | 1 +
ndctl/dimm.c | 32 ++++++++++++++++++++++++++++++
ndctl/ndctl.c | 1 +
5 files changed, 64 insertions(+)
create mode 100644 Documentation/ndctl/ndctl-dirty-dimm.txt
diff --git a/Documentation/ndctl/Makefile.am b/Documentation/ndctl/Makefile.am
index a30b139ba3a3..1a826bb001be 100644
--- a/Documentation/ndctl/Makefile.am
+++ b/Documentation/ndctl/Makefile.am
@@ -38,6 +38,7 @@ man1_MANS = \
ndctl-disable-region.1 \
ndctl-enable-dimm.1 \
ndctl-disable-dimm.1 \
+ ndctl-dirty-dimm.1 \
ndctl-enable-namespace.1 \
ndctl-disable-namespace.1 \
ndctl-create-namespace.1 \
diff --git a/Documentation/ndctl/ndctl-dirty-dimm.txt b/Documentation/ndctl/ndctl-dirty-dimm.txt
new file mode 100644
index 000000000000..02a30fe85f53
--- /dev/null
+++ b/Documentation/ndctl/ndctl-dirty-dimm.txt
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0
+
+ndctl-dirty-dimm(1)
+===================
+
+NAME
+----
+ndctl-dirty-dimm - set dimm to record the next dirty shutdown event
+
+SYNOPSIS
+--------
+[verse]
+'ndctl dirty-dimm <nmem0> [<nmem1>..<nmemN>]'
+
+Some NVDIMMs have the capability to detect 'flush failed' events whereby
+data that is pending in buffers at the time of system power loss fail to
+be flushed out to media. Some DIMMs go further to count how many times
+such fatal events occur, but only roll the count in response to a latch
+being set. The 'dirty-dimm' command sets this latch on those devices and
+is meant to be called in advance to any writes to media.
+
+OPTIONS
+-------
+<nmem>::
+include::xable-dimm-options.txt[]
+
+SEE ALSO
+--------
+http://pmem.io/documents/NVDIMM_DSM_Interface-V1.7.pdf[NVDIMM DSM Inteface]
diff --git a/builtin.h b/builtin.h
index 675a6ce79b9c..1157243cdf60 100644
--- a/builtin.h
+++ b/builtin.h
@@ -48,4 +48,5 @@ int cmd_bat(int argc, const char **argv, void *ctx);
#endif
int cmd_update_firmware(int argc, const char **argv, void *ctx);
int cmd_inject_smart(int argc, const char **argv, void *ctx);
+int cmd_dirty_dimm(int argc, const char **argv, void *ctx);
#endif /* _NDCTL_BUILTIN_H_ */
diff --git a/ndctl/dimm.c b/ndctl/dimm.c
index a4203f354000..2d658dab34a5 100644
--- a/ndctl/dimm.c
+++ b/ndctl/dimm.c
@@ -61,6 +61,23 @@ static int action_zero(struct ndctl_dimm *dimm, struct action_context *actx)
return ndctl_dimm_zero_labels(dimm);
}
+static int action_dirty(struct ndctl_dimm *dimm, struct action_context *actx)
+{
+ struct ndctl_cmd *cmd;
+ int rc;
+
+ cmd = ndctl_dimm_cmd_new_ack_shutdown_count(dimm);
+ if (!cmd) {
+ fprintf(stderr, "%s: 'dirty-dimm' not supported\n",
+ ndctl_dimm_get_devname(dimm));
+ return -EOPNOTSUPP;
+ }
+
+ rc = ndctl_cmd_submit(cmd);
+ ndctl_cmd_unref(cmd);
+ return rc;
+}
+
static struct json_object *dump_label_json(struct ndctl_dimm *dimm,
struct ndctl_cmd *cmd_read, ssize_t size)
{
@@ -943,6 +960,11 @@ static const struct option update_options[] = {
OPT_END(),
};
+static const struct option dirty_options[] = {
+ BASE_OPTIONS(),
+ OPT_END(),
+};
+
static const struct option base_options[] = {
BASE_OPTIONS(),
OPT_END(),
@@ -1181,3 +1203,13 @@ int cmd_update_firmware(int argc, const char **argv, void *ctx)
count > 1 ? "s" : "");
return count >= 0 ? 0 : EXIT_FAILURE;
}
+
+int cmd_dirty_dimm(int argc, const char **argv, void *ctx)
+{
+ int count = dimm_action(argc, argv, ctx, action_dirty, dirty_options,
+ "ndctl dirty-dimm <nmem0> [<nmem1>..<nmemN>] [<options>]");
+
+ fprintf(stderr, "dirtied %d nmem%s.\n", count >= 0 ? count : 0,
+ count > 1 ? "s" : "");
+ return count >= 0 ? 0 : EXIT_FAILURE;
+}
diff --git a/ndctl/ndctl.c b/ndctl/ndctl.c
index 73dabfac3908..93d0d88a274c 100644
--- a/ndctl/ndctl.c
+++ b/ndctl/ndctl.c
@@ -83,6 +83,7 @@ static struct cmd_struct commands[] = {
{ "write-labels", cmd_write_labels },
{ "init-labels", cmd_init_labels },
{ "check-labels", cmd_check_labels },
+ { "dirty-dimm", cmd_dirty_dimm },
{ "inject-error", cmd_inject_error },
{ "update-firmware", cmd_update_firmware },
{ "inject-smart", cmd_inject_smart },
2 years, 4 months
[ndctl PATCH v2 0/3] ndctl: Remove udev rule for latch and dirty-shutdown-count
by Dan Williams
Changes since v1:
* Add an error message for the not supported case (Vishal)
* Asciidoctor format fixup
---
The latch needs to be coordinated with writes to the namespace and that
makes it not suitable as a dimm-add-event udev rule.
Additionally, the dirty-shutdown-count is something that can live in
sysfs alongside the other health state flags in /sys/.../nmemX/nfit.
Otherwise, calling any of the libndctl apis from a udev script means the
entire udev queue can get blocked behind a ndctl_bus_wait_probe() call.
That's too much overhead for a non-default policy.
In other words, while the latch event remains in userspace as close as
possible to the application that wants to manage dimm-flush-failed
policy as anything other than fatal, dirty-shutdown-count caching should
move to sysfs where it is cheap to implement and compliments the
flush-failed health state flag.
---
Dan Williams (3):
ndctl: Introduce dirty-dimm command
ndctl: Revert "ndctl, intel: Fallback to smart cached shutdown_count"
ndctl: Revert "ndctl: Create ndctl udev rules for dirty shutdown"
.gitignore | 1
Documentation/ndctl/Makefile.am | 1
Documentation/ndctl/ndctl-dirty-dimm.txt | 29 ++++++
Makefile.am | 3 -
builtin.h | 1
configure.ac | 10 --
contrib/80-ndctl.rules | 3 -
ndctl.spec.in | 3 -
ndctl/Makefile.am | 5 -
ndctl/dimm.c | 31 ++++++
ndctl/lib/intel.c | 41 --------
ndctl/lib/libndctl.c | 6 -
ndctl/lib/private.h | 3 -
ndctl/ndctl-udev.c | 150 ------------------------------
ndctl/ndctl.c | 1
15 files changed, 64 insertions(+), 224 deletions(-)
create mode 100644 Documentation/ndctl/ndctl-dirty-dimm.txt
delete mode 100644 contrib/80-ndctl.rules
delete mode 100644 ndctl/ndctl-udev.c
2 years, 4 months
[ndctl PATCH 0/3] ndctl: Remove udev rule for latch and dirty-shutdown-count
by Dan Williams
The latch needs to be coordinated with writes to the namespace and that
makes it not suitable as a dimm-add-event udev rule.
Additionally, the dirty-shutdown-count is something that can live in
sysfs alongside the other health state flags in /sys/.../nmemX/nfit.
Otherwise, calling any of the libndctl apis from a udev script means the
entire udev queue can get blocked behind a ndctl_bus_wait_probe() call.
That's too much overhead for a non-default policy.
In other words, while the latch event remains in userspace as close as
possible to the application that wants to manage dimm-flush-failed
policy as anything other than fatal, dirty-shutdown-count caching should
move to sysfs where it is cheap to implement and compliments the
flush-failed health state flag.
---
Dan Williams (3):
ndctl: Introduce dirty-dimm command
ndctl: Revert "ndctl, intel: Fallback to smart cached shutdown_count"
ndctl: Revert "ndctl: Create ndctl udev rules for dirty shutdown"
.gitignore | 1
Documentation/ndctl/Makefile.am | 1
Documentation/ndctl/ndctl-dirty-dimm.txt | 29 ++++++
Makefile.am | 3 -
builtin.h | 1
configure.ac | 10 --
contrib/80-ndctl.rules | 3 -
ndctl.spec.in | 3 -
ndctl/Makefile.am | 5 -
ndctl/dimm.c | 28 ++++++
ndctl/lib/intel.c | 41 --------
ndctl/lib/libndctl.c | 6 -
ndctl/lib/private.h | 3 -
ndctl/ndctl-udev.c | 150 ------------------------------
ndctl/ndctl.c | 1
15 files changed, 61 insertions(+), 224 deletions(-)
create mode 100644 Documentation/ndctl/ndctl-dirty-dimm.txt
delete mode 100644 contrib/80-ndctl.rules
delete mode 100644 ndctl/ndctl-udev.c
2 years, 4 months