[RFC v2 0/3] MP_JOIN handling WIP
by Peter Krystad
Current state of MP_JOIN handling. As I mentioned this is not
functional yet but am sharing for visibility. The intent of this
patchset is to complete the handshake for a second subflow in the
incoming direction. Sending an outgoing SYN with MP_JOIN and the
data path is not addressed.
v2 - pr_debug() cleanup and single mptcp_established() hook were
address in other already-merged patchsets
- split path manager skeleton into its own commit
- cleaned up logic in subflow_syn_recv_sock
- rebase onto Florian's locking change
- namespace for token table NOT addressed
Peter Krystad (3):
mptcp: Add path manager interface
mptcp: Add ADD_ADDR handling
mptcp: Add handling of incoming MP_JOIN requests
include/linux/tcp.h | 12 ++++
include/net/mptcp.h | 28 +++++++-
include/net/tcp.h | 6 ++
net/ipv4/tcp_minisocks.c | 6 ++
net/mptcp/Makefile | 2 +-
net/mptcp/options.c | 150 ++++++++++++++++++++++++++++++++++++---
net/mptcp/pm.c | 67 +++++++++++++++++
net/mptcp/protocol.c | 25 +++++++
net/mptcp/protocol.h | 53 +++++++++++++-
net/mptcp/subflow.c | 54 ++++++++++++--
net/mptcp/token.c | 124 ++++++++++++++++++++++++++++++++
11 files changed, 503 insertions(+), 24 deletions(-)
create mode 100644 net/mptcp/pm.c
--
2.17.2
3 years
[PATCH] squash-to: mptcp: Implement MPTCP receive path
by Paolo Abeni
The usage of an atomic type for such field does not really protect
vs concurrent read/modify/update cycles, and makes the code less
readable, move to plain u64.
We already held the required lock in the relevant context
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
net/mptcp/options.c | 2 +-
net/mptcp/protocol.c | 8 ++++----
net/mptcp/protocol.h | 2 +-
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/net/mptcp/options.c b/net/mptcp/options.c
index d285a33cd480..92cb0cb48f0c 100644
--- a/net/mptcp/options.c
+++ b/net/mptcp/options.c
@@ -319,7 +319,7 @@ static bool mptcp_established_options_dss(struct sock *sk, struct sk_buff *skb,
msk = mptcp_sk(subflow_ctx(sk)->conn);
if (msk) {
- opts->ext_copy.data_ack = atomic64_read(&msk->ack_seq);
+ opts->ext_copy.data_ack = msk->ack_seq;
} else {
crypto_key_sha1(subflow_ctx(sk)->remote_key, NULL,
&opts->ext_copy.data_ack);
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 941988387ea0..91e913636c45 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -435,7 +435,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
}
ssn = tcp_sk(ssk)->copied_seq - subflow->ssn_offset;
- old_ack = atomic64_read(&msk->ack_seq);
+ old_ack = msk->ack_seq;
if (unlikely(before(ssn, subflow->map_subflow_seq))) {
/* Mapping covers data later in the subflow stream,
@@ -501,7 +501,7 @@ static int mptcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
ack_seq = get_mapped_dsn(subflow);
if (before64(old_ack, ack_seq))
- atomic64_set(&msk->ack_seq, ack_seq);
+ msk->ack_seq = ack_seq;
if (!before(tcp_sk(ssk)->copied_seq - subflow->ssn_offset,
subflow->map_subflow_seq + subflow->map_data_len)) {
@@ -635,7 +635,7 @@ static struct sock *mptcp_accept(struct sock *sk, int flags, int *err,
crypto_key_sha1(msk->remote_key, NULL, &ack_seq);
msk->write_seq = subflow->idsn + 1;
ack_seq++;
- atomic64_set(&msk->ack_seq, ack_seq);
+ msk->ack_seq = ack_seq;
subflow->map_seq = ack_seq;
subflow->map_subflow_seq = 1;
subflow->rel_write_seq = 1;
@@ -760,7 +760,7 @@ void mptcp_finish_connect(struct sock *sk, int mp_capable)
crypto_key_sha1(msk->remote_key, NULL, &ack_seq);
msk->write_seq = subflow->idsn + 1;
ack_seq++;
- atomic64_set(&msk->ack_seq, ack_seq);
+ msk->ack_seq = ack_seq;
subflow->map_seq = ack_seq;
subflow->map_subflow_seq = 1;
subflow->rel_write_seq = 1;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index a4f0e7d3bd62..4bc1a094aad6 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -43,7 +43,7 @@ struct mptcp_sock {
u64 local_key;
u64 remote_key;
u64 write_seq;
- atomic64_t ack_seq;
+ u64 ack_seq;
u32 token;
struct list_head conn_list;
struct socket *subflow; /* outgoing connect/listener/!mp_capable */
--
2.20.1
3 years
[PATCH 0/4] Move MPTCP #defines to internal header
by Peter Krystad
Like it says...
Peter Krystad (4):
mptcp: Move TCPOLEN #defines to protocol.h
mptcp: Move TCPOLEN #defines to protocol.h
mptcp: Move TCPOLEN #defines to protocol.h
mptcp: make flag field naming consistent
include/net/tcp.h | 9 ---------
net/mptcp/options.c | 4 ++--
net/mptcp/protocol.h | 16 +++++++++++++---
3 files changed, 15 insertions(+), 14 deletions(-)
--
2.17.2
3 years
mptcpd 0.2a released
by Othman, Ossama
The mptcpd 0.2a alpha release is now available on GitHub at:
https://github.com/intel/mptcpd/releases/tag/v0.2a
This Multipath TCP Daemon alpha release replaces support for the
deprecated MPTCP generic netlink API with the one found in the
multipath-tcp.org kernel (0.95+), and has been verified to work with
that kernel.
Work is ongoing to leverage some of the features in the MPTCP generic
netlink API, such as availability of subflow network interface
indices, to help simplify path management plugin implementations. See
the list of mptcpd issues for further details:
https://github.com/intel/mptcpd/issues
3 years
[Weekly meetings] MoM - 13th of June 2019
by Matthieu Baerts
Hello,
We just had our 54th meeting with Mat, Peter and Ossama (Intel OTC),
Paolo, Davide and Florian (Red Hat) and myself (Tessares).
Thanks again for this new good meeting!
Here are the minutes of the meeting:
Accepted patches:
- Minor patches [5 out of 7]
- by Peter
- accepted by Paolo
- squashed in existing commits
- mptcp: selftests: increase timeout and return error on ping failure:
- by Florian
- accepted by Paolo
- squashed in "mptcp: selftests: switch to netns+veth based tests"
- Minor patches [Remaining two]:
- by Peter
- v1 and v2 reviewed by Paolo and Mat
- patch 1 had an issue, v3 sent and applied
- MPTCP debug print is an unnecessary change to common TCP code:
- by Mat
- accepted by Matth
- squashed in "mptcp: Implement MPTCP receive path"
Pending patches:
- MP_JOIN handling (v2):
- by Peter
- got some new reviews (Paolo)
- discussions, see below
- Change IPPROTO_MPTCP :
- by Mat
- discussions, see below
- net: extend INET_DIAG_INFO with information specific to TCP ULP:
- by Davide
- proposed to net-next (RFC)
- got some reviews from Mellanox and Netronome people
- no need to wait for MPTCP patches, it should be accepted in
net-next before (except the specific MPTCP patches of course)
- WIP, look at how kTLS is working.
- remove atomic64_read():
- by Paolo
- waiting for review
- usage of an atomic type for such field does not really protect
vs concurrent read/modify/update cycles, and makes the code less
readable, move to plain u64.
MP_JOIN support:
- some questions by Paolo about the checks that have been added,
maybe some looks useless or wrong interpretation
- Sounds good to apply them like that
- will be applied tomorrow, a v3 is possible if needed
- we can improve/fix stuff later on
RFC patches to Eric:
- Cover Letter:
- https://lists.01.org/pipermail/mptcp/2019-May/001233.html
- or see below
- we removed the fact that MP_JOIN was missing
- we could mention that MP_JOIN work is still in progress
- we should explain why coupled recv win are needed (RFC)
- To who?
- also add net-next in cc
- except if we don't want to have too much various comments for
the moment? And we could send it to net-next a bit after?
- What to send?
- everything including MP_JOIN + checkpatch
- When?
- Monday evening in EU
- by who?
- *@Mat* will send them
- with our ML (or our emails) in cc
Cover letter draft for Eric:
The MPTCP upstreaming community has prepared a net-next RFC
patch set for review:
https://github.com/multipath-tcp/mptcp_net-next/tree/export
With CONFIG_MPTCP=y, a socket created with IPPROTO_MPTCP will
attempt to create an MPTCP connection but remains compatible with
regular TCP. IPPROTO_TCP socket behavior is unchanged.
This implementation makes use of ULP between the
userspace-facing MPTCP socket and the set of in-kernel TCP sockets it
controls. ULP has been extended for use with listening sockets. skb_ext
is used to carry MPTCP metadata.
The patch set includes a self-test to exercise MPTCP in various
connection and routing scenarios.
We have more work to do to reach the initial feature set for
merging, notably:
* Finish MP_JOIN work
* Coupling receive windows across sibling subflow TCP socket →
we should mention that this is part of RFC 6824, not just a fancy
feature we want to add :)
* IPv6
* Not exposing the subflow ULP type to userspace
Thank you for your review. You can find us at mptcp(a)lists.01.org
and https://is.gd/mptcp_upstream
LPC:
- Draft sent:
- https://lists.01.org/pipermail/mptcp/2019-June/001313.html
- see below
- What to send? → Waiting for review:
- that should certainly be a different presentation than the one
from Netdev 0x12
- could be nice to say that modifications we do are also
interesting to others (skb ext, ULP, etc.)
- possible that in presentation we stop to ask questions to the
audience → might be interesting for us (architectural decisions) →
*@Paolo* can look at examples from last LPC
- When?
→ might be better not to wait the last moment.
→ max in 2 weeks?
→ *@all* : please review and improve what has been sent on the ML
- Who? To be defined in less than 2 weeks
Multipath TCP (MPTCP) is more and more popular these days but it is
not in the upstream Linux kernel yet. A fork is still being maintained
on the side and has been since March 2009. But it cannot be upstreamed
as it is because this implementation is designed for MPTCP and the TCP
stack is too heavily impacted in term of maintainability but also a bit
regarding the performances.
In this presentation, we would like to present the challenges we are
facing. Some are introduced by this MPTCP protocol, others by objectives
we defined: limit at the maximum the impact on the existing TCP stack.
We would like to have no performance regression, a maintainable and
configurable solution and an MPTCP implementation that can be used in a
variety of deployments.
The MPTCP upstreaming community is working on a RFC patch set for
net-next. We should be able to send it before the next LPC in September.
In the current situation, a socket can be created with IPPROTO_MPTCP to
initiate and accept an MPTCP connection. This socket remains compatible
with regular TCP and IPPROTO_TCP socket behavior is unchanged. This
implementation makes use of ULP between the userspace-facing MPTCP
socket and the set of in-kernel TCP sockets it controls to limit the
minimum impact on the current TCP stack. ULP has been extended for use
with listening sockets. skb_ext is used to carry MPTCP metadata.
Both the communication and the code are public and opened. You can
find us at mptcp(a)lists.01.org and https://is.gd/mptcp_upstream
MPTCP-API:
- https://lists.01.org/pipermail/mptcp/2019-June/001339.html
- being able, from userspace and without kernel headers, to identify:
- which MPTCP implementation it is (mptcp.org vs upstream)
- which MPTCP API version it is (for the future)
- iana number:
https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml
- can we ask to reserve a number even if it will not be visible on
the field? We could have something unified by OS and something official
for libc.
- What we can do:
- apply Mat's patch → *@Matth*
- ask at netconf *@Paolo* / *@Florian*
- ask IETF group → discussions with IANA/IESG maybe? *@Matth*
mptcpd:
- mptcpd 0.2a: Alpha release!
- https://github.com/intel/mptcpd/releases/tag/v0.2a
→ mainly support for NL PM from mptcp.org
→ more work in progress
Misc:
- MPTCP at WWDC: https://developer.apple.com/videos/play/wwdc2019/712/
- more MPTCP traffic
- SIGCOMM Networking Systems Award 2019, congratulations!
- https://lwn.net/Articles/790235/ → KUnit is coming, new framework
of test
DataFin:
- we can re-use FIN-WAITx TCP status for MPTCP socket
Next meeting:
- We propose to have it next Thursday, the 20th of June.
- Florian and Paolo will not be able to join but will talk about
MPTCP at Netconf
- Usual time: 16:00 UTC (9am PDT, 6pm CEST)
- Still open to everyone!
- https://annuel2.framapad.org/p/mptcp_upstreaming_20190620
Feel free to comment on these points and propose new ones for the next
meeting!
Talk to you next week,
Matthieu
--
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium
3 years
MPTCP-API
by Christoph Paasch
Hello everyone,
I am bringing up something that we already once discussed quite a while
back. Apache Traffic Server has an option to enable MPTCP (currently,
through the socket-option MPTCP_ENABLED that is exposed on the
multipath-tcp.org kernel). Leif (in CC) is the main contributor & maintainer
of ATS.
Now, ATS will be doing a 9.0.0-release soon that will be the first release
with this MPTCP-option in the ATS-config.
We want to make sure that it would also work with the upcoming upstream MPTCP
implementation. So, we can easily make this use IPPROTO_MPTCP. However,
there currently isn't a 100% guarantee that the API is stable (as upstream
might request changes).
So, we are wondering how we can do binary compatibility without having to
wait for MPTCP to land upstream. Could the MPTCP-code expose maybe a sysctl
that indicates the "API-version" ? That way, upon startup ATS can simply
check the API-version and if it doesn't match, just disable MPTCP.
What are your thoughts?
Thanks,
Christoph
3 years
[PATCH] squash-to: Implement MPTCP receive path
by Mat Martineau
This MPTCP debug print is an unnecessary change to common TCP code, and
it's best to get rid of it before upstream review.
Signed-off-by: Mat Martineau <mathew.j.martineau(a)linux.intel.com>
---
net/ipv4/tcp_output.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 08675fd0f070..a850686b8e2a 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3707,9 +3707,6 @@ void __tcp_send_ack(struct sock *sk, u32 rcv_nxt)
skb_set_tcp_pure_ack(buff);
/* Send it off, this clears delayed acks for us. */
- if (sk_is_mptcp(sk))
- pr_debug("mptcp sk=%p", sk);
-
__tcp_transmit_skb(sk, buff, 0, (__force gfp_t)0, rcv_nxt);
}
EXPORT_SYMBOL_GPL(__tcp_send_ack);
--
2.22.0
3 years
[PATCH v2 0/2] Minor patches [Remaining two]
by Peter Krystad
Responding to comments and rebase.
v2 - fix mp_opt usage and remove changes to mptcp_write_options logic.
Peter Krystad (2):
mptcp: reduce number of pr_debug() calls.
mptcp: Refactor mptcp_established_options() to single hook
include/net/mptcp.h | 21 ++++---------
net/ipv4/tcp_output.c | 20 ++++---------
net/mptcp/options.c | 70 +++++++++++++++++++++++++++++--------------
3 files changed, 58 insertions(+), 53 deletions(-)
--
2.17.2
3 years
[PATCH 0/2] Minor patches [Remaining two]
by Peter Krystad
Responding to comments and rebase.
Peter Krystad (2):
mptcp: reduce number of pr_debug() calls.
mptcp: Refactor mptcp_established_options() to single hook
include/net/mptcp.h | 21 ++++---------
net/ipv4/tcp_output.c | 20 +++---------
net/mptcp/options.c | 73 ++++++++++++++++++++++++++++---------------
3 files changed, 59 insertions(+), 55 deletions(-)
--
2.17.2
3 years