Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ Namely:
[iproute2 repository](https://github.com/L4STeam/iproute2)
- An implementation of Accurate ECN (see branch AccECN-full)
- The base implementation of TCP Prague (see branch tcp_prague)
- ECT(1) enabled DCTCP
- ECT(1) enabled BBR v2 (from v2alpha branch in
- ECT(1) and AccECN enabled DCTCP
- ECT(1) and AccECN enabled BBR v2 (from v2alpha branch in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither DCTCP nor BBRv2 are "AccECN enabled per say".

All CCA will use AccECN if it is toggled, and will not notice a difference since it preserves the internal semantics of FLAG_ECE, e.g.,

The text could however reflect that BBRv2 only uses ECT(1) if AccECN is enabled and negotiated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, before writing the README, I think we need to agree what is meant to happen with various CCs, various f/b sysctl settings, and varioius negotiated f/b outcomes.

For DCTCP:

How about:

# CC module  sysctl  Negot'd f/b  | F/b used  ECT used  CC used
___________________________________________________________________
1 dctcp      2       Classic      | Sticky    0         dctcp
2 dctcp      2       No-ECN       | No-ECN    n/a       Default (Cubic)
3 dctcp      3       AccECN       | AccECN    1         dctcp
4 dctcp      3       Classic      | Classic   0         Default (Cubic)
5 dctcp      3       No-ECN       | No-ECN    n/a       Default (Cubic)

Rationale:

  1. By by loading dctcp, but not enabling accecn, it can be assumed the admin wants to use DCTCP in the original way (in a DC where the f/b is arranged by configuration, not negotiation). Not sure about ECT - see discussion below.
  2. I believe if DCTCP cannot negotiate ECN, it switches to a different CC, and I assume it switches to the Linux default (Cubic).
  3. Straightforward
  4. The admin intended to use accecn feedback, but the other end doesn't support accecn, so have to assume the remote peer supports classic ECN f/b (no reason to assume it understands sticky f/b). But dctcp doesn't work with Classic ECN f/b, or no-ECN. So we have to switch to a CC that works with the available f/b (modelled on case 2).
  5. Similar rationale to 4.

In case 1, ECT0 is proposed for backward compatibility.

For BBRv2 in TCP

How about:

# CC module  sysctl  Negot'd f/b  | F/b used  ECT used  CC used
___________________________________________________________________
1 bbrv2      2       Classic      | Sticky    0         bbrv2
2 bbrv2      2       No-ECN       | No-ECN    n/a       bbrv2
3 bbrv2      3       AccECN       | AccECN    1         bbrv2
4 bbrv2      3       Classic      | No-ECN    n/a       bbrv2
5 bbrv2      3       No-ECN       | No-ECN    n/a       bbrv2

Rationale:

  1. By loading bbrv2, but configuring classic ECN, not accecn, it can be assumed the admin wants to use BBRv2 in the original way (I'm guessing that was with sticky feedback, but I'm not sure). Not sure about ECT - see discussion below.
  2. BBRv2 works fine without ECN.
  3. Straightforward
  4. The admin intended to use accecn feedback, but the other end doesn't support accecn, so have to assume the remote peer supports classic ECN f/b (no reason to assume it understands sticky f/b). But BBRv2 doesn't work with Classic ECN f/b. So just don't use ECN at all, even tho it's been negotiated (still have to give Classic ECN f/b to the other end).
  5. BBRv2 works fine without ECN.

The above tries to maintain backward compatibility, but I don't actually know what BBRv2 originally did. So what we decide depends on the answers to the questions below:

  • How did the original BBRv2 alpha code enable ECN?
  • And, if ECN was enabled, did BBRv2 just negotiate RFC3168 ECN, but unilaterally use sticky (DCTCP-style) feedback logic at one end, even if the other was a pure RFC3168 host?
  • When ECN was enabled, did BBRv2 send as ECT(0)?

In the L4Steam BBRv2 code, if the original way that BBRv2 used ECN included unsafe assumptions, we may not want to provide backward compatibility. But if it was used like that in DCs, we ought to.

ECT setting for DC-use

I'm not sure about defaulting dctcp or bbrv2 to ECT0 if the sysctl is 2 and Classic ECN is negotiated. It's intended for backward compatibility in DCs, but it's potentially unsafe over the Internet if hosts are configured wrongly.

Which ever way we decided, there will need to be a CC module switch (in dctcp, prague and bbrv2) to be able to force ECT to 0 or 1 (e.g. where a particular codepoint is needed in a DC environment).

Similarly, rather than falling back to sticky feedback, it might be better to have a specific sysctl (or module options?) for sticky feedback, and somehow add negotiation of sticky feedback to AccECN (as an alternative to AccECN or Classic).

However, I imagine the big existing users of BBRV2 & DCTCP in DCs will not want to have to change the way these CCAs are loaded just to keep the existing behaviour.

[BBR v2 repo](https://github.com/google/bbr))

# Installation (debian derivatives)
Expand Down Expand Up @@ -93,7 +93,7 @@ tc/tc qdisc replace dev eth0 root dualpi2 ...

While dualpi2 can work with DCTCP, DCTCP suffers from a few unfortunate
interactions with GSO/pacing/..., resulting in under-utilization. As a result,
we advice you to use tcp_prague which currently has
we advise you to use tcp_prague which currently has
basic fixes to those limitations. Note that this might still under-perform in
heavily virtualized settings, as scheduling becomes less reliable.

Expand All @@ -103,6 +103,7 @@ sysctl -w net.ipv4.tcp_congestion_control=prague
sysctl -w net.ipv4.tcp_ecn=3
```

Prague attempts to negotiate Accurate ECN automatically.
Note that, at the moment, Accurate ECN **must** be enabled on both ends of a
connection in order it with DCTCP or BBR v2.
Prague, BBRv2 and DCTCP attempt to negotiate Accurate ECN automatically once
the AccECN sysctl is enabled.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AccECN will always be negotiated if enabled, irrespectively of the CCA (the CCA is not even known at SYN-SENT time).

Prague however forces the use of AccECN, overruling the sysctl--similar in spirit to DCTCP forcing the use of ECN even if disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use my pull request text at al then, because I misunderstood. But these comments certainly prove that the README needs to be clarified in these respects. There's also confusion between using the word DCTCP as a congestion control, or as both a feedback mechanism and a congestion control. Does the final sentence even need to mention BBR or DCTCP then? How about:

Note: to use AccECN feedback, the accecn sysctl has to be enabled at both ends. If a host has the tcp-prague congestion control module loaded, it has the same effect as enabling the accecn sysctl. Nonetheless, the other end still needs to have accecn enabled (either via the accecn sysctl, or by loading tcp-prague there as well).

Note that, Accurate ECN **must** be enabled on both ends of a
connection in order for the negotiation to succeed.