Problem: Load-balancing schemes hash a flow to a random path.
- Long flows collide on the same path. High congestion
- Other paths underutilized.
Proposed Solution: Flowbender.
- No excessive packet reordering.
- End-host driven rehashing. Assigns flows to paths dynamically.
- Recovers from link failures within Retransmit Timeout (RTO)
- Readily deployable in data-centers. (< 50 lines of kernel code)
- Robust. Easy to tune.
Top-Level Design Decisions of Flowbender:
- Uses flow load-balancing instead of packet-level load-balancing. Shift flow to a different path only once it is congested or disconnected. This avoids:
- Excessive out-of-order packet delivery
- Hardware changes to existing datacenter infrastructure
- Uses RTT-based load-balancing instead of Link-level load-balancing. This:
- Avoids congestion spreading trees that link-level schemes can create.
- ECN etc. already demonstrated to promptly propagate congestion information.
- Using ECN => targeting long flows taking several roundtrips to finish but these flows carry most of the traffic anyway.
- Transport-layer algorithms (like TCP) can quickly detect end-to-end link-failures. Flowbender can hence promptly avoid broken paths.
Gist of the Algorithm: Transmit a flow on a single path. Reroute that individual flow when it is congested.
- How do we know a flow is congestion?
- How is rerouting done?
Sensing Congestion: (uses the same technique as DCTCP)
- Switch marks every packet exceeding queue size above a threshold.
- Keep track of the fraction of ECN-marked acks in every RTT.
If congestion is larger than a threshold, we reroute the flow.
- ECMP engines in switches compute a hash-function based on packet-header fields.
- “Hacks” a packet-header field, TTL, based on congestion to change the hash generated. Maps hash to paths. (Why TTL? long story. Feel free to email me if you want to know)
- Evaluated against existing schemes: ECMP, RPS and DeTail.
- Evaluation methods used:
- Simulation (simulated in ns-3, not lame old ns-2 with a stupid Tcl binding and incredibly fucked up C++ API from the prehistoric period)
- Test-bed implementation