networks

FlowBender: Flow-level Adaptive Routing for Improved Latency and Throughput in Datacenter Networks by Kabbani, Vamanan, Hasan et al.

Problem: Load-balancing schemes hash a flow to a random path.

  • Long flows collide on the same path. High congestion
  • Other paths underutilized.

Proposed Solution: Flowbender.

  • No excessive packet reordering.
  • End-host driven rehashing. Assigns flows to paths dynamically.
  • Recovers from link failures within Retransmit Timeout (RTO)
  • Readily deployable in data-centers. (< 50 lines of kernel code)
  • Robust. Easy to tune.

Selection_003

Top-Level Design Decisions of Flowbender:

  • Uses flow load-balancing instead of packet-level load-balancing. Shift flow to a different path only once it is congested or disconnected. This avoids:
    • Excessive out-of-order packet delivery
    • Hardware changes to existing datacenter infrastructure
  • Uses RTT-based load-balancing instead of Link-level load-balancing. This:
    • Avoids congestion spreading trees that link-level schemes can create.
    • ECN etc. already demonstrated to promptly propagate congestion information.
    • Using ECN => targeting long flows taking several roundtrips to finish but these flows carry most of the traffic anyway.
    • Transport-layer algorithms (like TCP) can quickly detect end-to-end link-failures. Flowbender can hence promptly avoid broken paths.

Gist of the Algorithm: Transmit a flow on a single path. Reroute that individual flow when it is congested.

  • How do we know a flow is congestion?
  • How is rerouting done?

Sensing Congestion: (uses the same technique as DCTCP)

  • Switch marks every packet exceeding queue size above a threshold.
  • Keep track of the fraction of ECN-marked acks in every RTT.

If congestion is larger than a threshold, we reroute the flow.

Rerouting Flows:

  • ECMP engines in switches compute a hash-function based on packet-header fields.
  • “Hacks” a packet-header field, TTL, based on congestion to change the hash generated. Maps hash to paths. (Why TTL? long story. Feel free to email me if you want to know)

Evaluation Details:

  • Evaluated against existing schemes: ECMP, RPS and DeTail.
  • Evaluation methods used:
    • Simulation (simulated in ns-3, not lame old ns-2 with a stupid Tcl binding and incredibly fucked up C++ API from the prehistoric period)
    • Test-bed implementation

 

Advertisements