diff --git a/README.md b/README.md index 48c6484..3008473 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,8 @@ Tools, libraries and statistical software for automating, managing, monitoring and testing Umber Fi‑Wi networks. +**Where it runs:** FiWiControl is built to run **on the Umber concentrator** — the Fi‑Wi control plane described in the architecture spec (**`html/Fi-Wi-L4S.php`** in this repo) — for **lab and customer** automation. Day-to-day development still uses **workstation** installs and **lab rigs** (e.g. Raspberry Pi) as in **`docs/install.md`**. + **Naming:** The **Git repository** and checkout directory are **FiWiControl** (mixed case). The **Python distribution** and **import package** are **`fiwicontrol`** (all lowercase, PEP 8) — same project, different casing rules for Git vs Python. Use **`fiwicontrol`** for `pip install` / `import`, not `FiWiControl`. This repository ships that distribution (**`fiwicontrol`** on PyPI / `pip`) with import root **`fiwicontrol`**: @@ -47,6 +49,8 @@ FiWiControl/ ├── LICENSE ├── README.md ├── pyproject.toml +├── html/ +│ └── Fi-Wi-L4S.php ├── docs/ │ ├── install.md │ ├── node-control-asyncio-design.md diff --git a/html/Fi-Wi-L4S.php b/html/Fi-Wi-L4S.php new file mode 100644 index 0000000..e965d2b --- /dev/null +++ b/html/Fi-Wi-L4S.php @@ -0,0 +1,16280 @@ + + + + + + + + Umber Fi-Wi Architecture: Cellularized Wi-Fi with Dynamic Point Selection + + + + + + +
+
+ Umber Networks Proprietary Architecture +
+ +

+ Umber Fi-Wi Architecture: Cellularized Wi-Fi, L4S, and RF Coordination +

+ +

+ Timestamp-synchronized control loops, dynamic RF grouping, and multi-RRH + operation
+ Umber Networks Fi-Wi Technical Architecture Overview (Version 1.1, + December 2025) +

+
+ +
+ Zebras look like horses, but they are not the same... Zebras, despite + man's best efforts, cannot be tamed. The Wi-Fi we have + engineered today remains fundamentally a collection of autonomous, + uncoordinated things—zebras that simply cannot be + harnessed.
+
+ Fi-Wi is architected from the ground up to be controllable, coordinated, + and directed — the horse we need for in-building communications and + sensing. As latency demands tighten and building densities increase, Fi-Wi + isn't just a better future; it's the future we can build today. +
+ +
+

0. Technical Disclaimer

+ +

+ The material presented in this document describes the Fi-Wi architecture + and associated engineering concepts. It is provided "as is" for + discussion and exploratory design purposes only. Nothing in this + document constitutes a formal specification, performance guarantee, + regulatory assertion, or commitment to implement any feature described. +

+ +

+ Several sections use simplified or idealized assumptions to illustrate + architectural differences between Wi-Fi, Multi-Link Operation (MLO), Low + Latency Low Loss Scalable throughput (L4S), and Fi-Wi queueing and + scheduling behavior. These examples are intended to clarify concepts + rather than fully model the non-linear and stochastic dynamics present + in operational wireless systems. +

+ +

+ Real system behavior depends on hardware characteristics, RF topology, + firmware behavior, congestion patterns, environmental conditions, and + interactions with legacy Wi-Fi devices. Actual performance may differ + from the representative models and examples described here. +

+ +

+ Important Note on Capabilities: This document describes + an architecture using Commercial Off-The-Shelf (COTS) Wi-Fi chipsets. + The system provides dynamic point selection, intelligent frequency + reuse, and centralized MAC scheduling. It does not provide RF phase + control, distributed MIMO, or coordinated simultaneous + transmission—capabilities that would require custom ASIC development. + All described features are achievable with commodity Wi-Fi hardware and + comply with unlicensed spectrum regulations. +

+
+ +
+

0.1 L4S Foundation and References

+ +

+ Low Latency, Low Loss, Scalable Throughput (L4S) is a + suite of IETF standards that extend the Internet's congestion control + mechanisms through + Explicit Congestion Notification (ECN) to support very + low queuing delays. L4S is a ratified protocol stack with multiple + production implementations. +

+ +

+ Fi-Wi is architected specifically to provide the deterministic + underlying transport required to satisfy the strict queuing mandates + defined in these standards. +

+ +

Core L4S Specifications

+ + + +

Transport & Production Status

+ +

+ L4S replaces capacity-seeking behavior (Reno/Cubic) with + pacing-based rate control. It is currently deployed in + production environments including: +

+ + + +

Further Reading

+ + +
+ +
+ + + +
+ +

1. Motivation and Problem Statement

+ +
+

+ "Perfection is achieved, not when there is nothing more to add, but when + there is nothing left to take away." — Antoine de Saint-Exupéry +

+ +

+ "Everything should be made as simple as possible, but not simpler." — + Albert Einstein +

+
+ +

+ With 23.3 billion Wi-Fi devices in use worldwide and 5.5 billion people + depending on internet connectivity, and growing, Wi-Fi has become the + primary way we access the internet. So much so many people think Wi-Fi is + the internet. It's how a home healthcare worker video-calls to check on a + patient, or a cancer patient connects to their support group. It’s how a + parent works remotely while their child attends school online, and how + lifelong learners access the information they need to grow. It’s how a + grandmother monitors her heart condition through a telehealth app. It’s + how a family member finds their next job, or how a neighbor orders a meal. +

+ +

+ Running quietly in the background are autonomous systems we've come to + depend on: security cameras that alert us to threats, medical monitors + that track vital signs, smart home systems that manage climate and safety, + IoT sensors that detect water leaks or carbon monoxide. These systems + don't wait for us to notice problems—they operate continuously, silently, + keeping people safe. +

+ +

+ We've moved far beyond entertainment and convenience. Wi-Fi now carries + the infrastructure of daily survival. When it breaks down under density or + congestion, it's not just buffering that fails. It's jobs, healthcare + access, human connection, and the life-safety systems we trust to work + when we're not watching. The $4.9 trillion Wi-Fi contributes to the global + economy isn't an abstract number. It's the cumulative value of billions of + human activities and critical systems that simply stop working when the + network fails. +

+ +

Why Traditional Wi-Fi Cannot Support L4S

+ +

+ The infrastructure supporting all of this is failing at scale, and it must + be addressed for all. The industry is moving toward L4S and ECN-based + control to eliminate bufferbloat, but traditional Wi-Fi makes this + impossible. Legacy congestion-control loops fail by design once a single + flow saturates the bottleneck queue, and even modern ECN-based systems + such as L4S cannot converge when Wi-Fi hides queue depth, induces + collision storms, injects firmware-created delays that look like queues, + and constantly shifts transmission (PHY) rates through its rate-control + and aggregation machinery. Mesh networks and more APs catalyze intolerable + user experiences by injecting more uncoordinated radios into an already + chaotic RF environment. And because the AP industry understands these + limits, it is no surprise that even major vendors publicly state that L4S + cannot operate correctly over the products they sell. +

+ +

+ Adding more Ethernet-attached APs makes it worse by creating more + overlapping contention domains. Hidden queues in SoCs, rate-control + firmware, and aggregation pipelines obscure the true bottleneck. In + control-theory terms: the bottleneck queue cannot expose its state, the + PHY rate is not stationary, and the closed loop cannot stabilize. This is + why user experience fails in many apartments and homes, in hotels, MDUs, + stadiums, and high-density buildings long before “capacity” is reached. +

+ +

+ QoS cannot rescue this architecture. Because the bottleneck queue inside a + Wi-Fi AP has no information about actual flow urgency or priority, no QoS + mechanism can operate meaningfully. The only real solution is to avoid + congestion altogether — which is exactly what L4S researchers have + designed for and exactly what Fi-Wi supports. +

+ +

Why Copper Infrastructure Has Reached Its Limits

+ +

+ While the protocol fails in the air, the physical infrastructure fails in + the walls - the industry’s traditional answer of running copper Ethernet + to APs — simply extends the lifetime of an architecture that has reached + its limits. Copper requires periodic rip-and-replace cycles: Cat5 becomes + Cat6, then Cat7, then Cat8. A home builder has no idea what communications + wiring to install. The RJ45 connector and its plastic tab is fragile, + outdated and end of life. And at 25G, 40G, or 100G, physics takes over: + copper loses signal in dB per inch. Data centers have abandoned structured + cabling (long-run copper) for core transport, restricting copper only to + short-reach intra-rack DACs. Fi-Wi applies this same logic to the + building: Fiber for the long haul (halls/walls), radio for the short hop. +

+ +

How Fi-Wi Breaks Both Cycles

+ +

+ Fi-Wi breaks the cycle. Install fiber once — and never revisit behind + walls or ceilings again. The glass is permanent; only the optics evolve. + Fiber is already the universal medium for 100G/400G data centers, DWDM + long-haul transport, and now PCIe throughout a building with Fi-Wi. Remote + Radio Heads simply convert between fiber and 802.11, eliminating embedded + routing, rate-control SoCs, switching silicon, and the security-patch + treadmill they require. When Wi-Fi standards evolve, you replace the small + radio module(s) — that's all. +

+ +
+ What is C-RAN?
+ Fi-Wi adapts the + Centralized/Cloud Radio Access Network (C-RAN) + architecture from 4G/5G cellular systems. In C-RAN, intelligence (baseband + processing) is centralized while radio heads are distributed. Fi-Wi + applies this proven approach to Wi-Fi, enabling building-scale + coordination impossible with autonomous access points. +
+ +

+ Fi-Wi turns fiber combined with 802.11 into the permanent, predictable, + control-theory-friendly transport that the L4S control loop requires, and + treats 802.11 radio heads as the small, disposable, last-meters, + connector-free interface where the in-building network behaves + deterministically. And because fiber increases the long-term value of a + building, the investment is not just technically durable — it is + financially durable. +

+ +

The Opportunity Is Here

+ +

+ There is no law of physics that says Wi-Fi cannot work at scale. The + collapse we're seeing in apartments, hotels, and high-density buildings + isn't inevitable. The researchers have shown engineers how to proceed. We + know how to build stable control loops. We know how to coordinate radios. + We know how to deploy permanent infrastructure. +

+ +

+ The conditions for solving this are here, now. Engineering talent exists + across our industry. The market has already validated the foundation: + China's FTTR deployments have installed fiber to millions of rooms, + proving that permanent infrastructure at this scale is not just + feasible—it's already happening at volume. What's missing is capital + directed at the right architecture. Investors are essential to this + challenge. Their capital will enable the engineering to serve the market. + And, once proven, market signals will sustain the development, directing + human resources toward building what humanity needs for continued + advancement. +

+ +

+ Fi-Wi is Umber's answer, but the underlying challenge belongs to all of + us. The 5.5 billion people depending on this infrastructure deserve better + than a system designed for convenience that we've repurposed for survival. + This is solvable engineering—the talent is ready, the manufacturing + exists, and the market is waiting. It's time we came together and fixed + this. +

+ +
+ About Umber Networks
+ Umber Networks was founded by Bob McMahon, a networking engineer with 35 + years of experience building internet infrastructure. Bob created and + maintains Iperf2, the industry-standard network performance measurement + tool with over 3 million downloads worldwide. His career spans + foundational work on FDDI for the International Space Station (1989), + development of the Cisco Catalyst RSM routing module deployed worldwide, + and wireless chipset testing using statistical process controls at + Broadcom. Fi-Wi represents the culmination of decades solving congestion + control, wireless scaling, and real-time transport challenges at the + protocol and silicon level. +
+ +
+ +

+ 2. The Wi-Fi Crisis: Why Evolution Failed and Control Was Lost +

+ +

+ The failure of modern Wi-Fi to support low-latency applications (L4S) is + not a failure of bandwidth; it is a failure of control. + With 23.5 billion Wi-Fi devices deployed globally, the protocol has hit an + asymptotic limit where adding complexity yields diminishing returns. +

+ +

+ As density rises, autonomous contention scales super-linearly—effectively + operating as the inverse of Metcalfe's Law. The result is a rising noise + floor and media access collisions that render unlicensed spectrum unusable + for the deterministic performance required by next-generation + applications. +

+ +

+ 2.1 The Evolutionary Trap: Why Incremental Improvements Failed +

+ +

+ Evolutionary engineering is powerful; it gave us twenty-five years of + Wi-Fi speed improvements. But every evolutionary curve eventually hits an + asymptote—a point where adding more complexity yields diminishing returns. + We have reached that point. +

+ +
+ "The IEEE 802.11 working group behaves like a composer writing a symphony + that effectively cannot be played. They continually add + instruments—4096-QAM, Puncturing, MLO—without considering that the + musician (the silicon) has only microseconds to react." +
+ +

+ The decision matrix for a Wi-Fi chip has exploded combinatorially. We can + trace this through the + Modulation and Coding Scheme (MCS) Table: +

+ + + +

+ The Physical Trap: When the firmware engineer fails to + optimize the radio, can we simply redesign the chip? No, because of + RTL (Register Transfer Level) Accretion. In software, + engineers "refactor" unwieldy code. In hardware, refactoring is + economically forbidden. A complex SoC takes 18–24 months to validate; + removing "dead" logic risks breaking obscure corner cases. Consequently, + vendors only add; they never subtract. 802.11be logic wraps around + 802.11ax logic, which wraps around 802.11ac logic—twenty-five years of + accumulated technical debt consuming area and leakage power. +

+ +

+ The Market Signal: The ultimate proof that the standard + has reached gridlock is the behavior of market leaders like Samsung and + Apple. They no longer rush to support every new feature—they aggressively + whitelist features and blacklist others because complexity drains battery + and destabilizes connections. When the two largest consumers of wireless + silicon effectively stop buying the complexity argument, the evolutionary + roadmap is broken. +

+ +

+ 2.2 The Density Paradox: More Capacity, Less Performance +

+ +

+ The fundamental instability of 802.11 stems from the + Birthday Paradox applied to media access. In an + autonomous system, as the number of contending stations (n) + increases linearly, the probability of collision increases + combinatorially: +

+ +
+ Collision Pairs = n(n-1)/2
+ For n=100 devices: 4,950 potential collision pairs
+ At P(collision) = 1/N per pair, aggregate P(failure) → 1 as n → ∞ +
+ +

+ Simulation data confirms that even with moderate client density, collision + probability quickly exceeds 50%, forcing the network into a state of + "Drift" where latency becomes unbounded. Under these conditions, the + network is no longer constrained by PHY capacity, but by the probability + of successful media access. +

+ +

+ This is Metcalfe's Law in reverse: instead of each new + node increasing the value of the network, each new node increases the + chance of interference and reduces usable capacity. +

+ +

2.3 The Three Technical Failure Modes

+ +

+ The collapse of the operator model is driven by three distinct + architectural failures inherent to the 802.11 standard. +

+ +

2.3.1 Protocol Tax: The Hidden Node Penalty

+ +

+ Standard Wi-Fi relies on Carrier Sense Multiple Access (CSMA), which + assumes that all stations can hear each other. In real-world MDU + (Multi-Dwelling Unit) environments, this assumption fails + catastrophically. +

+ +

+ Field measurements using ESP32-based sensors reveal that hidden node + contention consumes 30-50% of available airtime in typical MDU + deployments—airtime paid for in spectrum acquisition costs but lost to + protocol overhead invisible to traditional monitoring. This represents a + massive protocol tax where significant airtime is consumed by retries and + backoff slots rather than payload delivery. +

+ +

2.3.2 The MCS Matrix: Un-Engineerable Complexity

+ +

+ The most critical failure for a network operator is the + loss of state control. Modern 802.11ax supports 12 MCS + indices × 4 bandwidth options × 8 spatial stream configurations × 3 guard + intervals = >1,000 valid PHY states. Autonomous rate + selection must navigate this space at sub-millisecond timescales under + non-stationary noise. +

+ +

This creates a Non-Stationary System:

+ + + +

+ Because Wi-Fi is non-stationary, autonomous rate selection under + contention has no bounded outcome. The IEEE 802.11 standard has allowed + the MCS table to explode into hundreds of valid permutations—a chaotic + state space that firmware must navigate in microseconds with incomplete + information. +

+ +

2.3.3 The Spatial Contention Cascade

+ +

+ As load increases, the spatial precision of the network degrades. + Mathematical modeling shows that the condition number (κ)—a measure of how + well-conditioned the MIMO channel matrix is—degrades from 6 dB (excellent + spatial separation) to >12 dB (severe interference) under load. This + collapse means that 4×4 MIMO effectively degrades to 2×2 or worse, turning + additional spatial streams into self-interference rather than capacity. +

+ +

+ This degradation collapses the theoretical gains of Mu-MIMO, transforming + high-order spatial streams into interference rather than usable capacity. + The "Efficiency Paradox" emerges: Wi-Fi evolution has focused on shrinking + Payload Duration (faster PHY rates like 4096-QAM) while MAC Overhead (LBT, + Backoff, Preamble) remains constant. To amortize the overhead, chips must + build massive Aggregates (A-MPDUs). This destroys latency. We have + engineered a Ferrari engine (the PHY) inside a garbage truck (the MAC). +

+ +

2.4 The Operator's Dilemma

+ +

+ For network operators—whether cable MSOs, telcos, or fiber providers—this + architectural chaos presents a fundamental business risk: + You own the customer experience, but not the air interface. +

+ + + +

2.5 Why Conventional Solutions Don't Scale

+ +

+ Traditional attempts to solve Wi-Fi density problems fail because they + address symptoms rather than the underlying architectural failure: +

+ + + +

+ The Trillion-Dollar Context: The mobile industry spent + $600 billion building 5G to get scheduled, deterministic performance + outdoors. They understand that unlicensed spectrum + autonomous contention + = chaos. The genius of 5G is its architecture; its Achilles heel is its + cost. In recent auctions, 20 MHz of licensed mid-band spectrum sold for + over $17 billion for U.S. rights alone. +

+ +

+ Fi-Wi applies the cellular C-RAN architecture indoors—but on unlicensed + spectrum that costs nothing. This is the arbitrage opportunity. +

+ +

+ 2.6 The Client Side: L4S and the End of Uplink Contention +

+ +

+ The architectural reset is not limited to the infrastructure; it + fundamentally alters the behavior of the Station (STA). In legacy Wi-Fi, + the STA is an autonomous agent that fights for upstream airtime using EDCA + (Enhanced Distributed Channel Access). It maintains its own local WMM + queues and blindly transmits whenever it wins a contention window, often + oblivious to the fact that the AP's receive buffer is already full. +

+ +

+ The L4S Inversion: With L4S, the "Quality of Service" + decision moves from the Wi-Fi card's firmware to the application's + congestion control algorithm. We replace the rigid, static categories of + WMM with the dynamic, adaptive responsiveness of + TCP Prague and other L4S-compliant congestion controls. +

+ + + +

+ Eliminating the "Uplink Queue": This effectively + virtualizes the queue. Instead of a deep buffer sitting on the Wi-Fi chip + waiting to be transmitted, the packets are held in user-space memory on + the client device, waiting for the "go" signal (or rather, the absence of + a "stop" signal). The traffic never enters the contention domain until + there is guaranteed capacity to service it. The STA no longer needs + complex internal QoS schedulers because it is no longer trying to force + more data than the pipe can hold. +

+ +
+

+ Technical Insight: The "Driver Queue" Trap +

+ +

+ In legacy systems, flow control happens at the driver level. When the + Wi-Fi card's hardware buffer fills up (the TX Ring), it signals the + Operating System to "Stop the Queue." The OS then buffers packets in + software (qdisc) until the hardware signals "Go." +

+ +

+ This is catastrophic for latency. It creates a hidden + reservoir of old data sitting in the kernel, waiting for the hardware to + clear. By the time the hardware is ready, the packets in the OS queue + are already stale. +

+ +

+ L4S eliminates this layer of buffering entirely. + Because TCP Prague adjusts the send rate to match the + actual airtime capacity (signaled via ECN), the application + never sends enough data to fill the hardware ring buffer. The driver + never has to assert flow control, the OS queue remains empty, and every + packet that hits the driver is fresh, ensuring immediate transmission. +

+
+ +

2.7 The Strategic Reset: Splitting the Graph

+ +

+ Solving this requires a "Subtractive Architecture." Instead of adding more + features to the radio, we must remove them. The architectural breakthrough + of Fi-Wi is decoupling the MCS State Graph described in + Section 2.3.2 into its constituent parts: +

+ + + +

+ This architectural shift—from distributed chaos to centralized + control—mirrors the evolution from + analog transmission systems (noise-prone, + operator-invisible) to digital QAM (deterministic, + monitorable). + Fi-Wi completes this transformation for the last 10 meters, moving the network from a model of probabilistic negotiation to one of + deterministic execution. +

+ +

+ Section 13 describes the Concentrator's scheduling algorithm that + implements this graph traversal, while Appendix C details the RRH's + scatter-gather DMA mechanism that executes the chosen state transitions at + microsecond timescales. +

+ +
+

+ Technical Insight: The QoS Fallacy +

+ +

+ Traditional QoS mechanisms in Wi-Fi—WMM access categories, priority + queues, and traffic shaping—reflect a fundamental architectural flaw: + treating contention as inevitable and attempting to optimize it + through priority classes. + This approach attempts to infer urgency by classifying packets, then + granting probabilistic access to the medium—essentially rolling dice + with weighted odds. +

+ +

+ L4S changes the premise entirely. Flows signal their + tolerance for delay using ECN, allowing the network to signal sources to + control their own send rates. Across many flows, this controls the + aggregate arrival rates at the forwarding plane based on real-time queue + feedback rather than static classes. +

+ +

+ In a Fi-Wi architecture, where all wireless transmissions are centrally + scheduled with unified state, traffic no longer competes through + contention. The Concentrator controls arrival rates to each Remote Radio + Head, ensuring packets are transmitted at the precise moment they are + needed. + This deterministic scheduling replaces the probabilistic contention + that WMM attempts to optimize. + Consequently, the complex web of traditional QoS queues is rendered + obsolete; we replace "Priority" (deciding who waits) with "Isolation" + (ensuring no one waits). +

+
+ +

+ 2.8 Interactive Visualization: The MCS Collapse Under Load +

+ +

+ The following interactive simulation demonstrates the architectural + differences between Fi-Wi, autonomous APs, and mesh networks under varying + load conditions. It visualizes the + MCS State Graph discussed in Section 2.7, showing how + autonomous systems fail to navigate this state space under density. +

+ +

+ Each "room" represents a device with a 4 × 12 grid of MCS states (4 + spatial streams × 12 MCS indices). The + ghost node (dashed) shows the ideal state based on + channel quality, while the active node shows the actual + state selected by the rate control algorithm. +

+ +
+
+ +
+ Click anywhere to open interactive version ↗ +
+ +
+ +
+

How to Use the Simulation

+ +

Quick Start - Try These Scenarios:

+ + + +

Interactive Controls:

+ + + +

What to Watch For:

+ + +
+ +

Technical Details: Understanding the Visualization

+ +

+ MCS Grid: Each 4×12 grid shows all possible MCS states. + Top rows = Mu-MIMO (multi-user), bottom rows = standard 2×2 MIMO. Columns + = MCS index (0-11, higher = faster but needs better SNR). +

+ +

+ Eigenvalues (λ₁, λ₂): Strength of spatial modes in the + MIMO channel. As density increases in autonomous mode, λ₂ collapses → + spatial interference. +

+ +

+ Condition Number (κ): Ratio λ₁/λ₂ in dB. Low (~6 dB) = + good. High (>12 dB) = Mu-MIMO degraded to single-stream. This directly + demonstrates the "Spatial Contention Cascade" from Section 2.3.3. +

+ +

+ Collision Probability: Computed using Birthday Paradox + formula: n(n-1)/2 collision pairs. When this exceeds 50%, the network + enters "Drift" state with unbounded latency. +

+ +

Why This Matters for Network Operators

+ +

+ This visualization proves the loss of control described + in Section 2.4. In autonomous mode, operators cannot engineer performance + because the system navigates a 1,000+ state MCS graph with no global + coordination. +

+ +

+ In Fi-Wi mode, the Concentrator's global state visibility allows it to: +

+ + + +

+ The result: predictable, engineerable performance that + scales with density instead of collapsing. The difference becomes visceral + when you watch autonomous mode turn red under the same load that Fi-Wi + handles in green. +

+ +
+ +

3. System Picture

+ +
+

+ System Diagram: Fi-Wi Concentrator, Central Packet Memory, and Multiple + RRHs +

+ +
+                        ┌────────────────────────────────────────────┐
+                        │              Fi-Wi Concentrator            │
+                        │────────────────────────────────────────────│
+   L4S/ECN-aware        │                                            │
+   traffic from LAN/    │   ┌────────────────────────────────────┐   │
+   WAN (IP/802.3)  ─────┼─▶│    Central Packet Memory & Queues  │   │
+                        │   │  • Per-flow / per-tenant queues    │   │
+                        │   │  • Per-airtime-domain queues       │   │
+                        │   │  • Enqueue timestamps (µs)         │   │
+                        │   └───────────────┬────────────────────┘   │
+                        │                   │                        │
+                        │   ┌───────────────▼────────────────────┐   │
+                        │   │   L4S/AQM & Scheduler              │   │
+                        │   │  • Sojourn-time based ECN marking  │   │
+                        │   │  • TXOP length control (≈250 µs)   │   │
+                        │   │  • RF grouping & spatial streams   │   │
+                        │   └───────────────┬────────────────────┘   │
+                        │                   │ PCIe over fiber        │
+                        └───────────────────┼────────────────────────┘
+                                            │
+        ┌───────────────────────────────────┼───────────────────────────────────┐
+        │                                   │                                   │
+        │                                   │                                   │
+┌───────▼─────────┐                ┌────────▼─────────┐                ┌────────▼─────────┐
+│   RRH #1        │                │   RRH #2         │                │   RRH #3         │
+│ (Thin MAC/PHY)  │                │ (Thin MAC/PHY)   │                │ (Thin MAC/PHY)   │
+│  • RF front end │                │  • RF front end  │                │  • RF front end  │
+│  • DFE + FFT    │                │  • DFE + FFT     │                │  • DFE + FFT     │
+│  • Minimal MAC  │                │  • Minimal MAC   │                │  • Minimal MAC   │
+│  • DMA engine   │                │  • DMA engine    │                │  • DMA engine    │
+│  • PTP sync     │                │  • PTP sync      │                │  • PTP sync      │
+└───────┬─────────┘                └────────┬─────────┘                └────────┬─────────┘
+        │                                   │                                   │
+        │                                   │                                   │
+        │                 PCIe-over-fiber links (no deep queues in RRHs)        │
+        │                                   │                                   │
+        │                                   │                                   │
+┌───────▼─────────┐                ┌────────▼────────┐                 ┌────────▼─────────┐
+│   RRH #4        │     ...        │   RRH #N        │                 │   Wi-Fi STAs     │
+│ (Thin MAC/PHY)  │                │ (Thin MAC/PHY)  │     (Rooms, AP-like cells, clients)│
+│  • RF front end │                │  • RF front end │                 │  • Phones        │
+│  • DFE + FFT    │                │  • DFE + FFT    │                 │  • Laptops       │
+│  • Minimal MAC  │                │  • Minimal MAC  │                 │  • IoT devices   │
+│  • DMA engine   │                │  • DMA engine   │                 │                  │
+│  • PTP sync     │                │  • PTP sync     │                 │                  │
+└─────────────────┘                └─────────────────┘                 └──────────────────┘
+  
+

+ Key properties: Central packet memory and queues live + entirely in the concentrator, where L4S-aware AQM and scheduling operate + on true bottleneck queues. RRHs are kept as simple hardware endpoints + (RF + minimal MAC + DMA + PTP), with no deep local buffering or + autonomous AP logic. This enables stable L4S behavior, explicit TXOP + control, and software-defined evolution of queueing and RF policies. +

+
+ +

3.1 Classical Stack vs. Fi-Wi (The C-RAN Shift)

+ +

+ To understand Fi-Wi, we must first unlearn the definition of an "Access + Point." +

+ +
+ Reality Check 1: The RRH is a Micro-Bridge, Not an Access Point
+ The industry treats the AP as a "Router on the Ceiling." Fi-Wi replaces + this with a + Tunneling Bridge. + + The Shift: The RRH does not "process" the network; it + "extends" it. It is a transparent pipe that bridges the airgap to the + fiber, leaving all decision-making to the central brain. +
+ +
+ Reality Check 2: Coordination vs. Control
+ Traditional "Centralized Controllers" (like Cisco/Aruba) provide + Coordination. They tell APs which channels to use or + which clients to kick, but the AP still decides exactly when to transmit + every packet. The "Control Loop" is still distributed.
+
+ Fi-Wi provides Control. The Concentrator does not + "suggest" a schedule; it executes it. It tells the RRH: + "Transmit these specific bytes at exactly microsecond T." There is no + disagreement, no race condition, and no distributed chaos. +
+ +

+ In a typical + controller-managed enterprise Wi-Fi deployment, a + centralized controller (e.g., Cisco WLC, Aruba Mobility Controller, + Ubiquiti UniFi Controller) coordinates AP configuration: channel + assignment, transmit power, client steering recommendations, and SSID + management. However, + each AP remains autonomous at the data plane: +

+ + + +

+ These systems are loosely-coupled: the controller manages + the control plane (configuration, policy) but the data plane — queuing, + MAC scheduling, aggregation, and packet forwarding — remains + distributed and autonomous across individual APs. +

+ +

+ In Umber Fi-Wi (C-RAN for Wi-Fi), we split the AP and + cellularize the RF domain, down to room-level. The concentrator sees all + flows, all queues, and all RRHs. The RRHs handle 802.11 MAC/PHY but are + tightly time-synchronized and behave as DMA-driven PHY/MAC endpoints + rather than autonomous APs. A set of RRHs and their shared queues form a + cellularized Wi-Fi domain within the building, often at + “cell per room” granularity. +

+ +

+ Fi-Wi centralizes both control plane AND data plane with + shared state across all RRHs. The concentrator doesn't just configure + RRHs; it directly manages their queues, schedules their TXOPs, and + maintains unified timestamp-synchronized state across the entire + cellularized RF domain. +

+ +

3.2 Dual-Loop Control Model

+ +

+ Conceptually, Fi-Wi decouples the system into two nested feedback loops, + separated by timescale: +

+ +
+ Outer loop (End-to-End Latency): [ L4S Sender ] ──(ms)──> [ Group Queue + ] ──> [ Feedback (ECN) ] Inner loop (MAC Efficiency): [ Aggregation + Buffer ] ──(µs)──> [ Airtime / PHY ] +
+ +

+ The Outer Loop manages congestion and end-to-end latency + (Internet speed). The Inner Loop manages MAC efficiency + and radio timing (Airtime). +

+ +

+ The Problem with Legacy Wi-Fi: Traditional APs couple + these loops unpredictably, creating "sawtooth" latency patterns that + confuse TCP. +

+ +

+ The Fi-Wi Solution: By centralizing both loops in the + Concentrator, Fi-Wi enforces a strict + Time-Scale Separation. The Inner Loop runs so fast (3–5 + kHz) that it appears as "constant service" to the slower Outer Loop (10–20 + Hz), allowing L4S to stabilize perfectly. +

+ +

+ (See Section 5: Control Architecture for the + rigorous control-theoretic analysis and stability criteria.) +

+ +
+ +

4. Key Fi-Wi Mechanisms

+ +

4.1 Time Synchronization

+ +

+ Fi-Wi operates across two distinct time domains simultaneously. The first + is the concentrator's internal master clock, disciplined via PTP/802.1AS + over the PCIe fronthaul (detailed in Section 4.7). The second is the + 802.11 TSF (Target Sync Function) domain that 802.11 clients use to + coordinate with the MAC layer. In a traditional AP these two clocks are + decoupled — the AP runs one TSF and one clock. In Fi-Wi, with 24 RRHs each + presenting a TSF-aware BSS, managing the relationship between them is a + foundational architectural responsibility of the concentrator. +

+ +

4.1.1 The Fronthaul Clock: PTP/802.1AS

+ +

+ The concentrator synchronizes its master clock to all attached RRHs on the + order of microseconds (and substantially tighter when using PCIe-native + timing mechanisms such as PTM — see Section 4.7 for the full hardware + chain). This master clock gives every packet: +

+ + + +

+ This clock lives entirely inside the Fi-Wi domain. Clients never see it + directly. It is the coordinate system in which shim header timestamps + (Section 4.2), AQM marking decisions (Section 4.3), and the ML training + corpus (Section 15) are all expressed. Because all packet timestamps, + service events, and queue measurements are expressed in this single master + time domain, Fi-Wi can compute precise per-packet sojourn times + independent of the TSF domain, enabling stable ECN marking and L4S control + across the system. +

+ +

4.1.2 The 802.11 TSF Domain

+ +

+ The 802.11 TSF is a 64-bit microsecond counter that every client + associates with a BSS. Clients set their local TSF from beacons. They use + it to wake from power save at the right moment, to interpret TBTT (Target + Beacon Transmission Time), and to coordinate TXOP timing. The TSF is the + only MAC-visible clock the 802.11 standard exposes at the MAC layer. +

+ +

+ In a traditional single-AP deployment this is trivial: one AP, one TSF, + one beacon stream. In Fi-Wi it is not. Consider a client in a room served + by two RRHs in the same airtime domain. That client will receive beacons + from both RRHs. If those beacons carry inconsistent TSF values, even small + inconsistencies can lead to misaligned power-save wakeups, ambiguous TBTT + interpretation, and in some implementations degraded performance or + reassociation. The coherence of the TSF domain across all RRHs in a BSS is + not optional; it is a hard correctness requirement. +

+ +

+ Fi-Wi satisfies this requirement by construction: + the concentrator generates all beacon frames. No RRH + constructs its own beacon. The concentrator writes the TSF value into + every beacon before dispatching it to the appropriate RRH for + transmission. Because all TSF values originate from the same source and + are derived from the same master clock, they are consistent by design + rather than by coordination protocol. Within a given BSS, TSF values are + identical across all participating RRHs; multiple TSF domains arise only + when multiple BSS instances are present. +

+ +

4.1.3 The Concentrator as Time Origin

+ +

+ The concentrator maintains 25 simultaneous time references: its own + PTP-disciplined master clock and one 802.11 TSF per RRH. Each TSF has its + own epoch (established at BSS creation) and its own drift correction term, + derived from periodic synchronization updates over the fronthaul + (PTP/802.1AS or PCIe PTM), which bound long-term drift. The concentrator + knows the exact affine mapping between the master clock and every + client-visible TSF domain at all times: +

+ +
TSF_i(t) = (t_master - epoch_i) + drift_correction_i(t)
+
+ +

+ Any event — a packet enqueue, an ECN mark, a TXOP start, a beacon + transmission — can be expressed in any of the 25 frames without loss of + precision. This is the time-domain analog of a coordinate transformation: + the concentrator is the origin from which all other reference frames are + derived, and any event timestamp can be mapped between frames via a known, + invertible affine transform, updated continuously via the fronthaul + synchronization loop. +

+ +
+

Figure 4.1-1: The Concentrator as Time Origin

+
+Concentrator master clock (PTP-disciplined)
+  │
+  ├─ Master frame: all shim timestamps, sojourn times, AQM marks, ML labels
+  │
+  ├─ TSF_1:  epoch_1, drift_1(t)  →  beacon stream for RRH 1  ┐
+  ├─ TSF_2:  epoch_2, drift_2(t)  →  beacon stream for RRH 2  │ identical within
+  ├─ TSF_3:  epoch_3, drift_3(t)  →  beacon stream for RRH 3  │ a given BSS
+  │   ...                                                       ┘
+  └─ TSF_24: epoch_24, drift_24(t) → beacon stream for RRH 24
+
+Any event E has coordinates in all 25 frames simultaneously.
+Mapping between any two frames: affine transform, known at the concentrator,
+updated continuously via the fronthaul sync loop.
+    
+

+ The concentrator as the origin of 25 simultaneous time reference frames + (for a 24-RRH deployment). Client-visible TSF domains are derived from + the master clock via known affine transforms. Within a BSS, TSF values + are identical across participating RRHs. +

+
+ +
+ Why Distributed APs Cannot Do This
+ +

+ In a controller-managed AP deployment, each AP runs its own TSF + independently. The controller can nudge APs toward a common time + reference via 802.11v BSS Transition Management or out-of-band NTP, but + it does not generate beacon frames — each AP does. This means TSF values + across APs can diverge by the inter-AP sync error (typically tens to + hundreds of microseconds with Ethernet-based PTP, more without it). +

+ +

+ A client roaming between two such APs may see a TSF discontinuity at + handoff. Power-save state, TBTT alignment, and any MAC-layer timing + assumption the client holds must be renegotiated. In Fi-Wi, roaming + between RRHs within the same concentrator domain is a TSF-transparent + event: the client's TSF counter simply continues, because the new RRH's + beacon carries the same TSF value the old one would have carried at that + moment. The client does not know a handoff occurred at the MAC layer. +

+
+ +

+ This unified time model also enables the concentrator to schedule + transmissions across RRHs against a single global timeline, rather than + relying on independent per-RRH contention processes. TSF continuity across + RRH handoffs is a direct consequence of centralized beacon generation, and + it is what makes Fi-Wi's active redundancy claims in Section 8 + operationally credible: per-packet steering between RRHs is transparent to + clients because the client's MAC-layer time reference never changes. This + unified time model enables not only precise measurement, but coordinated + control of transmission behavior across RRHs, as described in Section + 4.1.4. +

+ +

4.1.4 Time-Driven EDCA Orchestration

+ +

+ The unified time model described above is not only a measurement + framework; it is the foundation for Fi-Wi's centralized MAC scheduling. In + conventional 802.11 deployments, EDCA (Enhanced Distributed Channel + Access) operates as a stochastic contention mechanism: each AP + independently selects random backoff values within its CWmin/CWmax range, + and medium access emerges probabilistically. +

+ +

+ In Fi-Wi, EDCA is not treated as a distributed random process. It is + treated as a centrally orchestrated actuation layer, + driven by the concentrator's master time reference. +

+ +

Because the concentrator maintains:

+ + + +

+ it can shape medium access behavior across RRHs by dynamically controlling + EDCA parameters on a per-radio basis. The key parameters are: +

+ + + +

+ By assigning narrowly bounded contention windows and staggered AIFS values + across RRHs, the concentrator can bias contention outcomes such that one + RRH is overwhelmingly likely to win access at a given moment. Rotating + these parameters over time creates a + soft time-division multiplexing (TDM) effect using + standard EDCA semantics. +

+ +

+ This transformation is only possible because all RRHs share a common time + reference. The concentrator can schedule EDCA parameter updates relative + to the master clock and ensure that all RRHs apply them in a coordinated + manner. Without this shared time base, independent EDCA processes would + quickly decorrelate and revert to stochastic contention. +

+ +

Conceptually, the concentrator executes a scheduling loop:

+ +
for each scheduling interval:
+  observe queue state across RRHs        // centralized visibility
+  select next RRH (or RF group) to serve // queue-aware decision
+  assign EDCA parameters (CWmin, CWmax, AIFS, TXOP)
+  enforce timing relative to master clock // coordinated application
+
+ +

+ The result is not strict TDMA — 802.11 contention semantics are preserved + and the system remains compliant with standard client behavior — but the + distribution of outcomes is shaped by the concentrator. Over short time + horizons, access becomes highly predictable and service intervals can be + bounded. This has two critical consequences: +

+ + + +

+ Because TSF values are consistent across RRHs, these scheduling decisions + are MAC-transparent to clients. From the client's perspective, the network + behaves as a single, coherent AP with stable timing characteristics, even + as transmissions are steered across multiple physical radios. +

+ +
+ Why Distributed AP Systems Cannot Replicate This
+ +

+ Controller-based Wi-Fi systems can configure EDCA parameters on + individual APs, but they cannot coordinate their application in time + with sufficient precision. Each AP maintains its own clock, its own + contention process, and its own transmit queues. +

+ +

+ Without a shared time origin and centralized queue visibility, EDCA + remains a probabilistic mechanism. Attempts to tune contention + parameters across APs produce statistical bias at best, not + deterministic scheduling. The lack of a unified time domain prevents + coordinated rotation of access privileges across radios. +

+ +

+ Fi-Wi's ability to treat EDCA as a controllable scheduling primitive is + a direct consequence of the concentrator's role as both the time origin + and the sole owner of transmit queues. +

+
+ +

+ This time-driven EDCA orchestration is the mechanism by which Fi-Wi + converts the inherently stochastic 802.11 MAC into a + predictable, centrally scheduled system — completing the + chain from time synchronization through queue observability to stable L4S + control. +

+ +

4.2 Fi-Wi Shim Header

+ +

+ Between 802.3/IP and the fronthaul link we add a small internal metadata + header. Conceptual form: +

+ +
struct FiWiMeta {
+  uint64_t seq;          // fronthaul sequence number
+  uint64_t t_ingress_us; // time packet enqueued into group queue (central DRAM)
+  uint32_t txop_id;      // TXOP this MSDU is in
+  uint8_t  mpdu_idx;     // index within aggregate
+  uint8_t  mpdu_cnt;     // total MSDUs in this TXOP
+  uint8_t  ecn_flags;    // CE applied? which queue? reason bits
+  uint32_t qlen_pkts;    // queue depth snapshot at TXOP start
+};
+
+

This header is visible only inside the Fi-Wi domain. It lets us:

+ + + +

4.3 AQM / L4S Marking Placement

+ +

+ We choose the group queues in the concentrator—each + corresponding to a cellularized airtime domain shared by one RRH or by + multiple interfering RRHs—as the only places where deep queues + are allowed and where we apply ECN: +

+ + + +

+ Other queues (within RRH hardware, on the fiber/fronthaul link) are kept + shallow via pacing and controlled descriptor posting. The group queues + become the single bottlenecks in each cellularized + airtime domain, which is exactly what L4S wants: a small number of stable, + well-behaved bottlenecks with known behavior. The control policy is + explicitly tuned to keep both average and + tail queueing delay low. +

+ +

4.4 Centralized Packet Memory and DMA

+ +
+ DMA (Direct Memory Access): Why RRHs Can Be Simple
+ +

+ The Standard AP Architecture: Traditional Wi-Fi chips + already use DMA to move packets from host memory to the radio without + CPU involvement. But they require a local CPU to create + descriptors, manage buffers, and run the network stack. Every AP is a + complete computer running millions of lines of Linux. +

+ +

The Fi-Wi Innovation: DMA Over Distance (not RDMA)

+ +

+ Fi-Wi extends the PCIe bus over fiber, allowing the RRH's DMA engine to + read and write remote memory in the Concentrator. To + the RRH silicon, memory 100 meters away appears "local"—accessible with + the same PCIe transactions a traditional Wi-Fi chip uses to access DRAM + 10 millimeters away on the motherboard. +

+ +

+ Result: The local CPU, local DRAM, and entire Linux + stack can be eliminated. The RRH becomes a pure "micro-bridge"—just DMA + + MAC/PHY logic. +

+ +

The Silicon Cost Difference:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ComponentTraditional APFi-Wi RRH
+ MAC/PHY Silicon
+ (802.11 Radio Logic) +
+ ~15-20M gates
+ MIMO, error correction, etc.
+ Complexity dictated by physics +
+ ~15-20M gates
+ Same physics, same complexity
+ No savings here +
+ Host SoC / CPU
+ (The "Brains") +
+ ~50-100M gates
+ Multi-core ARM CPU
+ DDR4 controller
+ Peripherals, caches, etc. +
+ ~100K-500K gates
+ Simple DMA state machine
+ Descriptor buffer only
+ 100-1000x simpler +
DRAM + 256MB - 1GB DDR4
+ (Required for OS + buffers) +
+ 16-64KB SRAM
+ (Descriptor storage only) +
Operating System + Linux (millions of LOC)
+ Requires security patches +
+ None
+ Zero software attack surface +
Total Silicon~70-120M gates~15-20M gates
+ +

Direct Implications:

+ + + +

The Economic Model:

+ +

+ Traditional Architecture: 50 APs = 50 CPUs, 50 DRAM modules, 50 power + supplies, 50 Linux installations, 50 security update cycles. +

+ +

+ Fi-Wi Architecture: 1 powerful Concentrator (workstation-class) + 50 + simple RRHs (DMA + radio only). +

+ +

+ Total system cost is lower because you're paying for + intelligence once, not 50 times. +

+ +

Why Incumbents Cannot Do This:

+ +

+ Traditional AP vendors have already optimized their SoC designs—the CPU, + DRAM controller, and peripherals are as efficient as they can be. But + their architecture requires these components at every radio + because each AP operates autonomously. Even if they wanted to simplify, + the distributed control model forces complexity at the edge. +

+ +

+ Fi-Wi's centralized architecture enables the per-radio simplification. + This is a structural cost advantage, not a manufacturing + efficiency. + Replicating it would require incumbents to abandon their entire product + line and business model—a classic Innovator's Dilemma. +

+ +

+ Bottom Line: C-RAN works because + silicon economics favor centralized intelligence. The + gate count difference isn't cosmetic—it's the foundation of Fi-Wi's + cost, power, and reliability advantages. +

+
+ +

In Fi-Wi, packet memory is centralized in the concentrator:

+ + + +
+ Central DRAM (Fi-Wi Concentrator) ──────────────────────────────── Group + queue A → RRH1, RRH2 (shared RF cell) Group queue B → RRH3 (isolated cell) + Group queue C → RRH4–RRH7 (shared RF cell) ... Queues live centrally; RRHs + are DMA clients draining those queues into airtime. +
+ +

This design:

+ + + +

4.5 RRH Edge Control via Beacon Power Shaping

+ +

+ Because the Fi-Wi concentrator maintains shared state for + the entire RF domain, it can directly control the + RF footprint of each RRH by adjusting per-RRH beacon + transmit power. This alters: +

+ + + +

+ Beacon power is one of the most effective tools for + dynamic RF cell shaping because it affects STA + association and roaming decisions without modifying data-plane PHY rates. + By lowering beacon power at certain RRHs and raising it at others, the + concentrator can: +

+ + + +

+ Traditional controller+AP systems attempt similar behavior but lack true + shared state because each AP maintains its own queueing and PHY + decisions. In Fi-Wi, beacon shaping is coordinated with: +

+ + + +

+ This makes beacon power a first-class control variable in defining and + stabilizing the boundaries of each cellularized RF domain. +

+ +

4.6 Fronthaul Requirements and Feasibility

+ +

+ The Fi-Wi architecture requires deterministic, low-latency fronthaul links + between the concentrator and RRHs. Because RRHs function as DMA engines + accessing centralized packet memory (Section 4.4), Umber's implementation + uses PCIe (PCI Express) over fiber rather than Ethernet. + This section quantifies bandwidth, latency, and jitter requirements, and + demonstrates that PCIe over fiber not only meets these requirements but + provides superior performance compared to network-based alternatives. +

+ +

4.6.1 Why PCIe Over Fiber?

+ +

+ The choice of PCIe over fiber instead of Ethernet is driven by the Fi-Wi + architectural model: +

+ +

+ RRHs as DMA engines: Each RRH directly reads packet + descriptors from concentrator DRAM, fetches packet data, and writes + received packets back to memory. This is native PCIe behavior—exactly how + a network card or storage controller operates. +

+ +

+ Latency advantage: PCIe avoids the network stack + entirely: +

+ + + +

+ Determinism: PCIe provides guaranteed bandwidth + allocation and predictable latency through: +

+ + + +

+ Simplicity: The RRH sees the concentrator's memory space + directly. No protocol translation, no socket APIs, no network + configuration. +

+ +

4.6.2 PCIe Bandwidth Requirements

+ +

Each RRH requires bandwidth for:

+ +

1. Downlink packet DMA (concentrator → RRH)

+ +

+ For an RRH serving one or more STAs with aggregate capacity + Ceff: +

+ +
+BWDL = Ceff · (1 + OHdesc)                    (4.1)
+
+

+ where OHdesc accounts for DMA descriptors, metadata, and PCIe + TLP (Transaction Layer Packet) overhead (typically 10-20%). +

+ +

+ Example: For Ceff = 600 Mbps (typical 802.11ax + 2×2 MIMO) with OHdesc = 0.15: +

+ +
+BWDL = 600 · 1.15 = 690 Mbps
+
+

2. Uplink packet DMA (RRH → concentrator)

+ +

+ Typically symmetric or slightly higher than downlink due to ACKs and + control frames: +

+ +
+BWUL ≈ BWDL · 1.1 ≈ 760 Mbps                   (4.2)
+
+

3. CSI and status updates

+ +

+ Channel State Information and MAC statistics are written to concentrator + memory via PCIe: +

+ +
+BWCSI = Nsta · Nsc · Ntx · Nrx · Bsample · fCSI    (4.3)
+
+

+ For Nsta=4, Nsc=234, Ntx=2, + Nrx=2, Bsample=24 bits, fCSI=50 Hz: +

+ +
+BWCSI = 4.49 Mbps per RRH
+
+

4. Control and command traffic (concentrator → RRH)

+ +

+ Configuration updates, timing sync corrections, power/channel commands: +

+ +
+BWcontrol ≈ 1-5 Mbps per RRH                         (4.4)
+
+

Total bidirectional bandwidth per RRH:

+ +
+BWtotal = BWDL + BWUL + BWCSI + BWcontrol           (4.5)
+BWtotal ≈ 690 + 760 + 4.5 + 2 = 1456 Mbps ≈ 1.5 Gbps
+
+

4.6.3 PCIe Link Configuration

+ +

PCIe bandwidth is determined by generation and lane count:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
PCIe GenPer-Lane Ratex1 Linkx4 Linkx8 Link
Gen 3~8 GT/s + ~985 MB/s
+ (7.88 Gbps) +
+ ~3.94 GB/s
+ (31.5 Gbps) +
+ ~7.88 GB/s
+ (63 Gbps) +
Gen 4~16 GT/s + ~1.97 GB/s
+ (15.75 Gbps) +
+ ~7.88 GB/s
+ (63 Gbps) +
+ ~15.75 GB/s
+ (126 Gbps) +
Gen 5~32 GT/s + ~3.94 GB/s
+ (31.5 Gbps) +
+ ~15.75 GB/s
+ (126 Gbps) +
+ ~31.5 GB/s
+ (252 Gbps) +
+ +

+ Note: Effective bandwidth accounts for 128b/130b encoding (Gen 3+) and + protocol overhead. +

+ +

RRH link sizing: For 1.5 Gbps per RRH requirement:

+ + + +

+ A single PCIe Gen 3 x1 lane is sufficient per RRH with substantial + headroom. +

+ +

4.6.4 Concentrator PCIe Topology

+ +

+ The concentrator must aggregate multiple RRH connections. Consider a + 50-RRH deployment: +

+ +

Total aggregate bandwidth requirement:

+ +
+BWaggregate = NRRH · BWtotal                      (4.6)
+BWaggregate = 50 · 1.5 Gbps = 75 Gbps (peak)
+
+

With 40% average utilization (typical for building-wide traffic):

+ +
+BWtypical = 75 · 0.40 = 30 Gbps
+
+

Architecture Options:

+ +

Option 1: PCIe switch fabric

+ + + +

Option 2: Multi-host server (Dual Socket)

+ + + +
+ Option 3: The Fi-Wi Choice — Workstation-Class Single-Socket
+ To achieve perfect determinism, Fi-Wi standardizes on + High-End Desktop (HEDT) / Workstation silicon (e.g., AMD + Threadripper Pro or Intel Xeon W-3400 series). + + This "Goldilocks" topology enables the + Non-Blocking Architecture detailed in + Section 13. +
+ +

4.6.5 PCIe Over Fiber: Physical Layer

+ +

+ Standard PCIe uses copper traces on motherboards (limited to ~30cm at Gen + 3/4 speeds). To reach RRHs distributed throughout a building, PCIe signals + are carried over fiber using optical transceivers. +

+ +

Technologies:

+ +

1. Active Optical Cables (AOC)

+ + + +

2. Optical PCIe adapter cards

+ + + +

3. PCIe fabric extenders

+ + + +

+ Recommended approach for Fi-Wi: Optical PCIe adapter + cards with standard fiber infrastructure, providing flexibility and + leveraging commodity fiber installation. +

+ +

4.6.6 Latency Analysis

+ +

PCIe over fiber latency components:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ComponentLatency
PCIe TLP formation (concentrator)0.2-0.5 µs
Optical transceiver (TX)0.1-0.3 µs
Fiber propagation (100m)0.5 µs
Optical transceiver (RX)0.1-0.3 µs
PCIe TLP processing (RRH)0.2-0.5 µs
PCIe switch (if used)0.1-0.3 µs per hop
Total one-way1.2-2.4 µs
Round-trip (DMA read)2.4-4.8 µs
+ +

Comparison to Ethernet:

+ + + + + + + + + + + + + + + + + + + + + + + + + +
Fronthaul TypeRound-Trip LatencyDeterminism
PCIe over fiber2.4-4.8 µsExcellent (credit-based)
10GbE (cut-through)10-30 µsGood (with QoS)
10GbE (store-forward)20-100 µsFair (subject to congestion)
+ +

+ PCIe over fiber provides 5-10× lower latency than even + optimized Ethernet, which is critical for the inner control loop (Section + B) operating at 200-500 µs timescales. +

+ +

4.6.7 Jitter and Determinism

+ +

+ PCIe's credit-based flow control eliminates congestion drops and provides + deterministic latency: +

+ + + +

+ Measured jitter: PCIe over fiber typically exhibits + <50 ns jitter, well under the 200 ns budget for 1 µs time + synchronization (Section 4.1). +

+ +

+ This determinism is impossible to achieve with Ethernet without + time-sensitive networking (TSN) extensions, which add complexity and cost. +

+ +

4.6.8 Distance Limitations

+ +

+ PCIe over fiber distance depends on optical budget and signal integrity: +

+ + + + + + + + + + + + + + + + + + + + + + + + + +
PCIe GenMulti-Mode FiberSingle-Mode Fiber
Gen 3 (8 GT/s)300 m10 km
Gen 4 (16 GT/s)100 m2-10 km
Gen 5 (32 GT/s)50-100 m2 km
+ +

+ Fi-Wi requirement: Building-scale deployments require + ≤100 m reach, easily achieved with Gen 3/4 over multi-mode fiber or any + generation over single-mode fiber. +

+ +

4.6.9 Cost Analysis

+ +

PCIe over fiber cost per RRH:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ComponentCost (approx.)
RRH-side PCIe optical adapter$150-300
Fiber pair (50m installed)$50-100
Optical transceiver pair$50-100
PCIe switch port allocation$100-200
Total per RRH$350-700
+ +

Comparison to network alternatives:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ApproachCost per RRHLatencyDeterminism
PCIe over fiber$350-7002-5 µsExcellent
10GbE + TSN$300-60010-30 µsGood
Standard 10GbE$200-40020-100 µsFair
+ +

+ PCIe over fiber costs moderately more than standard Ethernet but delivers + 5-10× better latency and superior determinism. For Fi-Wi's DMA-based + architecture, this cost is justified by the performance and architectural + simplicity gains. +

+ +

+ For context: a typical enterprise AP costs $500-2000, and a cellular small + cell costs $1000-5000. The fronthaul cost is comparable to or less than + the radio cost difference, making it economically viable. +

+ +

4.6.10 Alternative: Hybrid PCIe + Ethernet

+ +

+ For deployments where PCIe over fiber infrastructure is unavailable, a + hybrid approach is possible: +

+ + + +

+ This reduces PCIe bandwidth requirements (only packet data, not + CSI/control) and allows leveraging existing Ethernet infrastructure for + non-latency-critical traffic. +

+ +

+ However, the pure PCIe approach is architecturally cleaner and avoids the + complexity of dual-protocol RRH implementation. +

+ +

4.6.11 Comparison to Cellular Fronthaul Standards

+ +

For context, cellular systems use:

+ +

CPRI (Common Public Radio Interface):

+ + + +

eCPRI (Enhanced CPRI) / Fronthaul Gateway:

+ + + +

Fi-Wi (PCIe over fiber):

+ + + +

+ Fi-Wi's functional split and PCIe transport provides a unique balance: + lower bandwidth than CPRI, lower latency than eCPRI, and native + integration with the DMA-based architecture. +

+ +

4.6.12 Summary: PCIe Over Fiber Enables Fi-Wi Architecture

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
RequirementTargetAchieved with PCIe Gen 3 x1
Bandwidth per RRH~1.5 Gbps✓ 7.88 Gbps (5× margin)
Aggregate (50 RRH)~30 Gbps avg✓ PCIe switch or multi-CPU
Round-trip latency<10 µs✓ 2.4-4.8 µs
Jitter<200 ns✓ <50 ns (credit-based)
Distance≤100 m✓ 300m MM / 10km SM
DeterminismNo drops, predictable✓ Credit-based flow control
Cost per RRH<$700✓ $350-700
+ +

Why PCIe over fiber is the right choice for Fi-Wi:

+ +
    +
  1. + Native DMA model: RRHs are DMA engines—PCIe is the + natural transport +
  2. + +
  3. + Lowest latency: 2-5 µs vs. 10-100 µs for Ethernet +
  4. + +
  5. + Perfect determinism: Credit-based flow control + eliminates jitter and drops +
  6. + +
  7. + Architectural simplicity: No network stack, no protocol + translation +
  8. + +
  9. + Proven technology: Used in HPC, storage (NVMe-oF), and + telecom +
  10. +
+ +

+ The deterministic, sub-5-microsecond fronthaul is what enables Fi-Wi's + centralized control, time synchronization, and single-bottleneck queueing + architecture. Unlike Wi-Fi mesh, controller-based systems with + over-the-air backhaul, or even Ethernet-based approaches, PCIe over fiber + provides the predictable substrate needed for the control loops described + in Appendices A and B to operate with the precision required for + sub-millisecond tail latency control. +

+ +
+

+ 4.7 Precision Clock Synchronization over Fronthaul +

+ +

+ The "cellularization" of Wi-Fi relies on a unified timebase. In the + Fi-Wi architecture, time is not merely used for logging; it is a + control variable. To achieve coordinated scheduling, + accurate queue measurements, and seamless mobility, every RRH must share + a precise understanding of "now" down to the microsecond level. +

+ +

+ To achieve this, Fi-Wi establishes a strict + Hierarchical Clock Tree over the PCIe fronthaul, + leveraging the native determinism of the bus rather than the best-effort + nature of packet switching. +

+ +

4.7.1 The Concentrator as Grandmaster (GM)

+ +

+ The Fi-Wi Concentrator acts as the + PTP Grandmaster (IEEE 1588v2 / 802.1AS) for the entire + building. It houses the primary reference oscillator (typically a + high-stability OCXO). +

+ + + +
+

Diagram 4-2: The Fi-Wi Clock Tree Topology

+ +
+          External Reference (Optional GPS/GNSS)
+                       │
+                       ▼
+    ┌──────────────────────────────────────────────┐
+    │            Fi-Wi Concentrator                │
+    │     [ High-Stability Ocillator (OCXO) ]     │ ◄── Grandmaster (GM)
+    │           (System Timebase t0)               │
+    └──────────────────┬───────────────────────────┘
+                       │ PCIe PTM / Hardware Sync
+                       │ (Compensates for fiber flight time)
+          ┌────────────┼─────────────┐
+          ▼            ▼             ▼
+    ┌───────────┐ ┌───────────┐ ┌───────────┐
+    │   RRH 1   │ │   RRH 2   │ │   RRH 3   │      ◄── Slaves
+    │ [LocalOsc]│ │ [LocalOsc]│ │ [LocalOsc]│
+    │  Locked   │ │  Locked   │ │  Locked   │
+    └─────┬─────┘ └─────┬─────┘ └─────┬─────┘
+          │             │             │
+          ▼             ▼             ▼
+     Frequency-Coordinated Operation
+    
+
+ +

4.7.2 What Clock Synchronization Actually Enables

+ +

+ A defining advantage of the Fi-Wi architecture is the use of "Hard + Synchronization" via PCIe, rather than "Soft Synchronization" via + Ethernet. While Ethernet-based APs rely on IEEE 1588 PTP, they are + subject to switch jitter and software stack latency. PCIe over fiber + eliminates these variables. +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FeatureFi-Wi (PCIe over Fiber)Traditional APs (Ethernet)
Protocol + PCIe PTM (Precision Time Measurement)
+ Hardware-native, bus-level messages +
+ IEEE 1588 PTP
+ Packet-based, software/firmware stack +
Sync Accuracy + 20-50 nanoseconds
+ Bus cycle precision + fiber margin +
+ 100ns – 10µs
+ Highly dependent on network load +
Jitter Source + Minimal
+ Point-to-point hardware flow control +
+ High
+ Switch queuing & software interrupt latency +
CPU Overhead + Zero
+ Handled entirely by PCIe PHY/Controller +
+ Moderate to High
+ CPU must interrupt to process sync packets +
Primary Benefits + Accurate L4S timestamps, TSF synchronization, unified timeline for + clients + Basic time sync for logging and management
+ +

+ Important Note: While frequency-locked clocks provide + excellent timing consistency, they do not enable RF phase control or + coordinated simultaneous transmission. COTS Wi-Fi chips have independent + RF synthesizers with arbitrary phase offsets that cannot be controlled + externally. The value of clock synchronization lies in accurate + timestamping for L4S queue management and consistent TSF counters for + seamless client mobility, not in RF phase alignment. +

+ +

4.7.3 Operating Modes: GPS-Disciplined vs. Free-Wheeling

+ +

+ The Concentrator's clock behavior depends on the deployment environment + and regulatory requirements. There are two distinct modes of operation: +

+ +
Mode A: GPS-Disciplined (Absolute Synchronization)
+ +

+ In this mode, the Concentrator is connected to an external GNSS + (GPS/Galileo) receiver. The internal oscillator is disciplined to align + with UTC (Coordinated Universal Time). This connects + the internal timing of the Fi-Wi system to external absolute time. +

+ +
Mode B: Free-Wheeling (Relative Synchronization)
+ +

+ In deep indoor environments (basements, bunkers) where GPS is + unavailable, or cost-sensitive deployments where 6 GHz AFC is not + required, the Concentrator operates in + Free-Wheeling mode. +

+ +
+ The Engineering Reality: Timing Consistency vs. Absolute Time
+ For dynamic RRH selection and coordinated scheduling, what matters is + consistent timing across RRHs, not absolute UTC accuracy. As long as all + RRHs maintain synchronized TSF counters relative to the Concentrator, + the system can provide seamless mobility and accurate queue + measurements—even if the system's concept of "UTC" is drifting by + seconds per year relative to atomic time.
+
+ Because all RRHs are frequency-locked to the same Concentrator + oscillator, if the Concentrator drifts, the entire system drifts in + unison. This uniform time base enables coordinated operation without + requiring external time references for basic functionality. +
+ +

4.7.4 When Absolute Time Becomes Mandatory

+ +

+ While Free-Wheeling mode is sufficient for core system operation, + GPS-Disciplined (Absolute) mode becomes mandatory when + the Fi-Wi system interacts with external systems that require UTC + timestamps: +

+ +
    +
  1. + 6 GHz AFC (Automated Frequency Coordination): To + operate at Standard Power in the 6 GHz band + (essential for outdoor or large-venue coverage), the FCC requires the + system to check a central database for incumbent microwave links. The + database operates on UTC. The Concentrator must sign its request with + a precise, absolute timestamp and geolocation. A drifting clock will + cause the AFC request to be rejected, forcing the system into Low + Power Indoor (LPI) mode. +
  2. + +
  3. + Inter-Concentrator Handoffs (Multi-Building Roaming): + In a campus environment with two distinct Concentrators (e.g., + Building A and Building B), a client roaming between them may + experience time jumps. If Concentrator A and B are free-wheeling + independently, their timestamps may differ by seconds. This jump can + break high-level security protocols (like Kerberos or 802.1X + re-authentication) that reject "replay attacks" based on timestamp + windows. +
  4. + +
  5. + Correlated Debugging: If a user reports a + connectivity drop at 10:04 AM, but the Concentrator has drifted by 45 + seconds, the system logs will be stamped 10:04:45. Correlating Fi-Wi + logs with client-side logs (which are usually synced to NTP/Cellular + time) becomes operationally difficult, complicating root-cause + analysis. +
  6. +
+ +

4.7.5 RRH Clock Distribution Hardware

+ +

+ Standard enterprise APs utilize free-running crystal oscillators with + ~20 ppm frequency error. This causes TSF counters to drift relative to + each other, making seamless mobility difficult. To achieve the timing + consistency required for Fi-Wi's coordinated operation, the RRH hardware + architecture must be fundamentally different. +

+ +

+ The Fi-Wi Solution: The RRH hardware uses + Mobile-Class Wi-Fi Silicon (which natively supports + external clock inputs) driven by a + Fronthaul-Recovered Precision Clock. +

+ +
+

Diagram 4-3: RRH Precision Clock Distribution Chain

+ +
+┌──────────────────────────────────────────────────────────────────────────────┐
+│                        RRH CLOCK DISTRIBUTION ARCHITECTURE                   │
+└──────────────────────────────────────────────────────────────────────────────┘
+
+        [ PCIe Over Fiber ]
+                 │
+                 │ (1) PTM Timestamps (Implicit Clock)
+                 ▼
+   ┌─────────────────────────────┐
+   │      RRH FPGA / Retimer     │
+   │   (Clock Recovery Circuit)  │
+   └─────────────┬───────────────┘
+                 │
+                 │ (2) "Dirty" Recovered Clock (High Jitter)
+                 ▼
+   ┌─────────────────────────────┐           ┌─────────────────────────────┐
+   │    JITTER ATTENUATOR IC     │           │    WI-FI 7 SOC (Client)     │
+   │    (e.g., Si5395 / LMK05)   │           │                             │
+   │                             │           │                             │
+   │   ┌─────────────────────┐   │           │    ┌───────────────────┐    │
+   │   │  Digital Servo Loop │   │ (3) Clean │    │   Internal PLL    │    │
+   │   │      (DSPLL)        │───┼───────────┼───►│ (RF Synthesizer)  │    │
+   │   └─────────────────────┘   │ 40 MHz    │    └─────────┬─────────┘    │
+   │                             │ Reference │              │              │
+   └─────────────────────────────┘           └──────────────┼──────────────┘
+                                                            │
+                                                            ▼
+                                                   [ 5 GHz / 6 GHz ]
+                                                   [ RF Carrier    ]
+                                                   (Independent phase per RRH)
+    
+

+ Signal Flow: The RRH recovers a noisy clock from the + PCIe fronthaul. A digital Jitter Attenuator cleans the signal using an + internal DSP servo loop. This provides the ultra-low phase noise + reference required for 4096-QAM while maintaining frequency lock to + the Concentrator's timebase. Note: The Wi-Fi chip's internal PLL + establishes its own RF carrier phase, which is independent across + RRHs. +

+
+ +

The clock distribution chain operates as follows:

+ +
    +
  1. + Concentrator (Grandmaster): Distributes the master + timebase via PTM packets over the PCIe-over-fiber link. +
  2. + +
  3. + RRH FPGA / Retimer: Recovers the implicit clock from + the PCIe bitstream or explicit PTM timestamps. +
  4. + +
  5. + Network Synchronizer (Jitter Attenuator): +
      +
    • + Component: e.g., Silicon Labs Si5395 or TI LMK05318. +
    • + +
    • + Function: Feeds the "dirty" recovered clock digitally + into this dedicated IC. +
    • + +
    • + Cleaning: The IC uses an internal, narrow-bandwidth DSP + servo loop to filter out PCIe transport jitter, synthesizing a + pristine 40 MHz reference. +
    • +
    +
  6. + +
  7. + Wi-Fi SoC (Client SKU): The cleaned signal is fed + directly into the chip's Ext_Ref / + XO_IN pin. The chip's internal PLLs lock to this external + frequency reference, ensuring consistent TSF counter operation across + all RRHs. +
  8. +
+ +
+ Architectural Decision: Digital Holdover vs. Voltage Control
+ Fi-Wi uses a Digital Network Synchronizer rather than a + traditional VCTCXO servo loop. In a VCTCXO design, any noise on the + analog control voltage line translates directly into phase noise, which + degrades 4096-QAM EVM. By using digital jitter attenuation, the control + loop remains in the digital domain until final synthesis, ensuring + ultra-low phase noise while providing superior holdover stability if the + fiber link flickers. +
+ +

4.7.6 Why Mobile Wi-Fi SKUs?

+ +

+ Fi-Wi explicitly selects + Mobile/Client Wi-Fi 7 chipsets (e.g., Qualcomm + FastConnect or Broadcom BCM43xx client series) rather than traditional + Enterprise AP SKUs. This choice is driven by specific architectural + needs: +

+ + + +

4.7.7 What Clock Synchronization Does NOT Enable

+ +

+ It is important to understand the limitations of frequency-locked clocks + with COTS Wi-Fi hardware: +

+ + + +
+ Key Insight: The frequency-locked clock discipline + ensures that TSF counters increment synchronously across all RRHs. This + enables consistent timing for seamless mobility and accurate + queue measurements—but does not enable RF phase control or coordinated + simultaneous transmission. Those capabilities would require custom ASIC + development with externally-controllable RF synthesizers, which is + beyond the scope of COTS Wi-Fi chipsets. +
+
+ +
+ +

5. Control Architecture: The Dual-Integrator System

+ +

+ A rigorous control-theoretic analysis of Wi-Fi reveals a fundamental + challenge: there are not one, but + two distinct integrators in the transmit path. In + traditional autonomous APs, these integrators are coupled in undefined + ways, leading to instability (bufferbloat) and poor interaction with TCP + congestion control. Fi-Wi explicitly separates these integrators, applies + distinct control laws to each, and enforces a strict + Time-Scale Separation to guarantee system stability. +

+ +

5.1 The Two Integrators

+ +

+ To achieve stability, we must model and control two distinct accumulation + processes: +

+ +
    +
  1. + The Outer Integrator (Group Queue): Located in the + Concentrator. This accumulates packets based on the mismatch between + arriving traffic (internet speed) and the wireless link capacity. It + operates on the RTT timescale (milliseconds). +
  2. + +
  3. + The Inner Integrator (Aggregation Buffer): Located + logically between the Concentrator and RRH. This accumulates packets to + build 802.11 A-MPDU aggregates for PHY efficiency. It operates on the + TXOP timescale (hundreds of microseconds). +
  4. +
+ +

5.2 The Outer Loop: L4S and Group Queue Dynamics

+ +

+ The primary bottleneck managed by the AQM (Active Queue Management) is the + Group Queue. This loop drives the end-to-end congestion + control (L4S/TCP). +

+ +

5.2.1 Queue Dynamics

+ +

+ The queue depth Q(t) evolves based on the mismatch between the arrival + rate λ(t) and the effective service rate μ(t): +

+ +
+dQ/dt = λ(t - τ_fwd) - μ(t)
+
+

5.2.2 The PI² Control Law

+ +

+ Fi-Wi uses a PI² controller to calculate a marking probability \( p(t) \), + targeting a shallow queue reference \( Q_{ref} \) (typically 200 µs). This + provides a coherent signal to L4S senders: +

+ +
+p(t) = K_alpha * (Q(t) - Q_ref) + K_beta * ∫ (Q(t) - Q_ref) dt
+
+
+ Concept Shift: AQM vs. Active Rate Management (ARM)
+ +

+ Traditional congestion control relies on + Active Queue Management (AQM): a queue must physically + build up before the network detects congestion and signals the sender to + slow down. The goal is to manage the queue size. +

+ +

+ L4S enables a new paradigm called + Active Rate Management (ARM). +

+ + + +

+ Reference: Koen De Schepper, "Understanding Latency 4.0", December + 2025.
+ Watch the explanation (19:15) +

+
+ +

5.3 The Inner Loop: MAC Aggregation and TXOPs

+ +

+ The Inner Loop manages the trade-off between PHY + efficiency (large aggregates) and latency (small aggregates). In + traditional APs, this integrator is effectively unbounded to maximize + benchmark scores, creating a "sawtooth" latency pattern that confuses TCP. +

+ +

Fi-Wi bounds this integrator via two mechanisms:

+ + + +

5.4 System Integration: Time-Scale Separation

+ +

+ For the nested loops to remain stable, the Inner Loop must look like + "constant service" to the Outer Loop. This requires the Inner Loop + bandwidth (ωmac) to be significantly higher than the Outer Loop bandwidth (ωtcp): +

+ +
+ω_mac >> ω_tcp   (typically > 20:1 ratio)
+
+

5.4.1 Frequency Domain Constraint

+ +

+ By forcing the MAC to operate at a frequency of 3–5 kHz (via 250 µs + TXOPs), the aggregation noise is pushed high enough that it is naturally + filtered out by the TCP loop (which operates at 10–20 Hz). +

+ +

+ 5.4.2 A-MPDU Aggregation Coherence and ECN Marking Precision +

+ +

+ The 250 µs TXOP constraint serves a dual purpose: it maintains time-scale + separation and ensures L4S receives coherent ECN feedback. + Traditional Wi-Fi's massive A-MPDU aggregation creates a fundamental + mismatch between Layer 2 efficiency and Layer 3 control precision. +

+ +

The Aggregation-Feedback Mismatch

+ +

+ In wide-channel deployments (160 MHz), APs build large A-MPDU aggregates + containing dozens of IP packets to amortize MAC overhead. This creates + three control-loop pathologies: +

+ + + +

Fi-Wi's Coherence Strategy

+ +

Fi-Wi resolves this through coordinated design:

+ +
    +
  1. + 40 MHz Channel Width: Narrower channels require smaller + aggregates, naturally increasing MAC service frequency. More frequent + transmissions with smaller payloads ensure sojourn time measurement + occurs at packet granularity. +
  2. + +
  3. + Concentrator-Level ECN Marking: The Concentrator + performs sojourn time measurement and ECN marking + before handing packets to RRHs for PHY transmission, preserving + microsecond-level queueing visibility. +
  4. + +
  5. + Bounded TXOP Duration: The 250 µs maximum ensures MAC + service frequency remains >10× higher than L4S control frequency (~1 + RTT), enabling senders to interpret ECN marks as smooth probability + signals rather than discrete bursts. +
  6. +
+ +

+ This approach maintains the benefits of A-MPDU efficiency while preserving + the feedback coherence L4S requires. The result: DualQ can sustain its + ~1ms target drain time without artificial inflation from aggregate + assembly delays. For detailed analysis, see Appendix I.7. +

+ +

5.4.3 Design Parameters for Stability

+ +

+ Fi-Wi uses these parameters to ensure the system remains critically + damped: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
LoopParameterTarget ValueRationale
OuterQueue Reference200 µsMaintains ultra-low queuing delay.
OuterUpdate Interval5 ms (~1 RTT)Matches typical control loop frequency.
InnerTarget TXOP250 µsEnsures ωmac >> ωtcp.
InnerMax Aggregate32 MSDUsLimits tail latency contribution.
+ +
+ +

6. Airtime Domains and Dynamic Queue Grouping

+ +

+ In Fi-Wi, the core rule is: + there is one deep queue per independent airtime resource. + The physical queue lives in concentrator memory, but it represents the + airtime of one RRH or a dynamic group of RRHs whose RF + signals are coupled strongly enough to behave like a single cell. +

+ +

+ If two RRHs can interfere, they cannot transmit simultaneously and + therefore must share a single logical queue. If RRHs are RF-isolated, each + receives its own queue. This preserves the “one bottleneck per control + loop” structure required by L4S. +

+ +

6.1 Why airtime determines queue structure

+ +

+ Service at each queue corresponds to over-the-air transmission. Any RRHs + that share RF space must share a service process and therefore share a + queue. RRHs that do not interfere have independent airtime and get + independent queues. +

+ +
+ Concentrator Queues (central DRAM, cellularized domains) + ──────────────────────────────────────────────────────── Queue A (airtime + domain A) ├── RRH1 └── RRH2 Queue B (airtime domain B) └── RRH3 Queue C + (airtime domain C) ├── RRH4 ├── RRH5 ├── RRH6 └── RRH7 Queue D (airtime + domain D) └── RRH8 +
+ +

6.2 Forming airtime groups dynamically

+ +

+ Crucially, these RF groups and their queues are + not static. The concentrator forms and maintains airtime + domains dynamically using: +

+ + + +

+ Beyond simple interference, Fi-Wi’s groupings also consider the + spatial structure of the channels: +

+ + + +

Over time, the Fi-Wi system continuously adjusts:

+ + + +

+ Groups may merge if interference appears or split if RRHs become + effectively isolated (e.g., after a channel change or power adjustment, + including beacon power shaping). The AQM and ECN marking logic always runs + at the current group queue, so L4S always sees a single, + well-defined bottleneck per cellularized domain. +

+ +

+ Because all RRHs expose real-time CSI, queue metrics, retry statistics, + airtime usage, and beacon reports into the concentrator’s shared state, + Fi-Wi can form RF groups that are tuned not just for coverage but for: +

+ + + +

6.3 Room-Level RRH Density (FTTR-Class Deployment)

+ +

+ Fi-Wi is not designed around a small number of big AP cells per floor. The + architecture assumes something much closer to + Fiber-to-the-Room (FTTR): one cell per room, + with fiber or equivalent deterministic fronthaul feeding small RRHs in + each room. +

+ +

+ In higher-end deployments, each room can contain + multiple RRHs (e.g., 2–4 per room) to support: +

+ + + +
+ Room-level Fi-Wi layout (conceptual) [Fi-Wi Concentrator] │ Fiber / + fronthaul │ ┌──────────┼──────────┬──────────┐ │ │ │ │ Room 1 Room 2 Room + 3 Room 4 │ │ │ │ RRH1..4 RRH5..8 RRH9..12 RRH13..16 (2–4/rm) (2–4/rm) + (2–4/rm) (2–4/rm) +
+ +

+ This density dramatically improves RF control. With RRHs separated by just + a few meters, the concentrator sees: +

+ + + +
+ Within a single room (example: 4 RRHs) Ceiling plan (top view) + ─────────────────────── RRH-A RRH-B ●-----------● | | | | ●-----------● + RRH-C RRH-D All four RRHs feed central queues with shared state and CSI. +
+ +

+ Traditional AP-based architectures cannot achieve this cleanly because + they lack shared state and maintain separate, isolated + queues and PHY/MAC processes in each AP. Even with a central controller, + they are limited to heuristic steering and static power/channel tweaks. +

+ +

Fi-Wi, by contrast:

+ + + +

+ A cell-per-room architecture makes Fi-Wi fundamentally different from + controller-based Wi-Fi: it behaves more like + cellular small cells with centralized coordination than + like a set of autonomous APs. +

+ +
+ +

7. Queue Architecture for Fi-Wi

+ +

+ Fi-Wi centralizes packet memory, queueing, AQM, and TXOP scheduling inside + the concentrator. Because the concentrator is the true bottleneck for all + wireless transmissions, Fi-Wi can use a clean, minimal queue structure + that behaves predictably under load and exposes stable delay semantics to + L4S congestion controllers. This stands in contrast to traditional APs, + where dozens of hidden queues (per-station, per-TID, firmware rings, + retry/BA windows, PS-poll buffers, rate-control queues) produce variable + and unobservable queueing delay. +

+ +

+ This section describes Fi-Wi’s queue architecture, why WMM priority + becomes largely unnecessary, and how centralized TXOP scheduling + eliminates the stochastic contention that drives Wi-Fi collapse in legacy + systems. The goal is simple: a minimal number of queues, explicit queue + semantics, and predictable latency for all traffic classes. +

+ +

7.1 Why queue architecture matters

+ +

+ Because all packets live in the concentrator’s memory until the moment + they are transmitted over the air, Fi-Wi can explicitly control: +

+ + + +

+ This allows Fi-Wi to do what distributed APs cannot: construct a + consistent, visible bottleneck queue that L4S congestion controllers can + lock onto with stable behavior. +

+ +

+ 7.2 The theoretical case: L4S makes most priority obsolete +

+ +

+ If queue delay is capped around 500 µs, legacy WMM categories provide + little additional value. For example, consider a voice stream: +

+ +
+Voice codec: 80 bytes every 20 ms  (64 kbps)
+Transmit time at 1 Gbps: ~0.64 µs
+L4S queue target:        500 µs
+Voice latency budget:    ~150,000 µs
+
+Queue share: 500 / 150,000 = 0.3%
+
+

+ If L4S keeps queueing delay under ~500 µs, then all traffic — + including voice — stays far inside its latency budget. WMM’s role in + combatting bufferbloat disappears when bufferbloat itself is removed. +

+ +

7.3 Practical complications

+ +

Three real-world issues motivate a cautious design:

+ +

• UDP does not respond to ECN

+ +

Voice and video often use UDP. They:

+ + + +

+ Fi-Wi can mitigate this using + per-flow fair queuing inside the L4S queue, keeping UDP + in check without needing a separate WMM hierarchy. +

+ +

• Airtime vs. queue time

+ +
+Total latency = Queue delay + Contention delay + TX delay + Retry delay
+                    ^^^^^^^^^^^^
+             L4S controls this
+
+

+ WMM historically manipulates AIFS, CW, and TXOP to reduce contention + delay. Fi-Wi eliminates contention entirely using + centralized TXOP scheduling, so WMM’s airtime hacks lose + relevance. +

+ +

• Failure modes and defense-in-depth

+ +

Even L4S can fail under:

+ + + +

+ Hence, Fi-Wi benefits from a small amount of priority separation, at least + in early deployments. +

+ +

7.4 Minimal 3-queue structure

+ +

+ The theoretically sufficient minimal queue architecture + for Fi-Wi is three queues: +

+ + + +
+

Figure 7-1: Minimal 3-Queue Fi-Wi Architecture

+ +
+                    ┌──────────────────────────────────────────┐
+                    │               Concentrator               │
+                    │ (Central Packet Memory • AQM • TXOP)     │
+                    └──────────────────────────────────────────┘
+                                   ▲
+                                   │
+                     ┌─────────────┼──────────────────┐
+                     │             │                  │
+                     │             │                  │
+            ┌────────┴───┐   ┌─────┴─────┐    ┌───────┴──────┐
+            │ Q_mgmt     │   │ Q_L4S      │    │ Q_classic     │
+            │ (Strict    │   │ (ECT(1),   │    │ (ECT(0),      │
+            │  priority) │   │  dual-Q)   │    │  classic)     │
+            └──────┬─────┘   └─────┬─────┘    └──────┬────────┘
+                   │               │                  │
+                   └───────────────┼──────────────────┘
+                                   │
+                          TXOP Scheduler
+                  (Build AMPDU • Select RRH • 200–250µs)
+                                   │
+         ┌─────────────────────────┼──────────────────────────┐
+         │                         │                          │
+     ┌───▼───┐               ┌─────▼─────┐              ┌─────▼─────┐
+     │  RRH1 │               │   RRH2    │              │   RRH3    │
+     │ (PHY) │               │  (PHY)    │              │  (PHY)    │
+     └───────┘               └───────────┘              └───────────┘
+
+

+ The minimal Fi-Wi queue architecture contains a strict-priority + management queue plus dual-queue L4S (L4S + Classic). All buffering + lives in the concentrator; RRHs keep no deep queues. L4S senders see a + clean single-bottleneck model, and all 802.11 management frames bypass + AQM entirely for correctness. +

+
+ +

+ In this design, WMM is unnecessary at the wireless + bottleneck. All data traffic benefits from the same controlled queue + delay, and fairness is enforced by per-flow scheduling rather than EDCA. +

+ +

7.5 Pragmatic 5-queue structure

+ +

+ A more conservative deployment uses five queues per + airtime domain: +

+ +
    +
  1. + Qmgmt — Management & control (strict + priority) +
  2. + +
  3. + QL4S-hi — High-priority L4S (voice, control) +
  4. + +
  5. + Qclassic-hi — High-priority classic (legacy + VoIP) +
  6. + +
  7. + QL4S-be — L4S best-effort (bulk QUIC/TCP) +
  8. + +
  9. + Qclassic-be — Classic best-effort (legacy + devices) +
  10. +
+ +
+

+ Figure 7-2: Pragmatic 5-Queue Fi-Wi Architecture (Defense-in-Depth) +

+ +
+                       ┌───────────────────────────────────────────┐
+                       │                Concentrator                │
+                       │  (Central Packet Memory • AQM • TXOP)      │
+                       └───────────────────────────────────────────┘
+                                   ▲
+                                   │
+          ┌─────────────── Five Logical Queues Per Airtime Domain ────────────────┐
+          │                                                                        │
+    ┌─────┴─────┐   ┌─────────┬──────────┬──────────┬──────────┬─────────┬────────┘
+    │ Q_mgmt     │   │ Q_L4S-hi│ Q_classic-hi│ Q_L4S-be │ Q_classic-be │
+    │ (priority) │   │ (Voice) │ (Legacy VoIP) │ (Bulk TCP/QUIC) │ (Legacy bulk) │
+    └─────┬──────┘   └──────┬──────────────┬──────────────┬──────┘
+          │                 │              │              │
+          └─────────────────┼──────────────┼──────────────┘
+                            │
+                      TXOP Scheduler
+                (Build AMPDU • Select RRH • Delay Targets)
+                            │
+      ┌─────────────────────┼──────────────────────────┐
+      │                     │                          │
+   ┌──▼───┐            ┌────▼────┐                 ┌────▼────┐
+   │ RRH1 │            │ RRH2    │                 │ RRH3    │
+   │ (PHY)│            │ (PHY)   │                 │ (PHY)   │
+   └──────┘            └─────────┘                 └─────────┘
+
+

+ The 5-queue design provides a two-tier priority system across L4S and + Classic traffic. This conservative architecture offers compatibility + with legacy UDP voice/video, while still keeping Fi-Wi’s centralized L4S + semantics intact. Over time, deployments can collapse from 5 queues to 3 + as performance data validates the simpler model. +

+
+ +

7.6 Numerical examples

+ +

+ Consider 10 simultaneous HD video calls (~20 Mbps total) plus a + saturating background TCP flow: +

+ +

Legacy WMM:

+ + + +

Fi-Wi with L4S + fair queuing:

+ + + +

+ This is roughly 1000× lower queueing latency than legacy + WMM systems, and it applies to all traffic, not only traffic in a + “priority” AC. +

+ +

7.7 Deployment strategy

+ +

Fi-Wi can phase its queue structure over time:

+ + + +

Metrics to monitor include:

+ + + +

7.8 WMM support in Fi-Wi

+ +

WMM exists to correct three historical problems in distributed Wi-Fi:

+ + + +

Fi-Wi removes the root causes of these behaviors:

+ + + +

+ Because of this, full WMM support at the air bottleneck is + not necessary. However, Fi-Wi does support WMM + semantics for: +

+ + + +

Fi-Wi handles WMM as an admission-time mapping:

+ + + +

+ This preserves compatibility while avoiding the complexity and + unpredictability of EDCA-based priority systems. Over time, Fi-Wi + deployments can rely on pure L4S semantics and collapse WMM to a + compatibility shim, not a required scheduling mechanism. +

+ +

7.9 Summary

+ +

Fi-Wi’s centralized queue architecture enables:

+ + + +

+ Traditional Wi-Fi uses WMM to work around bufferbloat and contention. + Fi-Wi removes those problems entirely through tight queue control, shared + state, and central scheduling. Priority becomes a policy choice — not a + crutch for an unstable MAC. +

+ +

+ In Fi-Wi, the Carve-Out ensures the voice packet (L4S) bypasses the + accumulated Classic bulk data completely. The file download continues to + saturate the link, but the + latency of the L4S flow is decoupled from the load of the Classic + flow. +

+ + + +

8. RRH-Level Active Redundancy

+ +

+ Fi-Wi’s centralized shared state across RRHs makes it natural to treat + multiple radios as an active redundant set for the same + STA or room. This is analogous in spirit to 802.11be’s + Multi-Link Operation (MLO), where a single multi-link + device (MLD) can use multiple links for reliability and capacity. In + Fi-Wi, the concentrator is the coordination point leveraging shared state, + and the RRHs are the distributed radios providing multiple RF paths. +

+ +

8.1 Uplink: Duplicate Reception & Diversity

+ +

+ In many deployments, a client STA will be audible at more than one RRH + (overlapping coverage). On the uplink, Fi-Wi exploits this spatial + diversity to improve reliability without requiring changes to the client. +

+ +
    +
  1. + Multi-Point Reception: Multiple RRHs may receive the + same MPDU from a transmitting STA, potentially at different SNR/MCS + levels. +
  2. + +
  3. + Forwarding: Each RRH decodes the frame locally. If the + Frame Check Sequence (FCS) passes, the RRH timestamps the frame (using + the shared global timebase), attaches metadata (RSSI, SNR, Channel State + Information), and forwards it to the Concentrator via the + PCIe-over-Fiber link. +
  4. + +
  5. + Post-Detection Selection: Effectively, the Concentrator + acts as a Post-Detection Selection Diversity combiner: + +
  6. +
+ +

+ This approach leverages the spatial diversity of distributed RRHs to + mitigate shadowing and multipath fading. Because the selection logic + operates on valid MAC frames (after FCS verification) rather than raw I/Q + samples, this architecture maintains compatibility with standard COTS + Wi-Fi silicon at the Radio Head. +

+ +
+ Uplink redundancy STA │ (same frame) ╱ ╲ RRH1 RRH2 │ │ └──► Fi-Wi + Concentrator ◄──┘ (dedup + select) +
+ +

8.2 Downlink: per-packet steering

+ +

+ On the downlink, the concentrator can treat multiple RRHs as candidate + transmitters for a given STA or room: +

+ + + +

This gives Fi-Wi:

+ + + +
+ Group Queue (airtime domain A) ────────────────────────────── │ ├─► RRH1 + TXOPs to STA └─► RRH2 TXOPs to STA (backup or parallel) Concentrator + chooses RRH per TXOP based on CSI + load + shared state. +
+ +

+ 8.2.1 Listen-Before-Talk (LBT) and RRH Eligibility for Downlink Scheduling +

+ +

+ In a multi-RRH Fi-Wi deployment, each radio head operates on the same + BSSID and channel but sits in a different physical location with its own + RF conditions. While Fi-Wi centralizes all queueing and scheduling + decisions, every RRH must still obey the fundamental 802.11 rule: + listen-before-talk (LBT). +

+ +

+ This is where Fi-Wi diverges sharply from classical multi-AP systems. In + UniFi, Ruckus, Aruba, and all controller-based Wi-Fi architectures, each + AP queue is blind to the RF medium state until it attempts to transmit. + The AP commits a packet to the hardware queue, and if the medium is busy, + the packet waits (Head-of-Line blocking) while the AP performs backoff. +

+ +

+ Fi-Wi inverts this. RRHs continuously report their + LBT Eligibility Status (Clear/Busy) to the Concentrator + via the high-speed telemetry path. RRHs report LBT eligibility status via + PCIe telemetry with update intervals of 100–500 µs, well-matched to + inter-TXOP scheduling decisions. While the Concentrator cannot react + within a single 9µs backoff slot, it operates on the + Inter-TXOP timescale (200–500 µs1). +

+ +

+ Before posting a new DMA descriptor to an RRH, the Scheduler checks this + eligibility: +

+ + + +

+ This prevents Head-of-Line Blocking where a packet sits + in a hardware queue on a jammed radio. When multiple RRHs report clear + airtime, Fi-Wi selects among them based on link quality (CSI) and + predicted airtime efficiency. Conversely, if all RRHs report medium-busy, + no RRH is primed; the scheduler pauses the flow to prevent backpressure + from accumulating in the RRH hardware, keeping the queue depth visible in + the Concentrator where L4S can measure it. +

+ +

+ The result is a form of + Centralized Selection based on LBT Eligibility. Multi-AP + systems coordinate configuration (channels, power), but they cannot + coordinate transmit starts because they lack the real-time + feedback loop to steer packets away from busy radios before they are + queued. +

+ +

+ 1 Representative scheduling interval for mixed traffic + workloads; actual TXOP durations range from tens of microseconds (small + frames) to several milliseconds (large aggregates). +

+ +
+
+ Figure 8-3: Per-RRH LBT eligibility feeding the + centralized Fi-Wi scheduler. +
+ +
+                        (Shared RF / Airtime Domain)
+
+       +----------------------+                 +----------------------+
+       |      RRH-A           |                 |      RRH-B           |
+       |  (Room / Zone A)     |                 |  (Room / Zone B)     |
+       +----------------------+                 +----------------------+
+       |  LBT: Clear          |                 |  LBT: Busy (ED high) |
+       |  Eligible = YES      |                 |  Eligible = NO       |
+       +----------+-----------+                 +-----------+----------+
+                  |                                           |
+                  |  Fiber fronthaul (low latency)            |
+                  |                                           |
+                  v                                           v
+
+                     +-----------------------------------+
+                     |  Fi-Wi Concentrator / Scheduler   |
+                     +-----------------------------------+
+                     |  Centralized queue for building   |
+                     |  L4S feedback / congestion state  |
+                     |                                   |
+                     |  Decision: Post Descriptor to A   |
+                     |  (RRH-B flagged as jammed/ineligible|
+                     |   prevents HoL blocking)          |
+                     +----------------+------------------+
+                                      |
+                                      | Downlink frames / aggregates
+                                      v
+
+                               +--------------+
+                               |   Client(s)  |
+                               +--------------+
+  
+
+ +
+
+ Figure 8-4: Inter-TXOP Steering. The Scheduler uses LBT + state to decide where to stage the next packet. Note: The RRH + still performs local backoff; the Scheduler simply ensures data is + staged at the RRH that + currently reports clear channel conditions. +
+ +
+Time →
+------------------------------------------------------------------------------------------------->
+
+RRH-A (Room A):        [ Sense medium ]  [ Idle ]  [ Clear ]  [  Transmit TXOP  ]  [ Idle ... ]
+                       |<-- DIFS --->|   |<---- contention window (few slots) ---->|
+
+RRH-B (Room B):        [ Sense medium ]  [  ED high: medium busy  ]  [ Backoff ... ]
+                       |<---- busy ---->|
+
+RRH LBT → Scheduler:       A: "Clear"                  B: "Busy"
+
+Scheduler View:        [ Receive LBT states from A, B ]
+                       [ Mark A = eligible, B = ineligible ]
+                       [ Dequeue next packets from central queue ]
+                       [ Post descriptor to RRH-A only ]
+
+Downlink Action:       RRH-A receives descriptor, enters backoff, wins, transmits.
+                       RRH-B remains silent (no descriptor posted).
+
+Effect:                • No packet trapped in RRH-B's buffer
+                       • No exponential backoff storm
+                       • Deterministic selection of the RRH with clear airtime
+  
+
+ +

8.3 Analogy to Wi-Fi 7 MLO

+ +

+ 802.11be MLO allows a multi-link device (AP/STA) to use multiple links + (e.g., 2.4G, 5G, 6G bands or channels) under a single MAC entity. Features + include: +

+ + + +

+ Fi-Wi provides a similar effect at the building scale, + but with important differences: +

+ + + +

+ Because the RRHs are spatially distributed around rooms + and hallways, Fi-Wi gains advantages that co-located antennas cannot + provide: +

+ + + +

+ These advantages come from intelligent packet routing and + dynamic RRH selection, not from RF phase coordination or + simultaneous beamforming across RRHs. +

+ +

+ 8.3.1 Fi-Wi vs Wi-Fi 7 MLO: Compliance and Control +

+ +

+ Fi-Wi strictly adheres to local regulatory compliance. + The Concentrator manages the queue and the schedule, but + the RRH manages the compliance. +

+ +

+ When the Scheduler assigns a TXOP to an RRH, it posts a descriptor. The + RRH hardware then performs standard 802.11 EDCA: +

+ +
    +
  1. It senses the medium.
  2. + +
  3. It draws a random backoff counter.
  4. + +
  5. It counts down only when the medium is idle.
  6. + +
  7. It transmits when the counter reaches zero.
  8. +
+ +

The Architectural Difference:

+ +

+ In MLO or Mesh: If an AP commits a packet to a radio and + that radio hits congestion, the packet is trapped in the local buffer. The + backoff might take 50ms. During this time, the AP's other radios (or other + APs in the mesh) might be idle, but they cannot help because the packet is + already "owned" by the busy MAC. +

+ +

+ In Fi-Wi: The packet remains in the Concentrator's + central memory until the last possible moment (see Appendix F). If the + Concentrator sees an RRH entering deep backoff (via real-time telemetry) + or reporting "Busy," it stops posting new descriptors to that RRH and + steers subsequent traffic to a free RRH. The backoff engine remains local + (compliance), but the queue feeding it is steered globally (performance). +

+ +

+ This allows Fi-Wi to scale airtime domains across an entire building while + preventing the multi-node contention collapse that plagues traditional + Wi-Fi networks. +

+ +
+
+ Figure 8-6: Per-airtime-domain queueing and scheduling + in MLO versus Fi-Wi. +
+ +
+Wi-Fi 7 MLO: per-radio queues and MAC logic           Fi-Wi: one centralized queue per airtime-domain
+================================================      ===============================================
+
+   Airtime-domain                                    Airtime-domain
+   --------------                                    --------------
+
+   +-------------+   +-------------+                +-------------------------+
+   |  Radio 1    |   |  Radio 2    |                |   Fi-Wi Concentrator    |
+   | MAC engine  |   | MAC engine  |                |  (per airtime-domain)   |
+   | Backoff     |   | Backoff     |                +-------------------------+
+   | DMA queues  |   | DMA queues  |                |  Centralized queue      |
+   +------+------+   +------+------+                |  AQM / L4S feedback     |
+          |                 |                       |  Scheduler              |
+          |                 |                       +-----------+-------------+
+          v                 v                                   |
+   Packet trapped          Packet trapped                       |
+   in local queue          in local queue                       |
+   during backoff          during backoff                       v
+
+                                                     +--------+-------+    +--------+-------+
+                                                     |   RRH A        |    |   RRH B        |
+                                                     | RF front-end   |    | RF front-end   |
+                                                     | LBT + backoff  |    | LBT + backoff  |
+                                                     +--------+-------+    +--------+-------+
+                                                              ^                    ^
+                                                              |                    |
+                                                   Scheduler posts descriptor only to
+                                                   the RRH that is clear and eligible.
+
+  
+
+ +

8.4 Preserving the "single bottleneck" L4S view

+ +

+ To keep L4S happy, Fi-Wi needs to preserve a + single bottleneck queue per flow even while using + multiple RRHs: +

+ + + +

In other words:

+ + + +

+ 9. Dynamic Point Selection and Intelligent Frequency Reuse +

+ +

+ Traditional Wi-Fi deployments suffer from two fundamental problems in + high-density environments: (1) clients are statically associated to a + single AP based on initial connection, leading to suboptimal performance + as they move, and (2) autonomous APs compete for airtime through CSMA/CA + contention, creating interference. Fi-Wi inverts this paradigm through + Dynamic Point Selection—continuously choosing the optimal + RRH per packet—and Intelligent Frequency Reuse—leveraging + spatial isolation to maximize capacity. +

+ +

9.1 Dynamic Point Selection: The Core Capability

+ +

+ Unlike traditional Wi-Fi where clients are physically and logically tied + to a single Access Point (AP), Fi-Wi treats the entire building as a + single Virtual Cell. The Concentrator maintains real-time + Channel State Information (CSI) from all RRHs and dynamically selects the + optimal transmission point for each individual packet. +

+ +

9.1.1 The Roaming Paradigm Shift: Negotiation vs. Execution

+ +

+ To understand the magnitude of this shift, we must compare the standard + "Fast BSS Transition" (802.11r) with the Fi-Wi approach. In standard + Wi-Fi, mobility is a negotiation. In Fi-Wi, it is an execution. +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
StepStandard Wi-Fi (802.11r / Fast Roaming)Fi-Wi (Dynamic Point Selection)
1. TriggerClient detects low RSSI and decides to scan. + Concentrator detects better path via Uplink SNR. +
2. Action + Client tunes radio off-channel to scan for beacons (Latency spike: + 50–100ms). + Zero Action. Client stays on channel.
3. Handshake + Client sends Auth + Re-Assoc frames. AP validates + keys. + None. No Over-the-Air frames.
4. SwitchAP 1 tears down keys; AP 2 installs keys. + Concentrator updates the DL_RRH_ID pointer in memory. +
Total Time~50ms – 150ms (Best case)< 1ms (PCIe Write)
+ +

+ While 802.11r is sufficient for buffered video (Netflix), it typically + breaks real-time applications like Voice over Wi-Fi (VoWiFi) and VR/XR, + where a 50ms gap causes audio dropouts or visual artifacts. Fi-Wi's + sub-millisecond switching ensures true continuity. +

+ +

9.1.2 How It Works

+ + + +

9.1.3 Example Scenario

+ +

Consider "Alice" on a VR headset walking down a hallway:

+ +
    +
  1. + Alice starts a session in Room 304 (near RRH-A: RSSI + -40 dBm). The Concentrator routes packets via RRH-A. +
  2. + +
  3. + Alice walks toward the doorway. RRH-A degrades (-55 dBm) while the + hallway unit, RRH-B, improves (-45 dBm). +
  4. + +
  5. The Concentrator detects this crossing point in the CSI data.
  6. + +
  7. + For the very next packet, the pointer switches to + RRH-B. +
  8. + +
  9. + Result: Alice's VR stream continues without a single + dropped frame or latency spike. She is unaware that the transmission + point changed. +
  10. +
+ +

9.3 Intelligent Frequency Reuse

+ +

+ In traditional Wi-Fi, neighboring APs on the same channel create + co-channel interference. The standard solution is to assign different + channels (e.g., AP-A uses Channel 36, AP-B uses Channel 48), but this + wastes spectrum. Fi-Wi enables + intelligent frequency reuse—using the same channel across + multiple RRHs when spatial conditions allow. +

+ +

When Frequency Reuse Works

+ +

+ Frequency reuse is viable when clients are in + spatially separated locations with significant isolation + (typically >25-30 dB attenuation due to walls, floors, or distance). +

+ +

Example: Adjacent Rooms

+ + + +

The Fi-Wi Decision:

+ +
    +
  1. + Concentrator detects >30 dB spatial isolation via CSI measurements +
  2. + +
  3. Configures both RRH-A and RRH-B to operate on Channel 36
  4. + +
  5. Each RRH performs independent CSMA/CA in its local environment
  6. + +
  7. Cross-interference is minimal due to spatial isolation
  8. + +
  9. + Result: Effective channel capacity is doubled without requiring + additional spectrum +
  10. +
+ +

Dynamic Adaptation

+ +

+ The key advantage over static channel planning is + real-time adaptation: +

+ + + +

Why Autonomous APs Cannot Do This

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
RequirementFi-Wi (C-RAN)Autonomous APs
Global CSI Visibility + Complete: Concentrator sees CSI from all RRHs to all clients in + real-time + + Fragmented: Each AP only knows its own channel. Must exchange info + over backhaul (high latency) +
Decision Latency + Microseconds: Concentrator makes decisions in software at µs + granularity + + Milliseconds to seconds: APs coordinate via slow management protocols +
Adaptation SpeedPer-packet: Can switch RRH or channel based on every CSI update + Minutes: Channel changes require beacon updates, client reassociation +
Client DisruptionNone: Decisions are transparent to clients + High: Channel changes or AP reassignment cause connectivity + interruptions +
+ +

9.4 Transparent Integration with L4S

+ +

+ The complexity of dynamic point selection and frequency reuse is hidden + from the L4S congestion control loop. Traffic still lives in + per-airtime-domain group queues. When the Concentrator enables frequency + reuse or optimizes RRH selection, it simply affects the effective service + rate μ(t) of the queue. +

+ +

+ The PI² controller in the outer loop (see + Section 5) sees the queue draining faster and + naturally reduces ECN marking. This allows L4S senders (TCP Prague) to + ramp up their congestion windows to fill the expanded capacity. The system + automatically discovers and exploits available spatial capacity without + requiring changes to congestion control algorithms or application + awareness. +

+ +
+

+ 9.5 Governing Station Media Access: The Control Hierarchy +

+ +

+ A common critique of centralized wireless architectures is the + "autonomous client problem": while the infrastructure can be + coordinated, the stations (STAs) are independent entities that contend + for the medium using their own logic. +

+ +

+ Fi-Wi addresses this by enforcing a + Control Hierarchy that governs client behavior from the + physical layer up to the transport layer. Instead of passively hoping + for "good client behavior," Fi-Wi uses four distinct mechanisms to + throttle, steer, or schedule station media access. +

+ +
+

Figure 9-3: The Four Tiers of Client Governance

+ +
+Level 1: Deterministic (Hard)
+   [ 802.11ax Trigger Frames ] ──▶ STA must wait for Schedule
+                                    (Zero contention)
+
+Level 2: Transport (Adaptive)
+   [ L4S / ECN Marking ] ────────▶ OS Kernel throttles pacing
+                                    (Reduces MAC load before enqueue)
+
+Level 3: RF Physics (Steering)
+   [ Beacon Power Shaping ] ─────▶ STA firmware seeks new cell
+                                    (Moves demand to different domain)
+
+Level 4: Statistical (Soft)
+   [ WMM / AIFS Parameters ] ────▶ STA adjusts backoff aggression
+                                    (Statistical deprioritization)
+    
+
+ +

1. Deterministic Scheduling (802.11ax/be)

+ +

+ For modern clients (Wi-Fi 6/7), Fi-Wi removes autonomy entirely for + uplink traffic. The Concentrator generates + Trigger Frames via the RRH. +

+ + + +

2. Transport-Layer Pacing (L4S)

+ +

+ For the growing ecosystem of L4S-capable clients (iOS, macOS, Linux, + Windows), control is applied at the Operating System kernel. +

+ + + +

3. RF Footprint Shaping (Beacon Power)

+ +

+ Fi-Wi manipulates the physical environment to restrict which RRHs a + client perceives as viable, effectively "shoving" media access demand to + specific airtime domains. +

+ + + +

4. Statistical Parameter Biasing (WMM/AIFS)

+ +

+ As a defense-in-depth measure for legacy clients, Fi-Wi advertises tuned + WMM EDCA parameters. +

+ + + +
+ Summary: Fi-Wi does not rely on a single method to + control clients. It uses Triggers for precision, + L4S for flow-rate discipline, + RF Shaping for load balancing, and + WMM as a statistical safety net. +
+
+ +

9.6 What Dynamic Point Selection Does NOT Enable

+ +

+ To maintain technical accuracy, it is important to clarify what Fi-Wi's + dynamic point selection does not provide: +

+ + + +

These capabilities would require either:

+ + + +

+ Fi-Wi's architecture deliberately focuses on capabilities achievable with + COTS Wi-Fi chips, providing 2-3x capacity improvement through intelligent + management rather than pursuing 4-6x gains that would require custom + silicon development. +

+ +

9.7 Performance Expectations

+ +

+ Based on the capabilities described above, Fi-Wi provides the following + performance improvements over traditional autonomous AP deployments: +

+ + + +

+ These gains are achieved through + centralized intelligence and microsecond-latency fronthaul, not through RF phase control or coordinated transmission. The + architecture remains fully compliant with unlicensed spectrum regulations + and works with commodity Wi-Fi chipsets. +

+ +

9.8 Summary

+ +

+ Fi-Wi transforms the problem of wireless density by treating it as a + routing and scheduling problem rather than an RF + coordination problem. By centralizing packet memory and MAC scheduling, + Fi-Wi converts adjacent radios from interferers into + dynamically selected access points, allowing the network + to scale capacity through intelligent management rather than collapsing + under interference. +

+ +

+ The key insight is that most Wi-Fi performance problems stem from poor + decisions (wrong AP, wrong channel, wrong timing) rather than fundamental + RF limitations. Fi-Wi solves this by providing the Concentrator with + complete visibility and control, enabling microsecond-granularity + optimization that autonomous APs cannot match. +

+ +
+ +

10. Fi-Wi value vs. Traditional Distributed APs

+ +

+ Modern enterprise Wi-Fi deployments use centralized controllers (Cisco + WLC, Aruba Mobility Controller, Ubiquiti UniFi, Ruckus SmartZone, etc.) to + manage multiple APs. These controllers coordinate the + control plane: channel assignment, transmit power, client + association hints, roaming policies, and security. However, these remain + loosely-coupled systems where the data plane — + queueing, MAC scheduling, aggregation, and packet memory — remains + distributed inside individual APs. +

+ +

+ A traditional AP is not just “running EDCA.” It is running EDCA + after juggling dozens or hundreds of logical MAC queues and state + machines: +

+ + + +

+ With N stations, an AP can easily have on the order of + N × (4–8) logical queues behind a single RF channel. + Every AP in the same RF domain runs this large, isolated, queue-filled + state machine independently. No AP has a global view; controllers see only + coarse statistics. +

+ +

The result:

+ + + +

+ Fi-Wi is fundamentally different: it centralizes both control plane + and data plane with shared state across all RRHs. The + concentrator does not just configure RRHs; it directly manages their + queues, schedules their TXOPs, maintains unified CSI and airtime state, + and applies coordinated ECN marking for each airtime domain. This + architectural difference — not just improved control-plane coordination — + is what enables Fi-Wi’s latency, L4S, and spatial multiplexing advantages. +

+ +
+

Diagram 10-1: Queue Explosion Inside a Traditional AP

+ +
+┌──────────────────────────── Traditional Distributed AP ───────────────────────────┐
+│                                                                                   │
+│  Many MAC queues hidden inside each AP:                                           │
+│                                                                                   │
+│    ┌─────────────┐  ┌─────────────┐  ┌─────────────┐                              │
+│    │ STA 1 TID   │  │ STA 2 TID   │  │ STA N TID   │   ... (N stations × 4–8 TIDs)│
+│    │ Queues      │  │ Queues      │  │ Queues      │                              │
+│    └─────┬───────┘  └─────┬───────┘  └─────┬───────┘                              │
+│          │                │                │                                      │
+│   ┌──────▼────────────────▼────────────────▼──────────┐                           │
+│   │   Firmware Queues (Aggregation, Reorder, BAR/BA)  │                           │
+│   └───────────┬───────────────────────────────────────┘                           │
+│               │                                                                   │
+│   ┌───────────▼──────────────┐                                                    │
+│   │ Hardware MAC Ring Buffers│   (TX/RX DMA)                                      │
+│   └───────────┬──────────────┘                                                    │
+│               │                                                                   │
+│   ┌───────────▼──────────────┐                                                    │
+│   │ EDCA / CSMA-CA Contention│   (Per-AP, no coordination)                        │
+│   └───────────┬──────────────┘                                                    │
+│               │                                                                   │
+│        Long, multi-ms TXOP bursts, inconsistent ECN, early collapse               │
+│                                                                                   │
+└───────────────────────────────────────────────────────────────────────────────────┘
+  
+

+ See also: + Section 2.1 — Why L4S + Legacy Wi-Fi Struggle, + Appendix A — 802.11 Backoff & Collapse Dynamics. +

+
+ +

+ The following subsections detail specific benefits of Fi-Wi’s + cellularized, tightly-coupled architecture compared to controller-managed, + loosely-coupled AP systems. +

+ +

10.1 Deterministic low latency

+ +

Traditional APs:

+ +

+ Each AP builds its own local queues. Under load, large aggregates, + retries, and hidden buffering produce multi-millisecond queueing and + service delays. Tail latency is largely uncontrolled, and varies across + APs sharing the same channel. +

+ +

Fi-Wi (cellularized Wi-Fi, cell-per-room):

+ + + +

10.2 Stable L4S behavior

+ +

Traditional APs:

+ +

+ L4S flows traverse multiple hidden queues: wired bottlenecks, AP-local + queues, firmware queues, and EDCA contention. ECN marking (if it exists at + all) is inconsistent and not tied to a single bottleneck. Collapse + produces noisy, bursty marking or loss, and the L4S control loop becomes + oscillatory or falls back toward classic congestion behavior, especially + in the tails that matter to users. +

+ +

Fi-Wi:

+ + + +

10.3 Aggregation without losing visibility

+ +

Traditional APs:

+ +

+ Aggregation improves PHY efficiency but hides individual packet timing + from the congestion controller. The controller does not know which MSDUs + were grouped into a TXOP, what the queue state was when the TXOP started, + or how long each device has been waiting. +

+ +

Fi-Wi:

+ + + +

+ This combination yields high PHY efficiency and transport-layer + visibility into congestion, instead of having to choose one or the other. +

+ +

10.4 Building-scale coordination

+ +

Controller-managed loosely-coupled APs:

+ +

+ The controller can adjust channels, power, and send steering hints (e.g., + 802.11v), but it cannot see or control: +

+ + + +

+ As a result, these systems rely on heuristic, reactive policies: channel + reassignment after interference is observed, power adjustments based on + neighbor reports, and client steering using RSSI or airtime snapshots. + These help, but they operate on coarse time scales (seconds to minutes) + and cannot fix the fundamental data-plane issues of distributed queues, + MAC contention, and tail latency under load. +

+ +

Fi-Wi cellularized architecture:

+ +

+ The concentrator maintains true + shared state across all RRHs in the building: +

+ + + +

+ Because RRHs are distributed in space (often 2–4 per room in high-density + deployments), Fi-Wi can leverage spatial separation for intelligent + frequency reuse. The concentrator sees CSI from all RRHs and can make + microsecond-granularity decisions about which RRH should transmit each + packet — all while preserving the "single bottleneck queue per airtime + domain" discipline required for stable L4S behavior. +

+ +
+

+ Diagram 10-2: Fi-Wi Centralized Queueing, Scheduling, and Shared State +

+ +
+┌─────────────────────────── Fi-Wi Cellularized Architecture ────────────────────────────┐
+│                                                                                        │
+│     One deep queue per airtime domain                     Shared CSI + µs timestamps   │
+│                                                                                        │
+│          ┌───────────────────────────────────────────┐                                 │
+│          │ Centralized Airtime-Domain Queue (ECN AQM)│◄──────────┐                     │
+│          └───────────────────┬──────────────────────┘            │                     │
+│                              │                                   │                     │
+│   ┌──────────────────────────▼──────────────────────────┐        │                     │
+│   │   Concentrator Scheduler (L4S, TXOP, RF Grouping)   │◄───────┘                     │
+│   │        Dynamic Point Selection per Packet           │                              │
+│   └───────────────┬─────────────────────────┬───────────┘                              │
+│                   │                         │                                          │
+│       PCIe/Fiber  │                         │   PCIe/Fiber                             │
+│                   │                         │                                          │
+│   ┌───────────────▼─────────────┐  ┌────────▼──────────────┐  ...                      │
+│   │    RRH 1 (Thin MAC/PHY)     │  │   RRH 2 (Thin MAC/PHY)│                           │
+│   └───────────────┬─────────────┘  └────────┬──────────────┘                           │
+│                   │                         │                                          │
+│             Selected RRH transmits; others silent in this TXOP                         │
+│                                                                                        │
+└────────────────────────────────────────────────────────────────────────────────────────┘
+  
+

+ See also: + Section 4 — Key Fi-Wi Mechanisms, + Section 5 — Control Architecture, + Section 9 — Dynamic Point Selection. +

+
+ +

10.5 Control Plane vs. Data Plane

+ +

+ The table below summarizes the architectural differences between + controller-managed, loosely-coupled APs and Fi-Wi's cellularized, + tightly-coupled architecture: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CapabilityController-Managed Loosely-Coupled APsFi-Wi Cellularized Tightly-Coupled
Control Plane
Channel assignment✓ Centralized✓ Centralized
Transmit power control✓ Centralized✓ Centralized + dynamic beacon shaping
Client steering hints✓ Centralized (802.11v/k)✓ Centralized
Data Plane
Packet queues + ✗ Distributed per-AP; many hidden per-STA/per-TID/firmware queues + ✓ Exactly one deep queue per airtime domain in the concentrator
MAC scheduling & aggregation✗ Autonomous per-AP; long TXOPs under load✓ Coordinated across RRH groups; TXOP length explicitly bounded
Timestamp synchronization✗ Not available at packet level✓ µs-accurate (PTM/PTP) shared across RRHs
Shared CSI state✗ Per-AP only; summarized to controller✓ Building-wide CSI aggregation at the concentrator
Queue visibility & AQM✗ Hidden in each AP; no global AQM + ✓ Fully visible per domain; explicit L4S/AQM on the true bottleneck +
L4S/ECN marking point✗ Inconsistent or absent; multiple uncontrolled bottlenecks✓ Single, well-defined marking point per airtime domain
Dynamic point selection✗ Clients statically associated to one AP✓ Per-packet RRH selection based on real-time CSI (Section 9)
Selection diversity✗ Single AP receives uplink✓ Multiple RRHs receive; best copy selected (Section 9)
Intelligent frequency reuse✗ Static channel plan✓ Dynamic adaptation based on spatial isolation (Section 9)
Per-packet steering between radios✗ Not available✓ Active redundancy and fast failover (Section 8)
Dynamic RF grouping✗ Static AP boundaries✓ Adaptive airtime domains based on CSI and load (Section 6)
+ +
+ Key insight: controller-managed systems coordinate + configuration but leave data-plane behavior distributed and autonomous. + Fi-Wi unifies the data plane with shared state and explicit control of + queues and TXOPs, enabling fundamentally different behavior for latency + control, dynamic point selection, and building-scale coordination. All + capabilities are achieved with COTS Wi-Fi chipsets and comply with + unlicensed spectrum regulations. +
+ +

10.6 Operational and lifecycle advantages

+ +

Controller-managed loosely-coupled APs:

+ + + +

Fi-Wi cellularized architecture:

+ + + +
+ +
+

+ 11. RRH Physical Envelope: Power, Thermals, and Size +

+ +

+ The economic viability of a "Cell-Per-Room" architecture hinges on the + Remote Radio Head (RRH) being fundamentally simpler, cooler, and cheaper + than a traditional Enterprise Access Point. By offloading complex logic + to the Concentrator (Section 13) and precision timing to the Fronthaul + (Section 4.7), the RRH becomes a lean physical device. +

+ +

+ 11.1 The Silicon Strategy: Mobile vs. Enterprise SKUs +

+ +

+ Fi-Wi explicitly selects + Mobile/Client Wi-Fi 7 chipsets (e.g., Qualcomm + FastConnect or Broadcom BCM43xx client series) rather than traditional + Enterprise AP/Networking SKUs. While Section 4.7 detailed how this + enables external clocking, this choice is equally critical for the + physical envelope: +

+ + + +

11.2 Power Budget Composition

+ +

+ We set a hard budget of 3.5–4 W total per RRH, enabling + Power over Ethernet (PoE) Class 1 or 2 operation, or simple remote + powering over hybrid fiber/copper cables. +

+ + + +

11.3 Thermal and Mechanical Implications

+ +

+ A sub-4W envelope fundamentally changes the industrial design + possibilities for the RRH: +

+ + + +

11.4 Concentrator-Side Considerations

+ +

+ Fi-Wi relies on a "Split Thermal" architecture. We deliberately shift + the power density from the edge (the ceiling) to the core (the wiring + closet). +

+ + +
+ +
+ +

12. PCIe Fronthaul (Gen3 x1 over Fiber)

+ +

12.1 Why PCIe as the RRH interface

+ +

+ A central hardware design choice is to make the RRH look like a + PCIe endpoint to the Fi-Wi concentrator. This leverages + the fact that: +

+ + + +

Benefits of this choice:

+ + + +

+ We start with PCIe Gen3, one lane (x1), carried over + fiber via a retimer + optical interface. Higher generations or widths + (Gen4, x2/x4) are possible later but not required for the initial Fi-Wi + performance targets. +

+ +

12.2 Gen3 x1 throughput

+ +

PCIe Gen3 provides:

+ + + +

+ After protocol overhead (TLP headers, DLLPs, flow control), the + sustained payload throughput for Gen3 x1 is in the rough + range of 6–7 Gb/s for large transfers. This is more + than sufficient for: +

+ + + +

+ If a future RRH design must exceed this, the same architecture scales to: +

+ + + +

+ For our initial Fi-Wi deployment assumptions, + Gen3 x1 over fiber is a sensible and sufficient starting point. +

+ +

12.3 Latency characteristics and budget

+ +

PCIe Gen3 latency has several components:

+ + + +

Order-of-magnitude:

+ + + +

Compared to:

+ + + +

+ the PCIe-over-fiber latency is effectively negligible. It + comfortably fits within the microsecond-level time base used for: +

+ + + +

12.4 Mapping queues and metadata

+ +

+ The PCIe model fits naturally with the Fi-Wi queueing and metadata scheme. + Each RRH behaves like a PCIe endpoint with: +

+ + + +

+ The FiWiMeta header lives in host memory adjacent to packet + payloads and is referenced by these descriptors. +

+ +

Downlink flow:

+ +
    +
  1. + Concentrator enqueues IP/Ethernet packets into a group queue in DRAM, + allocates or updates FiWiMeta (including + t_ingress_us and queue snapshot). +
  2. + +
  3. + Scheduler posts PCIe descriptors to the RRH for the next TXOP, selecting + which MSDUs and which RF group/airtime domain. +
  4. + +
  5. + RRH DMA-fetches the MSDUs via Gen3 x1, builds an aggregate (A-MPDU), + transmits over the air, and reports: + +
  6. +
+ +

Uplink flow:

+ +
    +
  1. + RRH receives 802.11 frames from STAs, decodes them, and attaches CSI and + MAC status. +
  2. + +
  3. + RRH DMA-writes the frames + metadata into concentrator DRAM via PCIe. +
  4. + +
  5. + Concentrator: + +
  6. +
+ +

In both directions, the PCIe fronthaul:

+ + + +

12.5 PCIe Hot Swap

+ +

+ A critical operational requirement for Fi-Wi is the ability to service, + replace, or add RRHs without bringing down the entire building's wireless + network. PCIe provides native support for this through + hot-plug capability, which is standard in enterprise + server platforms and can be leveraged for Fi-Wi deployments. +

+ +

12.5.1 Hot-plug fundamentals

+ +

+ PCIe hot-plug allows physical insertion and removal of endpoint devices + (RRHs) while the system is running: +

+ + + +

12.5.2 RRH insertion flow

+ +

When a new RRH is connected or powered on:

+ +
    +
  1. + Physical detection: PCIe hot-plug controller detects + the new device via link training. +
  2. + +
  3. + Enumeration: Concentrator OS (Linux) enumerates the new + PCIe endpoint: + +
  4. + +
  5. + Driver initialization: Fi-Wi driver: + +
  6. + +
  7. + RF group integration: Concentrator control plane: + +
  8. +
+ +

+ Time from physical insertion to active traffic forwarding: typically + 1–5 seconds, depending on link training, driver + initialization, and RF group discovery. +

+ +

12.5.3 RRH removal flow

+ +

+ When an RRH is removed (planned maintenance, failure, or surprise + disconnection): +

+ +
    +
  1. + Detection: PCIe hot-plug event or surprise removal + detected: + +
  2. + +
  3. + Traffic rerouting: Concentrator immediately: + +
  4. + +
  5. + Queue cleanup: Driver: + +
  6. + +
  7. + RF group adjustment: Control plane: + +
  8. +
+ +

+ Impact on active connections: minimal to none for STAs + served by multi-RRH domains. Traffic seamlessly fails over to remaining + RRHs within the same RF group. For isolated single-RRH cells, removal + causes brief disconnection until STAs reassociate with neighboring cells. +

+ +

12.5.4 Operational advantages

+ +

Hot-plug capability provides critical operational benefits:

+ + + +

12.5.5 Design considerations

+ +

To fully support hot-swap in production deployments:

+ + + +

12.5.6 Contrast with traditional APs

+ +

Traditional distributed APs handle failures differently:

+ + + +

+ Fi-Wi's PCIe hot-plug, combined with multi-RRH airtime domains and + centralized queues, enables + sub-second failover with minimal packet loss—a + qualitative improvement over traditional Wi-Fi high-availability + approaches. +

+ +

12.5.7 Integration with L4S and queue management

+ +

+ Hot-swap events interact cleanly with Fi-Wi's L4S and queueing + architecture: +

+ + + +

+ This separation—queues and control in the concentrator, timing-critical + MAC in hot-swappable RRHs—is precisely what enables graceful hardware + lifecycle management while maintaining the control-theoretic cleanliness + that L4S requires (Appendix A). +

+ +
+ +

+ 13. Hardware Architecture: The Workstation Concentrator vs. The Legacy AP +

+ +

+ To understand why Fi-Wi achieves deterministic latency where traditional + Wi-Fi fails, we must look beyond the protocol and into the physical + architecture of the devices. The feasibility of the "Cut-Through" RRH + design relies on the upstream link being non-blocking. Fi-Wi achieves this + by replacing the internal switching fabric of legacy APs with the massive + PCIe lane overprovisioning of a workstation-class Concentrator. +

+ +

+ 13.1 The Legacy Bottleneck: Anatomy of a Traditional AP +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ComponentTraditional AP (The Appliance)Fi-Wi RRH (The Peripheral)
Core SiliconComplex SoC (Quad-core CPU, NPU, Switch)Thin PHY/MAC + PCIe Retimer
Data PathStore-and-Forward (Switch → CPU → DMA)Cut-Through (Fiber → PCIe → Air)
Queues1000s of opaque hardware queuesZero deep queues (FIFO only)
Decision MakingAutonomous (Local Scheduler)None (Slave to Concentrator)
+ +

+ A traditional Enterprise Access Point is functionally a + "Router-on-a-Stick." It forces high-speed wireless traffic through a + series of internal serialization bottlenecks before the software ever sees + the packet. +

+ +
+ TRADITIONAL AP ARCHITECTURE (The Traffic Jam) [ Cat6 Cable ] | + +----------v-----------+ | RJ45 Magnetics | +----------+-----------+ | + +----------v-----------+ | Ethernet Switch | <--- Queuing Point A: + Switch Buffer | (or PHY) | (Head-of-Line Blocking / Opaque) + +----------+-----------+ | | GMII / RGMII / SGMII Link | (Fixed 1G or 2.5G + Pipe) | +----------v-----------+ | AP SoC | | | | [ CPU / OS ] | <--- + Queuing Point B: Kernel/Driver | | | (Software Bridging Latency) | v | | [ + HW DMA Rings ] | <--- Queuing Point C: Hardware Queues | (Per + Station/AC) | (The "Blind" Enqueue Point) | | | | [ Wi-Fi MAC/BB ] | + +--------+-------------+ | [ Radios ] +
+ +

Architectural Flaws in Legacy APs:

+ +
    +
  1. + The GMII Choke: The interface between the internal + Switch and the CPU is a serialized bottleneck (typically GMII/SGMII). + High-speed bursts from Wi-Fi 6E/7 radios can saturate this single link, + causing invisible backpressure inside the SoC. +
  2. + +
  3. + Triple Buffering: A packet is buffered at the Switch + (Point A), then in system RAM (Point B), and finally in the Hardware DMA + Ring (Point C). This "Store-and-Forward" chain destroys the precise + timing required for L4S. +
  4. + +
  5. + Opaque Switching: The internal switch operates + autonomously. The CPU has no visibility into the depth of the switch's + internal buffers, meaning latency accumulates invisibly before the OS + can measure it. +
  6. +
+ +

13.2 The Fi-Wi Solution: The 92-Lane Fabric

+ +

+ Fi-Wi eliminates the internal switch, the GMII link, and the autonomous + CPU. By utilizing high-end workstation silicon (e.g., AMD Threadripper Pro + or Intel Xeon W-3400 series), the Concentrator provides + 92 to 128 native PCIe lanes directly from a CPU with + 24 to 96 high-performance cores. +

+ +

+ The 92+ lanes of PCIe eliminate the need for an internal ethernet + switch anywhere in the datapath. +

+ +
+ TOPOLOGY COMPARISON Standard Server + Switch Fi-Wi Workstation + Concentrator ┌─────────────┐ ┌──────────────────────────┐ │ Dual CPU │ 20 + Lanes │ Workstation CPU │ │ (High Core) │ per CPU │ (24-96 Cores, High + Freq) │ └──────┬──────┘ └────────────┬─────────────┘ │ ||||||||||||||| (92 + Native Lanes) ┌──────▼──────┐ ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓ │ PCIe Switch │ (Congestion + Point) RRH RRH RRH RRH (Direct Attach) └─┬─┬─┬─┬─┬─┬─┘ ... ... ... ... ↓ ↓ + ↓ ↓ ↓ ↓ RRH Connections +
+ +

13.3 Dedicated Resources and Determinism

+ +

+ By mapping each RRH (or small groups of RRHs) to dedicated root ports on + the CPU, Fi-Wi achieves a Non-Blocking Architecture: +

+ + + +

+ This guarantees that the host DRAM behaves like + Deterministic Ultra-Low Latency Memory rather than a + shared network resource. This stability is the physical foundation that + allows the software-defined queues (Section 14) to operate with + microsecond precision. +

+ +
+ Historical Analogy: How the Cisco 7500 Removed the "Global + Lock"
+ +

+ Just as Fi-Wi removes blocking via massive PCIe lane availability, the + CyBus ASIC in the Cisco 7500 (1990s) solved a similar + bottleneck in routing. +

+ + + +

+ Fi-Wi applies this same "Non-Blocking" philosophy to the wireless + stack, utilizing 92+ lanes of PCIe to ensure that RRH memory access is + never gated by a shared internal switch or software mutex. +

+
+ +

14. Hardware Queues and the Software Advantage

+ +

14.1 The Hardware Queue Problem

+ +

+ Traditional Wi-Fi APs use hardware DMA (Direct Memory Access) rings to + meet strict 802.11 MAC timing requirements—SIFS and DIFS deadlines + measured in microseconds. While this solves the timing problem, it creates + a cascade of architectural constraints that Fi-Wi explicitly avoids. +

+ +

+ Hardware queues are expensive to implement in silicon. Each queue requires + dedicated SRAM for descriptor storage, control logic for pointer + management and overflow handling, and power even when idle. Current chip + design limits traditional APs to hardware queues at L2 or MAC—typically + the four WMM access categories (AC_VO, AC_VI, AC_BE, AC_BK) per radio * N + stations. +

+ +

+ While sufficient for basic priority handling, this fundamental constraint + prevents the sophisticated per-flow scheduling that modern high-density + networks require: +

+ +
+ What AP hardware queues prevent: ✗ Per-flow fair queuing (would require + 100+ queues) ✗ DualQ L4S per flow ✗ Dynamic queue allocation based on + traffic patterns +
+ +

14.2 The DMA Ownership Constraint

+ +

+ An equally significant problem is that once packets are enqueued to + hardware DMA rings, the CPU cannot access them without causing + race conditions. This "ownership transfer" creates fundamental + limitations: +

+ +
+ Critical constraint: All packet inspection, + classification, ECN marking, and policy decisions must occur + before handing packets to hardware. After DMA enqueue, software + is blind until transmission completes. +
+ +

This prevents:

+ + + +

14.3 Compensating Hardware

+ +

+ Because hardware queues are limited and packets become inaccessible after + DMA, traditional AP vendors must add compensating hardware functionality + to address these fundamental architectural limitations: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Fundamental LimitationHardware Workaround RequiredComplexity Added
Only 4-8 queues → no per-flow fairnessAirtime fairness tracking engineSignificant additional logic
Only 4-8 queues → no per-STA queuingMU-MIMO grouping and coordinationComplex scheduling algorithms
Can't inspect after enqueueHardware deep packet inspection enginePattern matching, state tracking
Can't mark ECN in real-timeHardware ECN marker with threshold logicQueue monitoring, marking logic
Can't reclassify flows dynamicallyFlow classification accelerator (TCAM)Fixed rules; high-priority only; cannot update easily
+ +

+ This compensating hardware represents substantial additional silicon area, + design complexity, and verification effort. More critically, + hardware-based solutions are fundamentally limited to fixed thresholds and + simple policies that were designed into the chip. They cannot implement + sophisticated algorithms like CoDel, PIE, or adaptive per-flow policies + that require complex state and frequent updates. +

+ +

14.4 Fi-Wi's Architectural Solution

+ +

Fi-Wi escapes these constraints through architectural separation:

+ +

RRH: Timing without queuing

+ +

+ RRH silicon implements only timing-critical functions (MAC/PHY, + synchronization) with zero hardware queues. Packets arrive from the + concentrator milliseconds before transmission, stay in simple descriptor + rings briefly, then transmit. No autonomous queuing or scheduling logic. +

+ +

Concentrator: Unlimited software queues

+ +

+ All queues live in concentrator DRAM. Because the concentrator operates at + TXOP granularity (~600 µs) rather than SIFS granularity (16 µs), it has + time for software scheduling. Queue structures are simple data structures + in memory— vastly cheaper than dedicated silicon: +

+ +
+ Concentrator per RF group: - 1000+ per-flow queues implemented as hash + tables in DRAM - Each queue is a simple software structure (linked list or + array) - Memory cost is negligible compared to 8+ GB server DRAM in + concentrator - No power consumption when idle - Can be + allocated/deallocated dynamically as needed Enables what traditional APs + cannot do: ✓ Per-flow fair queuing (stochastic fairness) ✓ DualQ L4S with + separate queues per flow class ✓ Real-time ECN marking (actual sojourn + time at TX) ✓ Sophisticated AQM (CoDel, PIE, custom algorithms) ✓ Deep + packet inspection any time before TX ✓ Dynamic flow reclassification ✓ + Full queue visibility for debugging +
+ +

Packet ownership until last moment

+ +

+ The critical difference: packets remain in concentrator DRAM + (software-accessible) until milliseconds before transmission. The + scheduler can: +

+ + + +

+ RRH only owns packets for ~1 ms while transmitting a TXOP—too brief to + constrain the system. +

+ +

14.5 Economic and Strategic Impact

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
AspectTraditional APFi-Wi
Queue countN stations * 4-8 (at MAC or L2 level)1000+ (dynamically allocated, quintuple level)
Queue implementationDedicated silicon (expensive)Software data structures (negligible cost)
Compensating logicSubstantial silicon for workaroundsNone needed
Per-flow fairnessImpossible (insufficient queues)Standard capability
Sophisticated AQMSimple thresholds only (hardware fixed)Any algorithm (CoDel, PIE, ML-based)
Policy updatesRequires new silicon designSoftware configuration or code update
Operational visibilityAggregate counters onlyFull per-flow statistics and queue contents
Algorithm experimentationImpossible in productionA/B testing, gradual rollout possible
+ +

+ Beyond the direct silicon cost advantages, Fi-Wi gains strategic + advantages that compound over time: +

+ + + +

14.6 Architectural Principle

+ +

Fi-Wi's approach follows a clear design principle:

+ +
+ RRH (hardware): Only latency-critical functions requiring + microsecond determinism (MAC timing, PHY processing, synchronization).
+
+ Concentrator (software): All scheduling, queuing, + inspection, marking, policy, and adaptation—anything that benefits from + flexibility, visibility, or frequent updates. +
+ +

+ This separation is not arbitrary. It's driven by fundamental constraints: + hardware is expensive, inflexible, and opaque; software is cheap, + updatable, and inspectable. By placing intelligence in software and only + timing-critical functions in hardware, Fi-Wi achieves both the performance + of hardware-accelerated systems and the flexibility of software-defined + networking—advantages that traditional distributed-AP architectures cannot + replicate due to their need for autonomous per-AP decision-making at + microsecond timescales. +

+ +

15. Adaptive Control via Machine Learning

+ +

+ The Fi-Wi architecture's centralized observability enables machine + learning to optimize MCS transition dynamics on a per-site basis. Unlike + autonomous APs that operate on partial, local state, the Concentrator + observes the complete state-transition graph for all RRHs under a single + clock. This section describes how Fi-Wi combines physics-based models with + adaptive learning to optimize performance. +

+ +

+ 15.1 The MCS State Graph as a Probability Current Network +

+ +

+ The MCS state graph from Section 2.7 can be formalized as a probability + current network, where each node represents a PHY configuration state (MCS + index, spatial stream count) and edges represent transitions between + states. The system's behavior follows probability flow dynamics: +

+ +
+

+ Figure 15-1: Interactive Animation: MCS and Spatial Stream Performance + (with Eigen Space) +

+ +
+

+ Interactive Animation: MCS and Spatial Stream Performance (with Eigen + Space) +

+ +
+
+

Autonomous AP(s)

+ + + +
+ PER: 0.0% +
+ +
+ Eigen Vectors: 2 +
+ +
+ WLAN Util: + 0.0% (0 Mbps) +
+ +
+ P99.9 Latency: + 0 ms +
+ (802.11: 0 ms + 802.3: + 0 ms) +
+
+
+ +
+

Centralized Concentrator

+ + + +
+ PER: 0.0% +
+ +
+ Eigen Vectors: 16 +
+ +
+ WLAN Util: + 0.0% (0 Mbps) +
+ +
+ P99.9 Latency: + 0 ms +
+ (802.11: 0 ms + 802.3: + 0 ms) +
+
+
+
+ + +
+

+ Flow Field Visualization +

+ +
+
+

Autonomous AP(s) - Flow Field

+ + + +
+ Turbulent Flow (High Entropy) +
+
+ +
+

Centralized Concentrator - Flow Field

+ + + +
+ Laminar Flow (Low Entropy) +
+
+
+
+ + +
+ + +
+

+ Probability Current (J) - Flow Field Visualization +

+ +
+

+ What you're seeing: The vector field (arrows) + shows the "flow" of PPDUs through the MCS/Spatial Stream + space—the "river" of probability current that drives system + behavior. +

+ +

+ Autonomous AP (Left): Turbulent flow with + chaotic arrow directions, sometimes pointing backward when + collisions occur. Multiple shallow potential wells create + competing forces. This represents + High Entropy—the system doesn't know which way + is optimal. +

+ +

+ Centralized Concentrator (Right): Laminar flow + with smooth, coherent streamlines pointing toward the optimum. + Steeper gradients and deeper potential wells create strong + convergence. This represents + Low Entropy (Determinism)—the system has clear + direction toward the optimal state. +

+
+
+ + +
+
+ +
+ +
+ +
+ When enabled, only one device is visualized. All devices still + run in the background to drive system dynamics, but you can see + the turbulence affecting a single device more clearly. +
+
+ +
+ +
+ L4S ON: Optimizes both PHY rates and latency + (conservative, stable MCS)
+ Note: L4S ECN signaling only works with + Centralized Concentrator architecture.
+ Autonomous APs can't coordinate aggregate WAN state, so queue + delay reduction doesn't apply.
+ L4S OFF (Greedy): Maximizes PHY rates + (aggressive, higher MCS targets) +
+
+ +
+ +
+ Unchecked (Phase 1): Software MAC Coordination + only. Eliminates collisions, but Eigenvectors capped at 4 + (Hardware Limit).
+ Checked (Phase 2): FPGA-based Coherency. + Unlocks Distributed MIMO (Rank Expansion). Eigenvectors scale to + 16+. +
+
+ +
+
+ + +
+ 15 devices +
+
+ +
+ +
+
+
+ Autonomous AP(s): +
+ +
+ 1 domain +
+
+ +
+
+ Fi-Wi RRH per room: +
+ +
+ 400 sq ft/room (25 + domains) +
+
+
+
+ +
+ + +
+ 10,000 sq ft (1.5 + devices per 1,000 sq ft) +
+
+
+
+
+
+ + +
+
+ Technical Justification for FPGA (Phase 2) +
+ +
+
+ Phase 1: Coordinated Scheduling (Software/MAC) +

+ In Phase 1, the Central Concentrator uses standard MAC-level + timing to prevent APs from transmitting simultaneously on the same + frequency.
+
+ Result: This successfully eliminates the "Red" + (collisions) seen in the Autonomous model. However, because the + Radio Heads (RRHs) are not phase-aligned, they cannot perform + Joint Transmission. The channel rank is limited to the physical + antennas of a single RRH (Rank 4). Throughput hits a "Glass + Ceiling." +

+
+ +
+ Phase 2: Distributed MIMO (FPGA/PHY) +

+ In Phase 2, we introduce an FPGA to achieve sub-nanosecond + synchronization between RRHs. This allows multiple RRHs to act as + a single, distributed antenna array.
+
+ Result: This unlocks Rank Expansion. The + system can resolve 16+ spatial streams (Eigenvectors) + simultaneously. The "Glass Ceiling" is removed, and throughput + scales linearly with the number of RRHs deployed. +

+ +
+ Implementation Mechanism: To achieve <1ns + precision over fiber, the system utilizes the + White Rabbit (IEEE 1588 HA) protocol. An FPGA on + the RRH compensates for variable PCIe bus latency (using + PCIe PTM) and fiber propagation delay, ensuring + the RRH clock is phase-locked to the Central Concentrator. +
+
+
+
+ +
+ +

+ 15.2 What Gets Learned: The Transition Rate Matrix +

+ +

+ Machine learning in Fi-Wi optimizes the transition rate matrix + W based on telemetry that is only observable in a + centralized architecture. For each potential transition from state + i (MCSi, SSi) to state + j (MCSj, SSj), the learned rate depends on: +

+ +
+ Per-Transition Learning Inputs: + +
+ +

The learned transition rate function takes the form:

+ +
+
+ W[i→j] = f(CSI, PER, queue_depth, interference, density, time, + site_params) +
+
+ +

+ This learned function answers: + "Given the current state and observed conditions, what is the optimal + next MCS/SS configuration to meet the L4S latency target while + maximizing achievable throughput?" +

+ +
+ Slow Learning, Fast Execution: +

+ The ML engine operates on the control plane timescale with adaptive + update rates: milliseconds for sudden events (interference spike + detection requiring rapid response), seconds for typical rate adaptation + (matching the timescales demonstrated by minstrel/minstrel_ht + schedulers), and minutes for long-term pattern learning (daily traffic + patterns, where slower updates are sufficient). This decouples the + computational cost of learning from the latency constraints of packet + transmission. The scheduler does not run neural network inference per + packet—it uses a pre-computed policy matrix updated at rates appropriate + to the dynamics being observed. +

+
+ +

15.3 Physics-Informed Learning

+ +

+ Fi-Wi uses physics-informed machine learning that + combines Shannon capacity theory with learned corrections. This hybrid + approach provides explainability, sample efficiency, and principled + generalization. +

+ +

The transition rate decomposes into two components:

+ +
+
+ W[i→j] = Wphysics(SNR, BW) · Wlearned(site, time, + load) ↑ ↑ Shannon-theoretic Site-specific baseline corrections +
+
+ +

+ Wphysics: The physics baseline uses Shannon + capacity to establish theoretical bounds. For each MCS index, the required + SNR is known from 802.11 specifications (e.g., MCS 11 requires ~30 dB). + The base transition rate is the probability that current SNR exceeds the + threshold given measured CSI. +

+ +

+ Wlearned: The learned correction factor + captures deviations from ideal conditions on a per-station basis, as + different spatial stream capabilities and local RF environments require + station-specific adaptation: +

+ + + +

+ This approach uses residual learning: the physics model + Wphysics provides the coarse steering (the "prior"), while the + ML model learns the residual error Δ specific to the site. This guarantees + the system never performs worse than a standard physics-based model, even + before site-specific training converges. The ML correction is additive (or + multiplicative) to a known-good baseline. +

+ +

This decomposition provides three advantages:

+ +
    +
  1. + Explainability: When Wlearned deviates + significantly from 1.0, the system can flag anomalies and explain why + performance differs from theory. +
  2. + +
  3. + Sample Efficiency: The physics prior means the ML model + only needs to learn corrections rather than the full mapping + from scratch. +
  4. + +
  5. + Generalization: The base model Wphysics is + universal. Site-specific Wlearned factors can be initialized + from similar deployments and fine-tuned with site-specific data. +
  6. +
+ +

15.4 Training Data from Centralized Observability

+ +

+ The Concentrator's complete state visibility provides labeled training + examples that are impossible to obtain in distributed AP systems. Each + scheduling decision creates a training tuple: +

+ +
+ Training Example Structure: +
+ Statet: • MCS = 9, SS = 2 (current PHY configuration) • Queue + depth = 50 packets • Sojourn time = 800 µs • CSI = [λ₁=0.92, λ₂=0.58, + κ=8.2 dB] (from RRH-A) • PERrecent = 0.02 (last 100 packets) + • Client density = 12 stations • Interference = -75 dBm Action: • + Transition to MCS = 7, SS = 2 Outcomet+1: • PER = 0.01 + (improved) • Throughput = 380 Mbps • Latency = 450 µs (met L4S target) • + Queue drain rate = increased Label: ✓ GOOD TRANSITION +
+
+ +

+ Over time, the Concentrator accumulates thousands of these labeled + examples across varying conditions. The ML model learns patterns such as: +

+ + + +

+ This supervised learning is + only possible with centralized observability. As detailed + in Appendix H, autonomous APs lack: +

+ + + +

+ It's worth noting that supervised learning doesn't require perfect ground + truth labels to be effective—even relative quality assessments ("better" + vs "worse") can drive learning. However, Fi-Wi's complete observability + provides significantly richer training signals: precise measurements of + queue impact, throughput changes, and latency effects that enable more + efficient learning compared to the partial observability available to + autonomous systems. +

+ +

15.5 Transfer Learning Across Sites

+ +

+ Fi-Wi's ML strategy uses transfer learning to balance generalization + across sites with site-specific optimization: +

+ +

Base Model (Cross-Site Training):

+ +

+ A foundational model is trained across multiple deployment sites to learn + universal patterns: +

+ +
+
+ Wbase[i→j] = funiversal(CSI, PER, queue_depth, + density) Learns: General relationships between SNR, MCS, PER, and + density +
+
+ +

Site-Specific Adaptation:

+ +

+ When deployed to a new site, the base model is augmented with learned + corrections: +

+ +
+
+ Wsite[i→j] = Wbase[i→j] + Δbuilding + + Δtemporal + Δbuilding: Building-specific RF corrections • Material + attenuation (concrete vs drywall) • Room geometry (open-plan vs + cubicles) • Persistent interference sources Δtemporal: + Time-varying patterns • Rush hour density • Weekend vs weekday usage • + Seasonal variations +
+
+ +

Continuous Adaptation:

+ +

+ The system continues to adapt using online learning with safety + constraints: +

+ + + +

15.6 The Learning Feedback Loop

+ +

+ Fi-Wi's ML capability creates a feedback loop that improves system + performance over time: +

+ +
+
+ 1. Centralized Observability → Complete visibility of state, actions, + outcomes 2. Supervised Learning → Labeled examples: (State, Action) → + Outcome quality 3. Improved Transition Rates → Wlearned + optimizes MCS selection per-site 4. Better User Experience → Higher + throughput, lower latency, fewer errors 5. More Training Data → New + conditions explored → model improves [Cycle repeats continuously] +
+
+ +

+ This loop is unique to centralized architectures. + Autonomous APs cannot generate ground truth labels without queue + observability. Coordinated AP systems (where APs share summaries via a + controller) see effects (latency, ECN) but not causes (queue growth, retry + timing, aggregation depth) due to high inference distance. +

+ +

+ Fi-Wi's centralized state graph provides the + causal observability that machine learning requires. The + probability current framework gives this learning a rigorous mathematical + foundation: we are learning the transition rate matrix of a physical + system governed by conservation laws. +

+ +
+ Summary: Centralization Enables Learning +

+ Machine learning requires complete, structured training examples where + actions, states, and outcomes are observable under consistent + measurement. Fi-Wi's centralized architecture provides this by design: + all state transitions occur under a single clock, all queue dynamics are + visible, and all RF outcomes are measurable. This makes the MCS + probability current learnable—something that is architecturally + impossible in distributed, autonomous systems. +

+
+ +

+ 15.7 The Multi-RRH Advantage: Learning the Spatial Network +

+ +

+ The presence of multiple concurrent Radio Heads (RRHs) serves as the + primary multiplier for the Fi-Wi machine learning capability. It + transforms the learning problem from optimizing a single isolated link + into optimizing a spatially coupled network. While a + traditional AP optimizes a local objective function (its own throughput), + the Fi-Wi Concentrator utilizes concurrent RRHs to construct a global view + of the RF environment. +

+ +

+ This multi-RRH architecture impacts the learning model in three critical + ways: +

+ +

1. Global RF State Visibility ("The Super-Eye")

+ +

+ In traditional systems, an AP is blind to the interference seen by its + neighbors. In Fi-Wi, the Concentrator aggregates real-time telemetry from + all RRHs simultaneously. +

+ +

This creates a Global RF State Matrix composed of:

+ + + +

+ This state matrix is sparse, time-aliased, and derived from + standards-compliant telemetry rather than continuous per-packet baseband + capture. +

+ +

+ The model learns not just that "Client A has a weak signal," but + specifically that "Client A is weak on RRH 1, strong on RRH 2, and creates + -80 dBm interference on RRH 3." This global observability enables the + prediction of building-wide interference patterns invisible to single-cell + learners. +

+ +

2. Expanded Action Space (Selection & Redundancy)

+ +

+ Because Fi-Wi treats multiple RRHs as an active redundant set, the ML + engine has a broader action space than a standard rate-control algorithm. + It learns not only how to transmit (MCS and scheduling decisions) + but which RRHs are eligible transmitters for a given packet. +

+ + + +

3. Phase 2 Capability: Eigenstructure & Rank Expansion

+ +

+ Note: This capability requires the hardware-synchronized FPGA + architecture (Phase 2). +

+ +

+ With sub-nanosecond synchronization, the ML engine will be able to resolve + the true distributed Eigenstructure of the + environment—the "shape" of available RF paths across distributed radios. + This allows for Rank Expansion, where the system resolves + more spatial streams (Eigenvectors) than a single physical AP could + support, scaling capacity approximately with the number of RRHs, subject + to channel rank and geometry. +

+ +

+ 15.8 Operational Calibration: Zero-Occupancy Sounding +

+ +

+ To ensure the physics-informed model converges accurately, Fi-Wi employs a + specific operational strategy: Zero-Occupancy Sounding. +

+ +

+ As described in Section 15.5, the site-specific transfer function is + composed of static building characteristics (Hstatic) and + dynamic temporal variations (Δtemporal). To disentangle these + variables, the system schedules automated channel sounding during hours of + minimum occupancy. +

+ + + +
+ The "Tare" Operation: +

+ In metrology, "tare" refers to zeroing a scale by removing known + weights to isolate what you want to measure. Similarly, Fi-Wi "tares" + the RF environment by measuring when human activity (the known + variable) is absent. +

+ +

+ Hmeasured(empty) ≈ Hstatic + Δbuilding +

+ +

+ By sounding when the building is empty, the system effectively removes + the noise of human movement and dynamic scatterers. This allows the + Concentrator to: +

+ +
    +
  1. + Isolate Hstatic: Establish a high-fidelity + ground truth of the static RF environment (walls, glass, steel). +
  2. + +
  3. + Calibrate the Physics Prior: Fine-tune the Shannon + capacity baseline (CShannon) against the specific physical + constraints of the deployment. +
  4. +
+
+ +

+ This establishes a stable baseline "Zero State" for the learning model, + ensuring that subsequent online learning is optimizing for dynamic changes + rather than relearning the static environment. + This separation dramatically improves offline RL dataset conditioning + by preventing the model from relearning static structure while adapting + to temporal dynamics. +

+ +

15.9 Bounded Model Validation During Idle Periods

+ +

+ While the primary learning mode is offline (using historical data), the + centralized Concentrator architecture enables a hybrid approach: + opportunistic, bounded model validation during predicted idle + periods. +

+ +

Idle Period Detection

+ +

+ Because the Concentrator has global visibility of queue states across all + RRHs in an Airtime Domain, it can predict when the RF channel will be + underutilized—a capability fundamentally unavailable to autonomous APs + that see only their local queues. +

+ + + +

Safe Validation Protocol

+ +

+ During high-confidence idle predictions, the system can perform controlled + validation and calibration—not arbitrary exploration: +

+ + + +

+ These activities refine the offline model without introducing risk to + production traffic. +

+ +

Production Traffic Protection

+ +

+ Validation is strictly bounded to prevent interference with real traffic: +

+ + + +

+ This hybrid approach provides the safety of offline learning with the + adaptability of continuous refinement, exploiting natural traffic lulls + that autonomous APs cannot collectively identify. +

+ +

+ 15.10 Architectural Comparison: Why Autonomous APs Cannot Learn +

+ +

+ Machine learning for MCS optimization is fundamentally enabled by Fi-Wi's + centralized architecture and impossible in distributed AP systems: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Requirement for MLAutonomous APFi-Wi Concentrator
Global CSI visibility + ❌ Each AP sees only local channel; no cross-AP interference data + + ✅ Concentrator receives CSI from all RRHs; computes spatial + correlation matrix +
Cross-AP coordination state + ❌ Cannot observe other APs' band selection, power levels, or + scheduling decisions + + ✅ Centralized scheduler has complete visibility of all RRH + configurations and decisions +
Queue observability❌ Queue depth hidden in firmware; sojourn time not exposed✅ Centralized queuing with microsecond-resolution timestamps
Deterministic replay + ❌ Cannot reproduce exact RF conditions; firmware decisions opaque + + ✅ Complete event log enables replay of scheduling decisions and + outcomes +
Inference distance❌ High (5-10 steps from cause to transport-layer effect) + ✅ Low (1-2 steps; queue → schedule → TX outcome directly linked) +
+ +

+ This observability gap is not a vendor implementation issue—it is an + architectural limitation. Autonomous APs cannot generate + high-quality training labels without queue observability. +

+ +

+ 16. Concentrator Fast Path: DPDK, DMA, and Queue Determinism +

+ +

+ The preceding sections established the architecture of the Fi-Wi + concentrator: centralized packet memory (Section 4.4), group queues as the + sole AQM bottleneck (Section 4.3), microsecond timestamps written into the + Fi-Wi shim header (Section 4.2), and ML-driven MCS selection running + continuously against that centralized data (Section 15). This section + explains how the concentrator executes that pipeline with the determinism + the architecture requires — maintaining a single observable bottleneck per + airtime domain, applying ECN marks at the right moment, and keeping the + RRH free of scheduling logic. +

+ +

16.1 Why a Kernel-Bypass Data Plane

+ +

+ The Fi-Wi concentrator's latency and determinism targets strongly favor a + kernel-bypass data plane. A conventional interrupt-driven kernel path + would reintroduce jitter at exactly the point where the architecture is + trying to remove it. +

+ +

+ L4S requires ECN marks to be applied at the group queue on the same time + scale as a single 802.11 TXOP. The Linux kernel's + softirq-based packet path introduces interrupt coalescing and + scheduler contention that accumulates across bursts. More fundamentally: + every packet that transits the kernel stack competes with arbitrary OS + activity for CPU time. The queue depth is not directly visible to + userspace without a syscall; the marking decision cannot be co-located + with the queue measurement in the same cache line. +

+ +

+ Fi-Wi's concentrator data plane therefore runs via + DPDK (Data Plane Development Kit): tight busy-poll loops + on dedicated cores, with no interrupt-driven jitter. All packet operations + — receive, classify, AQM mark, forward — execute in a cache-resident loop + that preserves the single-bottleneck, fully-observable queue structure + that the rest of the architecture depends on. +

+ +

16.2 The Memory Model: IOMMU, VFIO, and Hugepages

+ +

+ DPDK allocates all packet buffers (mbufs) from hugepages, + eliminating TLB misses during packet processing. Each airtime domain's + group queue is a logically contiguous region within this space. The pool + is allocated once at startup; no per-packet memory allocation occurs on + the fast path. +

+ +

+ Each SFP+ NIC is bound to the vfio-pci driver. The system + IOMMU enforces DMA isolation: a card can only reach the memory regions + explicitly registered with it at startup. This gives the concentrator two + properties simultaneously: +

+ + + +
+
+ Startup (once): rte_pktmbuf_pool_create() └─ VFIO registers hugepages + with IOMMU └─ NIC DMA engine can now reach mbuf pool directly Per-burst + (dedicated lcore, busy-poll): rte_eth_rx_burst(rrh_port, queue, pkts[], + N) ← NIC DMA → mbuf, no interrupt └─ classify_airtime_domain(pkt) ← + (port, queue_id) → group queue index └─ aqm_mark_l4s(pkt, queue_depth) ← + ECN CE if sojourn > threshold └─ rte_eth_tx_burst(out_port, ...) ← mbuf + → NIC DMA, zero copy +
+
+ Figure 16-1: Concentrator polling loop. No interrupts, no kernel + crossings, no per-packet allocation after startup. Queue depth and + sojourn time are visible in the same execution context as the ECN + marking decision. +
+
+ +

16.3 Airtime Domains as Hardware Queue Partitions

+ +

+ DPDK exposes each NIC's hardware receive queues independently. Fi-Wi uses + this to achieve a direct, lockless mapping from PCIe port and queue index + to airtime domain — the same logical grouping described in Section 6. Each + lcore owns a fixed set of (port, queue) pairs. Because ownership is + exclusive, there are no locks on the fast path and no shared state between + lcores during steady-state forwarding. +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Fast-Path PropertyKernel StackFi-Wi DPDK Pipeline
Receive and Queue Observability
Interrupt model + Hardware IRQ → softirq → NAPI poll; coalescing adds + jitter + + No interrupts. Dedicated lcore polls hardware queue register directly. +
Queue depth visibilityVisible inside kernel only; userspace access requires syscall + Directly readable by AQM loop in same CPU cache line as packet pointer +
Buffer allocationPer-packet skb allocation from kernel slabPre-allocated mbuf pool; zero allocation on fast path
AQM and Forwarding
ECN marking timingMarked in kernel qdisc; subject to scheduling lagMarked in polling loop body; co-located with queue measurement
Forwarding lookupRouting table + netfilter traversal(port, queue_id) → group queue index; O(1), cache-hot
Packet copyTypically 1–2 copies through socket buffer chainZero copies; mbuf pointer passed through the pipeline
Transmit
IOMMU interactionKernel maps and unmaps DMA regions per packet + IOMMU mapping established once at pool creation; static thereafter +
+ +

16.4 The L4S Marking Loop

+ +

+ The AQM marking step is deliberately minimal. The DPDK data plane does not + run a full queue scheduler — that is the outer control loop's + responsibility (Section 5). The inner loop does one thing: read sojourn + time from the shim header (Section 4.2) and set the ECN CE codepoint if + the threshold is exceeded. +

+ +
// Per-packet in the rx → tx burst loop:
+uint64_t sojourn_ns = now_tsc() - pkt->t_ingress;
+if (sojourn_ns > THRESHOLD_NS) {
+    rte_ipv4_l4s_mark(pkt);                       // in-place, no copy
+    fiwi_meta(pkt)->ecn_flags |= ECN_CE_APPLIED;
+}
+rte_eth_tx_burst(out_port, queue_id, &pkt, 1);
+
+ +

+ Because t_ingress is written by the same lcore at enqueue, no + cross-core communication is needed to compute sojourn time at dequeue. The + marking decision is local to the polling thread. This is what Section 4.3 + means when it says AQM runs "exactly where the integrator lives": the + integrator is the group queue, the group queue is an mbuf ring in hugepage + memory, and the marking loop touches that ring on every poll cadence with + no additional indirection. +

+ +

16.5 Fault Isolation via IOMMU Groups

+ +

+ In a multi-card concentrator, each SFP+ card appears in its own IOMMU + group, which means each card can be bound to VFIO independently and the + IOMMU enforces that one card's DMA cannot reach another card's memory + regions. In a deployment with multiple SFP+ cards, the IOMMU topology + provides natural fault isolation at the card boundary: a PCIe error or + runaway DMA event from one RRH is contained within its card's group and + cannot corrupt the packet memory of an adjacent airtime domain. This is a + hardware guarantee, not a software policy. +

+ +

16.6 What DPDK Does and Does Not Solve

+ +

+ The kernel-bypass data plane is not a complexity cost — it is the + mechanism that justifies the RRH's simplicity. Because the + concentrator runs a deterministic, observable pipeline that applies AQM, + tracks sojourn time, and manages all descriptor posting without OS + intervention, the RRH never needs to make a queuing or scheduling + decision. It remains a pure DMA client, exactly as the silicon cost + argument in Section 4.4 requires. +

+ +

+ Incumbent distributed APs have no equivalent. Because each AP operates + autonomously, it must run its own Linux network stack, its own + qdisc, and its own firmware scheduler. The CPU carrying that + stack is the dominant gate cost per RRH (Section 4.4, silicon cost table). + A centralized DPDK pipeline eliminates that requirement across every RRH + simultaneously — not by optimizing the AP implementation, but by removing + the architectural condition that forces the CPU to exist there in the + first place. +

+ +

+ That said, DPDK solves a specific problem: it gives the concentrator a + deterministic, observable, zero-copy execution path in which queue state, + ECN marking, and packet steering remain under unified software control. It + does not solve the radio-side interface. Per-packet MCS selection, EDCA + parameter control, and TX-outcome metadata from the Wi-Fi silicon remain + the next required interface boundary — the point at which concentrator + intelligence must reach into the RRH to close the control loop. DPDK is + the precondition; radio-side per-packet programmability is what completes + it. +

+ +

16.7 DualPI2 Baseline: Control Law and Queue Structure

+ +

+ Section 16.4 described the minimal ECN marking step — reading queue state and + applying a CE mark in the fast path. That sketch is sufficient to illustrate + where marking occurs, but it elides the control structure that makes + L4S coexistence with legacy traffic work: the + dual-queue coupled AQM defined in RFC 9332. +

+ +

+ This section defines the baseline DualPI2 control law as it + would be realized inside the DPDK polling loop. Fi-Wi preserves this + dual-queue topology, coupling mechanism, and PI-based control structure, but + Section 17 replaces the underlying congestion signal with + Airtime Debt (Di), grounding the controller in + predicted wireless service time rather than raw queue occupancy. +

+ +

16.7.1 The Two Queues

+ +

+ Each airtime domain maintains two logically independent mbuf rings in the + concentrator's hugepage pool: an L4S queue for scalable + congestion-control flows (senders marking with ECT(1)), and a + Classic queue for legacy RFC 3168 flows and unmarked + traffic. Classification happens at ingress on the fast path, before the + packet is enqueued, and costs a single bitfield check on the IP ECN field: +

+ +
// Ingress classification — per-packet, inline in the rx burst loop
+uint8_t ecn = (pkt_ip->type_of_service & 0x03);
+bool is_l4s = (ecn == 0x01 || ecn == 0x03);   // ECT(1) or CE — scalable sender
+
+fiwi_meta(pkt)->queue_class = is_l4s ? QUEUE_L4S : QUEUE_CLASSIC;
+enqueue_to_domain(pkt, domain_id, fiwi_meta(pkt)->queue_class);
+
+ +

+ Both queues drain toward the same transmit burst for that airtime domain. + The scheduler services the L4S queue with a strict low-latency budget and the + Classic queue at a rate that saturates the domain's aggregate share, matching + the DualPI2 service model from RFC 9332. +

+ +

16.7.2 The Coupling Mechanism

+ +

+ The key property of DualPI2 is that the two queues are not independent. + The Classic queue's drop probability pc — computed by + a PI controller from a congestion signal representing pressure at the shared + bottleneck — also governs the L4S queue's ECN marking probability via a + coupling factor k (default 2 in the Linux + sch_dualpi2 reference implementation). +

+ +
// Outer control loop — runs on a slow timer cadence (~16 ms), same lcore,
+// non-preemptive. Not per-packet.
+double signal_classic = ewma_update(&domain->classic_signal,
+                                    ring_depth(QUEUE_CLASSIC));
+double p_c = max(0.0, K_PI * (signal_classic - TARGET_CLASSIC));  // PI controller
+
+double p_l = COUPLING_K * p_c;   // Coupled L4S marking probability
+
+// Applied per-packet in the L4S dequeue path:
+double p_l_step = (sojourn_L4S_ns > THRESHOLD_L4S_NS) ? 1.0 : p_l;
+if (rte_rand_u64() < (uint64_t)(p_l_step * (double)UINT64_MAX))
+    rte_ipv4_l4s_mark(pkt);      // Set ECN CE in-place, no copy
+
+ +

+ In a conventional queue-based implementation, signal_classic + would be an EWMA of Classic queue depth. In Fi-Wi, that queue-derived signal + is replaced as the PI controller input by + Airtime Debt (Di), a forward estimate of wireless + service time. The DualPI2 control law, coupling mechanism, and + dual-queue topology remain unchanged; only the input signal changes. +

+ +

+ Queue depth is a lagging indicator in Wi-Fi because contention, retries, and + variable PHY rates consume airtime without necessarily appearing in buffer + occupancy. Airtime Debt provides a forward-looking signal that better matches + the true wireless bottleneck while preserving the DualPI2 coexistence + structure required for L4S and Classic traffic to share the medium. +

+ +

16.7.3 Per-Domain State and the fiwi_update Interface

+ +

+ Each airtime domain carries its own DualPI2 state alongside the + fiwi_rrh_state struct (Section 17.5). Because each lcore + owns a fixed set of domains exclusively (Section 16.8), this state is + never shared across cores — no locks, no atomics, no cache-line bouncing on + the fast path. +

+ +

+ The telemetry path (Section 17.8) delivers ground-truth airtime + measurements back to the lcore via a lockless ring carrying + fiwi_update objects. The struct is defined here because it + originates in the DPDK fast-path layer and is consumed by it; + Section 17.8 populates it from Netlink/vendor telemetry events: +

+ +
/**
+ * fiwi_update — telemetry record posted by the Netlink callback,
+ * consumed by the DPDK lcore during its scheduling loop.
+ * Allocated from fiwi_update_pool (rte_mempool); returned after use.
+ */
+struct fiwi_update {
+    uint8_t  type;          /* AIRTIME_RECONCILE (only type currently defined) */
+    uint32_t rrh_id;        /* RRH index, validated < FIWI_MAX_RRHS before enqueue */
+    uint64_t actual_us;     /* Hardware-path-to-status interval (ground truth) */
+    uint64_t expected_us;   /* Forward estimate: T_phy + T_agg at enqueue time */
+    uint32_t retry_us;      /* Observed retry airtime from telemetry metadata */
+};
+
+ +
+
+Per-domain fast-path structure (allocated in hugepages, lcore-local): + + domain[d] + ├── l4s_ring mbuf ring, N_L4S slots (RING_F_SP_ENQ | RING_F_SC_DEQ) + ├── classic_ring mbuf ring, N_CLASSIC slots (RING_F_SP_ENQ | RING_F_SC_DEQ) + ├── classic_signal EWMA accumulator for controller input + ├── pi_integral PI controller integral term + ├── p_c current Classic drop probability + ├── p_l coupled L4S mark probability (= COUPLING_K * p_c) + └── port_queue_map (PCIe port, hw queue_id) → this domain + + rrh_update_rings[d] per-RRH lockless ring (RING_F_MP_HTS_ENQ | RING_F_SC_DEQ) + fiwi_update_pool shared rte_mempool; safe to get() from non-EAL threads + +Slow-path timer (~16 ms, same lcore, non-preemptive): + ewma_update → pi_update → refresh p_c, p_l + +Fast-path (every poll cadence): + rx_burst → classify ECN → enqueue l4s / classic + dequeue l4s (strict sojourn threshold) → mark CE → tx_burst + dequeue classic (weighted, drop at p_c) → tx_burst + drain rrh_update_rings → apply fiwi_apply_updates() +
+
+ Figure 16-2: Per-domain DualPI2 state layout. All per-domain state is + lcore-local and single-writer. The update ring uses + RING_F_MP_HTS_ENQ because the Netlink callback runs on a + non-EAL thread; the lcore-side dequeue uses + RING_F_SC_DEQ (single consumer). +
+
+ +

16.8 Multi-RRH lcore Topology and Control Ownership

+ +

+ The Umber concentrator runs on a workstation-class host with a Threadripper PRO + processor and multiple PCIe-connected RRHs. This section describes how DPDK + lcore assignments map onto that hardware topology to preserve cache locality, + single-writer semantics, and deterministic fast-path execution. +

+ +

+ Each lcore owns both the DualPI2 control state (Section 16.7) and the Airtime + Debt estimator (Section 17) for its assigned RRHs. This ensures that congestion + estimation, scheduling, and ECN marking operate within a single execution context. +

+ +

16.8.1 RRH Assignment

+ + + + + + + + + + + + + + + + + +
RRH RangeAssigned lcoreAirtime Domains
0–3lcore 2domains 0–3
4–7lcore 4domains 4–7
8–11lcore 6domains 8–11
12–15lcore 8domains 12–15
16–19lcore 10domains 16–19
20–23lcore 12domains 20–23
+ +

16.8.2 Control and Data Flow

+ +

+ Each RRH lcore applies its per-domain DualPI2 loop as described in Section 16.7, + with Airtime Debt (Di) serving as the PI controller + input in place of queue depth. This presents a single, airtime-grounded congestion + signal per domain to the L4S control loop. +

+ +

+ Downlink traffic is classified at ingress and directed to the appropriate + airtime domain. The owning lcore performs scheduling, ECN marking, and transmission. + Uplink traffic follows the reverse path toward the WAN interface. +

+ +

+ Because each lcore exclusively owns its RRHs and associated Airtime Debt state, + congestion estimation, scheduling, and ECN marking operate without cross-core + coordination. This preserves deterministic fast-path behavior. +

+ +
+
+Ingress → classify → assign domain → lcore owns RRH + → compute D_i → schedule → transmit + → measure → update C_i/R_i → recompute D_i +
+
+ Figure 16-3: lcore ownership of RRHs and control loop execution. +
+
+ +
+

17. Airtime-Assisted ECN: Airtime Debt as the Congestion Signal

+ +

+ Fi-Wi does not infer congestion from queue depth alone. The bottleneck + is the wireless medium, and the relevant state variable is the time + required to successfully transmit packets over that medium. The system + replaces the queue sojourn-time inputs of traditional PI2 + controllers with Airtime Debt (Di), + converting a stochastic medium into a controlled service process. +

+ +

17.1 The Bottleneck is Airtime, Not a Queue

+

+ In traditional L4S systems, ECN marking is derived from queue sojourn + time, which assumes a stationary service rate. These assumptions fail in + Wi-Fi because service time varies per client based on PHY rates, + contention, and retries. Fi-Wi replaces backward-looking buffer metrics + with a forward model of wireless service time. The + Concentrator maintains this model continuously and makes scheduling + decisions on predicted service outcomes, not observed + queue growth. This approach provides the AQM with a signal that has a + more stationary distribution than raw queue depth over a variable-rate + medium, improving marking coherence and L4S stability. +

+ +

17.2 Airtime Debt Model (Per RRH)

+

+ For each RRH (i), the Concentrator maintains a real-time + Airtime Debt (Di): +

+
+ Di = Ai + Ci + Ri +
+ + +

+ 17.3 Measuring Ground Truth (Hardware-Path-to-Status) +

+

+ The "Ground Truth" for airtime consumption is measured as the interval + from + descriptor posting into the hardware transmit path to + TX Status (hardware completion signal via + driver/vendor-specific telemetry events such as mt76 TX + status reports). This interval captures the full service duration, + including the full wait for TXOP eligibility (AIFS + backoff), + aggregation delay, and all hardware-level retransmission attempts. +

+ +

17.4 Predicted Sojourn Time (Si)

+

+ For any packet, the + Predicted Sojourn Time (Si) is a forward + estimate of delivery time: +

+
+ Si(packet) = Di + Tservice(packet) +
+

+ The Tservice calculation is decomposed into: + Tagg (aggregation hold time) + + Tphy (modulation time at current MCS) + + Tretry (statistical retry overhead). This + estimate is packet- and client-specific; it is not a constant service + quantum. +

+ +

17.5 Implementation: DPDK Fast Path State

+

+ The Concentrator tracks RRH state in hugepage-backed memory. The DPDK + lcore is the sole writer of fiwi_rrh_state; telemetry + updates are applied via per-RRH lockless ring buffers to preserve + single-writer semantics and microsecond-level determinism. +

+
+
+struct __rte_cache_aligned fiwi_rrh_state {
+    uint32_t rrh_id;
+    uint64_t D_i;            /* Total airtime debt (A+C+R) */
+    
+    /* Component Estimates (microseconds) */
+    uint64_t A_i;            /* Total scheduled airtime (queued + in-flight) */
+    uint32_t C_i;            /* Estimated contention delay */
+    uint32_t R_i;            /* Estimated retry penalty */
+
+    /* Feedback & Synchronization */
+    uint64_t last_update_us;     /* Timestamp of last lcore application */
+    uint64_t last_tx_status_us;  /* TSC of last hardware completion */
+    uint32_t moving_avg_per;     /* Recent PER (Section 15.4) */
+};
+    
+
+

+ Di is recomputed in the DPDK fast path after + each update to Ai, Ci, or Ri. The loop updates Ai when packets are assigned + to an RRH and decrements it upon TX completion using telemetry feedback. +

+ +

17.6 Authoritative Congestion Signaling

+

+ Airtime Debt replaces physical queue depth as the authoritative input + for the Dual-Queue AQM, providing a single, authoritative congestion + signal across all RRHs without relying on a shared physical buffer. +

+ + +

17.7 Slow-Path Observability

+

+ While Di provides fast-path control, the system + monitors + Airtime Utilization (Uair = ΔTX_DURATION / + Δt) + as a slow-path observability metric. This metric is used to identify + external interference patterns and long-term capacity shifts in the + airtime domain, calibrating the confidence weights applied to the + Ci and Ri estimators. +

+ +

17.8 Telemetry Feedback: Netlink Calibration

+

+ The following logic processes TX_STATUS events from the + mt76 driver. Completion data is retrieved from a + pre-allocated mempool and posted to a per-RRH lockless ring to reconcile + state without lcore contention. +

+
+
+/* Telemetry Path (Netlink Callback) */
+static int fiwi_handle_mt76_telemetry(struct nl_msg *msg, void *arg) {
+    struct nlattr *attrs[MT76_ATTR_MAX + 1];
+    nla_parse(attrs, MT76_ATTR_MAX, genlmsg_attrdata(nlmsg_data(nlmsg_hdr(msg)), 0),
+              genlmsg_attrlen(nlmsg_data(nlmsg_hdr(msg)), 0), NULL);
+
+    if (!attrs[MT76_ATTR_TX_DURATION] || !attrs[MT76_ATTR_RRH_ID])
+        return NL_SKIP;
+
+    uint32_t rrh_id = nla_get_u32(attrs[MT76_ATTR_RRH_ID]);
+    if (rrh_id >= FIWI_MAX_RRHS) return NL_SKIP;
+
+    struct fiwi_update *update;
+    if (rte_mempool_get(fiwi_update_pool, (void**)&update) < 0) return NL_SKIP;
+
+    update->type = AIRTIME_RECONCILE;
+    update->rrh_id = rrh_id;
+    update->actual_us = nla_get_u64(attrs[MT76_ATTR_TX_DURATION]);
+    update->retry_us = nla_get_u32(attrs[MT76_ATTR_RETRY_DURATION]);
+    update->expected_us = estimate_service_time(msg); 
+
+    rte_ring_enqueue(rrh_update_rings[rrh_id], update);
+    return NL_PROCEED;
+}
+    
+
+ +

17.8.1 Telemetry Application (DPDK lcore)

+

+ The DPDK lcore closes the control loop by draining the update ring. It + decrements the backlog and calibrates penalties to ensure the + Airtime Debt remains an accurate representation of + physical medium pressure. +

+ +
+
+/* DPDK lcore: apply telemetry updates */
+static inline void
+fiwi_apply_updates(struct fiwi_rrh_state *rrh, struct rte_ring *ring)
+{
+    struct fiwi_update *upd;
+    while (rte_ring_dequeue(ring, (void**)&upd) == 0) {
+        /* 1. Discharge processed backlog */
+        rrh->A_i = (rrh->A_i > upd->actual_us) ? (rrh->A_i - upd->actual_us) : 0;
+
+        /* 2. Update contention estimate (drift from expected modulation time) */
+        uint32_t drift = (upd->actual_us > (upd->expected_us + upd->retry_us)) ? 
+                         (upd->actual_us - upd->expected_us - upd->retry_us) : 0;
+        rrh->C_i = (rrh->C_i * 7 + drift) >> 3;
+
+        /* 3. Update retry penalty */
+        rrh->R_i = (rrh->R_i * 7 + upd->retry_us) >> 3;
+
+        /* 4. Recompute total Airtime Debt (D_i) */
+        rrh->D_i = rrh->A_i + rrh->C_i + rrh->R_i;
+
+        rrh->last_tx_status_us = rte_get_tsc_cycles();
+        rte_mempool_put(fiwi_update_pool, upd);
+    }
+}
+    
+
+

17.9 Visualization: The Airtime Debt Control Loop

+ +
+
+ Figure 17-1: Airtime Debt Control Loop showing Forward Service Model and Ground Truth Calibration + + + +

Figure 17-1: The Fi-Wi recursive control loop for stabilizing stochastic wireless service.

+
+ +

Diagram Overview: Closing the Feedback Loop

+

+ Figure 17-1 synthesizes the technical components of the Airtime Debt model into a continuous functional loop. The architecture separates the Speculative Forward Path (Fast Path) from the Calibrated Feedback Path (Telemetry Path). +

+ +
+ 1. Forward Service Model (Prediction): + Every ingress packet triggers a per-STA calculation of Tservice. This is not a global constant; it is a client-specific sum of aggregation hold time (Tagg), PHY modulation time (Tphy), and predicted retry overhead (Tretry) based on that STA's specific RF context. +
+ +
+ 2. Debt Update & Marking Decision: + The predicted Tservice is added to the RRH's Ai (Backlog). If the resulting Predicted Sojourn Time (Si) exceeds Tlow, an ECN CE mark is applied immediately in the DPDK fast path. This provides the "Virtual Backpressure" that stabilizes L4S senders. +
+ +
+ 3. Ground Truth Calibration (Correction): + As the packet is dispatched via DMA, the hardware records the precise interval from descriptor posting into the hardware transmit path to TX Status completion. The Telemetry Path calculates the Drift—the delta between the forward prediction and physical reality. +
+ +
+ 4. Estimator Refinement: + This drift is fed back into the EWMA filters for Ci (Contention) and Ri (Retries). This ensures that subsequent predictions for the same STA or RRH domain are corrected for changing medium pressure, effectively regularizing the stochastic nature of the 802.11 medium. +
+
+ + +

18. Summary

+ +

+ The core idea of Umber’s Fi-Wi architecture is to make a building full of + Wi-Fi radios behave like a + large number of predictable, low-latency, cellularized + bottlenecks + (often cell-per-room) that integrate cleanly with L4S, and to avoid Wi-Fi + collapse in the regime that matters most for users: + tail latency. +

+ +

We do that by:

+ + + +

Compared to a building filled with independent APs, Fi-Wi provides:

+ + + +
+ +

+ Appendix A: 802.11 Backoff Timing & Collapse Dynamics +

+ +

+ This appendix explains the precise behavior of the 802.11 CSMA/CA backoff + algorithm, why the freeze/resume mechanics create strong nonlinearities + under load, and how this drives the collapse behavior discussed in + Sections 2 and 6. We also include reference diagrams, accurate pseudocode, + and probability scaling that shows why birthday-paradox collisions appear + long before PHY saturation. +

+ +

A.1 Overview

+ +

The 802.11 MAC is built around two core mechanisms:

+ + + These mechanisms interact in a way that works beautifully for light to + moderate station counts, but begins to break down sharply once multiple + stations become backlogged. Collapse is not a "bug"; it is the + mathematically expected outcome under high concurrency. +

A.2 Backoff Decrements Only During Idle SlotTime

+ +

When a station has a frame to send, it chooses a random integer:

+ +
B ← Uniform[0, CW]
+ where CW is the contention window. The counter + decrements only when: + + +

+ If any of these conditions break during a SlotTime boundary, backoff does + not decrement. +

+ +

Diagram A-A — Backoff Countdown with Idle Slots and Freezes

+ +
+Time →  ───────────────────────────────────────────────────────────────────────→
+
+Channel:    Busy TXOP      Idle slot     Idle slot     Busy TXOP      Idle ...
+           ────────────┐  ┌─────────┐   ┌─────────┐  ┌───────────┐
+                       │  │ slot OK │   │ slot OK │  │collision  │
+                       └──┘         └───┘         └──────────────┘
+
+Backoff B:   [frozen]        B:=B-1       B:=B-2        [frozen]       B:=B-3
+
+

+ This "idle-slot-only" decrement rule is the source of nonlinear timing + behavior. +

+ +

A.3 Freeze Conditions: Physical Busy + NAV Busy

+ +

+ The backoff counter freezes immediately under either + condition: +

+ + + +

+ NAV counts down in microseconds, not slot units, so a NAV may span dozens + or hundreds of SlotTimes, creating long frozen periods. +

+ +

Diagram A-B — NAV Freezes Backoff for Entire Duration

+ +
+Frame overheard with Duration=480µs
+     NAV := 480 µs  ─────────────────────────────────────────────▶ 0 µs
+
+Backoff:
+   Frozen until NAV==0
+   Then: AIFS idle interval → first idle SlotTime → resume B countdown
+
+

A.4 Full Backoff State Machine

+ +

+ The following pseudocode describes the real 802.11 backoff and retry + machine: +

+ +
+# Variables
+B   = random integer in [0, CW]
+CW  = CWmin initially, doubled on failures
+NAV = virtual carrier sense (µs timer)
+Slot = 9 microseconds (typical)
+AIFS = access category-specific inter-frame space
+
+while True:
+
+    wait_until( medium_idle() and NAV == 0 )
+    wait(AIFS)  # must see idle for entire AIFS
+
+    # Backoff countdown
+    while B > 0:
+
+        if medium_idle() and NAV == 0:
+            wait(Slot)
+            if medium_idle() and NAV == 0:
+                B -= 1      # decrement only if entire slot was idle
+        else:
+            # Freeze B until another idle AIFS appears
+            wait_until( medium_idle() and NAV == 0 )
+            wait(AIFS)
+
+    # Backoff fully expired, attempt TX
+    transmit()
+
+    if ack_received():
+        CW = CWmin
+        B = random(0, CW)
+    else:
+        CW = min(2 * CW, CWmax)
+        B = random(0, CW)
+
+

+ The critical detail: + multiple stations freeze and resume their counters in lock-step + after every long TXOP or NAV, making collisions statistically inevitable + as station count grows. +

+ +

+ A.5 Collision Probability and the Birthday Paradox +

+ +

+ Each station independently picks a backoff slot in [0, CW]. + The probability that no two stations choose the same slot is: +

+ +
+P(no collision) = (CW+1)! / [(CW+1 - n)! · (CW+1)^n]
+
+

where n = number of active contenders. Therefore:

+ + + +

Diagram A-C — Collision Probability vs. Number of Stations

+ +
+Stations (n) →   4     6      8      10     12     16
+--------------------------------------------------------
+P(collision)   ~12%   30%    48%    65%    78%    >90%
+
+(CWmin = 15)
+
+

+ This is the MAC-level reason collapse begins long before PHY capacity is + reached. +

+ +

A.6 Why Collapse Appears as 2–3 ms TXOP Tails

+ +

Once collisions become frequent:

+ + + +

Diagram A-D — TXOP Length as Collapse Indicator

+ +
+Healthy:    T50 ≈ 200–500 µs,   T95 < 0.8 ms,    T99 < 1.2 ms
+Degraded:   T95 = 1–2 ms,       T99 = 2–3 ms
+Collapsed:  T95 > 2 ms AND      T99 ≥ 3 ms (dominant channel monopolization)
+
+

+ A single 3 ms TXOP already violates the bottleneck-delay budget required + by L4S (≈250–300 µs). With multiple stations taking such TXOPs, service + gaps can reach 10–50 ms for unlucky flows. +

+ +

A.7 Multi-Station Synchronization Example

+ +

+ The following diagram illustrates how multiple stations become + phase-aligned: +

+ +
+Time →  ────────────────────────────────────────────────────────────────→
+
+TXOP1 by STA-A:   ────────────────
+NAV for others:   ──────────────── (all B frozen)
+
+After NAV expires:
+All stations wait AIFS → begin countdown
+Slot 1:  B_A=2, B_B=4, B_C=2
+Slot 2:  B_A=1, B_C=1
+Slot 3:  B_A=0  ,  B_C=0   → simultaneous transmit → collision
+
+

+ This synchronization is why the birthday paradox applies so strongly in + Wi-Fi. +

+ +

A.8 Why Fi-Wi Breaks the Cycle

+ +

Fi-Wi removes the “every station fends for itself” randomness by:

+ + + Thus Fi-Wi converts Wi-Fi from a chaotic CSMA/CA system into a + scheduled, low-latency cellular MAC. + +
+ +

+ Appendix B: Channel State Information (CSI) and Learning-Enhanced Fi-Wi +

+ +

+ This appendix describes how Fi-Wi can use Channel State Information (CSI) + from each RRH, together with learning models (e.g. LSTM or TCN), to + improve grouping, scheduling, redundancy, and control beyond what is + possible with queue-based feedback alone. +

+ +

B.1 What CSI provides in a Fi-Wi context

+ +
+ Concept: What is CSI?
+ Imagine shouting in a complex room. You hear echoes bouncing off walls, + furniture, and people. If you analyze those echoes, you can map the + environment.
+
+ In Wi-Fi, Channel State Information (CSI) is that map. It + describes exactly how the radio wave traveled from the transmitter to the + receiver—including all the bounces (multipath), fading, and phase shifts + caused by the physical environment. + + Traditional APs throw this data away after decoding the packet. Fi-Wi + sends it to the Concentrator, allowing the system to "see" the RF + environment and mathematically calculate how to steer beams or combine + signals.
+
+ Wi-Fi Sensing: Because physical objects reflect radio + waves, any movement in the room changes the CSI pattern. By monitoring + these changes over time, Fi-Wi can detect presence—such as a person + walking or a pet breathing—turning the network into a ubiquitous sensor + without cameras. +
+ +

+ Modern 802.11 chipsets can export CSI per subcarrier or + per resource unit: complex-valued estimates of the channel between an RRH + and a station (STA). In a Fi-Wi deployment, each RRH periodically reports: +

+ + + +

+ Thanks to centralized time synchronization and packet memory, the + concentrator can align CSI reports with: +

+ + + +

This gives Fi-Wi a rich per-domain, per-STA time series:

+ + + +

B.2 What we want to predict

+ +

+ Using this data, Fi-Wi can learn models to help answer questions such as: +

+ + + +

These predictions can feed directly into:

+ + + +

B.3 Example model: LSTM / TCN

+ +

+ One reasonable approach is to use a sequence model such as an LSTM or + Temporal Convolutional Network (TCN) per airtime domain: +

+ +
Input features (per timestep):
+  - queue depth q_k
+  - marking probability p_k
+  - throughput, PER, retries
+  - per-RRH CSI summary (e.g. dominant eigenvalues/eigenvectors)
+  - beacon power settings, channel, bandwidth
+
+Outputs:
+  - predicted effective capacity C_eff,k+1
+  - predicted collapse risk score
+  - recommended group reconfiguration / beacon adjustments (optional)
+
+

A higher-level policy layer then uses these predictions to:

+ + + +

+ The key point is that Fi-Wi has access to the + joint state across all RRHs—queues, CSI, MAC outcomes, + and beacon configuration—so learning can be done on a true building-scale + view rather than a per-AP snippet. +

+ +

+ B.4 The Non-Linear Control Policy (Feature Vectors) +

+ +

+ While the PI² controller (Section 5.2) provides a robust baseline using + linear control theory, the wireless medium is inherently non-linear. A + small drop in SNR can cause a discrete, non-linear step-down in MCS, + cutting capacity by half in microseconds. A linear controller often reacts + too slowly to these step-changes. +

+ +

+ Because the Concentrator terminates both the MAC (Inner Loop) and L4S + (Outer Loop), it possesses a complete, global view of the system state. + This allows Fi-Wi to implement a + Non-Linear Marking Signal derived from a rich real-time + feature vector: +

+ +
+
+Feature Vector x(t) = [
+   MCS_t,          // Current Modulation (Capacity potential)
+   PHY_Rate_t,     // Raw drain rate
+   RTT_outer,      // End-to-end latency (Sojourn + Flight)
+   Q_depth_t,      // Current backlog
+   d_arrival/dt    // Arrival rate gradient (ARM Policer)
+]
+  
+
+ +

+ Optimization Objective: Efficiency vs. Latency
+ The system uses this vector to solve the fundamental Wi-Fi trade-off: + Aggregation Efficiency vs. Serialized Latency. +

+ + + +

+ This creates a Non-Linear Marking Signal that optimizes + Throughput per Microsecond of Latency, rather than simply + targeting a fixed queue depth. +

+ +
+ +

Appendix C: Latency Hiding via Scatter-Gather DMA

+ +

+ Early architectural models of C-RAN often assumed a "Store-and-Forward" + approach, where full packets must be buffered at the edge to meet timing. + Fi-Wi eliminates this inefficiency by leveraging the natural physics of + the 802.11 air interface. We utilize a + Scatter-Gather DMA engine with + Preamble Hiding to enable a "Thin RRH" design with + minimal local SRAM. +

+ +

C.1 The "Preamble Shield" Physics

+ +

+ The critical timing constraint in Wi-Fi is the transition from "Decision + to Transmit" to "Energy on Air." However, the 802.11 PHY does not transmit + user data immediately. Every transmission begins with a PHY Preamble + (PLCP) and MAC Headers. +

+ +
+ Time-Domain View of a Transmission Start: T=0 µs T=5 µs T=24 µs (approx) | + TX Trigger | | | | Preamble & Headers | Payload Data Starts... [ MAC Logic + ]->[/////////////////////////][......................] ^ ^ | | Source: + Local RRH SRAM Source: Host Concentrator DRAM (Instant Access) (Fetched + via Fiber) +
+ +

+ The Insight: The transmission of the Preamble and Headers + takes roughly 20–40 µs (depending on PHY generation). The + round-trip time to fetch payload data over 100m of PCIe-over-Fiber is + roughly 2–5 µs. +

+ +

+ Consequently, the fetch latency is completely "hidden" behind the + transmission of the headers. The payload data arrives at the RRH's small + FIFO well before the PHY is ready to modulate it. +

+ +

C.2 Scatter-Gather Architecture

+ +

+ Instead of a large packet buffer, the Fi-Wi RRH implements a + Scatter-Gather DMA engine that composes frames on the fly from two + distinct memory regions: +

+ +
    +
  1. + Template RAM (Local RRH SRAM): Stores 802.11 MAC + headers, PLCP headers, and delimiter signatures. This memory is small + (< 16 KB), fast, and populated by the Concentrator during the + descriptor posting phase. +
  2. + +
  3. + Payload Buffer (Remote Concentrator DRAM): Stores the + actual 802.3 Ethernet payloads. These remain in the host server's memory + until the exact moment of transmission. +
  4. +
+ +

C.3 The Transmit Sequence

+ +
    +
  1. + Descriptor Posting: The Concentrator posts a descriptor + to the RRH. This descriptor points to the header in Local RAM and the + payload in Remote DRAM. +
  2. + +
  3. + Contention: The RRH MAC performs EDCA backoff. No data + is moved during this phase. +
  4. + +
  5. + TX Trigger: When backoff reaches zero, the MAC + immediately begins transmitting the Preamble from Local RAM. +
  6. + +
  7. + Just-in-Time Fetch: Simultaneously with the Preamble + start, the DMA engine issues a read request to the Concentrator for the + payload data. +
  8. + +
  9. + Cut-Through: Data returns from the fiber, flows into a + small speed-matching FIFO (e.g., 4 KB), and flows directly into the PHY + serialization path immediately following the header. +
  10. +
+ +

C.4 Solving the Retry Timing (SIFS)

+ +

+ A common objection to C-RAN is the SIFS deadline (16 µs) required for + retries. If a transmission fails, the station must retransmit immediately. +

+ +

+ With Scatter-Gather, the RRH does not need to buffer the + packet for retries. If a NACK occurs, the MAC simply resets the + Scatter-Gather engine. It re-transmits the Preamble (from Local RAM) while + re-issuing the DMA fetch (from Remote RAM). Because the fiber latency (5 + µs) is significantly shorter than the SIFS + Preamble duration, the data + again arrives in time. +

+ +

C.5 Architectural Benefits

+ + + +
+ +

+ Appendix D: 802.11ax/be Features and Fi-Wi Integration +

+ +

+ Modern Wi-Fi standards — particularly 802.11ax (Wi-Fi 6/6E) and 802.11be + (Wi-Fi 7) — introduce features that appear to address some of the same + problems as Fi-Wi: uplink scheduling, spatial reuse, and multi-AP + coordination. This appendix clarifies how these features relate to Fi-Wi's + architecture, where they're complementary, and why they don't eliminate + the need for Fi-Wi's centralized data-plane approach. +

+ +

+ Key takeaway: 802.11ax/be features like trigger frames + and multi-AP coordination are valuable enhancements that Fi-Wi can + leverage when client support is available, but they operate at a different + architectural level (per-AP MAC features vs. building-scale data-plane + unification) and cannot replace Fi-Wi's core innovations: centralized + queues, shared state, L4S marking coordination, and dynamic RF grouping + across the entire building. +

+ +

D.1 Trigger Frames and Uplink Scheduling

+ +

+ 802.11ax introduced trigger frames (TF) to enable + centralized uplink scheduling. Instead of clients contending for the + channel using stochastic EDCA backoff, the AP sends a trigger frame that + grants specific clients permission to transmit on specific OFDMA resource + units (RUs) or spatial streams at a specific time. +

+ +

What trigger frames provide:

+ + + +

How trigger frames align with Fi-Wi:

+ +

+ Trigger frames match Fi-Wi's philosophy of centralized scheduling rather + than distributed contention. In a Fi-Wi deployment where RRHs support + 802.11ax and clients support uplink OFDMA/MU-MIMO, the concentrator can: +

+ + + +

Reality check — client support in 2025:

+ +

+ While 802.11ax was ratified in 2019, uplink OFDMA support remains + inconsistent. + Crucially, trigger frames only control 802.11ax/be clients; legacy + devices (iPhone 11, older IoT) are invisible to this schedule. + These legacy clients cannot parse the trigger, so they continue to contend + via random EDCA, acting as unmanaged interference sources. In contrast, + Fi-Wi's reception diversity (Section 8.1) enhances uplink reliability for + all clients, regardless of generation, by combining signals from + multiple RRHs. +

+ +

+ D.2 Why Trigger Frames Don't Eliminate the Need for Fi-Wi +

+ +

+ A natural question: "If 802.11ax APs can use trigger frames for uplink + scheduling, why do we need Fi-Wi's centralized architecture?" +

+ +

+ Answer: Trigger frames address only a + small subset of the problems Fi-Wi solves, and even for + uplink scheduling, they provide per-AP control, not building-scale + coordination. +

+ +

What trigger frames do NOT provide:

+ +
    +
  1. + Centralized queues across APs: Even with trigger + frames, each AP maintains its own independent downlink and uplink + queues. There's no shared queue state, no unified bottleneck, and no + coordinated ECN marking across APs. +
  2. + +
  3. + Shared state: Trigger-capable APs still operate + autonomously. They don't share CSI, retry statistics, airtime usage, or + queue metrics. Each AP makes trigger scheduling decisions based only on + its local view. +
  4. + +
  5. + Coordinated L4S marking: There's no mechanism in + 802.11ax for multiple APs to coordinate ECN marking or present a single + logical bottleneck to L4S. Each AP marks (or doesn't mark) + independently. +
  6. + +
  7. + Dynamic RF grouping: 802.11ax APs don't dynamically + reconfigure which radios share airtime resources based on interference, + CSI structure, or collapse risk. They're fixed islands. +
  8. + +
  9. + Tail latency control: Trigger frames help with uplink + efficiency, but they don't address the fundamental problem of hidden + queues, uncontrolled aggregation, and tail latency blowup under load + across a multi-AP building. +
  10. +
+ +

D.3 OFDMA Resource Units and Airtime Domains

+ +

+ 802.11ax OFDMA subdivides a channel into resource units (RUs). In Fi-Wi, + an airtime domain is a logical entity representing a + shared RF resource. OFDMA RUs provide + finer-grained subdivision of that airtime resource. +

+ +

Conceptually:

+ + + +

+ This does not change the fact that all RRHs in that + airtime domain share a single group queue and marking point. It simply + allows the service process to be more efficient. +

+ +

D.4 BSS Coloring and Spatial Reuse

+ +

+ 802.11ax BSS coloring allows STAs to distinguish between intra-BSS frames + (same color) and inter-BSS frames (different color), enabling more + aggressive spatial reuse. +

+ +

+ Relationship to Fi-Wi RF grouping: Fi-Wi's dynamic RF + grouping (Section 6) serves a similar but more sophisticated purpose. + Fi-Wi uses richer information (CSI, retry statistics, airtime) to decide + grouping, not just RSSI thresholds. In a Fi-Wi deployment, the + concentrator can assign BSS colors to RRHs strategically: RRHs in the same + airtime domain get the same color, while isolated domains get different + colors. +

+ +

D.5 802.11be (Wi-Fi 7) Multi-AP Coordination

+ +

+ 802.11be (Wi-Fi 7) introduces + multi-AP coordination features that appear to move in + Fi-Wi's direction: +

+ + + +

+ How these relate to Fi-Wi: These features acknowledge the + problem of autonomous APs but approach it incrementally. 802.11be uses + distributed AP-to-AP messaging, which limits scale and speed. Fi-Wi + centralizes the data plane, enabling deeper coordination than distributed + messaging can achieve. +

+ +

D.6 Deployment Strategy: Mixed Client Populations

+ +

+ A key advantage of Fi-Wi's architecture is that it + degrades gracefully with mixed client populations and + doesn't require forklift client upgrades. +

+ +

Client capability tiers in a 2025 deployment:

+ +
    +
  1. + Legacy 802.11ac and earlier: No trigger frame support, + no OFDMA, no BSS coloring. +
      +
    • + Fi-Wi provides: centralized downlink queuing, L4S marking, reception + diversity on uplink, beacon shaping to reduce contention. +
    • + +
    • + Result: Significantly better latency and stability than traditional + multi-AP, even without 802.11ax features. +
    • +
    +
  2. + +
  3. + 802.11ax with partial features: May support downlink + OFDMA, BSS coloring, some power save enhancements, but not uplink OFDMA + or uplink MU-MIMO. +
      +
    • + Fi-Wi provides: All of the above, plus downlink MU-OFDMA where + beneficial, coordinated BSS coloring across RRH groups. +
    • + +
    • + Result: Better spatial reuse and efficiency, still robust to clients + that don't support full 802.11ax. +
    • +
    +
  4. + +
  5. + 802.11ax with full features: Supports uplink OFDMA and + uplink MU-MIMO via trigger frames. +
      +
    • + Fi-Wi provides: All of the above, plus trigger-based uplink + scheduling, uplink MU-OFDMA for small packets, coordinated + uplink/downlink airtime management. +
    • + +
    • + Result: Bidirectional sub-millisecond latency control, maximum + airtime efficiency. +
    • +
    +
  6. + +
  7. + 802.11be (Wi-Fi 7): Adds MLO, 320 MHz channels, + 4096-QAM, possibly multi-AP coordination support. +
      +
    • + Fi-Wi provides: Can leverage MLO via concentrator coordination + (Section 13.3), wider channels for capacity, and potentially + integrate with 802.11be multi-AP features while maintaining superior + shared-state coordination. +
    • + +
    • + Result: Cutting-edge performance while maintaining backward + compatibility. +
    • +
    +
  8. +
+ +

Deployment strategy:

+ + + +

+ D.7 Summary: 802.11ax/be as Enhancements, Not Replacements +

+ +

+ 802.11ax and 802.11be introduce valuable features — trigger frames, OFDMA, + BSS coloring, multi-AP coordination — that align with Fi-Wi's centralized + control philosophy and can enhance Fi-Wi deployments when clients support + them. However: +

+ +
    +
  1. + These features do not eliminate the need for Fi-Wi's + architecture. + They provide per-AP enhancements and limited inter-AP coordination, but + they cannot create the unified data plane, shared state, and + building-scale control that Fi-Wi provides. +
  2. + +
  3. + Fi-Wi is designed to work with or without them. Core + benefits (centralized queues, L4S marking, tail latency control) are + independent of client 802.11ax/be support. +
  4. + +
  5. + Fi-Wi leverages them when available. As client + capabilities improve, Fi-Wi automatically benefits from trigger-based + uplink scheduling, OFDMA efficiency, and other enhancements without + requiring architectural changes. +
  6. +
+ +

+ In short: + 802.11ax/be features make Fi-Wi better, but Fi-Wi solves problems these + standards cannot address within the constraints of the distributed-AP + model. + Fi-Wi is not "better APs" — it's a different architecture that happens to + integrate well with modern Wi-Fi standards as they evolve. +

+ +
+ +

Appendix E: ASIC Evolution to Complexity

+ +

E.1 Why ASICs accumulate legacy complexity

+ +

+ Unlike software, ASICs cannot easily “refactor away” unused features. + Removing blocks typically requires re-verifying entire subsystems, while + adding blocks often requires verifying only the new logic. This asymmetry + encourages accumulation: +

+ + + +

+ Over many product generations, this leads to RTL codebases that only grow. + Legacy modulation modes, preambles, power-save FSMs, calibration paths, + and debug hooks persist long after their practical value has disappeared. +

+ +

E.2 Real costs of legacy bloat

+ +

This accumulated complexity has tangible costs:

+ + + +

E.3 How Fi-Wi changes the design equation

+ +

Fi-Wi’s architecture separates the system into:

+ + + +

+ This separation dictates where complexity must live. RRHs implement only + what must be fast and deterministic: RF front end, PHY processing, minimal + MAC TX/RX, DMA, PTP synchronization, and PCIe-over-fiber transport. All + high-level behavior (queueing, L4S policy, aggregation strategy) lives in + the concentrator. +

+ +

E.4 Economic and engineering leverage

+ +

+ For a modern Wi-Fi chip at an advanced node, even a modest reduction in + unnecessary logic can translate into significant savings: smaller die, + lower power, simpler verification, and faster time to market. +

+ +

E.5 Design principle for Fi-Wi RRH silicon

+ +

The guiding principle for Fi-Wi RRH design is:

+ +
+ Complexity belongs in the concentrator; only latency-critical functions + belong in RRH silicon. +
+ +

+ Concretely, this means: no autonomous AP queueing/scheduling logic, no + legacy PHY/MAC support beyond what Fi-Wi needs, and no embedded firmware + CPU managing per-station behavior at the edge. +

+ +
+ +

+ Appendix F: A Day in the Life of a Packet (The "Preamble Shield" in + Action) +

+ +

+ To truly understand Fi-Wi, we must follow a single packet through the + system at the microsecond scale. This narrative illustrates how the + Workstation Concentrator (Section 13) and the + Scatter-Gather RRH (Appendix C) collaborate to trick the + physics of latency. +

+ +

F.1 The Scenario

+ +
+ The Setting: Room 304 (Served by RRH-A and RRH-B). The Flow: A 4K Video + Frame (Downlink) destined for "Alice's Laptop." The Constraint: L4S + requires <1ms tail latency. The Challenge: The packet is currently 200 + meters away in the Concentrator's DRAM. +
+ +

F.2 The Downlink Race (The "Preamble Shield")

+ +

+ T = 0 µs (Arrival): The video packet arrives at the + Concentrator's NIC. The CPU timestamps it immediately. +

+ +

+ T = 2 µs (The Decision): The Concentrator's software + scheduler inspects the packet. +

+ + + +

+ T = 10 µs (The Setup): The scheduler posts a + DMA Descriptor to RRH-A via PCIe.
+ Note: The payload data (1500 bytes) stays in the Concentrator. Only a + 16-byte pointer moves to the edge. +

+ +

+ T = 50 µs (The Trigger): RRH-A's LBT logic sees the + airtime is clear. It begins the transmission sequence. + This is where the magic happens: +

+ +
+ The Race Against the PHY:
+ Action 1: RRH-A starts transmitting the 802.11 Preamble + (PLCP) from its local SRAM. This takes 20 µs of + airtime.
+ Action 2: Simultaneously, RRH-A issues a PCIe + Read Request to fetch the payload from the Concentrator.
+
+ The payload must travel 200m up the fiber and back before the + Preamble finishes transmitting. +
+ +

+ T = 52 µs (The Fetch): The Read Request hits the + Concentrator's PCIe controller. Because of the 92-lane non-blocking fabric + (Section 13), there is zero switching delay. +

+ +

+ T = 55 µs (The Return): The payload data flies back down + the fiber. +

+ +

+ T = 58 µs (The Handover): The payload data arrives at + RRH-A's FIFO. The PHY is just finishing the last symbol of the Preamble. +

+ +

+ T = 59 µs (Seamless Serialization): The PHY seamlessly + switches from transmitting the Preamble to transmitting the payload. To + the air, it looks like one continuous stream. The 200-meter fiber latency + effectively vanished because it was hidden behind the mandatory PHY + training sequence. +

+ +

F.3 The Uplink Journey (Diversity & Sensing)

+ +

T = 200 µs: Alice sends a TCP ACK.

+ +

+ T = 204 µs (The Multi-Stat): Both RRH-A and RRH-B hear + the ACK. +

+ + + +

+ T = 210 µs (The Race Up): Both RRHs push the packet + CSI + metadata to the Concentrator. +

+ +

+ T = 215 µs (The Deduplication): The Concentrator sees two + copies of Sequence #104. It discards the weak one from RRH-B but keeps the + CSI data to update the "Sensing Model" (detecting that someone is standing + near RRH-B, blocking the line of sight). +

+ +

F.4 Contrast with Legacy Wi-Fi

+ +

If this were a traditional AP:

+ + + +

F.5 Edge Cases and Advanced Scenarios

+ +

+ RRH Failure: If RRH-A fails during the prefetch (e.g., + power loss), the concentrator detects the link loss immediately via PCIe + link state. Because the packet payload never left Concentrator DRAM, the + scheduler simply re-posts the descriptor to RRH-B. No packet is lost, and + TCP does not see a drop. +

+ +

+ Congestion: The scatter-gather pipeline depth allows the + Concentrator to queue up the next descriptor while the current + one is transmitting. This allows back-to-back TXOPs (SIFS spacing) without + idle gaps on the air, even with the fiber latency. +

+ +

+ Coordinated Transmission: The Concentrator can schedule + RRH-A and RRH-B to transmit concurrently to spatially separated clients. + It analyzes the CSI matrix to determine if spatial isolation is sufficient + (>25 dB cross-coupling attenuation). If yes, both RRHs transmit + simultaneously using standard 802.11 frames. If interference is detected, + the Concentrator schedules sequential TXOPs. This dynamic decision happens + per-packet based on real-time CSI. +

+ +

F.6 Summary: The Packet's Perspective

+ +

+ From the packet's view, Fi-Wi provides uplink diversity, per-flow fair + queuing, accurate ECN marking, and speculative DMA that hides PCIe + latency. The packet experiences the network as a transparent, zero-wait + pipe. +

+ +

F.7 The Critical Insight: Timing vs. Intelligence

+ +

+ Fi-Wi separates timing (RRH hardware) from + intelligence (Concentrator software), bridged by the + speculative DMA prefetch pipeline. This allows the hardware to meet strict + microsecond deadlines while the software retains the flexibility to run + complex scheduling, L4S, and spatial multiplexing logic. +

+ +
+ +

+ Appendix G: The Strategic Case for Fiber Infrastructure +

+ +

+ The upfront cost of installing fiber is often the primary friction point + for C-RAN adoption ("The Fiber Tax"). However, this framing ignores the + physics of modern signaling and the macroeconomics of construction. + Fi-Wi's reliance on fiber is not a tax; it is a strategic asset + conversion. +

+ +

G.1 The Physics of 100G (The Copper Wall)

+ +

+ We are hitting a hard physical limit with copper cabling. At modern data + center speeds (100Gb/s), signal loss in copper is so high it is + characterized in dB per inch. +

+ + + +

G.2 Labor Rate Hedging (Inflation Proofing)

+ +

+ In low-voltage construction, the cost of cabling is dominated by + labor (often 70-80%), not material. +

+ + + +

G.3 Asset vs. Consumable

+ +

+ Unlike HDMI or Copper Ethernet—which are + purpose-built cables engineered for a single generation—fiber is a raw + transport medium. It is a "pipe for light" that supports Ethernet, DWDM, + and PCIe-over-Fiber simultaneously. +

+ +

+ While cable standards have cycled (Cat5e → Cat6 → Cat6A), they remain + tethered to the legacy RJ45 connector. This physical + interface is rapidly becoming obsolete. Fi-Wi recognizes that + the connection is what matters, not the physical port. In + this architecture, the + 802.11 wireless interface becomes the new connector. By + installing fiber once as a permanent asset and treating Wi-Fi as the + universal 'plug' inside the room, the building infrastructure is 'one and + done'. This finally breaks the cycle of physical obsolescence. +

+ +

+ Appendix H: Centralized Observability and the ML Advantage +

+ +

+ Fi-Wi's centralized architecture provides observability that is difficult + or impractical to achieve in distributed AP systems. This appendix + presents the Observability Matrix—a systematic comparison + of what telemetry is directly observable, partially observable, or hidden + across different measurement approaches. This complete visibility is the + prerequisite for effective machine learning (Section 15) and deterministic + L4S control. +

+ +

The Observability Gap

+ +

+ Traditional Wi-Fi deployments rely on tools that provide only partial + visibility into system state. Operators attempt to infer problems from + symptoms (latency spikes, ECN marks, throughput degradation) without + directly observing root causes (queue growth, retry timing, MCS selection + under interference). This inference distance—the number + of steps between observable effects and hidden causes—makes control + systems less stable and limits the effectiveness of machine learning. +

+ +

+ The table below compares observability across six measurement approaches. + The legend indicates: +

+ +
+
+ Direct: Directly measurable with + microsecond-resolution timestamps +
+ +
+ Partial: Partially observable or requires + inference +
+ +
+ Not Observable: Hidden or cannot be reliably + measured +
+
+ +

Observability Matrix

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Telemetry / Metric + + ESP32-C5
+ RF sensor +
+ RPi 5
+ Monitor mode +
+ RPi 5
+ L4S node +
+ tcpdump
+ Packet capture +
+ iperf2
+ L4S +
+ Fi-Wi
+ Concentrator +
+ Energy detect / CCA +
+ Channel busy time +
+ NAV / medium reservation +
+ CSI / channel matrix +
+ MCS / GI / NSS +
+ PER / retry counts +
+ RSSI / SNR +
+ Queue depth +
+ Sojourn time +
+ ECN marks +
+ One-way delay (OWD) +
+ Responsiveness +
+ Throughput / goodput +
+ Deterministic playback +
+
+ +

Critical Observations

+ +

Queue Depth and Sojourn Time (highlighted rows):

+ +

+ These metrics are essential for L4S congestion control and machine + learning. Traditional tools (tcpdump, Wi-Fi packet capture) cannot + directly observe queue state because it exists inside firmware or kernel + layers. While synchronized ingress and egress packet captures could + theoretically infer queue depth through timing correlation, this approach + requires nanosecond-precise time synchronization across physically + separated capture points, perfect packet correlation despite potential + losses, and still cannot observe firmware-internal retry queues, + aggregation buffer states, or PHY scheduling decisions. External sniffers + see the explosion (the packet hitting the air), but they cannot see the + fuse burning (the packet sitting in the driver queue). Only centralized + queueing architectures expose these values with direct + microsecond-resolution timestamps. +

+ +

MCS / GI / NSS (PHY Configuration):

+ +

+ Monitor-mode packet capture can partially infer MCS from radiotap headers, + but this only shows what was transmitted—not the decision process, CSI + data, or PER history that informed the choice. The Fi-Wi Concentrator has + direct access to the complete decision state. +

+ +

Deterministic Playback (bottom row):

+ +

+ This capability enables machine learning. Deterministic playback means the + Concentrator can reproduce its own decision sequence from a log file: + packet arrivals, queue transitions, scheduling decisions, MCS selections, + and RRH transmission commands. While actual RF outcomes depend on station + behavior and channel conditions that may vary, the Concentrator can replay + its control decisions under the logged RF environment to evaluate + alternative strategies offline and verify whether different MCS/scheduling + choices would have improved performance. This is only possible when all + Concentrator-controlled components operate under a single clock with + complete state visibility. Distributed systems cannot reconstruct this + causal chain from partial packet traces because they lack visibility into + queue state, retry logic, and the decision-making process itself. +

+ +

Why This Enables More Effective Machine Learning

+ +

+ Section 15 describes how Fi-Wi uses machine learning to optimize MCS + transition rates. The observability matrix demonstrates significant + practical advantages that Fi-Wi's centralized architecture provides for ML + training: +

+ + + +

+ Fi-Wi's centralized architecture provides these observability advantages. + The Concentrator's event log becomes a high-quality training dataset where + every state transition is labeled with measured outcomes under consistent + instrumentation. While autonomous AP systems could attempt ML-based rate + adaptation using the partial observability available to them, Fi-Wi's + richer telemetry—particularly queue visibility, global CSI, and + deterministic replay—enables significantly more effective learning and + optimization. +

+ +
+ Coordination Shares Outcomes; Fi-Wi Centralizes Causes +

+ Coordinated AP systems can share summaries (throughput, ECN marks, + interference reports) but cannot share hidden internal state (queue + depth, firmware retry logic, aggregation decisions). This creates + inference distance—the controller sees effects but not causes. Fi-Wi + eliminates inference distance by removing autonomous decision-making + from the edge. Queues, scheduling, and PHY selection are centralized + under a single clock, producing an observable state graph where causes + are explicit, replayable, and directly controllable. This architectural + difference translates to measurably better ML training data quality. +

+
+ +
+

Appendix I: Channel Width Orchestration and Service Time Variance

+ +

+ The Fi-Wi architecture treats channel width as a dynamic control + parameter managed by the Concentrator. While 802.11be + (Wi-Fi 7) emphasizes 320 MHz peak PHY rates, Fi-Wi's orchestration + engine strategically selects 40 MHz channel widths in + high-density environments to ensure + Service Time Stationarity and the stability of the L4S + control loop. +

+ +

+ I.1 The Contention-Domain Collapse of Wideband Channels +

+ +

+ In shared-spectrum MDUs (Multi-Dwelling Units), the theoretical gain of + wider channels is often negated by + contention-domain collapse. In a CSMA/CA environment, a + transmission opportunity (TXOP) requires the entire bonded channel to be + idle. In a 6-AP overlapping scenario with 50% aggregate airtime + occupancy, the probability of finding all sub-bands simultaneously idle + drops exponentially with bandwidth. +

+ +

+ Under a simplified independent-sub-band occupancy assumption, a basic + model suggests P(160 MHz idle) ≈ (P(40 MHz idle))^4, + resulting in 4–16× fewer transmission opportunities. In practice, + partial correlation between sub-bands moderates the exponent but does + not eliminate the super-linear decline in idle probability. This leads + to: +

+ + + +

I.2 Queueing Theory and L4S Stability

+ +

+ From an M/G/1 queueing perspective, the performance of + the L4S control loop depends on the stability of the service rate (μ). + L4S stability requires frequent service opportunities and low variance + in service time to prevent the decoupling of the sender's congestion + window from the actual queue state. +

+ + + +

I.3 Link Adaptation and Spectral Robustness

+ +

+ Narrower channels reduce the probability that partial-band interference + (e.g., unmanaged IoT bursts) forces a full MCS downgrade across the + entire bonded width. This allows the Concentrator to + maintain stable link adaptation and a predictable drain rate, avoiding + the chaotic rate-shifting common in 160 MHz deployments. +

+ +

I.4 Orchestration: Width as a Control Variable

+ +

+ Fi-Wi is not anti-wideband; channel width is an orchestrated variable. + The system expands width opportunistically when contention is low to + leverage PHY gains and contracts it to 40 MHz when deterministic latency + is required. This prioritizes + spatial reuse and airtime isolation over maximum burst + rate—the fundamental technical unlock for Fi-Wi’s cell-per-room model. +

+ +

+ I.5 Capacity Density: Throughput Under a Latency SLO +

+ +

+ Fi-Wi optimizes Capacity Density under a Latency SLO, + rather than peak PHY on a single link. In dense OBSS environments, wide + channels reduce spatial reuse; narrower channels increase the number of + bounded contention domains. Consequently, aggregate goodput per area + increases even if per-link PHY decreases. +

+ +
+ Metric Definition: Low-Latency Goodput Density (ρ_LL) +

+ ρ_LL [Mbps / 1,000 sq ft] = (Σ Goodput_i) / Area | subject to p95 OWD + ≤ 20ms +

+ +

+ Where Goodput_i is the application-layer payload throughput + delivered while maintaining the p95 one-way delay (OWD) constraint. + The 20ms threshold reflects the target for interactive L4S + applications. +

+
+ +

+ Example Calculation (1,000 sq ft section of a 10,000 sq ft + floor): +

+ +

+ Assumptions: 50% aggregate offered load per BSS, default EDCA + parameters, and no explicit inter-AP coordination in the autonomous + case. +

+ + + +

+ I.6 Application: Aligning Wireless Capacity to Gigabit WAN Service +

+ +

+ To align with a Gigabit-class WAN service, the wireless architecture + must match the aggregate wireline supply to + orchestrated spatial demand. In a dense MDU, Contention + Delay is 10–100× larger than serialization time. A single 160 MHz AP + attempting to serve a Gigabit load creates a "fast but flaky" link that + collapses under co-channel interference, delivering only a fraction of + the ISP's provided capacity to real-time applications. +

+ +

+ Fi-Wi resolves this by using 40 MHz orchestration to spread the Gigabit + load across N coordinated spatial domains. This ensures + that the building-wide wireless fabric can actually saturate a 1 Gbps + WAN link with deterministic, multi-user goodput, rather than relying on + single-device peak bursts that starve other users and destabilize shared + airtime. +

+ +

+ I.7 Aggregation Quantization and L4S Feedback Mismatch +

+ +

+ L4S signals congestion at Layer 3 (IP ECN), but wideband Wi-Fi operates + via massive Layer 2 A-MPDU aggregation to maintain PHY efficiency. This + creates a fundamental control-loop mismatch: +

+ + + +

+ The Fi-Wi architecture addresses these challenges through its DualQ + implementation (Section 5.2), which maintains separate queues for L4S + and Classic traffic and performs per-packet sojourn time measurements at + the Concentrator before entering the A-MPDU aggregation pipeline. +

+ +
+ +

Comparison of Service Metrics (Dense MDU Contention Model)

+ +

+ Scenario: 2x2 MIMO, 6+ overlapping BSSIDs, shared unlicensed spectrum + (5/6 GHz), 50% aggregate offered load, autonomous EDCA parameters. See + Appendix J for full simulation parameters. +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Metric160 MHz (Autonomous CSMA)40 MHz (Fi-Wi Orchestrated)
Peak PHY Rate (2x2, MCS 11)~1.2 Gbps~300-400 Mbps
Effective Airtime Utilization<10% (Fragmented TXOPs)30–50% (Planned reuse / Bounded domain)
Service Time Variance (σ²)High (Heavy-tailed)Low (Near-stationary)
Queue Service Interval (median)Tens to >100 ms5–15 ms (Stationary)
DualQ ECN Feedback CoherenceSparse / Burst-markedContinuous / Stable marking
+ Goodput Density (ρ_LL)
+ (Mbps per 1,000 sq ft) +
+ ~12 Mbps
+ (Overlapping contention domains) +
+ ~128 Mbps
+ (8 RRHs, orthogonal 40 MHz channels) +
+ +

+ Economic Conclusion: Under realistic dense MDU + conditions, Fi-Wi's orchestrated 40 MHz architecture delivers ~10× + higher usable goodput density compared to autonomous wide-channel + deployments. This is the fundamental advantage of Fi-Wi: capacity scales + with RRH density and spatial reuse, not channel width alone. +

+ +

+ See Appendix J for detailed contention modeling and simulation + methodology. +

+
+ +
+

Appendix J: 10-Node MDU Simulation Methodology

+ +

+ This appendix details the Monte Carlo simulation and analytical models + used to derive the + Low-Latency Goodput Density (ρ_LL) metrics. The + framework evaluates Fi-Wi's spatial capacity gains under realistic + Multi-Dwelling Unit (MDU) contention scenarios. +

+ +

J.1 Spatial and RF Environment Model

+ +

+ The simulation contrasts traditional wide-area coverage with Fi-Wi's + localized orchestration. +

+ +
+ Building & RF Assumptions: +
    +
  • + Geometry: 10,000 sq ft floor divided into 8 units + (~1,250 sq ft each). Metrics are normalized to "per 1,000 sq ft" for + comparative analysis. +
  • + +
  • + Path Loss Model: + PL(d) = PL(d₀) + 10n log₁₀(d/d₀) + Xσ with + n = 2.8. +
  • + +
  • + OBSS Overlap: Autonomous case assumes 6 neighboring + BSSIDs audible at ≥ -62 dBm. +
  • + +
  • + Fi-Wi Isolation: 8 RRHs achieving >25 dB + co-channel isolation through planned orthogonal reuse. +
  • +
+
+ +

J.2 Contention and Backoff Logic

+ +

+ The simulation models 20 active stations (STAs) distributed across the + 8-unit floor (average 2.5 STAs per unit). + Service Time Variance (σ²) is calculated by observing + the delay between TX_START and ACK_END across + 10⁶ simulated TXOPs. +

+ + + +

J.3 The ρ_LL Filtration Process

+ +

+ The Goodput Density is derived by filtering raw + throughput through the 20ms p95 OWD constraint. +

+ +
+// Derivation for ρ_LL Calculation
+for each packet i:
+    delay_i = contention_delay + serialization_delay + retry_overhead
+    if delay_i <= 20ms:
+        accepted_payload += size_i
+    else:
+        dropped_from_goodput_metric++
+
+ρ_LL = (accepted_payload) / (total_time * area)
+  
+

J.3.1 Numerical Results and Derivation

+ +

+ The simulation produces the following goodput derivation for a 1,000 sq + ft sections: +

+ + + +

J.4 Traffic Model and Payload Composition

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Traffic Type% of LoadConstraint
Interactive (L4S/Gaming)20%Strict SLO subject
Streaming (4K Video)50%Freeze sensitive
Bulk (Background)30%Throughput focused
+
+ + + ↑ Contents + +