add AI strategy docs
This commit is contained in:
parent
05fbff0092
commit
a84f9a0db6
|
|
@ -0,0 +1,141 @@
|
||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="UTF-8">
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||||
|
<title>AI Strategy: Wi-Fi Collapse Detection</title>
|
||||||
|
<style>
|
||||||
|
body {
|
||||||
|
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif;
|
||||||
|
line-height: 1.6;
|
||||||
|
color: #333;
|
||||||
|
max-width: 960px;
|
||||||
|
margin: 0 auto;
|
||||||
|
padding: 20px;
|
||||||
|
background-color: #f4f6f8;
|
||||||
|
}
|
||||||
|
.container {
|
||||||
|
background-color: #fff;
|
||||||
|
padding: 50px;
|
||||||
|
border-radius: 8px;
|
||||||
|
box-shadow: 0 4px 6px rgba(0,0,0,0.05);
|
||||||
|
}
|
||||||
|
h1 { border-bottom: 3px solid #8e44ad; padding-bottom: 15px; color: #2c3e50; }
|
||||||
|
h2 { margin-top: 40px; color: #34495e; border-left: 5px solid #8e44ad; padding-left: 15px; }
|
||||||
|
h3 { margin-top: 25px; color: #7f8c8d; }
|
||||||
|
|
||||||
|
.architecture-box {
|
||||||
|
background: #f3e5f5;
|
||||||
|
border: 1px solid #e1bee7;
|
||||||
|
padding: 20px;
|
||||||
|
border-radius: 8px;
|
||||||
|
margin: 20px 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
code {
|
||||||
|
background-color: #2d3436;
|
||||||
|
color: #fab1a0;
|
||||||
|
padding: 2px 6px;
|
||||||
|
border-radius: 4px;
|
||||||
|
font-family: "SFMono-Regular", Consolas, monospace;
|
||||||
|
}
|
||||||
|
pre {
|
||||||
|
background-color: #2d3436;
|
||||||
|
color: #dfe6e9;
|
||||||
|
padding: 15px;
|
||||||
|
border-radius: 5px;
|
||||||
|
overflow-x: auto;
|
||||||
|
}
|
||||||
|
|
||||||
|
.step-number {
|
||||||
|
display: inline-block;
|
||||||
|
background: #8e44ad;
|
||||||
|
color: white;
|
||||||
|
width: 24px;
|
||||||
|
height: 24px;
|
||||||
|
border-radius: 50%;
|
||||||
|
text-align: center;
|
||||||
|
line-height: 24px;
|
||||||
|
font-weight: bold;
|
||||||
|
margin-right: 10px;
|
||||||
|
}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
|
||||||
|
<div class="container">
|
||||||
|
|
||||||
|
<h1>AI Strategy: Wi-Fi Collapse Detection</h1>
|
||||||
|
<p><strong>Technical Brief for Development Team</strong></p>
|
||||||
|
<p>This document outlines the machine learning pipeline for detecting Wi-Fi "Collapse" events (Hidden Node, Saturation, Interference) using ESP32 sensors. We utilize an <strong>LLM-Assisted Weak Supervision</strong> approach to overcome the lack of labeled training data.</p>
|
||||||
|
|
||||||
|
<div class="architecture-box">
|
||||||
|
<h3>🚀 The Core Concept</h3>
|
||||||
|
<p>We cannot hard-code thresholds because RF environments vary wildly. Instead, we use <strong>Gemini (LLM)</strong> as a "Senior Network Engineer" to label raw logs, and then train a fast, lightweight model (Random Forest/XGBoost) to mimic that decision logic in real-time.</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<h2>Phase 1: Firmware Data Engineering</h2>
|
||||||
|
<p>The ESP32 firmware is the data generator. It must output time-series features suitable for ML, not just human-readable logs.</p>
|
||||||
|
|
||||||
|
<h3>1. Feature Vector (The CSV)</h3>
|
||||||
|
<p>Every 1 second, the firmware writes a CSV line to the internal <code>storage</code> partition. This is our feature set:</p>
|
||||||
|
<ul>
|
||||||
|
<li><code>Timestamp</code>: Epoch time.</li>
|
||||||
|
<li><code>RetryRate</code>: (Float 0-100) Percentage of frames requiring retransmission.</li>
|
||||||
|
<li><code>AvgNAV</code>: (UInt16) Average Network Allocation Vector duration (microseconds).</li>
|
||||||
|
<li><code>MaxNAV</code>: (UInt16) Peak contention window seen in that second.</li>
|
||||||
|
<li><code>Collisions</code>: (UInt8) Count of inferred collision events.</li>
|
||||||
|
<li><code>AvgPHY</code>: (UInt16) Average data rate (Mbps). Low PHY + High NAV = Bad.</li>
|
||||||
|
<li><code>Mismatches</code>: (UInt8) Count of packets where duration > expected airtime.</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h3>2. Storage Strategy</h3>
|
||||||
|
<p>We use a custom partition table to allocate ~5-13MB for <strong>LittleFS/SPIFFS</strong>. NVS is strictly for config. This allows 24-72 hours of continuous logging.</p>
|
||||||
|
|
||||||
|
<hr>
|
||||||
|
|
||||||
|
<h2>Phase 2: The "Weak Supervision" Pipeline</h2>
|
||||||
|
<p>We solve the "Cold Start Problem" (having data but no labels) by using Generative AI.</p>
|
||||||
|
|
||||||
|
<h3>Step A: Context-Aware Capture</h3>
|
||||||
|
<p>Technicians capture logs in known scenarios. The filename conveys the context:</p>
|
||||||
|
<ul>
|
||||||
|
<li><code>microwave_interference.csv</code></li>
|
||||||
|
<li><code>hidden_node_scenario_A.csv</code></li>
|
||||||
|
<li><code>clean_baseline.csv</code></li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h3>Step B: LLM Labeling (Gemini)</h3>
|
||||||
|
<p>We feed the raw CSV chunks to Gemini via API with a prompt that injects domain knowledge:</p>
|
||||||
|
<div style="background:#eee; padding:15px; border-left:4px solid #8e44ad; font-style:italic;">
|
||||||
|
"Act as a Wi-Fi expert. Analyze this CSV log. This data represents a Hidden Node scenario. Look for periods where AvgNAV is high (>10ms) but AvgPHY remains normal, yet Retries spike. Label each timestamp row as 'Normal', 'Congestion', or 'Collapse'."
|
||||||
|
</div>
|
||||||
|
<p><strong>Result:</strong> A "Silver Standard" labeled dataset ready for supervised learning.</p>
|
||||||
|
|
||||||
|
<hr>
|
||||||
|
|
||||||
|
<h2>Phase 3: Training & Inference</h2>
|
||||||
|
<p>Gemini is too slow for real-time packet analysis. We train a classical model for the actual work.</p>
|
||||||
|
|
||||||
|
<h3>1. Model Selection</h3>
|
||||||
|
<p><strong>Algorithm:</strong> Random Forest Classifier or XGBoost.</p>
|
||||||
|
<ul>
|
||||||
|
<li><strong>Why?</strong> They excel at tabular data, handle non-linear relationships (e.g., high NAV is fine <em>unless</em> Retries are also high), and have microsecond inference times.</li>
|
||||||
|
<li><strong>Input:</strong> The 7-column Feature Vector from Phase 1.</li>
|
||||||
|
<li><strong>Output:</strong> Probability Score (0.0 - 1.0) for "Collapse".</li>
|
||||||
|
</ul>
|
||||||
|
|
||||||
|
<h3>2. Runtime Inference Architecture</h3>
|
||||||
|
<p><strong>Deployment Target:</strong> Linux Gateway / Edge Server.</p>
|
||||||
|
<ol>
|
||||||
|
<li><span class="step-number">1</span> ESP32 streams CSV lines via UDP (Broadcast/Unicast) to Linux.</li>
|
||||||
|
<li><span class="step-number">2</span> Python script listens on UDP port.</li>
|
||||||
|
<li><span class="step-number">3</span> Script loads pre-trained <code>model.pkl</code>.</li>
|
||||||
|
<li><span class="step-number">4</span> Incoming CSV -> Feature Vector -> <code>model.predict()</code>.</li>
|
||||||
|
<li><span class="step-number">5</span> If <code>Collapse_Prob > 0.8</code> for 3 consecutive seconds -> <strong>TRIGGER ALERT</strong>.</li>
|
||||||
|
</ol>
|
||||||
|
|
||||||
|
</div>
|
||||||
|
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
|
@ -0,0 +1,94 @@
|
||||||
|
# AI Strategy: Wi-Fi Collapse Detection
|
||||||
|
|
||||||
|
**Target Audience:** Development Team
|
||||||
|
**Objective:** Build a real-time detection engine using Weak Supervision.
|
||||||
|
|
||||||
|
## 🏗️ Architecture Overview
|
||||||
|
|
||||||
|
The system uses a two-stage AI approach:
|
||||||
|
1. **Teacher (Offline):** Gemini (LLM) analyzes historical logs to create "Ground Truth" labels.
|
||||||
|
2. **Student (Real-Time):** A lightweight Random Forest model runs on the Linux gateway for sub-second inference.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
[Image of AI Pipeline Diagram]
|
||||||
|
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 1: Firmware Data Engineering
|
||||||
|
|
||||||
|
The ESP32 firmware is responsible for **Feature Extraction**. It must aggregate raw packet events into 1-second statistical snapshots.
|
||||||
|
|
||||||
|
### The Feature Vector
|
||||||
|
The firmware writes the following struct to flash/UDP every 1000ms:
|
||||||
|
|
||||||
|
| Feature | Type | Description |
|
||||||
|
| :--- | :--- | :--- |
|
||||||
|
| `timestamp` | `uint32` | Epoch or Uptime. |
|
||||||
|
| `retry_rate` | `float` | % of frames with Retry bit set. |
|
||||||
|
| `avg_nav` | `uint16` | Average Network Allocation Vector (microseconds). |
|
||||||
|
| `max_nav` | `uint16` | Maximum contention window observed. |
|
||||||
|
| `collisions` | `uint8` | Count of inferred collisions (High NAV + Retry). |
|
||||||
|
| `avg_phy` | `uint16` | Average PHY Rate (Mbps). |
|
||||||
|
| `mismatches` | `uint8` | Count of duration anomalies (Spoofing/Bugs). |
|
||||||
|
|
||||||
|
**Storage:** * Do **NOT** use NVS. Use a custom partition table with **LittleFS** or **SPIFFS**.
|
||||||
|
* Capacity: ~1.1 days (8MB chip) to ~3 days (16MB chip) at 1Hz sampling.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 2: The "Weak Supervision" Pipeline
|
||||||
|
|
||||||
|
We lack labeled data. We cannot manually look at 100,000 rows of logs and say "That's a collapse." We use Gemini to do this.
|
||||||
|
|
||||||
|
### 1. Data Collection (Contextual)
|
||||||
|
Technicians run `async_mass_deploy` and collect logs in specific, controlled environments:
|
||||||
|
* **Clean:** Basement, Faraday cage.
|
||||||
|
* **Noisy:** Microwave running, Baby monitor active.
|
||||||
|
* **Hostile:** Hidden Node simulation (2 ESPs blasting UDP, hidden from each other).
|
||||||
|
|
||||||
|
### 2. The Labeling Loop (Python + Gemini API)
|
||||||
|
We will write a script (`label_data.py`) that:
|
||||||
|
1. Reads the raw CSVs.
|
||||||
|
2. Injects a **System Prompt** based on the filename (Context).
|
||||||
|
3. Asks Gemini to output a classification column: `0` (Normal), `1` (Interference), `2` (Collapse).
|
||||||
|
|
||||||
|
> **Prompt Logic:** "In a hidden node scenario, we expect high Retries and low Throughput, but standard NAV values might look normal because the nodes can't hear each other. Label rows matching this pattern as 'Collapse'."
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 3: Runtime Inference (Linux)
|
||||||
|
|
||||||
|
We do not run the LLM live. We run a compiled Scikit-Learn model.
|
||||||
|
|
||||||
|
### Training
|
||||||
|
* **Input:** The Gemini-labeled CSVs.
|
||||||
|
* **Model:** Random Forest Classifier (Robust, interpretable feature importance).
|
||||||
|
* **Artifact:** `wifi_collapse_model.pkl`
|
||||||
|
|
||||||
|
### The Real-Time Loop
|
||||||
|
The Linux monitoring service performs the following loop:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import socket
|
||||||
|
import joblib
|
||||||
|
import pandas as pd
|
||||||
|
|
||||||
|
# Load Model
|
||||||
|
model = joblib.load('wifi_collapse_model.pkl')
|
||||||
|
|
||||||
|
# Listen for ESP32 Data
|
||||||
|
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
|
||||||
|
sock.bind(('0.0.0.0', 5000))
|
||||||
|
|
||||||
|
while True:
|
||||||
|
data, addr = sock.recvfrom(1024)
|
||||||
|
# Parse CSV -> DataFrame
|
||||||
|
features = parse_packet(data)
|
||||||
|
|
||||||
|
# Inference (< 1ms)
|
||||||
|
prediction = model.predict(features)
|
||||||
|
|
||||||
|
if prediction == "COLLAPSE":
|
||||||
|
trigger_alert(addr, "Network Collapse Detected")
|
||||||
Loading…
Reference in New Issue