Leaderboard
Performance rankings for traffic model evaluation
Select Task Type
Select Evaluation Category
Robustness
Baseline Performance (Real Data): TrafficFormer 0.8038, ET-BERT 0.7849, NetGPT 0.8604, YaTC 0.8566
| Model | R1 (Light) | R2 (Medium) | R3 (Heavy) | R4 (Label Noise) | Mean(R1–R4)/Baseline | Worst/Baseline |
|---|---|---|---|---|---|---|
| TrafficFormer | 0.8000 (-0.47%) | 0.7283 (-9.39%) | 0.7245 (-9.87%) | 0.7358 (-8.46%) | 92.95% | 90.13% |
| ET-BERT | 0.7811 (-0.48%) | 0.7207 (-8.18%) | 0.6981 (-11.06%) | 0.7000 (-10.82%) | 92.37% | 88.94% |
| NetGPT | 0.8340 (-3.07%) | 0.8264 (-3.95%) | 0.8151 (-5.26%) | 0.8226 (-4.39%) | 95.83% | 94.74% |
| YaTC | 0.8164 (-4.69%) | 0.8008 (-6.51%) | 0.7143 (-16.61%) | 0.8302 (-3.08%) | 92.27% | 83.39% |
Baseline Performance (Real Data): TrafficFormer 0.8718, ET-BERT 0.8278, NetGPT 0.8755, YaTC 0.8425
| Model | R1 (Light) | R2 (Medium) | R3 (Heavy) | R4 (Label Noise) | Mean(R1–R4)/Baseline | Worst/Baseline |
|---|---|---|---|---|---|---|
| TrafficFormer | 0.8498 (-2.52%) | 0.8278 (-5.05%) | 0.7912 (-9.25%) | 0.8681 (-0.42%) | 95.69% | 90.75% |
| ET-BERT | 0.8205 (-0.88%) | 0.7949 (-3.97%) | 0.7326 (-11.50%) | 0.7949 (-3.97%) | 94.92% | 88.50% |
| NetGPT | 0.8645 (-1.26%) | 0.8425 (-3.77%) | 0.8132 (-7.12%) | 0.8571 (-2.10%) | 96.44% | 92.88% |
| YaTC | 0.8015 (-4.87%) | 0.8039 (-4.58%) | 0.7414 (-12.00%) | 0.8315 (-1.30%) | 94.86% | 88.03% |
Baseline Performance (Real Data): TrafficFormer 0.7904, ET-BERT 0.7857, NetGPT 0.7780, YaTC 0.8962
| Model | R1 (Light) | R2 (Medium) | R3 (Heavy) | R4 (Label Noise) | Mean(R1–R4)/Baseline | Worst/Baseline |
|---|---|---|---|---|---|---|
| TrafficFormer | 0.6923 (-12.41%) | 0.5754 (-27.20%) | 0.4453 (-43.67%) | 0.7789 (-1.46%) | 78.82% | 56.34% |
| ET-BERT | 0.6675 (-15.04%) | 0.5445 (-30.70%) | 0.4104 (-47.77%) | 0.7793 (-0.81%) | 76.42% | 52.23% |
| NetGPT | 0.6823 (-12.30%) | 0.5715 (-26.54%) | 0.4474 (-42.49%) | 0.7597 (-2.35%) | 79.08% | 57.51% |
| YaTC | 0.8408 (-6.18%) | 0.7533 (-15.95%) | 0.6213 (-30.67%) | 0.8644 (-3.55%) | 85.91% | 69.33% |
Data Efficiency
Interpretability
ISCX-VPN(app) · ET-BERT
ISCX-VPN(service) · ET-BERT
CSTNET-TLS1.3 · ET-BERT
ISCX-VPN(app) · YaTC
ISCX-VPN(service) · YaTC
CSTNET-TLS1.3 · YaTC
Select Evaluation Category
Distribution Fidelity
IDS2017 · Numerical Columns Metrics
| Model | JSD | TVD |
|---|---|---|
| TrafficLLM | 0.2195 | 0.4051 |
| NetDiffusion | 0.4555 | 0.6932 |
IDS2017 · Categorical Columns Metrics
| Model | JSD | TVD |
|---|---|---|
| TrafficLLM | 0.3515 | 0.5140 |
| NetDiffusion | 0.6402 | 0.9651 |
VPN2016 · Numerical Columns Metrics
| Model | JSD | TVD |
|---|---|---|
| TrafficLLM | 0.4430 | 0.5241 |
| NetDiffusion | 0.3660 | 0.5927 |
VPN2016 · Categorical Columns Metrics
| Model | JSD | TVD |
|---|---|---|
| TrafficLLM | 0.6937 | 0.7738 |
| NetDiffusion | 0.4364 | 0.6784 |
IoT2023 · Numerical Columns Metrics
| Model | JSD | TVD |
|---|---|---|
| TrafficLLM | 0.3794 | 0.6501 |
| NetDiffusion | 0.4146 | 0.6629 |
IoT2023 · Categorical Columns Metrics
| Model | JSD | TVD |
|---|---|---|
| TrafficLLM | 0.5391 | 0.8519 |
| NetDiffusion | 0.4727 | 0.7019 |
Downstream Utility
Baseline Performance (Real Data): ET-BERT: 0.9756, NetGPT: 0.9868, TrafficFormer: 0.9640
IDS2017 · 50% Real, 50% Generated
| Model | ET-BERT | NetGPT | TrafficFormer |
|---|---|---|---|
| TrafficLLM | 0.9664 (-0.0092%) | 0.9836 (-0.0032%) | 0.9588 (-0.0052%) |
| NetDiffusion | 0.9776 (+0.0020) | 0.9824 (-0.0044%) | 0.9784 (+0.0144) |
IDS2017 · 75% Real, 25% Generated
| Model | ET-BERT | NetGPT | TrafficFormer |
|---|---|---|---|
| TrafficLLM | 0.9584 (-0.0172%) | 0.9848 (-0.0020%) | 0.9380 (-0.0260%) |
| NetDiffusion | 0.9748 (-0.0008%) | 0.9832 (-0.0036%) | 0.9672 (+0.0032) |
IDS2017 · 100% Generated
| Model | ET-BERT | NetGPT | TrafficFormer |
|---|---|---|---|
| TrafficLLM | 0.4528 (-0.5228%) | 0.4612 (-0.5256%) | 0.4528 (-0.5112%) |
| NetDiffusion | 0.2188 (-0.7568%) | 0.2232 (-0.7636%) | 0.1692 (-0.7948%) |
Baseline Performance (Real Data): ET-BERT: 0.9600, NetGPT: 0.8868, TrafficFormer: 0.8192
VPN2016 · 50% Real, 50% Generated
| Model | ET-BERT | NetGPT | TrafficFormer |
|---|---|---|---|
| TrafficLLM | 0.9992 (+0.0392) | 0.9992 (+0.1124) | 0.9988 (+0.1796) |
| NetDiffusion | 0.9080 (-0.0520%) | 0.9836 (+0.0968) | 0.9940 (+0.1748) |
VPN2016 · 75% Real, 25% Generated
| Model | ET-BERT | NetGPT | TrafficFormer |
|---|---|---|---|
| TrafficLLM | 0.9992 (+0.0392) | 0.7672 (-0.1196%) | 0.9984 (+0.1792) |
| NetDiffusion | 0.9932 (+0.0332) | 0.9972 (+0.1104) | 0.9856 (+0.1664) |
VPN2016 · 100% Generated
| Model | ET-BERT | NetGPT | TrafficFormer |
|---|---|---|---|
| TrafficLLM | 0.6796 (-0.2804%) | 0.6796 (-0.2072%) | 0.6796 (-0.1396%) |
| NetDiffusion | 0.6476 (-0.3124%) | 0.6700 (-0.2168%) | 0.6568 (-0.1624%) |
Baseline Performance (Real Data): ET-BERT: 0.6636, NetGPT: 0.6784, TrafficFormer: 0.6748
IoT2023 · 50% Real, 50% Generated
| Model | ET-BERT | NetGPT | TrafficFormer |
|---|---|---|---|
| TrafficLLM | 0.7012 (+0.0376) | 0.7512 (+0.0728) | 0.6744 (-0.0004%) |
| NetDiffusion | 0.7252 (+0.0616) | 0.7420 (+0.0636) | 0.7440 (+0.0692) |
IoT2023 · 75% Real, 25% Generated
| Model | ET-BERT | NetGPT | TrafficFormer |
|---|---|---|---|
| TrafficLLM | 0.6840 (+0.0204) | 0.7456 (+0.0672) | 0.6396 (-0.0352%) |
| NetDiffusion | 0.7600 (+0.0964) | 0.7736 (+0.0952) | 0.7184 (+0.0436) |
IoT2023 · 100% Generated
| Model | ET-BERT | NetGPT | TrafficFormer |
|---|---|---|---|
| TrafficLLM | 0.0940 (-0.5696%) | 0.0980 (-0.5804%) | 0.0940 (-0.5808%) |
| NetDiffusion | 0.0348 (-0.6288%) | 0.0464 (-0.6320%) | 0.0504 (-0.6244%) |
Protocol Correctness
| Model | Compliance Rate |
|---|---|
| TrafficLLM | -- |
| NetDiffusion | 0.6624 |
| Model | Compliance Rate |
|---|---|
| TrafficLLM | -- |
| NetDiffusion | 0.7246 |
| Model | Compliance Rate |
|---|---|
| TrafficLLM | -- |
| NetDiffusion | 0.7478 |
Generation Diversity
| Model | Entropy | Coverage | Novelty |
|---|---|---|---|
| TrafficLLM | 2.2598 | 0.4197 | 0.5277 |
| NetDiffusion | 1.8197 | 0.0989 | 0.6364 |
| Model | Entropy | Coverage | Novelty |
|---|---|---|---|
| TrafficLLM | 1.8232 | 0.1539 | 0.5307 |
| NetDiffusion | 1.7984 | 0.2197 | 0.7850 |
| Model | Entropy | Coverage | Novelty |
|---|---|---|---|
| TrafficLLM | 2.9316 | 0.1531 | 0.6110 |
| NetDiffusion | 1.3713 | 0.2297 | 0.7043 |