Skip to main content

Multimodal Wireless Foundation Model

WavesFM for Multimodal Wireless Intelligence

4 modalitiesSelf-supervised pretrainingReproducible benchmarks

A single wireless foundation model that provides a shared representation for raw IQ streams, spectrograms, CSI, and CIR.

Our vision

Multimodal design

One ViT backbone learns a shared representation for IQ streams, spectrograms, CSI, and CIR.

Multitask support

Jointly supports sensing, communication, and localization tasks.

Benchmarks + artifacts

Linked datasets, protocols, reproducible benchmarks, weights, and code.

Benchmarks at a glance

Compare performance across datasets and fine-tuning regimes.

How to read

Regimes: LP = linear probe, FT2 = partial fine-tuning, LoRA = low-rank adaptation. Positioning tasks are displayed as 1/(1 + error) so higher is better.

Task wheel comparison across LP, FT2, and LoRA regimes

Why WavesFM

Masked wireless modeling

Self-supervised pretraining to learn a shared representation across modalities.

Downstream evaluations

Evaluate on a range of tasks: human activity sensing, signal and modulation classification, NR and UWB positioning, beam prediction, and more.

Finetuning regimes

LP, FT2, and LoRA finetuning.

Reproducibility

Docs, recipes, and versioned releases.

Paper

Main reference for the current model and benchmark suite.