Tracking the Invisible: Self-Supervised Wheel Track Detection with Thermal-RGB Fusion
Under review at IEEE/RSJ International Conference on INTELLIGENT ROBOTS \& SYSTEMS (IROS), 2026

Motivation
Autonomous systems struggle in snow-covered environments where visual cues disappear and terrain properties become homogeneous. Traditional perception systems fail to leverage the most reliable navigation cues available in these conditions: the physical imprint of previously traversed paths. This research addresses the critical need for robust wheel track detection to enable safe winter autonomy.
Solution
Following existing tracks provides verified path stability through compacted snow, predictable vehicle dynamics from preserved kinematic patterns, and optimal traction from pre-deformed terrain. These physical advantages persist even in white-out conditions, offering a failsafe navigation solution when other features vanish.
Methods
We present a wheel track detection system that leverages RGB-Thermal (RGB-T) imaging, where thermal channels reveal critical temperature differentials between compacted tracks and loose snow - tracks exhibit higher thermal inertia and lower reflectivity, emitting stronger radiation signatures even in visually homogeneous conditions. By fusing these distinctive thermal patterns with RGB spatial information, our method reliably identifies navigable tracks, enabling robust path-following in complete white-out conditions where snow textures and terrain features become indistinguishable.

1. Self-supervised learning

- Get vehicle trajectories from recorded GNSS/LiDAR data
- Calculate wheel track contact points from vehicle trajectories
- Project wheel track contact points to images as labels
2. Image fusion

- The RGB image and thermal image are downsampled to 512 x 512 pixels
- Image alignment via a static homography transformation
- Contrast Limited Adaptive Histogram Equalization (CLAHE) is applied to enhance the contrast of aligned images
3. Encoder-decoder design

- Use ResNet-18 as backbone of encoder
- Attention Gates focus on wheel tracks by amplifying rare pixel classes (addressing severe class imbalance).
- Atrous Convolutions preserve boundaries with multi-scale context. (handling spatial imbalance).
4. Model training and validation
- 10 epochs with a batch size of 4 and an image size of 512 x 512
- Adam optimizer with a learning rate scheduler starting at 1e-4
- L2 regularizer
Results
- The designed neural network outputs a pixel-wise probability map
- Each value ∈ [0,1] represents the predicted likelihood of the corresponding image pixel belonging to a wheel track
Ablation study on thermal image

Ablation study on attention gate and atrous dilated block

DataSet
ThermalTrack dataset can be download here ThermalTrack
We derived our ThermalTrack dataset by utilizing WADS, a specialized dataset designed for autonomous vehicles and robot research in inclement winter weather. WADS focuses exclusively on challenging conditions such as heavy snowfall and occasional white-out scenarios, providing a robust platform for testing and development. WADS comprises images captured across diverse conditions, encompassing variations in road surfaces, weather, lighting, and terrain types, etc.
To enhance the robustness and generalization of our model, we performed dataset augmentation, including techniques including random flip, rotation, brightness, contrast, scaling, noise injection, and perspective change to simulate a wider variety of challenging road conditions and improve the model’s ability to handle real-world variability. We augmented 951 images to 7608 in total for model training.
License
This data is made available under the Creative Commons Attribution-NonCommercial-ShareAlike license. You are free to share and adapt the data, but have to give appropriate credit and may not use the work for commercial purposes.
Maintainers
- Yiming Yang yimingya@mtu.edu
- Jeremy P. Bos jpbos@mtu.edu
