From Data Uncertainty to Spectral Consistency: Curriculum-Derived Semantic Understanding in GPS-Denied Maritime Environments

Alastair Finlinson

doi:10.15126/thesis.901948

Maritime vision systems—whether deployed on unmanned surface vessels, aerial drones,

or coastal surveillance cameras, rely on accurate extraction of three sparse semantic

boundaries—the Visible Horizon Line (VHL), True Horizon Line (THL), and Shoreline

(SL)—to stabilise cameras, geo-register imagery to digital elevation models, and provide

last-resort localisation when GPS is jammed or denied. Yet these contours are challenging

to detect in scenes dominated by vast, low-texture sky-and-water regions, specular

reflections, wave clutter, haze, and rapidly changing weather; fewer than 1% of pixels

belong to the true boundaries, and horizon cues often blur into clouds or disappear

behind vessel wakes. Classical Canny–Hough or super-pixel heuristics collapse under

such conditions, while generic deep networks demand dense, costly annotation. To overcome

these obstacles, this thesis introduces a unified, data-efficient framework—built

from four complementary advances—that achieves state-of-the-art semantic-boundary

detection and multi-boundary estimation for robust, GPS-denied maritime navigation.

The first work builds a framework on top of a streamlined, low-latency UNet-like architecture

that ingests RGB frames alongside inertial measurement unit (IMU) streams,

enabling cross-modal feature alignment. Evaluated on multiple benchmarks, our network

achieves highly accurate predictions for the key semantic boundaries while sustaining

real-time performance. The work shows a first step towards extracting multiple

semantic boundaries from a single image using a simple framework. To the best of our

knowledge, this is the first work to attempt to extract all three semantic boundaries simultaneously

and within real-time constraints. The extraction of multiple boundaries,

or boundaries in general, applies to many autonomous tasks, such as those involving

UAVs, USVs, and land-based vehicles. These domains require the detection of surface

edges and horizons. Enhancing contour detection will improve downstream tasks,

including safer navigation in complex environments, obstacle detection, and collision

avoidance.

Second, recognising that even modest label noise severely degrades thin-boundary prediction,

we introduce a novel lightweight Edge Prior Module (EPM) that guides early

layers to suppress water-reflection artefacts and sky texture. The proposed framework

predicts multiple high-quality semantic boundaries in complex scenarios. The maritime

environment presents numerous obstacles and photometric distortions, making the task

particularly challenging when using classical methods. We train a robust and deployable

model that overcomes numerous challenges, such as varying lighting conditions,

obfuscations, and reflections from water surfaces.

Third, to address the semantically pixel class imbalance and accelerate convergence, we

devise curriculum learning with Fourier Spectral Alignment (FSA). The FSA loss aligns

the predictions and ground truth, forcing the model to capture both low-frequency

global shape and high-frequency edge detail. Our method and mechanisms demonstrate

a significant improvement over the current state of the art in semantic segmentation

on the LaRS dataset. The results showcase the detection capabilities of the approach,

setting a new state-of-the-art.

Fourth, to curb data and annotation costs, we introduce policy-driven curriculum distillation,

where image selection is framed as a Markov decision process solved by a

Deep Q-Network (DQN). At each epoch, the agent evaluates image embeddings, predicted

difficulty, and past validation gains to assign a utility score, admitting only

the most informative samples. The saliency-informed curation compresses the training

set while preserving key performance metric scores compared to full-data training,

reducing annotation effort and GPU compute. Because the policy re-scores incoming

data in real-time, it can continuously adapt the distilled subset to evolving operational

scenarios while retaining the model’s predictive strength.

Collectively, these four contributions—multi-contour semantic segmentation, edge-prior

conditioning, spectral curriculum learning, and reinforcement learning-guided dataset

distillation—compose a principled pipeline for maritime boundary detection. The resulting

system establishes new benchmarks across ReMaSTrED300, MaSTr1325, and

LaRS, achieves robust, real-time performance in GPS-denied conditions, and reduces

data requirements by more than half. Beyond maritime navigation, the proposed

techniques generalise to any sparse-label, edge-centric vision task, offering a scalable

blueprint for boundary-aware perception in autonomous robotic navigation.

From Data Uncertainty to Spectral Consistency: Curriculum-Derived Semantic Understanding in GPS-Denied Maritime Environments

Abstract

Files and links (1)

Metrics

Details

From Data Uncertainty to Spectral Consistency: Curriculum-Derived Semantic Understanding in GPS-Denied Maritime Environments

Abstract

Files and links (1)

Metrics

Details

Usage Policy