Deep clustering analysis and representation learning for high-dimensional data

FOIVOS NTELEMIS

doi:10.15126/thesis.900536

Back

Deep clustering analysis and representation learning for high-dimensional data

Doctoral Thesis

Open access

Deep clustering analysis and representation learning for high-dimensional data

FOIVOS NTELEMIS

Doctor of Philosophy (PhD), University of Surrey

30/11/2022

DOI:

https://doi.org/10.15126/thesis.900536

Abstract

Advanced technologies have accelerated the collection and storage of unlabeled, high dimensional data, such as visual data. Manually annotating these large datasets is time-consuming and only a temporary solution as the annotation relates to a particular dataset. Machine learning algorithms are capable of managing and analysing this large volume of data. However, due to high-dimensionality, traditional clustering techniques are inadequate. In this thesis, we aim to overcome the insight of traditional clustering methods by proposing a two-phase training deep framework. The first phase consists of a generative adversarial network (GAN) used as a feature extraction pipeline. To improve GAN's capacity, we introduced pre-defined kernel filters that encourage identifying visual edges. The second phase deploys an auxiliary classifier which clusters the extracted features. Although the proposed framework demonstrates promising results, the two-phase training increases the computation burden and the number of hyper-parameters. Therefore, we develop an innovative deep clustering framework capable to be optimised in a single-phase training. The GAN module is replaced with a grouping-based self-supervised learning (SSL) strategy, and by leveraging its learning process, clustering is achieved in real-time with the incorporation of mutual information as an objective function. Despite the significant high accuracy that this framework develops, the SSL strategy implements transformation schemes exclusively in visual data. To address this constraint, our last work focuses on developing a generic SSL method without the requirement for the definition of an explicit transformation scheme in particular datasets. To accomplish this task, an internal transformation mechanism is introduced where its formulation is not limited to specific data types and can be widely applied to visual, audio, text, or mass spectrometry data. All our proposed frameworks are evaluated with detailed ablation studies, compared with the latest state-of-the-art methods, and demonstrate high clustering precision and representation learning for a broad range of datasets.

Files and links (1)

pdf

Foivos_Ntelemis_thesis_Fivos Ntelemis10.78 MBDownload View

PDFCC BY-NC-SA V4.0, Open Access

Metrics

22 File views/ downloads

81 Record Views

Details

Title: Deep clustering analysis and representation learning for high-dimensional data
Creators: FOIVOS NTELEMIS - University of Surrey, Department of Computer Science
Contributors: YAOCHU JIN (Supervisor) - University of Surrey, Department of Computer Science
SPENCER A THOMAS (Supervisor)
HONGYING LILIAN TANG (Supervisor) - University of Surrey, Department of Computer Science
Awarding Institution: University of Surrey; Doctor of Philosophy (PhD)
Theses and Dissertations: Doctor of Philosophy (PhD), University of Surrey
Grants: Engineering and Physical Sciences Research Council (United Kingdom, Swindon) - EPSRC
Identifiers: 99695965902346
Academic Unit: Department of Computer Science
Resource Type: Doctoral Thesis

Deep clustering analysis and representation learning for high-dimensional data

Abstract

Files and links (1)

Metrics

Details

Usage Policy