Investigating unimodal isolated signer-independent sign language recognition

Marais, Marc Jason

Title: Investigating unimodal isolated signer-independent sign language recognition
Creator: Marais, Marc Jason
ThesisAdvisor: Brown, Dane
ThesisAdvisor: Connan, James
Subject: Convolutional neural network
Subject: Sign language recognition
Subject: Human activity recognition
Subject: Pattern recognition systems
Subject: Neural networks (Computer science)
Date: 2024-04-04
Type: Academic theses
Type: Master's theses
Type: text
Identifier: http://hdl.handle.net/10962/435343
Identifier: vital:73149
Description: Sign language serves as the mode of communication for the Deaf and Hard of Hearing community, embodying a rich linguistic and cultural heritage. Recent Sign Language Recognition (SLR) system developments aim to facilitate seamless communication between the Deaf community and the broader society. However, most existing systems are limited by signer-dependent models, hindering their adaptability to diverse signing styles and signers, thus impeding their practical implementation in real-world scenarios. This research explores various unimodal approaches, both pose-based and vision-based, for isolated signer-independent SLR using RGB video input on the LSA64 and AUTSL datasets. The unimodal RGB-only input strategy provides a realistic SLR setting where alternative data sources are either unavailable or necessitate specialised equipment. Through systematic testing scenarios, isolated signer-independent SLR experiments are conducted on both datasets, primarily focusing on AUTSL – a signer-independent dataset. The vision-based R(2+1)D-18 model emerged as the top performer, achieving 90.64% accuracy on the unseen AUTSL dataset test split, closely followed by the pose-based Spatio- Temporal Graph Convolutional Network (ST-GCN) model with an accuracy of 89.95%. Furthermore, these models achieved comparable accuracies at a significantly lower computational demand. Notably, the pose-based approach demonstrates robust generalisation to substantial background and signer variation. Moreover, the pose-based approach demands significantly less computational power and training time than vision-based approaches. The proposed unimodal pose-based and vision-based systems were concluded to both be effective at classifying sign classes in the LSA64 and AUTSL datasets.
Description: Thesis (MSc) -- Faculty of Science, Ichthyology and Fisheries Science, 2024
Format: computer, online resource, application/pdf, 1 online resource (154 pages), pdf
Publisher: Rhodes University, Faculty of Science, Computer Science
Language: English
Rights: Marais, Marc Jason
Rights: Use of this resource is governed by the terms and conditions of the Creative Commons "Attribution-NonCommercial-ShareAlike" License (http://creativecommons.org/licenses/by-nc-sa/2.0/)

Hits: 513
Visitors: 511
Downloads: 19

Collections

RU Department of Computer Science

		Thumbnail	File	Description	Size	Format
View Details			SOURCE1	MARAIS-MSC-TR24-42.pdf	2 MB	Adobe Acrobat PDF	View Details