- Title
- Investigating unimodal isolated signer-independent sign language recognition
- Creator
- Marais, Marc Jason
- ThesisAdvisor
- Brown, Dane
- ThesisAdvisor
- Connan, James
- Subject
- Convolutional neural network
- Subject
- Sign language recognition
- Subject
- Human activity recognition
- Subject
- Pattern recognition systems
- Subject
- Neural networks (Computer science)
- Date
- 2024-04-04
- Type
- Academic theses
- Type
- Master's theses
- Type
- text
- Identifier
- http://hdl.handle.net/10962/435343
- Identifier
- vital:73149
- Description
- Sign language serves as the mode of communication for the Deaf and Hard of Hearing community, embodying a rich linguistic and cultural heritage. Recent Sign Language Recognition (SLR) system developments aim to facilitate seamless communication between the Deaf community and the broader society. However, most existing systems are limited by signer-dependent models, hindering their adaptability to diverse signing styles and signers, thus impeding their practical implementation in real-world scenarios. This research explores various unimodal approaches, both pose-based and vision-based, for isolated signer-independent SLR using RGB video input on the LSA64 and AUTSL datasets. The unimodal RGB-only input strategy provides a realistic SLR setting where alternative data sources are either unavailable or necessitate specialised equipment. Through systematic testing scenarios, isolated signer-independent SLR experiments are conducted on both datasets, primarily focusing on AUTSL – a signer-independent dataset. The vision-based R(2+1)D-18 model emerged as the top performer, achieving 90.64% accuracy on the unseen AUTSL dataset test split, closely followed by the pose-based Spatio- Temporal Graph Convolutional Network (ST-GCN) model with an accuracy of 89.95%. Furthermore, these models achieved comparable accuracies at a significantly lower computational demand. Notably, the pose-based approach demonstrates robust generalisation to substantial background and signer variation. Moreover, the pose-based approach demands significantly less computational power and training time than vision-based approaches. The proposed unimodal pose-based and vision-based systems were concluded to both be effective at classifying sign classes in the LSA64 and AUTSL datasets.
- Description
- Thesis (MSc) -- Faculty of Science, Ichthyology and Fisheries Science, 2024
- Format
- computer, online resource, application/pdf, 1 online resource (154 pages), pdf
- Publisher
- Rhodes University, Faculty of Science, Computer Science
- Language
- English
- Rights
- Marais, Marc Jason
- Rights
- Use of this resource is governed by the terms and conditions of the Creative Commons "Attribution-NonCommercial-ShareAlike" License (http://creativecommons.org/licenses/by-nc-sa/2.0/)
- Hits: 641
- Visitors: 643
- Downloads: 25
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details | SOURCE1 | MARAIS-MSC-TR24-42.pdf | 2 MB | Adobe Acrobat PDF | View Details |