Improving signer-independence using pose estimation and transfer learning for sign language recognition
- Marais, Marc, Brown, Dane L, Connan, James, Boby, Alden
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463406 , vital:76406 , xlink:href="https://doi.org/10.1007/978-3-031-35644-5"
- Description: Automated Sign Language Recognition (SLR) aims to bridge the com-munication gap between the hearing and the hearing disabled. Com-puter vision and deep learning lie at the forefront in working toward these systems. Most SLR research focuses on signer-dependent SLR and fails to account for variations in varying signers who gesticulate naturally. This paper investigates signer-independent SLR on the LSA64 dataset, focusing on different feature extraction approaches. Two approaches are proposed an InceptionV3-GRU architecture, which uses raw images as input, and a pose estimation LSTM architecture. MediaPipe Holistic is implemented to extract pose estimation landmark coordinates. A final third model applies augmentation and transfer learning using the pose estimation LSTM model. The research found that the pose estimation LSTM approach achieved the best perfor-mance with an accuracy of 80.22%. MediaPipe Holistic struggled with the augmentations introduced in the final experiment. Thus, looking into introducing more subtle augmentations may improve the model. Over-all, the system shows significant promise toward addressing the real-world signer-independence issue in SLR.
- Full Text:
- Date Issued: 2022
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463406 , vital:76406 , xlink:href="https://doi.org/10.1007/978-3-031-35644-5"
- Description: Automated Sign Language Recognition (SLR) aims to bridge the com-munication gap between the hearing and the hearing disabled. Com-puter vision and deep learning lie at the forefront in working toward these systems. Most SLR research focuses on signer-dependent SLR and fails to account for variations in varying signers who gesticulate naturally. This paper investigates signer-independent SLR on the LSA64 dataset, focusing on different feature extraction approaches. Two approaches are proposed an InceptionV3-GRU architecture, which uses raw images as input, and a pose estimation LSTM architecture. MediaPipe Holistic is implemented to extract pose estimation landmark coordinates. A final third model applies augmentation and transfer learning using the pose estimation LSTM model. The research found that the pose estimation LSTM approach achieved the best perfor-mance with an accuracy of 80.22%. MediaPipe Holistic struggled with the augmentations introduced in the final experiment. Thus, looking into introducing more subtle augmentations may improve the model. Over-all, the system shows significant promise toward addressing the real-world signer-independence issue in SLR.
- Full Text:
- Date Issued: 2022
Iterative Refinement Versus Generative Adversarial Networks for Super-Resolution Towards Licence Plate Detection
- Boby, Alden, Brown, Dane L, Connan, James
- Authors: Boby, Alden , Brown, Dane L , Connan, James
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463417 , vital:76407 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-99-1624-5_26"
- Description: Licence plate detection in unconstrained scenarios can be difficult because of the medium used to capture the data. Such data is not captured at very high resolution for practical reasons. Super-resolution can be used to improve the resolution of an image with fidelity beyond that of non-machine learning-based image upscaling algorithms such as bilinear or bicubic upscaling. Technological advances have introduced more than one way to perform super-resolution, with the best results coming from generative adversarial networks and iterative refinement with diffusion-based models. This paper puts the two best-performing super-resolution models against each other to see which is best for licence plate super-resolution. Quantitative results favour the generative adversarial network, while qualitative results lean towards the iterative refinement model.
- Full Text:
- Date Issued: 2022
- Authors: Boby, Alden , Brown, Dane L , Connan, James
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463417 , vital:76407 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-99-1624-5_26"
- Description: Licence plate detection in unconstrained scenarios can be difficult because of the medium used to capture the data. Such data is not captured at very high resolution for practical reasons. Super-resolution can be used to improve the resolution of an image with fidelity beyond that of non-machine learning-based image upscaling algorithms such as bilinear or bicubic upscaling. Technological advances have introduced more than one way to perform super-resolution, with the best results coming from generative adversarial networks and iterative refinement with diffusion-based models. This paper puts the two best-performing super-resolution models against each other to see which is best for licence plate super-resolution. Quantitative results favour the generative adversarial network, while qualitative results lean towards the iterative refinement model.
- Full Text:
- Date Issued: 2022
- «
- ‹
- 1
- ›
- »