Enabling Vehicle Search Through Robust Licence Plate Detection
- Authors: Boby, Alden , Brown, Dane L , Connan, James , Marais, Marc , Kuhlane, Luxolo L
- Date: 2023
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463372 , vital:76403 , xlink:href="https://ieeexplore.ieee.org/abstract/document/10220508"
- Description: Licence plate recognition has many practical applications for security and surveillance. This paper presents a robust licence plate detection system that uses string-matching algorithms to identify a vehicle in data. Object detection models have had limited application in the character recognition domain. The system utilises the YOLO object detection model to perform character recognition to ensure more accurate character predictions. The model incorporates super-resolution techniques to enhance the quality of licence plate images to increase character recognition accuracy. The proposed system can accurately detect license plates in diverse conditions and can handle license plates with varying fonts and backgrounds. The system's effectiveness is demonstrated through experimentation on components of the system, showing promising license plate detection and character recognition accuracy. The overall system works with all the components to track vehicles by matching a target string with detected licence plates in a scene. The system has potential applications in law enforcement, traffic management, and parking systems and can significantly advance surveillance and security through automation.
- Full Text:
- Date Issued: 2023
Real-Time Detecting and Tracking of Squids Using YOLOv5
- Authors: Kuhlane, Luxolo L , Brown, Dane L , Marais, Marc
- Date: 2023
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463467 , vital:76411 , xlink:href="https://ieeexplore.ieee.org/abstract/document/10220521"
- Description: This paper proposes a real-time system for detecting and tracking squids using the YOLOv5 object detection algorithm. The system utilizes a large dataset of annotated squid images and videos to train a YOLOv5 model optimized for detecting and tracking squids. The model is fine-tuned to minimize false positives and optimize detection accuracy. The system is deployed on a GPU-enabled device for real-time processing of video streams and tracking of detected squids across frames. The accuracy and speed of the system make it a valuable tool for marine scientists, conservationists, and fishermen to better understand the behavior and distribution of these elusive creatures. Future work includes incorporating additional computer vision techniques and sensor data to improve tracking accuracy and robustness.
- Full Text:
- Date Issued: 2023
Spatiotemporal Convolutions and Video Vision Transformers for Signer-Independent Sign Language Recognition
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2023
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463478 , vital:76412 , xlink:href="https://ieeexplore.ieee.org/abstract/document/10220534"
- Description: Sign language is a vital tool of communication for individuals who are deaf or hard of hearing. Sign language recognition (SLR) technology can assist in bridging the communication gap between deaf and hearing individuals. However, existing SLR systems are typically signer-dependent, requiring training data from the specific signer for accurate recognition. This presents a significant challenge for practical use, as collecting data from every possible signer is not feasible. This research focuses on developing a signer-independent isolated SLR system to address this challenge. The system implements two model variants on the signer-independent datasets: an R(2+ I)D spatiotemporal convolutional block and a Video Vision transformer. These models learn to extract features from raw sign language videos from the LSA64 dataset and classify signs without needing handcrafted features, explicit segmentation or pose estimation. Overall, the R(2+1)D model architecture significantly outperformed the ViViT architecture for signer-independent SLR on the LSA64 dataset. The R(2+1)D model achieved a near-perfect accuracy of 99.53% on the unseen test set, with the ViViT model yielding an accuracy of 72.19 %. Proving that spatiotemporal convolutions are effective at signer-independent SLR.
- Full Text:
- Date Issued: 2023
An evaluation of hand-based algorithms for sign language recognition
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465124 , vital:76575 , xlink:href="https://ieeexplore.ieee.org/abstract/document/9856310"
- Description: Sign language recognition is an evolving research field in computer vision, assisting communication between hearing disabled people. Hand gestures contain the majority of the information when signing. Focusing on feature extraction methods to obtain the information stored in hand data in sign language recognition may improve classification accuracy. Pose estimation is a popular method for extracting body and hand landmarks. We implement and compare different feature extraction and segmentation algorithms, focusing on the hands only on the LSA64 dataset. To extract hand landmark coordinates, MediaPipe Holistic is implemented on the sign images. Classification is performed using poplar CNN architectures, namely ResNet and a Pruned VGG network. A separate 1D-CNN is utilised to classify hand landmark coordinates extracted using MediaPipe. The best performance was achieved on the unprocessed raw images using a Pruned VGG network with an accuracy of 95.50%. However, the more computationally efficient model using the hand landmark data and 1D-CNN for classification achieved an accuracy of 94.91%.
- Full Text:
- Date Issued: 2022
Deep Learning Approach to Image Deblurring and Image Super-Resolution using DeblurGAN and SRGAN
- Authors: Kuhlane, Luxolo L , Brown, Dane L , Connan, James , Boby, Alden , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465157 , vital:76578 , xlink:href="https://www.researchgate.net/profile/Luxolo-Kuhlane/publication/363257796_Deep_Learning_Approach_to_Image_Deblurring_and_Image_Super-Resolution_using_DeblurGAN_and_SRGAN/links/6313b5a01ddd44702131b3df/Deep-Learning-Approach-to-Image-Deblurring-and-Image-Super-Resolution-using-DeblurGAN-and-SRGAN.pdf"
- Description: Deblurring is the task of restoring a blurred image to a sharp one, retrieving the information lost due to the blur of an image. Image deblurring and super-resolution, as representative image restoration problems, have been studied for a decade. Due to their wide range of applications, numerous techniques have been proposed to tackle these problems, inspiring innovations for better performance. Deep learning has become a robust framework for many image processing tasks, including restoration. In particular, generative adversarial networks (GANs), proposed by [1], have demonstrated remarkable performances in generating plausible images. However, training GANs for image restoration is a non-trivial task. This research investigates optimization schemes for GANs that improve image quality by providing meaningful training objective functions. In this paper we use a DeblurGAN and Super-Resolution Generative Adversarial Network (SRGAN) on the chosen dataset.
- Full Text:
- Date Issued: 2022
Exploring the Incremental Improvements of YOLOv7 over YOLOv5 for Character Recognition
- Authors: Boby, Alden , Brown, Dane L , Connan, James , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463395 , vital:76405 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-35644-5_5"
- Description: Technological advances are being applied to aspects of life to improve quality of living and efficiency. This speaks specifically to automation, especially in the industry. The growing number of vehicles on the road has presented a need to monitor more vehicles than ever to enforce traffic rules. One way to identify a vehicle is through its licence plate, which contains a unique string of characters that make it identifiable within an external database. Detecting characters on a licence plate using an object detector has only recently been explored. This paper uses the latest versions of the YOLO object detector to perform character recognition on licence plate images. This paper expands upon existing object detection-based character recognition by investigating how improvements in the framework translate to licence plate character recognition accuracy compared to character recognition based on older architectures. Results from this paper indicate that the newer YOLO models have increased performance over older YOLO-based character recognition models such as CRNET.
- Full Text:
- Date Issued: 2022
Golf Swing Sequencing Using Computer Vision
- Authors: Marais, Marc , Bradshaw, Karen
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464129 , vital:76479 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-04881-4_28"
- Description: Analysis of golf swing events is a valuable tool to aid all golfers in im-proving their swing. Image processing and machine learning enable an automated system to perform golf swing sequencing using images. The majority of swing sequencing systems implemented involve using ex-pensive camera equipment or a motion capture suit. An image-based swing classification system is proposed and evaluated on the GolfDB dataset. The system implements an automated golfer detector com-bined with traditional machine learning algorithms and a CNN to classify swing events. The best performing classifier, the LinearSVM, achieved a recall score of 88.3% on the entire GolfDB dataset when combined with the golfer detector. However, without golfer detection, the pruned VGGNet achieved a recall score of 87.9%, significantly better (>10.7%) than the traditional machine learning models. The results are promising as the proposed system outperformed a Bi-LSTM deep learning ap-proach to achieve swing sequencing, which achieved a recall score of 76.1% on the same GolfDB dataset. Overall, the results were promising and worked towards a system that can assist all golfers in swing se-quencing without expensive equipment.
- Full Text:
- Date Issued: 2022
Improving signer-independence using pose estimation and transfer learning for sign language recognition
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463406 , vital:76406 , xlink:href="https://doi.org/10.1007/978-3-031-35644-5"
- Description: Automated Sign Language Recognition (SLR) aims to bridge the com-munication gap between the hearing and the hearing disabled. Com-puter vision and deep learning lie at the forefront in working toward these systems. Most SLR research focuses on signer-dependent SLR and fails to account for variations in varying signers who gesticulate naturally. This paper investigates signer-independent SLR on the LSA64 dataset, focusing on different feature extraction approaches. Two approaches are proposed an InceptionV3-GRU architecture, which uses raw images as input, and a pose estimation LSTM architecture. MediaPipe Holistic is implemented to extract pose estimation landmark coordinates. A final third model applies augmentation and transfer learning using the pose estimation LSTM model. The research found that the pose estimation LSTM approach achieved the best perfor-mance with an accuracy of 80.22%. MediaPipe Holistic struggled with the augmentations introduced in the final experiment. Thus, looking into introducing more subtle augmentations may improve the model. Over-all, the system shows significant promise toward addressing the real-world signer-independence issue in SLR.
- Full Text:
- Date Issued: 2022
Investigating signer-independent sign language recognition on the lsa64 dataset
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden , Kuhlane, Luxolo L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465179 , vital:76580 , xlink:href="https://www.researchgate.net/profile/Marc-Marais/publication/363174384_Investigating_Signer-Independ-ent_Sign_Language_Recognition_on_the_LSA64_Dataset/links/63108c7d5eed5e4bd138680f/Investigating-Signer-Independent-Sign-Language-Recognition-on-the-LSA64-Dataset.pdf"
- Description: Conversing with hearing disabled people is a significant challenge; however, computer vision advancements have significantly improved this through automated sign language recognition. One of the common issues in sign language recognition is signer-dependence, where variations arise from varying signers, who gesticulate naturally. Utilising the LSA64 dataset, a small scale Argentinian isolated sign language recognition, we investigate signer-independent sign language recognition. An InceptionV3-GRU architecture is employed to extract and classify spatial and temporal information for automated sign language recognition. The signer-dependent approach yielded an accuracy of 97.03%, whereas the signer-independent approach achieved an accuracy of 74.22%. The signer-independent system shows promise towards addressing the real-world and common issue of signer-dependence in sign language recognition.
- Full Text:
- Date Issued: 2022
Investigating the Effects of Image Correction Through Affine Transformations on Licence Plate Recognition
- Authors: Boby, Alden , Brown, Dane L , Connan, James , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465190 , vital:76581 , xlink:href="https://ieeexplore.ieee.org/abstract/document/9856380"
- Description: Licence plate recognition has many real-world applications, which fall under security and surveillance. Deep learning for licence plate recognition has been adopted to improve existing image-based processing techniques in recent years. Object detectors are a popular choice for approaching this task. All object detectors are some form of a convolutional neural network. The You Only Look Once framework and Region-Based Convolutional Neural Networks are popular models within this field. A novel architecture called the Warped Planar Object Detector is a recent development by Zou et al. that takes inspiration from YOLO and Spatial Network Transformers. This paper aims to compare the performance of the Warped Planar Object Detector and YOLO on licence plate recognition by training both models with the same data and then directing their output to an Enhanced Super-Resolution Generative Adversarial Network to upscale the output image, then lastly using an Optical Character Recognition engine to classify characters detected from the images.
- Full Text:
- Date Issued: 2022