A systematic methodology to evaluating optimised machine learning based network intrusion detection systems
- Authors: Chindove, Hatitye Ethridge
- Date: 2022-10-14
- Subjects: Intrusion detection systems (Computer security) , Machine learning , Computer networks Security measures , Principal components analysis
- Language: English
- Type: Academic theses , Master's theses , text
- Identifier: http://hdl.handle.net/10962/362774 , vital:65361
- Description: A network intrusion detection system (NIDS) is essential for mitigating computer network attacks in various scenarios. However, the increasing complexity of computer networks and attacks makes classifying unseen or novel network traffic challenging. Supervised machine learning techniques (ML) used in a NIDS can be affected by different scenarios. Thus, dataset recency, size, and applicability are essential factors when selecting and tuning a machine learning classifier. This thesis explores developing and optimising several supervised ML algorithms with relatively new datasets constructed to depict real-world scenarios. The methodology includes empirical analyses of systematic ML-based NIDS for a near real-world network system to improve intrusion detection. The thesis is experimental heavy for model assessment. Data preparation methods are explored, followed by feature engineering techniques. The model evaluation process involves three experiments testing against a validation, un-trained, and retrained set. They compare several traditional machine learning and deep learning classifiers to identify the best NIDS model. Results show that the focus on feature scaling, feature selection methods and ML algo- rithm hyper-parameter tuning per model is an essential optimisation component. Distance based ML algorithm performed much better with quantile transformation whilst the tree based algorithms performed better without scaling. Permutation importance performs as a feature selection method compared to feature extraction using Principal Component Analysis (PCA) when applied against all ML algorithms explored. Random forests, Sup- port Vector Machines and recurrent neural networks consistently achieved the best results with high macro f1-score results of 90% 81% and 73% for the CICIDS 2017 dataset; and 72% 68% and 73% against the CICIDS 2018 dataset. , Thesis (MSc) -- Faculty of Science, Computer Science, 2022
- Full Text:
- Date Issued: 2022-10-14
- Authors: Chindove, Hatitye Ethridge
- Date: 2022-10-14
- Subjects: Intrusion detection systems (Computer security) , Machine learning , Computer networks Security measures , Principal components analysis
- Language: English
- Type: Academic theses , Master's theses , text
- Identifier: http://hdl.handle.net/10962/362774 , vital:65361
- Description: A network intrusion detection system (NIDS) is essential for mitigating computer network attacks in various scenarios. However, the increasing complexity of computer networks and attacks makes classifying unseen or novel network traffic challenging. Supervised machine learning techniques (ML) used in a NIDS can be affected by different scenarios. Thus, dataset recency, size, and applicability are essential factors when selecting and tuning a machine learning classifier. This thesis explores developing and optimising several supervised ML algorithms with relatively new datasets constructed to depict real-world scenarios. The methodology includes empirical analyses of systematic ML-based NIDS for a near real-world network system to improve intrusion detection. The thesis is experimental heavy for model assessment. Data preparation methods are explored, followed by feature engineering techniques. The model evaluation process involves three experiments testing against a validation, un-trained, and retrained set. They compare several traditional machine learning and deep learning classifiers to identify the best NIDS model. Results show that the focus on feature scaling, feature selection methods and ML algo- rithm hyper-parameter tuning per model is an essential optimisation component. Distance based ML algorithm performed much better with quantile transformation whilst the tree based algorithms performed better without scaling. Permutation importance performs as a feature selection method compared to feature extraction using Principal Component Analysis (PCA) when applied against all ML algorithms explored. Random forests, Sup- port Vector Machines and recurrent neural networks consistently achieved the best results with high macro f1-score results of 90% 81% and 73% for the CICIDS 2017 dataset; and 72% 68% and 73% against the CICIDS 2018 dataset. , Thesis (MSc) -- Faculty of Science, Computer Science, 2022
- Full Text:
- Date Issued: 2022-10-14
A multispectral and machine learning approach to early stress classification in plants
- Authors: Poole, Louise Carmen
- Date: 2022-04-06
- Subjects: Machine learning , Neural networks (Computer science) , Multispectral imaging , Image processing , Plant stress detection
- Language: English
- Type: Master's thesis , text
- Identifier: http://hdl.handle.net/10962/232410 , vital:49989
- Description: Crop loss and failure can impact both a country’s economy and food security, often to devastating effects. As such, the importance of successfully detecting plant stresses early in their development is essential to minimize spread and damage to crop production. Identification of the stress and the stress-causing agent is the most critical and challenging step in plant and crop protection. With the development of and increase in ease of access to new equipment and technology in recent years, the use of spectroscopy in the early detection of plant diseases has become notably popular. This thesis narrows down the most suitable multispectral imaging techniques and machine learning algorithms for early stress detection. Datasets were collected of visible images and multispectral images. Dehydration was selected as the plant stress type for the main experiments, and data was collected from six plant species typically used in agriculture. Key contributions of this thesis include multispectral and visible datasets showing plant dehydration as well as a separate preliminary dataset on plant disease. Promising results on dehydration showed statistically significant accuracy improvements in the multispectral imaging compared to visible imaging for early stress detection, with multispectral input obtaining a 92.50% accuracy over visible input’s 77.50% on general plant species. The system was effective at stress detection on known plant species, with multispectral imaging introducing greater improvement to early stress detection than advanced stress detection. Furthermore, strong species discrimination was achieved when exclusively testing either early or advanced dehydration against healthy species. , Thesis (MSc) -- Faculty of Science, Ichthyology & Fisheries Sciences, 2022
- Full Text:
- Date Issued: 2022-04-06
- Authors: Poole, Louise Carmen
- Date: 2022-04-06
- Subjects: Machine learning , Neural networks (Computer science) , Multispectral imaging , Image processing , Plant stress detection
- Language: English
- Type: Master's thesis , text
- Identifier: http://hdl.handle.net/10962/232410 , vital:49989
- Description: Crop loss and failure can impact both a country’s economy and food security, often to devastating effects. As such, the importance of successfully detecting plant stresses early in their development is essential to minimize spread and damage to crop production. Identification of the stress and the stress-causing agent is the most critical and challenging step in plant and crop protection. With the development of and increase in ease of access to new equipment and technology in recent years, the use of spectroscopy in the early detection of plant diseases has become notably popular. This thesis narrows down the most suitable multispectral imaging techniques and machine learning algorithms for early stress detection. Datasets were collected of visible images and multispectral images. Dehydration was selected as the plant stress type for the main experiments, and data was collected from six plant species typically used in agriculture. Key contributions of this thesis include multispectral and visible datasets showing plant dehydration as well as a separate preliminary dataset on plant disease. Promising results on dehydration showed statistically significant accuracy improvements in the multispectral imaging compared to visible imaging for early stress detection, with multispectral input obtaining a 92.50% accuracy over visible input’s 77.50% on general plant species. The system was effective at stress detection on known plant species, with multispectral imaging introducing greater improvement to early stress detection than advanced stress detection. Furthermore, strong species discrimination was achieved when exclusively testing either early or advanced dehydration against healthy species. , Thesis (MSc) -- Faculty of Science, Ichthyology & Fisheries Sciences, 2022
- Full Text:
- Date Issued: 2022-04-06
- «
- ‹
- 1
- ›
- »