- Title
- A systematic methodology to evaluating optimised machine learning based network intrusion detection systems
- Creator
- Chindove, Hatitye Ethridge
- ThesisAdvisor
- Brown, Dane Lesley
- ThesisAdvisor
- Irwin, Barry Vivian William
- Subject
- Intrusion detection systems (Computer security)
- Subject
- Machine learning
- Subject
- Computer networks Security measures
- Subject
- Principal components analysis
- Date
- 2022-10-14
- Type
- Academic theses
- Type
- Master's theses
- Type
- text
- Identifier
- http://hdl.handle.net/10962/362774
- Identifier
- vital:65361
- Description
- A network intrusion detection system (NIDS) is essential for mitigating computer network attacks in various scenarios. However, the increasing complexity of computer networks and attacks makes classifying unseen or novel network traffic challenging. Supervised machine learning techniques (ML) used in a NIDS can be affected by different scenarios. Thus, dataset recency, size, and applicability are essential factors when selecting and tuning a machine learning classifier. This thesis explores developing and optimising several supervised ML algorithms with relatively new datasets constructed to depict real-world scenarios. The methodology includes empirical analyses of systematic ML-based NIDS for a near real-world network system to improve intrusion detection. The thesis is experimental heavy for model assessment. Data preparation methods are explored, followed by feature engineering techniques. The model evaluation process involves three experiments testing against a validation, un-trained, and retrained set. They compare several traditional machine learning and deep learning classifiers to identify the best NIDS model. Results show that the focus on feature scaling, feature selection methods and ML algo- rithm hyper-parameter tuning per model is an essential optimisation component. Distance based ML algorithm performed much better with quantile transformation whilst the tree based algorithms performed better without scaling. Permutation importance performs as a feature selection method compared to feature extraction using Principal Component Analysis (PCA) when applied against all ML algorithms explored. Random forests, Sup- port Vector Machines and recurrent neural networks consistently achieved the best results with high macro f1-score results of 90% 81% and 73% for the CICIDS 2017 dataset; and 72% 68% and 73% against the CICIDS 2018 dataset.
- Description
- Thesis (MSc) -- Faculty of Science, Computer Science, 2022
- Format
- computer, online resource, application/pdf, 1 online resource (134 pages), pdf
- Publisher
- Rhodes University, Faculty of Science, Computer Science
- Language
- English
- Rights
- Chindove, Hatitye Ethridge
- Rights
- Use of this resource is governed by the terms and conditions of the Creative Commons "Attribution-NonCommercial-ShareAlike" License (http://creativecommons.org/licenses/by-nc-sa/2.0/)
- Hits: 1158
- Visitors: 1158
- Downloads: 47
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details Download | SOURCE1 | CHINDOVE-MSC-TR22-156.pdf | 1 MB | Adobe Acrobat PDF | View Details Download |