Evaluation of the effectiveness of small aperture network telescopes as IBR data sources
- Authors: Chindipha, Stones Dalitso
- Date: 2023-03-31
- Subjects: Computer networks Monitoring , Computer networks Security measures , Computer bootstrapping , Time-series analysis , Regression analysis , Mathematical models
- Language: English
- Type: Academic theses , Doctoral theses , text
- Identifier: http://hdl.handle.net/10962/366264 , vital:65849 , DOI https://doi.org/10.21504/10962/366264
- Description: The use of network telescopes to collect unsolicited network traffic by monitoring unallocated address space has been in existence for over two decades. Past research has shown that there is a lot of activity happening in this unallocated space that needs monitoring as it carries threat intelligence data that has proven to be very useful in the security field. Prior to the emergence of the Internet of Things (IoT), commercialisation of IP addresses and widespread of mobile devices, there was a large pool of IPv4 addresses and thus reserving IPv4 addresses to be used for monitoring unsolicited activities going in the unallocated space was not a problem. Now, preservation of such IPv4 addresses just for monitoring is increasingly difficult as there is not enough free addresses in the IPv4 address space to be used for just monitoring. This is the case because such monitoring is seen as a ’non-productive’ use of the IP addresses. This research addresses the problem brought forth by this IPv4 address space exhaustion in relation to Internet Background Radiation (IBR) monitoring. In order to address the research questions, this research developed four mathematical models: Absolute Mean Accuracy Percentage Score (AMAPS), Symmetric Absolute Mean Accuracy Percentage Score (SAMAPS), Standardised Mean Absolute Error (SMAE), and Standardised Mean Absolute Scaled Error (SMASE). These models are used to evaluate the research objectives and quantify the variations that exist between different samples. The sample sizes represent different lens sizes of the telescopes. The study has brought to light a time series plot that shows the expected proportion of unique source IP addresses collected over time. The study also imputed data using the smaller /24 IPv4 net-block subnets to regenerate the missing data points using bootstrapping to create confidence intervals (CI). The findings from the simulated data supports the findings computed from the models. The CI offers a boost to decision making. Through a series of experiments with monthly and quarterly datasets, the study proposed a 95% - 99% confidence level to be used. It was known that large network telescopes collect more threat intelligence data than small-sized network telescopes, however, no study, to the best of our knowledge, has ever quantified such a knowledge gap. With the findings from the study, small-sized network telescope users can now use their network telescopes with full knowledge of gap that exists in the data collected between different network telescopes. , Thesis (PhD) -- Faculty of Science, Computer Science, 2023
- Full Text:
- Authors: Chindipha, Stones Dalitso
- Date: 2023-03-31
- Subjects: Computer networks Monitoring , Computer networks Security measures , Computer bootstrapping , Time-series analysis , Regression analysis , Mathematical models
- Language: English
- Type: Academic theses , Doctoral theses , text
- Identifier: http://hdl.handle.net/10962/366264 , vital:65849 , DOI https://doi.org/10.21504/10962/366264
- Description: The use of network telescopes to collect unsolicited network traffic by monitoring unallocated address space has been in existence for over two decades. Past research has shown that there is a lot of activity happening in this unallocated space that needs monitoring as it carries threat intelligence data that has proven to be very useful in the security field. Prior to the emergence of the Internet of Things (IoT), commercialisation of IP addresses and widespread of mobile devices, there was a large pool of IPv4 addresses and thus reserving IPv4 addresses to be used for monitoring unsolicited activities going in the unallocated space was not a problem. Now, preservation of such IPv4 addresses just for monitoring is increasingly difficult as there is not enough free addresses in the IPv4 address space to be used for just monitoring. This is the case because such monitoring is seen as a ’non-productive’ use of the IP addresses. This research addresses the problem brought forth by this IPv4 address space exhaustion in relation to Internet Background Radiation (IBR) monitoring. In order to address the research questions, this research developed four mathematical models: Absolute Mean Accuracy Percentage Score (AMAPS), Symmetric Absolute Mean Accuracy Percentage Score (SAMAPS), Standardised Mean Absolute Error (SMAE), and Standardised Mean Absolute Scaled Error (SMASE). These models are used to evaluate the research objectives and quantify the variations that exist between different samples. The sample sizes represent different lens sizes of the telescopes. The study has brought to light a time series plot that shows the expected proportion of unique source IP addresses collected over time. The study also imputed data using the smaller /24 IPv4 net-block subnets to regenerate the missing data points using bootstrapping to create confidence intervals (CI). The findings from the simulated data supports the findings computed from the models. The CI offers a boost to decision making. Through a series of experiments with monthly and quarterly datasets, the study proposed a 95% - 99% confidence level to be used. It was known that large network telescopes collect more threat intelligence data than small-sized network telescopes, however, no study, to the best of our knowledge, has ever quantified such a knowledge gap. With the findings from the study, small-sized network telescope users can now use their network telescopes with full knowledge of gap that exists in the data collected between different network telescopes. , Thesis (PhD) -- Faculty of Science, Computer Science, 2023
- Full Text:
A systematic methodology to evaluating optimised machine learning based network intrusion detection systems
- Authors: Chindove, Hatitye Ethridge
- Date: 2022-10-14
- Subjects: Intrusion detection systems (Computer security) , Machine learning , Computer networks Security measures , Principal components analysis
- Language: English
- Type: Academic theses , Master's theses , text
- Identifier: http://hdl.handle.net/10962/362774 , vital:65361
- Description: A network intrusion detection system (NIDS) is essential for mitigating computer network attacks in various scenarios. However, the increasing complexity of computer networks and attacks makes classifying unseen or novel network traffic challenging. Supervised machine learning techniques (ML) used in a NIDS can be affected by different scenarios. Thus, dataset recency, size, and applicability are essential factors when selecting and tuning a machine learning classifier. This thesis explores developing and optimising several supervised ML algorithms with relatively new datasets constructed to depict real-world scenarios. The methodology includes empirical analyses of systematic ML-based NIDS for a near real-world network system to improve intrusion detection. The thesis is experimental heavy for model assessment. Data preparation methods are explored, followed by feature engineering techniques. The model evaluation process involves three experiments testing against a validation, un-trained, and retrained set. They compare several traditional machine learning and deep learning classifiers to identify the best NIDS model. Results show that the focus on feature scaling, feature selection methods and ML algo- rithm hyper-parameter tuning per model is an essential optimisation component. Distance based ML algorithm performed much better with quantile transformation whilst the tree based algorithms performed better without scaling. Permutation importance performs as a feature selection method compared to feature extraction using Principal Component Analysis (PCA) when applied against all ML algorithms explored. Random forests, Sup- port Vector Machines and recurrent neural networks consistently achieved the best results with high macro f1-score results of 90% 81% and 73% for the CICIDS 2017 dataset; and 72% 68% and 73% against the CICIDS 2018 dataset. , Thesis (MSc) -- Faculty of Science, Computer Science, 2022
- Full Text:
- Authors: Chindove, Hatitye Ethridge
- Date: 2022-10-14
- Subjects: Intrusion detection systems (Computer security) , Machine learning , Computer networks Security measures , Principal components analysis
- Language: English
- Type: Academic theses , Master's theses , text
- Identifier: http://hdl.handle.net/10962/362774 , vital:65361
- Description: A network intrusion detection system (NIDS) is essential for mitigating computer network attacks in various scenarios. However, the increasing complexity of computer networks and attacks makes classifying unseen or novel network traffic challenging. Supervised machine learning techniques (ML) used in a NIDS can be affected by different scenarios. Thus, dataset recency, size, and applicability are essential factors when selecting and tuning a machine learning classifier. This thesis explores developing and optimising several supervised ML algorithms with relatively new datasets constructed to depict real-world scenarios. The methodology includes empirical analyses of systematic ML-based NIDS for a near real-world network system to improve intrusion detection. The thesis is experimental heavy for model assessment. Data preparation methods are explored, followed by feature engineering techniques. The model evaluation process involves three experiments testing against a validation, un-trained, and retrained set. They compare several traditional machine learning and deep learning classifiers to identify the best NIDS model. Results show that the focus on feature scaling, feature selection methods and ML algo- rithm hyper-parameter tuning per model is an essential optimisation component. Distance based ML algorithm performed much better with quantile transformation whilst the tree based algorithms performed better without scaling. Permutation importance performs as a feature selection method compared to feature extraction using Principal Component Analysis (PCA) when applied against all ML algorithms explored. Random forests, Sup- port Vector Machines and recurrent neural networks consistently achieved the best results with high macro f1-score results of 90% 81% and 73% for the CICIDS 2017 dataset; and 72% 68% and 73% against the CICIDS 2018 dataset. , Thesis (MSc) -- Faculty of Science, Computer Science, 2022
- Full Text:
Evolving IoT honeypots
- Authors: Genov, Todor Stanislavov
- Date: 2022-10-14
- Subjects: Internet of things , Malware (Computer software) , QEMU , Honeypot , Cowrie
- Language: English
- Type: Academic theses , Master's theses , text
- Identifier: http://hdl.handle.net/10962/362819 , vital:65365
- Description: The Internet of Things (IoT) is the emerging world where arbitrary objects from our everyday lives gain basic computational and networking capabilities to become part of the Internet. Researchers are estimating between 25 and 35 billion devices will be part of Internet by 2022. Unlike conventional computers where one hardware platform (Intel x86) and three operating systems (Windows, Linux and OS X) dominate the market, the IoT landscape is far more heterogeneous. To meet the growth demand the number of The System-on-Chip (SoC) manufacturers has seen a corresponding exponential growth making embedded platforms based on ARM, MIPS or SH4 processors abundant. The pursuit for market share is further leading to a price war and cost-cutting ultimately resulting in cheap systems with limited hardware resources and capabilities. The frugality of IoT hardware has a domino effect. Due to resource constraints vendors are packaging devices with custom, stripped-down Linux-based firmwares optimized for performing the device’s primary function. Device management, monitoring and security features are by and far absent from IoT devices. This created an asymmetry favouring attackers and disadvantaging defenders. This research sets out to reduce the opacity and identify a viable strategy, tactics and tooling for gaining insight into the IoT threat landscape by leveraging honeypots to build and deploy an evolving world-wide Observatory, based on cloud platforms, to help with studying attacker behaviour and collecting IoT malware samples. The research produces useful tools and techniques for identifying behavioural differences between Medium-Interaction honeypots and real devices by replaying interactive attacker sessions collected from the Honeypot Network. The behavioural delta is used to evolve the Honeypot Network and improve its collection capabilities. Positive results are obtained with respect to effectiveness of the above technique. Findings by other researchers in the field are also replicated. The complete dataset and source code used for this research is made publicly available on the Open Science Framework website at https://osf.io/vkcrn/. , Thesis (MSc) -- Faculty of Science, Computer Science, 2022
- Full Text:
- Authors: Genov, Todor Stanislavov
- Date: 2022-10-14
- Subjects: Internet of things , Malware (Computer software) , QEMU , Honeypot , Cowrie
- Language: English
- Type: Academic theses , Master's theses , text
- Identifier: http://hdl.handle.net/10962/362819 , vital:65365
- Description: The Internet of Things (IoT) is the emerging world where arbitrary objects from our everyday lives gain basic computational and networking capabilities to become part of the Internet. Researchers are estimating between 25 and 35 billion devices will be part of Internet by 2022. Unlike conventional computers where one hardware platform (Intel x86) and three operating systems (Windows, Linux and OS X) dominate the market, the IoT landscape is far more heterogeneous. To meet the growth demand the number of The System-on-Chip (SoC) manufacturers has seen a corresponding exponential growth making embedded platforms based on ARM, MIPS or SH4 processors abundant. The pursuit for market share is further leading to a price war and cost-cutting ultimately resulting in cheap systems with limited hardware resources and capabilities. The frugality of IoT hardware has a domino effect. Due to resource constraints vendors are packaging devices with custom, stripped-down Linux-based firmwares optimized for performing the device’s primary function. Device management, monitoring and security features are by and far absent from IoT devices. This created an asymmetry favouring attackers and disadvantaging defenders. This research sets out to reduce the opacity and identify a viable strategy, tactics and tooling for gaining insight into the IoT threat landscape by leveraging honeypots to build and deploy an evolving world-wide Observatory, based on cloud platforms, to help with studying attacker behaviour and collecting IoT malware samples. The research produces useful tools and techniques for identifying behavioural differences between Medium-Interaction honeypots and real devices by replaying interactive attacker sessions collected from the Honeypot Network. The behavioural delta is used to evolve the Honeypot Network and improve its collection capabilities. Positive results are obtained with respect to effectiveness of the above technique. Findings by other researchers in the field are also replicated. The complete dataset and source code used for this research is made publicly available on the Open Science Framework website at https://osf.io/vkcrn/. , Thesis (MSc) -- Faculty of Science, Computer Science, 2022
- Full Text:
- «
- ‹
- 1
- ›
- »