Adaptive machine learning based network intrusion detection
- Chindove, Hatitye E, Brown, Dane L
- Authors: Chindove, Hatitye E , Brown, Dane L
- Date: 2021
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464052 , vital:76471 , xlink:href="https://doi.org/10.1145/3487923.3487938"
- Description: Network intrusion detection system (NIDS) adoption is essential for mitigating computer network attacks in various scenarios. However, the increasing complexity of computer networks and attacks make it challenging to classify network traffic. Machine learning (ML) techniques in a NIDS can be affected by different scenarios, and thus the recency, size and applicability of datasets are vital factors to consider when selecting and tuning a machine learning classifier. The proposed approach evaluates relatively new datasets constructed such that they depict real-world scenarios. It includes analyses of dataset balancing and sampling, feature engineering and systematic ML-based NIDS model tuning focused on the adaptive improvement of intrusion detection. A comparison between machine learning classifiers forms part of the evaluation process. Results on the proposed approach model effectiveness for NIDS are discussed. Recurrent neural networks and random forests models consistently achieved high f1-score results with macro f1-scores of 0.73 and 0.87 for the CICIDS 2017 dataset; and 0.73 and 0.72 against the CICIDS 2018 dataset, respectively.
- Full Text:
- Authors: Chindove, Hatitye E , Brown, Dane L
- Date: 2021
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464052 , vital:76471 , xlink:href="https://doi.org/10.1145/3487923.3487938"
- Description: Network intrusion detection system (NIDS) adoption is essential for mitigating computer network attacks in various scenarios. However, the increasing complexity of computer networks and attacks make it challenging to classify network traffic. Machine learning (ML) techniques in a NIDS can be affected by different scenarios, and thus the recency, size and applicability of datasets are vital factors to consider when selecting and tuning a machine learning classifier. The proposed approach evaluates relatively new datasets constructed such that they depict real-world scenarios. It includes analyses of dataset balancing and sampling, feature engineering and systematic ML-based NIDS model tuning focused on the adaptive improvement of intrusion detection. A comparison between machine learning classifiers forms part of the evaluation process. Results on the proposed approach model effectiveness for NIDS are discussed. Recurrent neural networks and random forests models consistently achieved high f1-score results with macro f1-scores of 0.73 and 0.87 for the CICIDS 2017 dataset; and 0.73 and 0.72 against the CICIDS 2018 dataset, respectively.
- Full Text:
Quantifying the accuracy of small subnet-equivalent sampling of IPv4 internet background radiation datasets
- Chindipha, Stones D, Irwin, Barry V W, Herbert, Alan
- Authors: Chindipha, Stones D , Irwin, Barry V W , Herbert, Alan
- Date: 2019
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/473818 , vital:77684 , xlink:href="https://doi.org/10.1145/3351108.3351129"
- Description: Network telescopes have been used for over a decade to aid in identifying threats by gathering unsolicited network traffic. This Internet Background Radiation (IBR) data has proved to be a significant source of intelligence in combating emerging threats on the Internet at large. Traditionally, operation has required a significant contiguous block of IP addresses. Continued operation of such sensors by researchers and adoption by organisations as part of its operation intelligence is becoming a challenge due to the global shortage of IPv4 addresses. The pressure is on to use allocated IP addresses for operational purposes. Future use of IBR collection methods is likely to be limited to smaller IP address pools, which may not be contiguous. This paper offers a first step towards evaluating the feasibility of such small sensors. An evaluation is conducted of the random sampling of various subnet sized equivalents. The accuracy of observable data is compared against a traditional 'small' IPv4 network telescope using a /24 net-block. Results show that for much of the IBR data, sensors consisting of smaller, non-contiguous blocks of addresses are able to achieve high accuracy rates vs. the base case. While the results obtained given the current nature of IBR, it proves the viability for organisations to utilise free IP addresses within their networks for IBR collection and ultimately the production of Threat intelligence.
- Full Text:
- Authors: Chindipha, Stones D , Irwin, Barry V W , Herbert, Alan
- Date: 2019
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/473818 , vital:77684 , xlink:href="https://doi.org/10.1145/3351108.3351129"
- Description: Network telescopes have been used for over a decade to aid in identifying threats by gathering unsolicited network traffic. This Internet Background Radiation (IBR) data has proved to be a significant source of intelligence in combating emerging threats on the Internet at large. Traditionally, operation has required a significant contiguous block of IP addresses. Continued operation of such sensors by researchers and adoption by organisations as part of its operation intelligence is becoming a challenge due to the global shortage of IPv4 addresses. The pressure is on to use allocated IP addresses for operational purposes. Future use of IBR collection methods is likely to be limited to smaller IP address pools, which may not be contiguous. This paper offers a first step towards evaluating the feasibility of such small sensors. An evaluation is conducted of the random sampling of various subnet sized equivalents. The accuracy of observable data is compared against a traditional 'small' IPv4 network telescope using a /24 net-block. Results show that for much of the IBR data, sensors consisting of smaller, non-contiguous blocks of addresses are able to achieve high accuracy rates vs. the base case. While the results obtained given the current nature of IBR, it proves the viability for organisations to utilise free IP addresses within their networks for IBR collection and ultimately the production of Threat intelligence.
- Full Text:
Segmenting objects with indistinct edges, with application to aerial imagery of vegetation
- James, Katherine M F, Bradshaw, Karen L
- Authors: James, Katherine M F , Bradshaw, Karen L
- Date: 2019
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/460614 , vital:75969 , ISBN 9781450372657 , https://doi.org/10.1145/3351108.3351124
- Description: Image segmentation mask creation relies on objects having distinct edges. While this may be true for the objects seen in many image segmentation challenges, it is less so when approaching tasks such as segmentation of vegetation in aerial imagery. Such datasets contain indistinct edges, or areas of mixed information at edges, which introduces a level of annotator subjectivity at edge pixels. Existing loss functions apply equal learning ability to both these pixels of low and high annotation confidence. In this paper, we propose a weight map based loss function that takes into account low confidence in the annotation at edges of objects by down-weighting the contribution of these pixels to the overall loss. We examine different weight map designs to find the most optimal one when applied to a dataset of aerial imagery of vegetation, with the task of segmenting a particular genus of shrub from other land cover types. When compared to inverse class frequency weighted binary cross-entropy loss, we found that using weight map-based loss produced a better performing model than binary cross-entropy loss, improving F1 score by 4%.
- Full Text:
- Authors: James, Katherine M F , Bradshaw, Karen L
- Date: 2019
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/460614 , vital:75969 , ISBN 9781450372657 , https://doi.org/10.1145/3351108.3351124
- Description: Image segmentation mask creation relies on objects having distinct edges. While this may be true for the objects seen in many image segmentation challenges, it is less so when approaching tasks such as segmentation of vegetation in aerial imagery. Such datasets contain indistinct edges, or areas of mixed information at edges, which introduces a level of annotator subjectivity at edge pixels. Existing loss functions apply equal learning ability to both these pixels of low and high annotation confidence. In this paper, we propose a weight map based loss function that takes into account low confidence in the annotation at edges of objects by down-weighting the contribution of these pixels to the overall loss. We examine different weight map designs to find the most optimal one when applied to a dataset of aerial imagery of vegetation, with the task of segmenting a particular genus of shrub from other land cover types. When compared to inverse class frequency weighted binary cross-entropy loss, we found that using weight map-based loss produced a better performing model than binary cross-entropy loss, improving F1 score by 4%.
- Full Text:
Extended feature-fusion guidelines to improve image-based multi-modal biometrics
- Brown, Dane L, Bradshaw, Karen L
- Authors: Brown, Dane L , Bradshaw, Karen L
- Date: 2016
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/473796 , vital:77682 , xlink:href="https://doi.org/10.1145/2987491.2987512"
- Description: The feature-level, unlike the match score-level, lacks multi-modal fusion guidelines. This work demonstrates a practical approach for improved image-based biometric feature-fusion. The approach extracts and combines the face, fingerprint and palmprint at the feature-level for improved human identification accuracy. Feature-fusion guidelines, proposed in recent work, are extended by adding the palmprint modality and the support vector machine classifier. Guidelines take the form of strengths and weaknesses as observed in the applied feature processing modules during preliminary experiments. The guidelines are used to implement an effective biometric fusion system at the feature-level to reduce the equal error rate on the SDUMLA and IITD datasets, using a novel feature-fusion methodology.
- Full Text:
- Authors: Brown, Dane L , Bradshaw, Karen L
- Date: 2016
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/473796 , vital:77682 , xlink:href="https://doi.org/10.1145/2987491.2987512"
- Description: The feature-level, unlike the match score-level, lacks multi-modal fusion guidelines. This work demonstrates a practical approach for improved image-based biometric feature-fusion. The approach extracts and combines the face, fingerprint and palmprint at the feature-level for improved human identification accuracy. Feature-fusion guidelines, proposed in recent work, are extended by adding the palmprint modality and the support vector machine classifier. Guidelines take the form of strengths and weaknesses as observed in the applied feature processing modules during preliminary experiments. The guidelines are used to implement an effective biometric fusion system at the feature-level to reduce the equal error rate on the SDUMLA and IITD datasets, using a novel feature-fusion methodology.
- Full Text:
Towards a technical skills curriculum to supplement traditional computer science teaching
- Marais, Craig, Bradshaw, Karen L
- Authors: Marais, Craig , Bradshaw, Karen L
- Date: 2016
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/476640 , vital:77946 , ISBN 9781450342315 , https://muse.jhu.edu/book/52741
- Description: It is commonplace for students to enter university with skills deficiencies. However, this is cause for growing concern in the context of South Africa, as these `deficient' students are becoming more numerous. Public secondary schools in South Africa are failing to create students with adequate skills for careers in the STEM fields. This paper isolates these skills deficiencies to a subset of technical skills for problem-solving. The problem-solving skills are divided into content groups, which are then aligned to existing Computer Science content. A solution is proposed that demonstrates how the content can be presented without the need for extensive curriculum changes to established course content.
- Full Text:
- Authors: Marais, Craig , Bradshaw, Karen L
- Date: 2016
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/476640 , vital:77946 , ISBN 9781450342315 , https://muse.jhu.edu/book/52741
- Description: It is commonplace for students to enter university with skills deficiencies. However, this is cause for growing concern in the context of South Africa, as these `deficient' students are becoming more numerous. Public secondary schools in South Africa are failing to create students with adequate skills for careers in the STEM fields. This paper isolates these skills deficiencies to a subset of technical skills for problem-solving. The problem-solving skills are divided into content groups, which are then aligned to existing Computer Science content. A solution is proposed that demonstrates how the content can be presented without the need for extensive curriculum changes to established course content.
- Full Text:
Problem-solving ability of first year CS students: A case study and intervention
- Marais, Craig, Bradshaw, Karen L
- Authors: Marais, Craig , Bradshaw, Karen L
- Date: 2015
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/475235 , vital:77787
- Description: This paper reports the findings of computational thinking research undertaken to develop problem-solving skills in first year computer science students. Through the use of pre-and post-tests, statistical results are presented showing the definite acquisition of problem-solving skills by the students after completing the introductory first year computer science course. These skills are argued to be both innate in some students and acquired in others. By identifying the component skills required and presenting a step-by-step approach to teaching problem solving, this research aims to provide a method for actively instilling these skills in learners who lack them.
- Full Text:
- Authors: Marais, Craig , Bradshaw, Karen L
- Date: 2015
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/475235 , vital:77787
- Description: This paper reports the findings of computational thinking research undertaken to develop problem-solving skills in first year computer science students. Through the use of pre-and post-tests, statistical results are presented showing the definite acquisition of problem-solving skills by the students after completing the introductory first year computer science course. These skills are argued to be both innate in some students and acquired in others. By identifying the component skills required and presenting a step-by-step approach to teaching problem solving, this research aims to provide a method for actively instilling these skills in learners who lack them.
- Full Text:
Towards an Extensible Generic Agent-Based Simulator for Mammals
- Carse, Stephen, Bradshaw, Karen L
- Authors: Carse, Stephen , Bradshaw, Karen L
- Date: 2015
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/475296 , vital:77793 , xlink:href="https://ieeexplore.ieee.org/abstract/document/9856380"
- Description: Modelling tools are widely used by national parks, both within South Africa and in countries around the world. The modelling of animal behaviour, particularly in South Africa and other African countries, is well established. These models, however, tend to be developed with one particular application and species in mind and are not reusable in other scenarios, requiring further development work for the addition of another species. This paper presents an approach towards developing an agent-based generic system that simulates a range of mammal behaviours by building a set of core behaviours that can be parameterised according to the needs of each species. The system uses XML notation for the definition of a species and provides a GUI tool that produces the XML required to simulate a species and set up the initial animals present in the simulation. Various feedback tools allow the simulation to be examined and analysed in detail to ascertain the success of the simulation of the mammal behaviours and those behaviours' adaptation to various different species.
- Full Text:
- Authors: Carse, Stephen , Bradshaw, Karen L
- Date: 2015
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/475296 , vital:77793 , xlink:href="https://ieeexplore.ieee.org/abstract/document/9856380"
- Description: Modelling tools are widely used by national parks, both within South Africa and in countries around the world. The modelling of animal behaviour, particularly in South Africa and other African countries, is well established. These models, however, tend to be developed with one particular application and species in mind and are not reusable in other scenarios, requiring further development work for the addition of another species. This paper presents an approach towards developing an agent-based generic system that simulates a range of mammal behaviours by building a set of core behaviours that can be parameterised according to the needs of each species. The system uses XML notation for the definition of a species and provides a GUI tool that produces the XML required to simulate a species and set up the initial animals present in the simulation. Various feedback tools allow the simulation to be examined and analysed in detail to ascertain the success of the simulation of the mammal behaviours and those behaviours' adaptation to various different species.
- Full Text:
Computational thinking in educational activities: an evaluation of the educational game light-bot
- Gouws, Lindsey Ann, Bradshaw, Karen L
- Authors: Gouws, Lindsey Ann , Bradshaw, Karen L
- Date: 2013
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/477581 , vital:78101 , ISBN 9781450320788 , https://doi.org/10.1145/2462476.2466518
- Description: Computational thinking is gaining recognition as an important skill set for students, both in computer science and other disciplines. Although there has been much focus on this field in recent years, it is rarely taught as a formal course within the curriculum, and there is little consensus on what exactly computational thinking entails and how to teach and evaluate it. To address these concerns, we have developed a computational thinking framework to be used as a planning and evaluative tool. Within this framework, we aim to unify the differing opinions about what computational thinking should involve. As a case study, we have applied the framework to Light-Bot, an educational game with a strong focus on programming, and found that the framework provides us with insight into the usefulness of the game to reinforce computer science concepts.
- Full Text:
- Authors: Gouws, Lindsey Ann , Bradshaw, Karen L
- Date: 2013
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/477581 , vital:78101 , ISBN 9781450320788 , https://doi.org/10.1145/2462476.2466518
- Description: Computational thinking is gaining recognition as an important skill set for students, both in computer science and other disciplines. Although there has been much focus on this field in recent years, it is rarely taught as a formal course within the curriculum, and there is little consensus on what exactly computational thinking entails and how to teach and evaluate it. To address these concerns, we have developed a computational thinking framework to be used as a planning and evaluative tool. Within this framework, we aim to unify the differing opinions about what computational thinking should involve. As a case study, we have applied the framework to Light-Bot, an educational game with a strong focus on programming, and found that the framework provides us with insight into the usefulness of the game to reinforce computer science concepts.
- Full Text:
Evaluating the acceleration of typical scientific problems on the GPU
- Tristram, Dale, Bradshaw, Karen L
- Authors: Tristram, Dale , Bradshaw, Karen L
- Date: 2013
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/477607 , vital:78103 , ISBN 9781450321129 , https://doi.org/10.1145/2513456.2513473
- Description: General-purpose computation on graphics processing units (GPGPU) has great potential to accelerate many scientific models and algorithms. However, some problems are considerably more difficult to accelerate than others, and it may be difficult for those new to GPGPU to ascertain the difficulty of accelerating a particular problem. Additionally, problems of different levels of difficulty require varying complexities of optimisations to achieve satisfactory results, and currently there is no clear separation between the different levels of known optimisations, which would be helpful to new users of GPGPU. Through what was learned in the acceleration of three problems, problem attributes have been identified to assist in evaluating the difficulty of accelerating a problem on a GPU. We envisage that with further development, these attributes could form the foundation of a difficulty classification system that could be used to determine whether GPU acceleration is practical for a candidate GPU acceleration problem, aid in identifying appropriate techniques and optimisations, and outline the required GPGPU knowledge.
- Full Text:
- Authors: Tristram, Dale , Bradshaw, Karen L
- Date: 2013
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/477607 , vital:78103 , ISBN 9781450321129 , https://doi.org/10.1145/2513456.2513473
- Description: General-purpose computation on graphics processing units (GPGPU) has great potential to accelerate many scientific models and algorithms. However, some problems are considerably more difficult to accelerate than others, and it may be difficult for those new to GPGPU to ascertain the difficulty of accelerating a particular problem. Additionally, problems of different levels of difficulty require varying complexities of optimisations to achieve satisfactory results, and currently there is no clear separation between the different levels of known optimisations, which would be helpful to new users of GPGPU. Through what was learned in the acceleration of three problems, problem attributes have been identified to assist in evaluating the difficulty of accelerating a problem on a GPU. We envisage that with further development, these attributes could form the foundation of a difficulty classification system that could be used to determine whether GPU acceleration is practical for a candidate GPU acceleration problem, aid in identifying appropriate techniques and optimisations, and outline the required GPGPU knowledge.
- Full Text:
First year student performance in a test for computational thinking
- Gouws, Lindsey Ann, Bradshaw, Karen L, Wentworth, Peter E
- Authors: Gouws, Lindsey Ann , Bradshaw, Karen L , Wentworth, Peter E
- Date: 2013
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/477618 , vital:78104 , ISBN 9781450321129 , https://doi.org/10.1145/2513456.2513484
- Description: Computational thinking, a form of thinking and problem solving within computer science, has become a popular focus of research on computer science education. In this paper, we attempt to investigate the role that computational thinking plays in the experience of introductory computer science students at a South African university. To this end, we have designed and administered a test for computational thinking ability, and contrasted the results of this test with the class marks for the students involved. The results of this test give us an initial view of the abilities that students possess when entering the computer science course. The results indicate that students who performed well in the assessment have a favourable pass rate for their class tests, and specific areas of weakness have been identified. Finally, we describe the plan for a follow-up test to take place at the end of the course to determine how students' abilities have changed over a semester of studies.
- Full Text:
- Authors: Gouws, Lindsey Ann , Bradshaw, Karen L , Wentworth, Peter E
- Date: 2013
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/477618 , vital:78104 , ISBN 9781450321129 , https://doi.org/10.1145/2513456.2513484
- Description: Computational thinking, a form of thinking and problem solving within computer science, has become a popular focus of research on computer science education. In this paper, we attempt to investigate the role that computational thinking plays in the experience of introductory computer science students at a South African university. To this end, we have designed and administered a test for computational thinking ability, and contrasted the results of this test with the class marks for the students involved. The results of this test give us an initial view of the abilities that students possess when entering the computer science course. The results indicate that students who performed well in the assessment have a favourable pass rate for their class tests, and specific areas of weakness have been identified. Finally, we describe the plan for a follow-up test to take place at the end of the course to determine how students' abilities have changed over a semester of studies.
- Full Text:
Performance optimisation of sequential programs on multi-core processors
- Tristram, Waide B, Bradshaw, Karen
- Authors: Tristram, Waide B , Bradshaw, Karen
- Date: 2012
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/477111 , vital:78046 , ISBN 9781450313087 , https://doi.org/10.1145/2389836.2389851
- Description: With the increasing availability of multi-core processors, the sequential programming paradigm is no longer capable of harnessing the full power of processors. Parallel programming is however, generally complex and requires more expertise than the traditional sequential programming model. On the other hand, there are a multitude of optimisations for sequential programs that can exploit multiple cores without much effort by the programmer. The primary goal of this research is to identify available tools and techniques to aid programmers in the process of optimising C/C++ programs for execution on multi-processors. Using a couple of example programs we show that improved performance is possible using the proposed methodology. However, the choice of optimisation is dependent on the type of problem being solved and there is no generic best choice for all classes of problems.
- Full Text:
- Authors: Tristram, Waide B , Bradshaw, Karen
- Date: 2012
- Subjects: To be catalogued
- Language: English
- Type: text , book
- Identifier: http://hdl.handle.net/10962/477111 , vital:78046 , ISBN 9781450313087 , https://doi.org/10.1145/2389836.2389851
- Description: With the increasing availability of multi-core processors, the sequential programming paradigm is no longer capable of harnessing the full power of processors. Parallel programming is however, generally complex and requires more expertise than the traditional sequential programming model. On the other hand, there are a multitude of optimisations for sequential programs that can exploit multiple cores without much effort by the programmer. The primary goal of this research is to identify available tools and techniques to aid programmers in the process of optimising C/C++ programs for execution on multi-processors. Using a couple of example programs we show that improved performance is possible using the proposed methodology. However, the choice of optimisation is dependent on the type of problem being solved and there is no generic best choice for all classes of problems.
- Full Text:
- «
- ‹
- 1
- ›
- »