A framework for high speed lexical classification of malicious URLs
- Authors: Egan, Shaun Peter
- Date: 2014
- Subjects: Internet -- Security measures -- Research , Uniform Resource Identifiers -- Security measures -- Research , Neural networks (Computer science) -- Research , Computer security -- Research , Computer crimes -- Prevention , Phishing
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:4696 , http://hdl.handle.net/10962/d1011933 , Internet -- Security measures -- Research , Uniform Resource Identifiers -- Security measures -- Research , Neural networks (Computer science) -- Research , Computer security -- Research , Computer crimes -- Prevention , Phishing
- Description: Phishing attacks employ social engineering to target end-users, with the goal of stealing identifying or sensitive information. This information is used in activities such as identity theft or financial fraud. During a phishing campaign, attackers distribute URLs which; along with false information, point to fraudulent resources in an attempt to deceive users into requesting the resource. These URLs are made obscure through the use of several techniques which make automated detection difficult. Current methods used to detect malicious URLs face multiple problems which attackers use to their advantage. These problems include: the time required to react to new attacks; shifts in trends in URL obfuscation and usability problems caused by the latency incurred by the lookups required by these approaches. A new method of identifying malicious URLs using Artificial Neural Networks (ANNs) has been shown to be effective by several authors. The simple method of classification performed by ANNs result in very high classification speeds with little impact on usability. Samples used for the training, validation and testing of these ANNs are gathered from Phishtank and Open Directory. Words selected from the different sections of the samples are used to create a `Bag-of-Words (BOW)' which is used as a binary input vector indicating the presence of a word for a given sample. Twenty additional features which measure lexical attributes of the sample are used to increase classification accuracy. A framework that is capable of generating these classifiers in an automated fashion is implemented. These classifiers are automatically stored on a remote update distribution service which has been built to supply updates to classifier implementations. An example browser plugin is created and uses ANNs provided by this service. It is both capable of classifying URLs requested by a user in real time and is able to block these requests. The framework is tested in terms of training time and classification accuracy. Classification speed and the effectiveness of compression algorithms on the data required to distribute updates is tested. It is concluded that it is possible to generate these ANNs in a frequent fashion, and in a method that is small enough to distribute easily. It is also shown that classifications are made at high-speed with high-accuracy, resulting in little impact on usability.
- Full Text:
- Date Issued: 2014
- Authors: Egan, Shaun Peter
- Date: 2014
- Subjects: Internet -- Security measures -- Research , Uniform Resource Identifiers -- Security measures -- Research , Neural networks (Computer science) -- Research , Computer security -- Research , Computer crimes -- Prevention , Phishing
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:4696 , http://hdl.handle.net/10962/d1011933 , Internet -- Security measures -- Research , Uniform Resource Identifiers -- Security measures -- Research , Neural networks (Computer science) -- Research , Computer security -- Research , Computer crimes -- Prevention , Phishing
- Description: Phishing attacks employ social engineering to target end-users, with the goal of stealing identifying or sensitive information. This information is used in activities such as identity theft or financial fraud. During a phishing campaign, attackers distribute URLs which; along with false information, point to fraudulent resources in an attempt to deceive users into requesting the resource. These URLs are made obscure through the use of several techniques which make automated detection difficult. Current methods used to detect malicious URLs face multiple problems which attackers use to their advantage. These problems include: the time required to react to new attacks; shifts in trends in URL obfuscation and usability problems caused by the latency incurred by the lookups required by these approaches. A new method of identifying malicious URLs using Artificial Neural Networks (ANNs) has been shown to be effective by several authors. The simple method of classification performed by ANNs result in very high classification speeds with little impact on usability. Samples used for the training, validation and testing of these ANNs are gathered from Phishtank and Open Directory. Words selected from the different sections of the samples are used to create a `Bag-of-Words (BOW)' which is used as a binary input vector indicating the presence of a word for a given sample. Twenty additional features which measure lexical attributes of the sample are used to increase classification accuracy. A framework that is capable of generating these classifiers in an automated fashion is implemented. These classifiers are automatically stored on a remote update distribution service which has been built to supply updates to classifier implementations. An example browser plugin is created and uses ANNs provided by this service. It is both capable of classifying URLs requested by a user in real time and is able to block these requests. The framework is tested in terms of training time and classification accuracy. Classification speed and the effectiveness of compression algorithms on the data required to distribute updates is tested. It is concluded that it is possible to generate these ANNs in a frequent fashion, and in a method that is small enough to distribute easily. It is also shown that classifications are made at high-speed with high-accuracy, resulting in little impact on usability.
- Full Text:
- Date Issued: 2014
Phishing within e-commerce: reducing the risk, increasing the trust
- Authors: Megaw, Gregory M
- Date: 2010
- Subjects: Phishing , Identity theft -- Prevention , Electronic commerce , Computer security , Internet -- Safety measures
- Language: English
- Type: Thesis , Masters , MCom (Information Systems)
- Identifier: vital:11131 , http://hdl.handle.net/10353/376 , Phishing , Identity theft -- Prevention , Electronic commerce , Computer security , Internet -- Safety measures
- Description: E-Commerce has been plagued with problems since its inception and this study examines one of these problems: The lack of user trust in E-Commerce created by the risk of phishing. Phishing has grown exponentially together with the expansion of the Internet. This growth and the advancement of technology has not only benefited honest Internet users, but has enabled criminals to increase their effectiveness which has caused considerable damage to this budding area of commerce. Moreover, it has negatively impacted both the user and online business in breaking down the trust relationship between them. In an attempt to explore this problem, the following was considered: First, E-Commerce’s vulnerability to phishing attacks. By referring to the Common Criteria Security Model, various critical security areas within E-Commerce are identified, as well as the areas of vulnerability and weakness. Second, the methods and techniques used in phishing, such as phishing e-mails, websites and addresses, distributed attacks and redirected attacks, as well as the data that phishers seek to obtain, are examined. Furthermore, the way to reduce the risk of phishing and in turn increase the trust between users and websites is identified. Here the importance of Trust and the Uncertainty Reduction Theory plus the fine balance between trust and control is explored. Finally, the study presents Critical Success Factors that aid in phishing prevention and control, these being: User Authentication, Website Authentication, E-mail Authentication, Data Cryptography, Communication, and Active Risk Mitigation.
- Full Text:
- Date Issued: 2010
- Authors: Megaw, Gregory M
- Date: 2010
- Subjects: Phishing , Identity theft -- Prevention , Electronic commerce , Computer security , Internet -- Safety measures
- Language: English
- Type: Thesis , Masters , MCom (Information Systems)
- Identifier: vital:11131 , http://hdl.handle.net/10353/376 , Phishing , Identity theft -- Prevention , Electronic commerce , Computer security , Internet -- Safety measures
- Description: E-Commerce has been plagued with problems since its inception and this study examines one of these problems: The lack of user trust in E-Commerce created by the risk of phishing. Phishing has grown exponentially together with the expansion of the Internet. This growth and the advancement of technology has not only benefited honest Internet users, but has enabled criminals to increase their effectiveness which has caused considerable damage to this budding area of commerce. Moreover, it has negatively impacted both the user and online business in breaking down the trust relationship between them. In an attempt to explore this problem, the following was considered: First, E-Commerce’s vulnerability to phishing attacks. By referring to the Common Criteria Security Model, various critical security areas within E-Commerce are identified, as well as the areas of vulnerability and weakness. Second, the methods and techniques used in phishing, such as phishing e-mails, websites and addresses, distributed attacks and redirected attacks, as well as the data that phishers seek to obtain, are examined. Furthermore, the way to reduce the risk of phishing and in turn increase the trust between users and websites is identified. Here the importance of Trust and the Uncertainty Reduction Theory plus the fine balance between trust and control is explored. Finally, the study presents Critical Success Factors that aid in phishing prevention and control, these being: User Authentication, Website Authentication, E-mail Authentication, Data Cryptography, Communication, and Active Risk Mitigation.
- Full Text:
- Date Issued: 2010
- «
- ‹
- 1
- ›
- »