Browsing by Author "Kushwaha, Rajni"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
Item Feature Extraction And Detection of Malicious URLs Using Deep Learning Approach(Indian Statistical Institute,Kolkata, 2019-07) Kushwaha, RajniPhishing Attack is one of the cyber bullying activity over the internet. Most of the phishing websites try to look similar to legitimate websites, their web content and URL features memic the legitimate URL. Due to emerging new techniques, detecting and analyzing these malicious URL is very costly due to their complexities. Traditionally, black and white listing is used for detection, but these technique was not good for real time.To address this, recent years have witnessed several e orts to perform Malicious URL Detection using Machine Learning. The most popular and scalable approaches use lexical properties of the URL string by extracting Bag-of-words like features, followed by applying machine learning models such as SVMs, Randon Forest etc. Various machine learning and deep learning techniques are used to improve generalization of malicious URLs.These approaches su er from several limitations: (i) Inability to e ectively capture semantic meaning and sequential patterns in URL strings; (ii) Requiring substantial manual feature engineering; and (iii) Inability to handle unseen features and generalize to test data. To address these Limitation, In this dissertation work, we are focused to built the real time and language independent phishing detection model by analyzing the anatomy of the URLs using deep learning techniques. To achieve this, we rstly try to nd static and dynamic features manually using some previous work. After getting the featured valued data set, we tried to nd the lexical features of Url using CNN which has both characters and words of the URL String to learn the URL embedding. After that we merge features which we manually selected and features learned from CNN and applied on Bi-LSTM Model to keeps the sequence information of URL. A hybrid model of CNN (convolution neural network model) and Bi-directional LSTM(Long Short Term Memory) are to achieve the goal. Our model analyze the URL without accessing the web content of websites. It eliminates the time latency.
