Today with increase using social media, a lot of researchers have interested in topic extraction from Twitter. Twitter is an unstructured short text and messy that it is critical to find topics from tweets. While topic modeling algorithms such as Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are originally designed to derive topics from large documents such as articles, and books. They are often less efficient when applied to short text content like Twitter. Luckily, Twitter has many features that represent the interaction between users. Tweets have rich user-generated hashtags as keywords. In this paper, we exploit the hashtags feature to improve topics learned from Twitter content without modifying the basic topic model of LSA and LDA. Users who share the same hashtag at most discuss the same topic. We compare the performance of the two methods (LSA and LDA) using the topic coherence (with and without hashtags). The experiment result on the Twitter dataset showed that LSA has better coherence score with hashtags than that do not incorporate hashtags. In contrast, our experiments show that the LDA has a better coherence score without incorporating hashtags. Finally, LDA has a better coherence score than LSA and the best coherence result obtained from the LDA method was (0.6047) and the LSA method was (0.4744) but the number of topics in LDA was higher than LSA. Thus, LDA may cause the same tweets to discuss the same subject set into different clustering.
Twitter data analysis is an emerging field of research that utilizes data collected from Twitter to address many issues such as disaster response, sentiment analysis, and demographic studies. The success of data analysis relies on collecting accurate and representative data of the studied group or phenomena to get the best results. Various twitter analysis applications rely on collecting the locations of the users sending the tweets, but this information is not always available. There are several attempts at estimating location based aspects of a tweet. However, there is a lack of attempts on investigating the data collection methods that are focused on location. In this paper, we investigate the two methods for obtaining location-based dat
... Show MoreDetection moving car in front view is difficult operation because of the dynamic background due to the movement of moving car and the complex environment that surround the car, to solve that, this paper proposed new method based on linear equation to determine the region of interest by building more effective background model to deal with dynamic background scenes. This method exploited the permitted region between cars according to traffic law to determine the region (road) that in front the moving car which the moving cars move on. The experimental results show that the proposed method can define the region that represents the lane in front of moving car successfully with precision over 94%and detection rate 86
... Show MorePattern matching algorithms are usually used as detecting process in intrusion detection system. The efficiency of these algorithms is affected by the performance of the intrusion detection system which reflects the requirement of a new investigation in this field. Four matching algorithms and a combined of two algorithms, for intrusion detection system based on new DNA encoding, are applied for evaluation of their achievements. These algorithms are Brute-force algorithm, Boyer-Moore algorithm, Horspool algorithm, Knuth-Morris-Pratt algorithm, and the combined of Boyer-Moore algorithm and Knuth–Morris– Pratt algorithm. The performance of the proposed approach is calculated based on the executed time, where these algorithms are applied o
... Show MorePredicting the network traffic of web pages is one of the areas that has increased focus in recent years. Modeling traffic helps find strategies for distributing network loads, identifying user behaviors and malicious traffic, and predicting future trends. Many statistical and intelligent methods have been studied to predict web traffic using time series of network traffic. In this paper, the use of machine learning algorithms to model Wikipedia traffic using Google's time series dataset is studied. Two data sets were used for time series, data generalization, building a set of machine learning models (XGboost, Logistic Regression, Linear Regression, and Random Forest), and comparing the performance of the models using (SMAPE) and
... Show MoreA session is a period of time linked to a user, which is initiated when he/she arrives at a web application and it ends when his/her browser is closed or after a certain time of inactivity. Attackers can hijack a user's session by exploiting session management vulnerabilities by means of session fixation and cross-site request forgery attacks.
Very often, session IDs are not only identification tokens, but also authenticators. This means that upon login, users are authenticated based on their credentials (e.g., usernames/passwords or digital certificates) and issued session IDs that will effectively serve as temporary static passwords for accessing their sessions. This makes session IDs a very appealing target for attackers. In many c
At the level of both individuals and companies, Wireless Sensor Networks (WSNs) get a wide range of applications and uses. Sensors are used in a wide range of industries, including agriculture, transportation, health, and many more. Many technologies, such as wireless communication protocols, the Internet of Things, cloud computing, mobile computing, and other emerging technologies, are connected to the usage of sensors. In many circumstances, this contact necessitates the transmission of crucial data, necessitating the need to protect that data from potential threats. However, as the WSN components often have constrained computation and power capabilities, protecting the communication in WSNs comes at a significant performance pena
... Show MoreThe paper uses the Direct Synthesis (DS) method for tuning the Proportional Integral Derivative (PID) controller for controlling the DC servo motor. Two algorithms are presented for enhancing the performance of the suggested PID controller. These algorithms are Back-Propagation Neural Network and Particle Swarm Optimization (PSO). The performance and characteristics of DC servo motor are explained. The simulation results that obtained by using Matlab program show that the steady state error is eliminated with shorter adjusted time when using these algorithms with PID controller. A comparative between the two algorithms are described in this paper to show their effectiveness, which is found that the PSO algorithm gives be
... Show MoreAkaike’s Information Criterion (AIC) is a popular method for estimation the number of sources impinging on an array of sensors, which is a problem of great interest in several applications. The performance of AIC degrades under low Signal-to-Noise Ratio (SNR). This paper is concerned with the development and application of quadrature mirror filters (QMF) for improving the performance of AIC. A new system is proposed to estimate the number of sources by applying AIC to the outputs of filter bank consisting quadrature mirror filters (QMF). The proposed system can estimate the number of sources under low signal-to-noise ratio (SNR).
Software Defined Network (SDN) is a new technology that separate the control plane from the data plane. SDN provides a choice in automation and programmability faster than traditional network. It supports the Quality of Service (QoS) for video surveillance application. One of most significant issues in video surveillance is how to find the best path for routing the packets between the source (IP cameras) and destination (monitoring center). The video surveillance system requires fast transmission and reliable delivery and high QoS. To improve the QoS and to achieve the optimal path, the SDN architecture is used in this paper. In addition, different routing algorithms are used with different steps. First, we eva
... Show MoreDocument clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Formerly, a number of conventional algorithms had been applied to perform document clustering. There are current endeavors to enhance clustering performance by employing evolutionary algorithms. Thus, such endeavors became an emerging topic gaining more attention in recent years. The aim of this paper is to present an up-to-date and self-contained review fully devoted to document clustering via evolutionary algorithms. It firstly provides a comprehensive inspection to the document clustering model revealing its various components with its related concepts. Then it shows and analyzes the principle research wor
... Show More