Improve topic modeling algorithms based on Twitter hashtags

Hayder M. Alash

doi:10.1088/1742-6596/1660/1/012100

Details

Publication Date

Sun Nov 01 2020

Journal Name

Journal Of Physics: Conference Series

Volume

1660

DOI

10.1088/1742-6596/1660/1/012100

Choose Citation Style

Statistics

View publication

11

Statistics

(20)

(19)

Improve topic modeling algorithms based on Twitter hashtags

Hayder M. Alash

...Show More Authors

Abstract<p>Today with increase using social media, a lot of researchers have interested in topic extraction from Twitter. Twitter is an unstructured short text and messy that it is critical to find topics from tweets. While topic modeling algorithms such as Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are originally designed to derive topics from large documents such as articles, and books. They are often less efficient when applied to short text content like Twitter. Luckily, Twitter has many features that represent the interaction between users. Tweets have rich user-generated hashtags as keywords. In this paper, we exploit the hashtags feature to improve topics learned from Twitter content without modifying the basic topic model of LSA and LDA. Users who share the same hashtag at most discuss the same topic. We compare the performance of the two methods (LSA and LDA) using the topic coherence (with and without hashtags). The experiment result on the Twitter dataset showed that LSA has better coherence score with hashtags than that do not incorporate hashtags. In contrast, our experiments show that the LDA has a better coherence score without incorporating hashtags. Finally, LDA has a better coherence score than LSA and the best coherence result obtained from the LDA method was (0.6047) and the LSA method was (0.4744) but the number of topics in LDA was higher than LSA. Thus, LDA may cause the same tweets to discuss the same subject set into different clustering.</p>

View Publication

Publication Date

Wed Apr 02 2025

Journal Name

Current Studies On Probability And Statistics

SAR-HDP: Non-parametric Topic Model for Aspect categorisation based on online reviews

Non-parametric models. Hierarchical Dirichlet process. Collapsed Gibbs sampling. Aspect extraction. Aspect categorisation. Online reviews

Omar Mustafa

...Show More Authors

Aspect categorisation and its utmost importance in the eld of Aspectbased Sentiment Analysis (ABSA) has encouraged researchers to improve topic model performance for modelling the aspects into categories. In general, a majority of its current methods implement parametric models requiring a pre-determined number of topics beforehand. However, this is not e ciently undertaken with unannotated text data as they lack any class label. Therefore, the current work presented a novel non-parametric model drawing a number of topics based on the semantic association present between opinion-targets (i.e., aspects) and their respective expressed sentiments. The model incorporated the Semantic Association Rules (SAR) into the Hierarchical Dirichlet Proce

View Publication

Publication Date

Wed Jan 01 2020

Journal Name

International Journal Of Computing

Twitter Location-Based Data: Evaluating the Methods of Data Collection Provided by Twitter Api

location data

Social media

Twitter

N.A.

Haneen

...Show More Authors

Twitter data analysis is an emerging field of research that utilizes data collected from Twitter to address many issues such as disaster response, sentiment analysis, and demographic studies. The success of data analysis relies on collecting accurate and representative data of the studied group or phenomena to get the best results. Various twitter analysis applications rely on collecting the locations of the users sending the tweets, but this information is not always available. There are several attempts at estimating location based aspects of a tweet. However, there is a lack of attempts on investigating the data collection methods that are focused on location. In this paper, we investigate the two methods for obtaining location-based dat

View Publication

(5)

(2)

Publication Date

Fri Aug 23 2024

Journal Name

Aro-the Scientific Journal Of Koya University

Graphical User Authentication Algorithms Based on Recognition

Zena M.

Ahmed T.

Omar Z.

...Show More Authors

In cyber security, the most crucial subject in information security is user authentication. Robust text-based password methods may offer a certain level of protection. Strong passwords are hard to remember, though, so people who use them frequently write them on paper or store them in file for computer .Numerous of computer systems, networks, and Internet-based environments have experimented with using graphical authentication techniques for user authentication in recent years. The two main characteristics of all graphical passwords are their security and usability. Regretfully, none of these methods could adequately address both of these factors concurrently. The ISO usability standards and associated characteristics for graphical

View Publication Preview PDF

(1)

Publication Date

Sat Dec 01 2018

Journal Name

Journal Of Theoretical And Applied Information Technology

Matching Algorithms for Intrusion Detection System based on DNA Encoding

Intrusion detection

DNA Encoding

Pattern Matching Algorithm

Knuth-Morris-Pratt Algorithm

Boyer-Moore Algorithm

Omar Fitian

Zulaiha Ali

Suhaila

...Show More Authors

Pattern matching algorithms are usually used as detecting process in intrusion detection system. The efficiency of these algorithms is affected by the performance of the intrusion detection system which reflects the requirement of a new investigation in this field. Four matching algorithms and a combined of two algorithms, for intrusion detection system based on new DNA encoding, are applied for evaluation of their achievements. These algorithms are Brute-force algorithm, Boyer-Moore algorithm, Horspool algorithm, Knuth-Morris-Pratt algorithm, and the combined of Boyer-Moore algorithm and Knuth–Morris– Pratt algorithm. The performance of the proposed approach is calculated based on the executed time, where these algorithms are applied o

(5)

Publication Date

Wed Aug 20 2025

Journal Name

International Journal Of Advanced Research In Computer Science

IMPROVE DATA ENCRYPTION BY USING DIFFIE-HELLMAN AND DNA ALGORITHMS, AUTHENTICATED BY HMAC-HASH256

DNA encoding

Diffie-Hellman Ephemeral

HASH256

MDC and MAC

HMAC-HASH256

Nada Abdul Aziz

...Show More Authors

: The need for means of transmitting data in a confidential and secure manner has become one of the most important subjects in the world of communications. Therefore, the search began for what would achieve not only the confidentiality of information sent through means of communication, but also high speed of transmission and minimal energy consumption, Thus, the encryption technology using DNA was developed which fulfills all these requirements [1]. The system proposes to achieve high protection of data sent over the Internet by applying the following objectives: 1. The message is encrypted using one of the DNA methods with a key generated by the Diffie-Hellman Ephemeral algorithm, part of this key is secret and this makes the pro

View Publication

Publication Date

Fri Mar 31 2023

Journal Name

Wasit Journal Of Computer And Mathematics Science

Security In Wireless Sensor Networks Based On Lightweight Algorithms : An Effective Survey

Data Confidentiality

Lightweight Cryptography

Security in Wireless Net-works

Wireless Sensor Networks

Mohammed

Sif

...Show More Authors

At the level of both individuals and companies, Wireless Sensor Networks (WSNs) get a wide range of applications and uses. Sensors are used in a wide range of industries, including agriculture, transportation, health, and many more. Many technologies, such as wireless communication protocols, the Internet of Things, cloud computing, mobile computing, and other emerging technologies, are connected to the usage of sensors. In many circumstances, this contact necessitates the transmission of crucial data, necessitating the need to protect that data from potential threats. However, as the WSN components often have constrained computation and power capabilities, protecting the communication in WSNs comes at a significant performance pena

View Publication

Publication Date

Fri Jan 01 2016

Journal Name

Journal Of Engineering

Improve the Performance of PID Controller by Two Algorithms for Controlling the DC Servo Motor

DC servo motor

direct synthesis

PID controller

neural network

particle swarm optimization (PSO).

Noor Safaa

...Show More Authors

The paper uses the Direct Synthesis (DS) method for tuning the Proportional Integral Derivative (PID) controller for controlling the DC servo motor. Two algorithms are presented for enhancing the performance of the suggested PID controller. These algorithms are Back-Propagation Neural Network and Particle Swarm Optimization (PSO). The performance and characteristics of DC servo motor are explained. The simulation results that obtained by using Matlab program show that the steady state error is eliminated with shorter adjusted time when using these algorithms with PID controller. A comparative between the two algorithms are described in this paper to show their effectiveness, which is found that the PSO algorithm gives be

View Publication Preview PDF

Publication Date

Wed Mar 18 2020

Journal Name

Baghdad Science Journal

A Software Defined Network of Video Surveillance System Based on Enhanced Routing Algorithms

Bellman-Ford algorithm

Dijkstra algorithm

Software defined network (SDN).

Mustafa Ismael

...Show More Authors

Software Defined Network (SDN) is a new technology that separate the ‎control plane from the data plane. SDN provides a choice in automation and ‎programmability faster than traditional network. It supports the ‎Quality of Service (QoS) for video surveillance application. One of most ‎significant issues in video surveillance is how to find the best path for routing the packets ‎between the source (IP cameras) and destination (monitoring center). The ‎video surveillance system requires fast transmission and reliable delivery ‎and high QoS. To improve the QoS and to achieve the optimal path, the ‎SDN architecture is used in this paper. In addition, different routing algorithms are ‎used with different steps. First, we eva

View Publication Preview PDF

(6)

(3)

Publication Date

Fri Mar 01 2019

Journal Name

Al-khwarizmi Engineering Journal

Improve Akaike’s Information Criterion Estimation Based on Denoising of Quadrature Mirror Filter Bank

Mohammed H.

...Show More Authors

Akaike’s Information Criterion (AIC) is a popular method for estimation the number of sources impinging on an array of sensors, which is a problem of great interest in several applications. The performance of AIC degrades under low Signal-to-Noise Ratio (SNR). This paper is concerned with the development and application of quadrature mirror filters (QMF) for improving the performance of AIC. A new system is proposed to estimate the number of sources by applying AIC to the outputs of filter bank consisting quadrature mirror filters (QMF). The proposed system can estimate the number of sources under low signal-to-noise ratio (SNR).

View Publication Preview PDF

Publication Date

Fri Oct 02 2015

Journal Name

American Journal Of Applied Sciences

Advances in Document Clustering with Evolutionary-Based Algorithms

Text Document Clustering

Hypertext Clustering

Evolutionary Algorithms

Genetic Algorithms

Text Dimensional Reduction

Sarmad

...Show More Authors

Document clustering is the process of organizing a particular electronic corpus of documents into subgroups of similar text features. Formerly, a number of conventional algorithms had been applied to perform document clustering. There are current endeavors to enhance clustering performance by employing evolutionary algorithms. Thus, such endeavors became an emerging topic gaining more attention in recent years. The aim of this paper is to present an up-to-date and self-contained review fully devoted to document clustering via evolutionary algorithms. It firstly provides a comprehensive inspection to the document clustering model revealing its various components with its related concepts. Then it shows and analyzes the principle research wor

View Publication

(2)

1 2 3 4 ... 747 748 749 750