Bayes Classification and Entropy Discretization of Large Datasets using Multi-Resolution Data Aggregation

Safaa Alwajidi; Li Yang

doi:10.25046/aj050557

Details

Publication Date

Wed Jan 01 2020

Journal Name

Advances In Science, Technology And Engineering Systems Journal

Volume

5

Issue Number

5

DOI

10.25046/aj050557

Choose Citation Style

Statistics

View publication

8

Statistics

Bayes Classification and Entropy Discretization of Large Datasets using Multi-Resolution Data Aggregation

Safaa Alwajidi

Li Yang

...Show More Authors

Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such as decision tree and nearest neighbor search. The proposed method can handle streaming data efficiently and, for entropy discretization, provide su the optimal split value.

View Publication

Publication Date

Tue Jan 01 2008

Journal Name

Lecture Notes In Computer Science

IRPS – An Efficient Test Data Generation Strategy for Pairwise Testing

Mohammed I.

Kamal Zuhairi

Nor Ashidi

...Show More Authors

View Publication

(20)

(7)

Publication Date

Fri Apr 01 2016

Journal Name

Gis Research Uk 24th Annual Conference

Comparing Open Source Map Data in Areas Lacking Authoritative Mapping

Maythm

David

Mustafa

...Show More Authors

One wide-ranging category of open source data is that referring to geospatial information web sites. Despite the advantages of such open source data, including ease of access and cost free data, there is a potential issue of its quality. This article tests the horizontal positional accuracy and possible integration of four web-derived geospatial datasets: OpenStreetMap (OSM), Google Map, Google Earth and Wikimapia. The evaluation was achieved by combining the tested information with reference field survey data for fifty road intersections in Baghdad, Iraq. The results indicate that the free geospatial data can be used to enhance authoritative maps especially small scale maps.

Publication Date

Fri Apr 01 2022

Journal Name

Baghdad Science Journal

Improved Firefly Algorithm with Variable Neighborhood Search for Data Clustering

Data clustering

Data mining

Firefly algorithm

Machine learning

Variable neighborhood search.

Hayder Naser Khraibet

...Show More Authors

Among the metaheuristic algorithms, population-based algorithms are an explorative search algorithm superior to the local search algorithm in terms of exploring the search space to find globally optimal solutions. However, the primary downside of such algorithms is their low exploitative capability, which prevents the expansion of the search space neighborhood for more optimal solutions. The firefly algorithm (FA) is a population-based algorithm that has been widely used in clustering problems. However, FA is limited in terms of its premature convergence when no neighborhood search strategies are employed to improve the quality of clustering solutions in the neighborhood region and exploring the global regions in the search space. On the

View Publication Preview PDF

(13)

(3)

Publication Date

Wed Oct 01 2008

Journal Name

2008 First International Conference On Distributed Framework And Applications

A strategy for Grid based t-way test data generation

Mohammed I.

Kamal Z.

Nor Ashidi Mat

...Show More Authors

View Publication

(22)

(16)

Publication Date

Sat Jun 21 2025

Journal Name

University Of Kirkuk Journal For Administrative And Economic Science

Anova For Fuzzy Data With Practical in The Medical Field

Mohammed

Omar

...Show More Authors

This research study Blur groups (Fuzzy Sets) which is the perception of the most modern in the application in various practical and theoretical areas and in various fields of life, was addressed to the fuzzy random variable whose value is not real, but the numbers Millbh because it expresses the mysterious phenomena or uncertain with measurements are not assertive. Fuzzy data were presented for binocular test and analysis of variance method of random Fuzzy variables , where this method depends on a number of assumptions, which is a problem that prevents the use of this method in the case of non-realized.

View Publication Preview PDF

Publication Date

Fri Apr 01 2022

Journal Name

Heliyon

Stigma towards health care providers taking care of COVID-19 patients: A multi-country study

Abdulqadir J.

Glenn Ford D.

Sadeq

Hani

Hossam

Muna

Joseph U.

Ibtesam O.

Hawa

Nabil E.

Fade

I. Ketut

Aiman

Bilal

Malik

Marwa M.

Mostafa

Hayder

Mohammed B.

Sabah A

Shaymaa M.

Ayat J.

Mohammed A.

Nisha

Majid

Ananth

Mohamed A.

Ralph C.

...Show More Authors

View Publication

(34)

(26)

Publication Date

Wed Oct 17 2018

Journal Name

Journal Of Economics And Administrative Sciences

New Robust Estimation in Compound Exponential Weibull-Poisson Distribution for both contaminated and non-contaminated Data

Compound distributions

Exponential Weibull Poisson distribution

Maximum Likelihood method

EM algorithm

Downhill Simplex algorithm

Data contamination.

انتصار عريبي

...Show More Authors

Abstract

The research Compared two methods for estimating fourparametersof the compound exponential Weibull - Poisson distribution which are the maximum likelihood method and the Downhill Simplex algorithm. Depending on two data cases, the first one assumed the original data (Non-polluting), while the second one assumeddata contamination. Simulation experimentswere conducted for different sample sizes and initial values of parameters and under different levels of contamination. Downhill Simplex algorithm was found to be the best method for in the estimation of the parameters, the probability function and the reliability function of the compound distribution in cases of natural and contaminateddata.

View Publication Preview PDF

Publication Date

Fri Apr 14 2023

Journal Name

Journal Of Big Data

A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications

Ali H.

...Show More Authors

Abstract<p>Data scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for</p> ... Show More

View Publication Preview PDF

(379)

(387)

Publication Date

Mon Jan 01 2018

Journal Name

Fme Transaction

The influence of frictional facing thickness on the contact pressure distribution of multi-disc dry clutches

Contact analysis

dry friction clutches

finite element analysis.

Oday

Ali M.

Hussein

...Show More Authors

Frictional heat is generated when the clutch starts to engag. As a result of this operation the surface temperature is increased rapidly due to the difference in speed between the driving and driven parts. The influence of the thickness of frictional facing on the distribution of the contact pressure of the multi-disc clutches has been investigated using a numerical approach (the finite element method). The analysis of contact problem has been carried out for a multiple disc dry clutch (piston, clutch discs, separators and pressure plate). The results present the distribution of the contact pressure on all tShe surfaces of friction discs that existed in the friction clutch system. Axisymmetric finite element models have been developed to ac

View Publication

(9)

Publication Date

Thu Sep 01 2022

Journal Name

Jordan Journal Of Mechanical And Industrial Engineering

Empact of Discrete Multi-arc Rib Roughness on the Effective Efficiency of a Solar Air Heater

SAH

effective efficiency

DMAR

rib roughness

arwa mahmood

...Show More Authors

Artificial roughness on the absorber plate of a Solar Air Heater (SAH) is a popular technique for increasing its effective efficiency. The study investigated the effect of geometrical parameters of discrete multi-arc ribs (DMAR) installed below the SAH absorber plate on the effective efficiency. The effects of major roughness factors, such as number of gaps (Ng = 1-4), rib pitch (p/e = 4-16), rib height (e/D = 0.018-0.045), gab width (wg/e = 0.5-2), angle of attack ( = 30-75), and Reynolds number (Re= 2000-20000) on the performance of a SAH are studied. The performance of the SAH is evaluated using a top-down iterative technique. The results show that as Re rises, SAH-effective DMAR's efficiency first ascends to a specified value o

1 2 ... 70 71 72 73 ... 2835 2836