Bayes Classification and Entropy Discretization of Large Datasets using Multi-Resolution Data Aggregation

Safaa Alwajidi; Li Yang

doi:10.25046/aj050557

Details

Publication Date

Wed Jan 01 2020

Journal Name

Advances In Science, Technology And Engineering Systems Journal

Volume

5

Issue Number

5

DOI

10.25046/aj050557

Choose Citation Style

Statistics

View publication

14

Statistics

Bayes Classification and Entropy Discretization of Large Datasets using Multi-Resolution Data Aggregation

Safaa Alwajidi

Li Yang

...Show More Authors

Big data analysis has important applications in many areas such as sensor networks and connected healthcare. High volume and velocity of big data bring many challenges to data analysis. One possible solution is to summarize the data and provides a manageable data structure to hold a scalable summarization of data for efficient and effective analysis. This research extends our previous work on developing an effective technique to create, organize, access, and maintain summarization of big data and develops algorithms for Bayes classification and entropy discretization of large data sets using the multi-resolution data summarization structure. Bayes classification and data discretization play essential roles in many learning algorithms such as decision tree and nearest neighbor search. The proposed method can handle streaming data efficiently and, for entropy discretization, provide su the optimal split value.

View Publication

Publication Date

Fri Jan 31 2020

Journal Name

Iraqi Geological Journal

ESTIMATION OF SHEAR WAVE VELOCITY FROM WIRELINE LOGS DATA FOR AMARA OILFIELD, MISHRIF FORMATION, SOUTHERN IRAQ

Petrophysics

shear wave velocity

Multiple regressions

Acoustic properties

Compressional velocity

Rwaida K.

...Show More Authors

Shear wave velocity is an important feature in the seismic exploration that could be utilized in reservoir development strategy and characterization. Its vital applications in petrophysics, seismic, and geomechanics to predict rock elastic and inelastic properties are essential elements of good stability and fracturing orientation, identification of matrix mineral and gas-bearing formations. However, the shear wave velocity that is usually obtained from core analysis which is an expensive and time-consuming process and dipole sonic imager tool is not commonly available in all wells. In this study, a statistical method is presented to predict shear wave velocity from wireline log data. The model concentrated to predict shear wave velocity fr

View Publication

(2)

Publication Date

Thu Jan 01 2015

Journal Name

Journal Of Engineering

A Visual Interface Design for Evaluating the Quality of Google Map Data for some Engineering Applications

data quality

positional accuracy

VGI

formal data

Google Map Maker.

Mouayed Y.

Maythm

Luma Layth

...Show More Authors

Today, there are large amounts of geospatial data available on the web such as Google Map (GM), OpenStreetMap (OSM), Flickr service, Wikimapia and others. All of these services called open source geospatial data. Geospatial data from different sources often has variable accuracy due to different data collection methods; therefore data accuracy may not meet the user requirement in varying organization. This paper aims to develop a tool to assess the quality of GM data by comparing it with formal data such as spatial data from Mayoralty of Baghdad (MB). This tool developed by Visual Basic language, and validated on two different study areas in Baghdad / Iraq (Al-Karada and Al- Kadhumiyah). The positional accuracy was asses

View Publication

Publication Date

Sat Feb 01 2020

Journal Name

Iop Conference Series: Materials Science And Engineering

Revealing the potentials of 3D modelling techniques; a comparison study towards data fusion from hybrid sensors

Abbas S.

Fanar M.

...Show More Authors

AbstractThe vast advantages of 3D modelling industry have urged competitors to improve capturing techniques and processing pipelines towards minimizing labour requirements, saving time and reducing project risk. When it comes to digital 3D documentary and conserving projects, laser scanning and photogrammetry are compared to choose between the two. Since both techniques have pros and cons, this paper approaches the potential issues of individual techniques in terms of time, budget, accuracy, density, methodology and ease to use. Terrestrial laser scanner and close-range photogrammetry are tested to document a unique invaluable artefact (Lady of Hatra) located in Iraq for future data fusion sc ... Show More

View Publication

(13)

(8)

Publication Date

Mon May 07 2018

Journal Name

Human And Ecological Risk Assessment: An International Journal

Multi-biomarker responses after exposure to organophosphates chlorpyrifos in the freshwater mussels Unio tigridis and snails Viviparous benglensis

Ali Abdulhamza

Adel M.

Ayad M. J.

...Show More Authors

View Publication

(22)

(20)

Publication Date

Thu Nov 01 2018

Journal Name

International Journal Of Biomathematics

A non-conventional hybrid numerical approach with multi-dimensional random sampling for cocaine abuse in Spain

Maha

...Show More Authors

This paper introduces a non-conventional approach with multi-dimensional random sampling to solve a cocaine abuse model with statistical probability. The mean Latin hypercube finite difference (MLHFD) method is proposed for the first time via hybrid integration of the classical numerical finite difference (FD) formula with Latin hypercube sampling (LHS) technique to create a random distribution for the model parameters which are dependent on time [Formula: see text]. The LHS technique gives advantage to MLHFD method to produce fast variation of the parameters’ values via number of multidimensional simulations (100, 1000 and 5000). The generated Latin hypercube sample which is random or non-deterministic in nature is further integ

View Publication

(8)

(2)

Publication Date

Wed Jan 01 2020

Journal Name

Ieee Access

Smart Routing Management Framework Exploiting Dynamic Data Resources of Cross-Layer Design and Machine Learning Approaches for Mobile Cognitive Radio Networks: A Survey

Qusay

...Show More Authors

View Publication

(21)

Publication Date

Sun Jun 05 2016

Journal Name

Baghdad Science Journal

Developing an Immune Negative Selection Algorithm for Intrusion Detection in NSL-KDD data Set

"NSL-KDD

Self-NonSelf Theory

RNS

RRNS."

Mafaz Mohsin Khalil

Alaa’ Hazim Jar

...Show More Authors

With the development of communication technologies for mobile devices and electronic communications, and went to the world of e-government, e-commerce and e-banking. It became necessary to control these activities from exposure to intrusion or misuse and to provide protection to them, so it's important to design powerful and efficient systems-do-this-purpose. It this paper it has been used several varieties of algorithm selection passive immune algorithm selection passive with real values, algorithm selection with passive detectors with a radius fixed, algorithm selection with passive detectors, variable- sized intrusion detection network type misuse where the algorithm generates a set of detectors to distinguish the self-samples. Practica

View Publication Preview PDF

(1)

Publication Date

Sun Jan 01 2017

Journal Name

Statistical Applications In Genetics And Molecular Biology

Mixture model-based association analysis with case-control data in genome wide association studies

genome wide association studies

haplotype mixture model

odds ratios

testing for inheritance patterns

ALI

Jian

...Show More Authors

AbstractMultilocus haplotype analysis of candidate variants with genome wide association studies (GWAS) data may provide evidence of association with disease, even when the individual loci themselves do not. Unfortunately, when a large number of candidate variants are investigated, identifying risk haplotypes can be very difficult. To meet the challenge, a number of approaches have been put forward in recent years. However, most of them are not directly linked to the disease-penetrances of haplotypes and thus may not be efficient. To fill this gap, we propose a mixture model-based approach for detecting risk haplotypes. Under the mixture model, haplotypes are clustered directly according to their estimated d ... Show More

View Publication

(4)

(2)

Publication Date

Sat Dec 01 2007

Journal Name

Journal Of Economics And Administrative Sciences

دور تنقيب البيانات Data Mining في زيادة أداء المنظمة (( دراسة تحليلية في المصرف الصناعي ))

زكريا مطلك

داليا

...Show More Authors

تمهيد

غالبا ما يكون تعامل المنظمات المالية والمصرفية مع الزبائن بشكل أساسي مما يتطلب منها جمع كميات هائلة من البيانات عن هؤلاء الزبائن هذا بالإضافة الى ما يرد اليها يوميا من بيانات يجعلها أمام أكداس كبيرة من البيانات تحتاج الى جهود جبارة تحسن التعامل معها والاستفادة منها بما يخدم المنظمة.

ان التعامل اليدوي مع مثل هذه البيانات دون استخدام تقنيات حديثة يبعد المنظمة عن التط

View Publication Preview PDF

Publication Date

Sat Feb 01 2025

Journal Name

Journal Of Energy Storage

Massive energy reduction and storage capacity relative to PCM physical size by integrating deep RL clustering and multi-stage strategies into smart buildings to grid reliability

Jasim M.

...Show More Authors

View Publication

(25)

(26)

1 2 ... 84 85 86 87 ... 2931 2932