Everybody is connected with social media like (Facebook, Twitter, LinkedIn, Instagram…etc.) that generate a large quantity of data and which traditional applications are inadequate to process. Social media are regarded as an important platform for sharing information, opinion, and knowledge of many subscribers. These basic media attribute Big data also to many issues, such as data collection, storage, moving, updating, reviewing, posting, scanning, visualization, Data protection, etc. To deal with all these problems, this is a need for an adequate system that not just prepares the details, but also provides meaningful analysis to take advantage of the difficult situations, relevant to business, proper decision, Health, social media, science, telecommunications, the environment, etc. Authors notice through reading of previous studies that there are different analyzes through HADOOP and its various tools such as the sentiment in real-time and others. However, dealing with this Big data is a challenging task. Therefore, such type of analysis is more efficiently possible only through the Hadoop Ecosystem. The purpose of this paper is to analyze literature related analysis of big data of social media using the Hadoop framework for knowing almost analysis tools existing in the world under the Hadoop umbrella and its orientations in addition to difficulties and modern methods of them to overcome challenges of big data in offline and real –time processing. Real-time Analytics accelerates decision-making along with providing access to business metrics and reporting. Comparison between Hadoop and spark has been also illustrated.
In this study, we made a comparison between LASSO & SCAD methods, which are two special methods for dealing with models in partial quantile regression. (Nadaraya & Watson Kernel) was used to estimate the non-parametric part ;in addition, the rule of thumb method was used to estimate the smoothing bandwidth (h). Penalty methods proved to be efficient in estimating the regression coefficients, but the SCAD method according to the mean squared error criterion (MSE) was the best after estimating the missing data using the mean imputation method
Abstract
The research Compared two methods for estimating fourparametersof the compound exponential Weibull - Poisson distribution which are the maximum likelihood method and the Downhill Simplex algorithm. Depending on two data cases, the first one assumed the original data (Non-polluting), while the second one assumeddata contamination. Simulation experimentswere conducted for different sample sizes and initial values of parameters and under different levels of contamination. Downhill Simplex algorithm was found to be the best method for in the estimation of the parameters, the probability function and the reliability function of the compound distribution in cases of natural and contaminateddata.
... Show More
Today, the science of artificial intelligence has become one of the most important sciences in creating intelligent computer programs that simulate the human mind. The goal of artificial intelligence in the medical field is to assist doctors and health care workers in diagnosing diseases and clinical treatment, reducing the rate of medical error, and saving lives of citizens. The main and widely used technologies are expert systems, machine learning and big data. In the article, a brief overview of the three mentioned techniques will be provided to make it easier for readers to understand these techniques and their importance.
Measuring the efficiency of postgraduate and undergraduate programs is one of the essential elements in educational process. In this study, colleges of Baghdad University and data for the academic year (2011-2012) have been chosen to measure the relative efficiencies of postgraduate and undergraduate programs in terms of their inputs and outputs. A relevant method to conduct the analysis of this data is Data Envelopment Analysis (DEA). The effect of academic staff to the number of enrolled and alumni students to the postgraduate and undergraduate programs are the main focus of the study.
ان وضع معايير دولية محاسبية على شكل نماذج وارشادات عامة تؤدي باصحاب القرارات الاقتصادية استخدام معايير المحاسبة الدولية عند اعداد وتجهيز القوائم والبيانات المالية اصبح مطلب اساسي وضرورة ملحة لمختلف الاطراف في المجتمع الحالي فهذه المعايير قد اثمرت في معالجة الامور المحاسبية على الصعيد المحلي والاقليمي والدولي. وان عدد كبير من الدول اعتمدت هذه المعايير فقد تجاوزت 150 بلدا. مما نتج عنه ازالة الفوارق الكث
... Show MoreThe research deals with an analytical approach between new media and traditional one in the light of the changes imposed by technology, which has been able to change a number of common concepts in the field of communication and media. The researcher tries to find an analytical explanation of the relationship between technology by being an influential factor in building the information society, which is the basis of new media, and the technical output that influenced the forms of social relations and linguistic construction as a human communication tool. The research deals with an analytical approach between new media and traditional one in the light of the changes imposed by technology, which has been able to change a number of comm
... Show MoreThis paper deals to how to estimate points non measured spatial data when the number of its terms (sample spatial) a few, that are not preferred for the estimation process, because we also know that whenever if the data is large, the estimation results of the points non measured to be better and thus the variance estimate less, so the idea of this paper is how to take advantage of the data other secondary (auxiliary), which have a strong correlation with the primary data (basic) to be estimated single points of non-measured, as well as measuring the variance estimate, has been the use of technique Co-kriging in this field to build predictions spatial estimation process, and then we applied this idea to real data in th
... Show MoreBig data usually running in large-scale and centralized key management systems. However, the centralized key management systems are increasing the problems such as single point of failure, exchanging a secret key over insecure channels, third-party query, and key escrow problem. To avoid these problems, we propose an improved certificate-based encryption scheme that ensures data confidentiality by combining symmetric and asymmetric cryptography schemes. The combination can be implemented by using the Advanced Encryption Standard (AES) and Elliptic Curve Diffie-Hellman (ECDH). The proposed scheme is an enhanced version of the Certificate-Based Encryption (CBE) scheme and preserves all its advantages. However
... Show MoreThe current paper proposes a new estimator for the linear regression model parameters under Big Data circumstances. From the diversity of Big Data variables comes many challenges that can be interesting to the researchers who try their best to find new and novel methods to estimate the parameters of linear regression model. Data has been collected by Central Statistical Organization IRAQ, and the child labor in Iraq has been chosen as data. Child labor is the most vital phenomena that both society and education are suffering from and it affects the future of our next generation. Two methods have been selected to estimate the parameter
... Show MoreData scarcity is a major challenge when training deep learning (DL) models. DL demands a large amount of data to achieve exceptional performance. Unfortunately, many applications have small or inadequate data to train DL frameworks. Usually, manual labeling is needed to provide labeled data, which typically involves human annotators with a vast background of knowledge. This annotation process is costly, time-consuming, and error-prone. Usually, every DL framework is fed by a significant amount of labeled data to automatically learn representations. Ultimately, a larger amount of data would generate a better DL model and its performance is also application dependent. This issue is the main barrier for