The influx of data in bioinformatics is primarily in the form of DNA, RNA, and protein sequences. This condition places a significant burden on scientists and computers. Some genomics studies depend on clustering techniques to group similarly expressed genes into one cluster. Clustering is a type of unsupervised learning that can be used to divide unknown cluster data into clusters. The k-means and fuzzy c-means (FCM) algorithms are examples of algorithms that can be used for clustering. Consequently, clustering is a common approach that divides an input space into several homogeneous zones; it can be achieved using a variety of algorithms. This study used three models to cluster a brain tumor dataset. The first model uses FCM, which is used to cluster genes. FCM allows an object to belong to two or more clusters with a membership grade between zero and one and the sum of belonging to all clusters of each gene is equal to one. This paradigm is useful when dealing with microarray data. The total time required to implement the first model is 22.2589 s. The second model combines FCM and particle swarm optimization (PSO) to obtain better results. The hybrid algorithm, i.e., FCM–PSO, uses the DB index as objective function. The experimental results show that the proposed hybrid FCM–PSO method is effective. The total time of implementation of this model is 89.6087 s. The third model combines FCM with a genetic algorithm (GA) to obtain better results. This hybrid algorithm also uses the DB index as objective function. The experimental results show that the proposed hybrid FCM–GA method is effective. Its total time of implementation is 50.8021 s. In addition, this study uses cluster validity indexes to determine the best partitioning for the underlying data. Internal validity indexes include the Jaccard, Davies Bouldin, Dunn, Xie–Beni, and silhouette. Meanwhile, external validity indexes include Minkowski, adjusted Rand, and percentage of correctly categorized pairings. Experiments conducted on brain tumor gene expression data demonstrate that the techniques used in this study outperform traditional models in terms of stability and biological significance.
Abstract
This study aims to identify the extent to which the criteria of the American Council for Teaching Foreign Languages (ACTFL) are included in the English language books for the fifth and sixth graders. To achieve the objective of the study, a content analysis card was prepared, where the classification of language proficiencies was divided into five main levels (beginner, intermediate, advanced, superior, and distinguished) of the four language skills (listening, speaking, reading, and writing), The content analysis card consisted of (89) indicators distributed at the four levels of language skills as follows: Listening (17), speaking (33), reading (15), and writing (26). The study sample consisted of Engl
... Show MoreGivers of foreign Audit about Social Responsibility of Profit Organization. The recent time is charcterstically with big economic Organization activities, because there are many transactions between these Organizations and different financial markets development techniques.
This encourgage business men to increase their efforts for investment in these markets. Because the Accounting is in general terms it represents a language of these Unions Activities and translate them in to fact numbers, for that there is need for Accounting recording for certain of these Organizations behavior and their harmonization with their Objectives.
In this respect the Audit function comes to che
... Show MoreVisual analytics becomes an important approach for discovering patterns in big data. As visualization struggles from high dimensionality of data, issues like concept hierarchy on each dimension add more difficulty and make visualization a prohibitive task. Data cube offers multi-perspective aggregated views of large data sets and has important applications in business and many other areas. It has high dimensionality, concept hierarchy, vast number of cells, and comes with special exploration operations such as roll-up, drill-down, slicing and dicing. All these issues make data cubes very difficult to visually explore. Most existing approaches visualize a data cube in 2D space and require preprocessing steps. In this paper, we propose a visu
... Show MorePortable devices such as smartphones, tablet PCs, and PDAs are a useful combination of hardware and software turned toward the mobile workers. While they present the ability to review documents, communicate via electronic mail, appointments management, meetings, etc. They usually lack a variety of essential security features. To address the security concerns of sensitive data, many individuals and organizations, knowing the associated threats mitigate them through improving authentication of users, encryption of content, protection from malware, firewalls, intrusion prevention, etc. However, no standards have been developed yet to determine whether such mobile data management systems adequately provide the fu
... Show MoreThe using of the parametric models and the subsequent estimation methods require the presence of many of the primary conditions to be met by those models to represent the population under study adequately, these prompting researchers to search for more flexible parametric models and these models were nonparametric, many researchers, are interested in the study of the function of permanence and its estimation methods, one of these non-parametric methods.
For work of purpose statistical inference parameters around the statistical distribution for life times which censored data , on the experimental section of this thesis has been the comparison of non-parametric methods of permanence function, the existence
... Show MoreThis research aims to analyze and simulate biochemical real test data for uncovering the relationships among the tests, and how each of them impacts others. The data were acquired from Iraqi private biochemical laboratory. However, these data have many dimensions with a high rate of null values, and big patient numbers. Then, several experiments have been applied on these data beginning with unsupervised techniques such as hierarchical clustering, and k-means, but the results were not clear. Then the preprocessing step performed, to make the dataset analyzable by supervised techniques such as Linear Discriminant Analysis (LDA), Classification And Regression Tree (CART), Logistic Regression (LR), K-Nearest Neighbor (K-NN), Naïve Bays (NB
... Show MoreAbstract
The traffic jams taking place in the cities of the Republic of Iraq in general and the province of Diwaniyah especially, causes return to the large numbers of the modern vehicles that have been imported in the last ten years and the lack of omission for old vehicles in the province, resulting in the accumulation of a large number of vehicles that exceed the capacity of the city's streets, all these reasons combined led to traffic congestion clear at the time of the beginning of work in the morning, So researchers chose local area network of the main roads of the province of Diwaniyah, which is considered the most important in terms of traffic congestion, it was identified fuzzy numbers for
... Show More