The need for an efficient method to find the furthermost appropriate document corresponding to a particular search query has become crucial due to the exponential development in the number of papers that are now readily available to us on the web. The vector space model (VSM) a perfect model used in “information retrieval”, represents these words as a vector in space and gives them weights via a popular weighting method known as term frequency inverse document frequency (TF-IDF). In this research, work has been proposed to retrieve the most relevant document focused on representing documents and queries as vectors comprising average term term frequency inverse sentence frequency (TF-ISF) weights instead of representing them as vectors of term TF-IDF weight and two basic and effective similarity measures: Cosine and Jaccard were used. Using the MS MARCO dataset, this article analyzes and assesses the retrieval effectiveness of the TF-ISF weighting scheme. The result shows that the TF-ISF model with the Cosine similarity measure retrieves more relevant documents. The model was evaluated against the conventional TF-ISF technique and shows that it performs significantly better on MS MARCO data (Microsoft-curated data of Bing queries).
Diabetes is one of the increasing chronic diseases, affecting millions of people around the earth. Diabetes diagnosis, its prediction, proper cure, and management are compulsory. Machine learning-based prediction techniques for diabetes data analysis can help in the early detection and prediction of the disease and its consequences such as hypo/hyperglycemia. In this paper, we explored the diabetes dataset collected from the medical records of one thousand Iraqi patients. We applied three classifiers, the multilayer perceptron, the KNN and the Random Forest. We involved two experiments: the first experiment used all 12 features of the dataset. The Random Forest outperforms others with 98.8% accuracy. The second experiment used only five att
... Show MoreShadow detection and removal is an important task when dealing with color outdoor images. Shadows are generated by a local and relative absence of light. Shadows are, first of all, a local decrease in the amount of light that reaches a surface. Secondly, they are a local change in the amount of light rejected by a surface toward the observer. Most shadow detection and segmentation methods are based on image analysis. However, some factors will affect the detection result due to the complexity of the circumstances. In this paper a method of segmentation test present to detect shadows from an image and a function concept is used to remove the shadow from an image.
The penalized least square method is a popular method to deal with high dimensional data ,where the number of explanatory variables is large than the sample size . The properties of penalized least square method are given high prediction accuracy and making estimation and variables selection
At once. The penalized least square method gives a sparse model ,that meaning a model with small variables so that can be interpreted easily .The penalized least square is not robust ,that means very sensitive to the presence of outlying observation , to deal with this problem, we can used a robust loss function to get the robust penalized least square method ,and get robust penalized estimator and
... Show More
Students’ feedback is crucial for educational institutions to assess the performance of their teachers, most opinions are expressed in their native language, especially for people in south Asian regions. In Pakistan, people use Roman Urdu to express their reviews, and this applied in the education domain where students used Roman Urdu to express their feedback. It is very time-consuming and labor-intensive process to handle qualitative opinions manually. Additionally, it can be difficult to determine sentence semantics in a text that is written in a colloquial style like Roman Urdu. This study proposes an enhanced word embedding technique and investigates the neural word Embedding (Word2Vec and Glove) to determine which perfo
... Show MoreThe research involves preparing gold nanoparticles (AuNPs) and studying the factors that influence the shape, sizes and distribution ratio of the prepared particles according to Turkevich method. These factors include (reaction temperature, initial heating, concentration of gold ions, concentration and quantity of added citrate, reaction time and order of reactant addition). Gold nanoparticles prepared were characterized by the following measurements: UV-Visible spectroscopy, X-ray diffraction and scanning electron microscopy. The average size of gold nanoparticles was formed in the range (20 -35) nm. The amount of added citrate was changed and studied. In addition, the concentration of added gold ions was changed and the calibration cur
... Show MoreMedian filter is adopted to match the noise statistics of the degradation seeking good quality smoothing images. Two methods are suggested in this paper(Pentagonal-Hexagonal mask and Scan Window Mask), the study involved modified median filter for improving noise suppression, the modification is considered toward more reliable results. Modification median filter (Pentagonal-Hexagonal mask) was found gave better results (qualitatively and quantitatively ) than classical median filters and another suggested method (Scan Window Mask), but this will be on the account of the time required. But sometimes when the noise is line type the cross 3x3 filter preferred to another one Pentagonal-Hexagonal with few variation. Scan Window Mask gave bett
... Show More