When optimizing the performance of neural network-based chatbots, determining the optimizer is one of the most important aspects. Optimizers primarily control the adjustment of model parameters such as weight and bias to minimize a loss function during training. Adaptive optimizers such as ADAM have become a standard choice and are widely used for their invariant parameter updates' magnitudes concerning gradient scale variations, but often pose generalization problems. Alternatively, Stochastic Gradient Descent (SGD) with Momentum and the extension of ADAM, the ADAMW, offers several advantages. This study aims to compare and examine the effects of these optimizers on the chatbot CST dataset. The effectiveness of each optimizer is evaluated based on its sparse-categorical loss during training and BLEU in the inference phase, utilizing a neural generative attention-based additive scoring function. Despite memory constraints that limited ADAMW to ten epochs, this optimizer showed promising results compared to configurations using early stopping techniques. SGD provided higher BLEU scores for generalization but was very time-consuming. The results highlight the importance of finding a balance between optimization performance and computational efficiency, positioning ADAMW as a promising alternative when training efficiency and generalization are primary concerns.
Abstract
The grey system model GM(1,1) is the model of the prediction of the time series and the basis of the grey theory. This research presents the methods for estimating parameters of the grey model GM(1,1) is the accumulative method (ACC), the exponential method (EXP), modified exponential method (Mod EXP) and the Particle Swarm Optimization method (PSO). These methods were compared based on the Mean square error (MSE) and the Mean Absolute percentage error (MAPE) as a basis comparator and the simulation method was adopted for the best of the four methods, The best method was obtained and then applied to real data. This data represents the consumption rate of two types of oils a he
... Show MoreElectrical Discharge Machining (EDM) is a widespread Nontraditional Machining (NTM) processes for manufacturing of a complicated geometry or very hard metals parts that are difficult to machine by traditional machining operations. Electrical discharge machining is a material removal (MR) process characterized by using electrical discharge erosion. This paper discusses the optimal parameters of EDM on high-speed steel (HSS) AISI M2 as a workpiece using copper and brass as an electrode. The input parameters used for experimental work are current (10, 24 and 42 A), pulse on time (100, 150 and 200 µs), and pulse off time (4, 12 and 25 µs) that have effect on the material removal rate (MRR), electrode wear rate (EWR) and wear ratio (WR). A
... Show Moreيسعى هذا البحث الى تقديم اطار عملي ونظري حول موضوع "تأثير الاقتصاد المنزلي في تعزيز الاداء العالي" وتم اختبار مخطط الدراسة الفرضي في القطاع التعليمي الحكومي في محافظة البصرة ,ويتضمن عدد من تشكيلات الجامعة التقنية الجنوبية. واستخدمت الاستبانة والمقابلة الشخصية كأسلوب لجمع البيانات للدراسة وكان حجم العينة114 موظف. وقد استخدمت عدد من الاساليب الاحصائية لاختبار فرضيات الدراسة. واظهرت النتائج بأن هناك تأثير ايج
... Show MoreObjective: preparing educational units for the magnet poles strategy in learning the spiking skill in volleyball, and identifying the effect of the magnet poles strategy in learning the spiking skill in volleyball for female students.Research methodology: The experimental design with two equal experimental and control groups with tight control was also adopted in the pre- and post-tests. The boundaries of this research community are represented by fourth-grade middle school students at Basra Girls' Middle School (2024-2025), whose total number is (90) students, distributed by nature into 4 sections. Sections (A-B) were determined by lottery, so that Section (A) represents the experimental group and Section (B) represents the control
... Show MoreThis research studied the effect of magnetized water in concrete preparation and its effect on the presenting of cement in concrete mixtures also to find the ability of reducing the amount of cement in preparing one cubic meter, this is not exceed than 10% in one mixture , The experiments showed the preparation of standard cubes from the concrete which was used two kind of water magnetized water which was prepared by passing the tap water through the systems of different magnetic strength in terms of (6000,9000) Gauss and the ordinary water . The velocity of water through the magnetic field, which gives us the highest value for the compressive strength, was up to 1m/sec. to determine the best magnetic intensity, we examined The comp
... Show MoreMany economic entities working in multiple industrial fields suffer fromlow techniques in using modern administrative means in their works. The mostused tool in measuring required procedures is to adopt and use quality costs. inspite of complications and bronchial of operations in construction projects, Theresearcher was able to find a structure to quality costs according to traditionclassification (prevention, Appraisal, failure) which enables the calculation ofthese costs and then analyze results and setting standards which can beimplemented in evaluating strategic performance for targeted project. and theforge research in theoretical fly to quality and costs concerning it inconstruction section , as well as strategically performance a
... Show MorePVA, Starch/PVA, and Starch/PVA/sugar samples of different
concentrations (10, 20, 30 and 40 % wt/wt) were prepared by casting
method. DSC analysis was carried; the results showed only one glass
transition temperature (Tg) for the samples involved, which suggest
that starch/PVA and starch/PVA/sugar blends are miscible. The
miscibility is attributed to the hydrogen bonds between PVA and
starch. This is in a good agreement with (FTIR) results. Tg and Tm
decrease with starch and sugar content compared with that for
(PVA). Systematic decrease in ultimate strength, due to starch and
sugar ratio increase, is attributed to (PVA), which has more hydroxyl
groups that made its ultimate strength higher than that for
Poly vinyl alcohol has been studied for its ability to form crystallites by using annealing method. Semicrystalline films of poly vinyl alcohol (PVA) were prepared by casting 11.5 wt. % and 13 wt. % PVA aqueous solution onto glass slides at annealing temperature range 90 -120°C and duration time 15- 60 minute. This allowed the macromolecules to form crystallites, small regions of folded and compacted chains separated by amorphous regions where single PVA chain may pass through several of these crystallites. Degree of crystallinity of PVA films (hydrogels) was determined by method of density; on the other hand the swelling behavior was conducted by the determination of water uptake, wet degree of crystallinity, gel fraction and solubilit
... Show MoreThe present study aims at knowing the effect of instrumental enrichment in the acquisition of geographical concepts for first grade student in intermediate school. The study is restricted in the students of first grade student in intermediate school\ The EducationalDirectorate of Rusafa for academic (2013/2014),for the purpose of achieving the objective, the following hypotheses:
There is no statistical significant different at the level of (0.5) between the scores of the experimental group who study geography according to the instrumental enrichment and the scores of the control group who learned geographical according to the traditional methods.
&nb
... Show MoreAbstract
The changes that happened in the environment of business have great effects upon organizations with different activities specially the banks which requires the existence of an able opinion resources can adapt with the changes . Accordingly importance put upon intellectual capital which become one of the basic resources for organizations and one of success and growth elements with the availability of expertise , skills and capability of making essential changes in different process due to the presentation of innovations and creations of the to support banks activities .Therefore the intellectual capital represents the more r
... Show More