Substantiating Precise Analysis of Data To Evaluate Students Answer Scripts
Substantiating Precise Analysis of Data To Evaluate Students Answer Scripts
Substantiating Precise Analysis of Data To Evaluate Students Answer Scripts
ISSN No:-2456-2165
Abstract:- Handwriting recognition refers to interpreting from the scored responses dataset and the answer key. The
and analyzing handwritten text. In rece-nt years, there model is employed to assess the ungraded responses during
have- been notable advance-ments in this field, espe-cially the testing stage. The ungraded responses are transformed into
in the context of computerize-d assessments. As online TF-IDF vectors, then cosine similarity matching is executed
e-xams and digital education platforms continue to gain using the trained model to award scores[1].
popularity, handwriting recognition plays a crucial role-
in evaluating students' written answers. Our proposed Aqil M. Azm et al. developed a system that applies
system automatically recognizes and scores handwritten LSA(Latent Semantic Analysis), RST(Rhetorical Structure
responses on answer sheets by comparing them to the Theory), which involves two stages: training and testing. In
correct answers provided by a moderator. To achieve the training phase, pre-scored essays are used to train the
this, the system utilizes Optical Character Recognition Latent Semantic Analysis model. The training set includes
(OCR) to convert the handwritten text images into essays that were scored by human instructors. During testing
computer-readable text. Additionally, BERT is employed stage, a new essay is processed through several steps
to convert the text into embeddings, and cosine similarity including pre-processing, checking cohesion, counting
is utilized to take these embeddings as input and provide spelling mistakes, comparing essay length, and applying RST.
a final matching confidence score. The essay’s overall score is calculated based on the weighted
sum of the scores assigned to semantic or conceptual analysis,
Keywords:- OCR, Google Vision OCR, BERT, Cosine spelling mistakes and writing style[2].
similarity.
Muhammad Farrukh Bashir et al. presented a method
I. INTRODUCTION that employs machine learning (ML) and natural language
processing (NLP) techniques like WMD(word mover's
In Universities Colleges as well as schools common way distance), Cosine similarity, MNB (Multinomial naive Bayes)
of assessment lies in manual evaluation of written to evaluate subjective answer responses. Assessment of
examination attempted by the students. The student's response responses involves employing solution statements and
is evaluated based on their understanding of language, relevant keywords, while a ML model is constructed to
concepts, and other relevant aspects. Professors encounter forecast the grades of the replies. The comparison score is
numerous challenges when manually grading handwritten determined by assessing the solution sentence against each
answer booklets. Answer scripts can be in the form of OMRs answer sentence using keyword weighting and similarity
in case of multichoice question paper which is easier to distance calculations. The process of keyword-weight
evaluate using a computer but when we need to evaluate computation involves identifying keywords in both the
descriptive answer scripts it is trickier due to the nature of solution sentence and the matching answer sentence. The
handwriting, way of expressing ideas or keywords etc. Hence, keyword-weight number, which falls within the range of 0 to
the evaluation task demands a significant amount of time and 1, is derived by computing the proportion of keywords in the
labor. answer sentence relative to the solution sentence and dividing
it by 100. The computation of similarity distance is performed
Ganga Sanuvala et al. presents a model for assessing using either the Word Movers Distance (WMD) method or the
descriptive responses in tests through a three-module Cosine Similarity (CSim) approach. The present comparison
evaluation system: 1) Scanning; wherein an Optical Character score is computed by amalgamating the similarity weight and
Recognition (OCR) technology is employed to scan the page keyword-weight when the similarity distance descends below
and retrieve student responses and kept in a dataset that is in a certain threshold of 30% keywords present. The keyword-
the form of a text file. 2) Preprocessing; where NLP is utilized weight is considered only when the similarity distance is over
to extract a collection of distinct words that correspond to the threshold, but only if it is below 30%. The overall score is
each sentence in the response by conducting a grammatical calculated by taking the average of the current comparison
check, tokenizing the text, removing stop words, checking for scores for all solution sentences[3].
synonyms and antonyms, and performing stemming.
3)Learning; comprising both training and testing. During the
training phase, a model is constructed by acquiring knowledge