Subject A’s rapid rise in his score and Subject B’s stability suggested that the AI grader concentrates only on some aspects of speaking. Subject A is an international businessperson and educator and is accustomed to enunciating his words so that non-native speakers understand what he says. In other tests, graders judged that his pronunciation, intonation, and fluency were high, while this AI test did not judge it that way on the first attempt. So, he adjusted his speaking style to the one where he linked more words and spoke faster, which resulted in a gradual rise of his score almost to the full. The rise of the score is primarily due to Subject A’s ability to speak two types of English: 1) intentional enunciation and 2) native speaker-like speech. The AI grader scored the second type much higher. Here we should note the fact that generally, the faster a person speaks, the more his or her coherence of content deteriorates. This person was no exception, but his score went up drastically.
Subject B was a young ongoing learner of English who spoke with all her might every time she took the test. Her pronunciation, intonation, and speed mainly stayed the same every time she took the test. Also, the accuracy and the content stayed almost the same. After all, as all readers know, it is virtually impossible to improve one’s speaking skills drastically in two days.
Although this was a small personal experiment, it suggested essential things about the assessment by AI: It is still in the developmental stage and has advantages and disadvantages. Unfortunately, assessment robots have not reached the level we see in the movies.
Here we want to share what we found about AI assessment. (Please note that these are still only hypotheses and need more experiments to prove.) We assume AI graders are measuring the proximity of test takers’ speech to the native speakers’ sound data in the big data collected and amassed from English-speaking countries. They use state-of-art voice recognition technology to texturize test takers’ answers and assess their grammar and vocabulary. Also, we assume that AI graders measure the response time and speaking speed.
The advantages are 1. Grading is 100 percent objective with no room for human subjectiveness; and 2. You get the results quickly in several minutes, which can open up many opportunities for educators. The disadvantages are 1. They probably cannot assess the coherence and logicalness of speech; and 2. They will not value enunciation or slow and easy speech, which is commonly appreciated in global communication.
All in all, considering these advantages and disadvantages, it is crucial to know where to use AI graders and human graders. For example, we would use AI graders for screening for a particular pronunciation and grammar level and human graders for selecting a candidate for an academic or business endeavor.
Let us see how far AI can go in 2030. Will it be the level we see in the movie AI or Blade Runner?
Get featured in our blog
You might also enjoy
ELTSociety Webinar – Online Assessment: A Seminar & Discussion. Group forum led by Michael Fields, University of Delaware.
Rosario Giraldez, former Academic Director at the Alianza Cultural Uruguay, discusses the transition of test from paper-and-pencil to online platforms.
Interview with Dr. Tahnee Bucher of Michigan Language Assessment. This event was sponsored by iTEP International, LLC.
Looking for something?
Type your search term below
Directory of Experts
The English Language Testing Society is compiling a directory of member experts in the field of English language testing. If you are interested in being listed*, please fill out the form below.
*Applicable for members only. If you are not a member, consider joining us by clicking here.