I am immensely proud that almost 3 years of work has now been published in PNAS. At the center of this paper is that we can measure if people are paying attention online by using web cameras to track peoples eye movements. Before you say “that sounds creepy”, then we can actually do this without sending any information about you to our server, thus preserving privacy. We can even predict how well students would do on a test, based on the material presented in the video. We did this with over 1000 students using their own webcams to track their eye movements.
So where did this all start? Ill tell you the journey we went through.
Just as Wikipedia has become a go-to source of information, so has instructional videos often found on YouTube. Everything from TED talks to videos about why stars are star-shaped? Online education has been growing rapidly, where you can take everything from fully fledged college degrees, to short courses on everything from language to knitting. When the NSF funded project I was hired in at the City College of New York started 3 years ago, we wanted to look in to online education. Specifically we wanted to look in to how this online format keeps students engaged, or maybe the lack of it, as the retention rates of students is staggeringly low. So how can we improve online learning? We believe one of the key challenged in education is the student teacher interaction. A good teacher can gauge whether students are lost or simply just not paying attention. Teachers react to this by changing teaching styles or give students some tasks that will hopefully engage students again. Some people are better at this than others of course, I remember too many super boring teachers, where sleeping might have been the only way to get through it (don’t tell anyone) but I stayed awake of course 😉
So how do we build tools that enable us to measure the level of attention of students online, that hopefully will enable both teachers and online education providers the possibility to react and make online education adaptive, just as a good teacher does in the classroom?
Working in the department of Biomedical Engineering at CCNY, in the Neural Engineering group for Prof. Lucas Parra, our go-to tools stems from Neuroscience. In the past we showed that the Inter-Subject Correlation (ISC) of EEG could actually measure the level of attention of students and in fact predict the students test scores based on the short videos they watched (paper). But having an EEG set on every time you want to go to “class”, might not be so practical. So we looked for inspiration from what teacher do in a classroom. Teachers often have an easy time seeing whether they have lost their students. They simply look at their students eyes. If you are looking somewhere else, on your phone, out the window or at some of the other cute student in the class, then they are probably not paying attention to the teacher. If you are not paying attention, then you probably wont remember much of what the teacher was saying.
With this knowledge we did a first study and measured students eye movements as they watched 5 different short informal instructional videos, often found on YouTube. To then test whether or not students eye movements actually changed when they weren’t paying attention, we had the students watch the videos again, but this time we asked them to count backwards in decrements of 7 from a high prime number between 800 and 1000. The reason for the range was simply to remove any learning effect, as the students had to restart the counting backwards every time they watched a new video for the second time.
We can see visually how different peoples eye movements are between whether you are watching it normally or being asked to count backwards. We can quantify how similar or dissimilar peoples eye movements are using simple correlation. We compute how correlated each person is to that of all the other people watching the video. This results in what we call Intersubject Correlation (ISC). Essentially this is a number that quantifies how similar you are to a group of people that also watched the video. We can see that this ISC of eye movements drops significantly when people are asked to count, meaning this measure captures the attention modulation.
With this knowledge we asked the students questions about the content of the videos and surely enough the measure of Intersubject Correlation (ISC) is actually predictive of how much people remember.
Interestingly we also found that the pupil size synchronize between people when they watch the videos. This might be due to the local luminance fluctuations in the video, but even if we regress these out we still see synchronization. This is really fascinating and I feel its worth a follow up study.
After this initial study we wanted to see if the ISC of eye movements and pupil size would also be modulated by attention and predictive of students test taking performance for other video types, comprehension / recall questions. Surely enough with 2 additional experiments we reproduced our findings.
In the end we cant ask a student to buy an eye tracker or any equipment in order for them to attend an online class. So we thought, “how can we make this technology available at-home”. The simple answer is that we can track peoples eye movements with something the majority of computers, smartphone and laptop have today, namely the webcam. We developed a website / online experimental framework to carry out these experiments (Elicit, Github).
We could now scale up the experiments and we ended up having over 1.000 students watch 5 videos and answer the questions about the videos. This is still early stages in terms of precision of the eye tracking data, but because we use correlation simple things like offsets or spatial noise doesn’t affect the ISC that much. In order not to have to transmit the gaze position of all students to our server, we instead transmit a median gaze position across students that had already watched the video. This helps denoise the data and results in higher correlation values between the student and the reference.
We could not compare the wISC values (‘w’ means weighted using simple machine learning) and the test taking performance and surely enough we found a correlation between the test scores and the wISC.
This really opens up a new field of how we can explore curriculum development, potential interventions to improve learning or gauge whether some education material is more suited for some people rather than others. The possibilities are endless.