Jens Madsen

New paper in PNAS Nexus: Cognitive processing of a common stimulus synchronizes brains, hearts, and eyes

When people engage in a shared experience, like watching a movie or listening to music, our behavioral, physiological and neural responses behave in a very similar fashion. This is in lieu due to the shared stimuli that we are engaging with. Our body reacts and processes the incoming stimuli through our experience. This process also happens to other people who were engaged with the stimuli. Literally our brainwaves, heart rate, gaze position, pupil size etc. is going up and down in unison. But what causes this? And what part of the body can we expect to react and be in sync in a similar fashion between people? In this new paper out in PNAS Nexus we propose there are 2 things that are needed in order for a given signal modality to synchronize between people. 1. There has to be a robust brain-body coupling with the signal in question, 2. That people are cognitively processing a common stimuli. An important part here is that we propose that people do not need to be co-present, i.e. watch a movie together for their physiological signals to start synchronizing. We propose that it is the stimuli and our cognitive processing of those stimuli that makes us react in a similar way.

This is not to say that being together doesn’t have any effect on us, we know that all too well. We just went through a pandemic where many of us were craving social contact. Sometimes it’s better to watch a romantic movie by yourself, because you can just cry all out and not worry about who sees you (that’s a little sad that we haven’t gotten there yet as a society). Other times it’s great to watch action movies together because when the hero wins, we can cheer together. So there are surely social effects when engaging with the world. But from a research perspective it’s important to see if the synchronization between people can happen when people are just in isolation or you need social interactions for it to occur. Personally I think if we can tease those two contributions apart, we can learn a lot about social interactions and be able to isolate what that contribution is.

Back to our study, we carried out 3 experiments in total (around N=150 participants), each with a different purpose. The first was to ask people to watch short informative videos while we record a great number of signal modalities. We recorded EEG, Heart Rate (HR), Gaze position, Pupil size and respiration. You can imagine how many sensors these people had on them, it was pretty intense, but people were pretty casual about it. In the first study N=92 subjects watched 3 videos (about 10 min) of educational youtube videos. We measured whether these signals would statistically synchronize between each other and we found that EEG, HR, Gaze and Pupil statistically synchronized between people, but not respiration. In fact we have measured this on much longer signals too and never found anything meaningful, which to me is a surprise because I would expect moments of suspense would have people gasping at the same time. I guess not! Mind you that the magnitudes of the synchronization is pretty different where the eyes and pupils synchronize to a much higher degree than the others, which is likely driven by the visuals in the videos.

So what is it that makes these signals synchronize? We first looked at the spectral properties of the synchronization, i.e. it could be that there is an underlying already existing signal that is shared across all modalities that when we engage with a stimuli we entrain to. But looking both at the power spectra from each modality and the synchronization spectra, these are pretty different. One common thing though is that they are all on very slow signals, in the range of 10 sec. But the shapes and peaks are pretty different.

I have now done these experiments in quite a few constellations, with different types of videos, with music and speech signals and always wondered, how does all ISC of all these modalities relate to each other? Most are modulated by attention and predictive of memory, so one could expect that they are all basically measuring the same thing, or at least partially. So I compared the ISC of all the measured modalities to each other and it turns out they are co-modulated. Meaning that if the ISC is high for EEG then it’s also high for HR, Pupil and Gaze. So are they really just the same? Not quite because this is at the scale of minutes, i.e. we compare the ISC values computed for the entire 10 min of stimuli. If we reduce that scale down to 10 seconds and let a sliding window pass through the stimuli and compute ISC values for each window and correlate the time resolved ISC between modalities, we get a different picture. The correlation drops between modalities, which to me shows that we might be capturing something very similar on long timescales, but at the shorter scales the similarity drops. This is a very interesting question which we might look into in the future. One could imagine that if people were attentively engaging with a stimuli, being in that state, different stimuli properties would “activate” different bodily responses.

Is there a common underlying factor that is driving the ISC? Well we know that each of the modalities independently are modulated by attention and predictive of memory, which we reproduced on two additional experiments, and we know that they are correlated with each other. So when we do a PCA analysis of the ISC values including all modalities we find that the first principal component loads on EEG, Gaze, Pupil and HR and this component is predictive of memory as well. From a machine learning perspective it’s essentially an unsupervised way of combining measures and that is predictive of memory.

Going back to our proposed theory, we have confirmed that we see synchronization despite the fact that all our subjects watched the videos in isolation. So social interaction is not a requirement, but again it could be that there would be additive effects. So when does the synchronization between people occur? Our second point in our theory was that we require there to be a robust connection between the brain and the physiological signal in question when we engage with common stimuli. The latter point is really important because of course the brain and body constantly communicate. So to test this we compute what we call Within Subject Correlation (WSC) i.e. can we linearly combine the signals we measure on scalp EEG to predict a physiological signal. We can compute the correlation between that prediction with the physiological signal in question. It Turns out that all the signals where we see ISC to occur also have a robust WSC. This is not a binary measure and so we can expect to see some signals to have a weaker WSC and most likely weaker ISC too.

To confirm our theory we tested two unseen signals namely head movements and saccade rate. Both could be good candidates to synchronize between people really, so we were excited. What we did find was positive and negative results, meaning saccade rate had a robust WSC and we found significant ISC across multiple participants, it was modulated by attention and predictive of memory. However that was not the case for head movements.

My take home message here is that the ISC on longer timescales are remarkably similar, opening up for experiments where we can measure attention by not using the more time costly EEG measure. We can even combine measures of ISC and potentially gain a more robust predictor of memory and attention. How these are so similar is quite remarkable because there isn’t an easy explanation as to why they would be similar. So many questions answered and as research goes, it also opens up a whole new set of questions. Stay tuned for more work coming up.

New Paper in Cell reports: Conscious processing of narrative stimuli synchronizes heart rate between individuals

New paper out in Cell reports on how our Heart Rate (HR) synchronize between people when we listen to engaging auditory narratives and watch videos. This work started as I pushed for not only measuring EEG as our subjects watched short educational videos, but also did eye tracking, ECG amongst other modalities. The first idea was simply that if people are more aroused i.e. higher HR then we would expect people to be more alert and pay more attention. However that didn’t actually turn out to be the case. Another frequently used measure is the Heart Rate Variability (HRV) which is also known to be an indicator of alertness, but we didn’t actually see that these measure could reliably predict attentional state nor how much people remember from narratives. Instead using the measure we also used for eye movements and pupil size, namely Intersubect Correlation of Heart rate turned out to be a robust measure of whether people were paying attention and how much they remember from the narratives that we played for people.

Heart rate synchronize between people when they listen to engaging stories and watch videos

So why would the heart start changing its rate similar to other people? There could be many reasons really, it could be that the state of arousal we are in changes over time, in this case over many seconds. There are obviously many things happening to us that could cause these changes, but what is similar between people is the stimuli we experience, the story. So these stories could be changing our state of arousal slowly as we listen to them. But only if we are paying attention to the story, if we let the story engage us, move us. With covid upon us I like the idea that a simple story can literally make our hearts beat similarly and something as simple as a story can make that happen. Even cooler is that this happens even when we sit at home and listen to it ourselves, our hearts still beat the same way despite us being alone.

New paper in PNAS: Synchronized eye movements predict test scores in online video education

I am immensely proud that almost 3 years of work has now been published in PNAS. At the center of this paper is that we can measure if people are paying attention online by using web cameras to track peoples eye movements. Before you say “that sounds creepy”, then we can actually do this without sending any information about you to our server, thus preserving privacy. We can even predict how well students would do on a test, based on the material presented in the video. We did this with over 1000 students using their own webcams to track their eye movements.

So where did this all start? Ill tell you the journey we went through.

Just as Wikipedia has become a go-to source of information, so has instructional videos often found on YouTube. Everything from TED talks to videos about why stars are star-shaped? Online education has been growing rapidly, where you can take everything from fully fledged college degrees, to short courses on everything from language to knitting. When the NSF funded project I was hired in at the City College of New York started 3 years ago, we wanted to look in to online education. Specifically we wanted to look in to how this online format keeps students engaged, or maybe the lack of it, as the retention rates of students is staggeringly low. So how can we improve online learning? We believe one of the key challenged in education is the student teacher interaction. A good teacher can gauge whether students are lost or simply just not paying attention. Teachers react to this by changing teaching styles or give students some tasks that will hopefully engage students again. Some people are better at this than others of course, I remember too many super boring teachers, where sleeping might have been the only way to get through it (don’t tell anyone) but I stayed awake of course 😉

So how do we build tools that enable us to measure the level of attention of students online, that hopefully will enable both teachers and online education providers the possibility to react and make online education adaptive, just as a good teacher does in the classroom?

Working in the department of Biomedical Engineering at CCNY, in the Neural Engineering group for Prof. Lucas Parra, our go-to tools stems from Neuroscience. In the past we showed that the Inter-Subject Correlation (ISC) of EEG could actually measure the level of attention of students and in fact predict the students test scores based on the short videos they watched (paper). But having an EEG set on every time you want to go to “class”, might not be so practical. So we looked for inspiration from what teacher do in a classroom. Teachers often have an easy time seeing whether they have lost their students. They simply look at their students eyes. If you are looking somewhere else, on your phone, out the window or at some of the other cute student in the class, then they are probably not paying attention to the teacher. If you are not paying attention, then you probably wont remember much of what the teacher was saying.

With this knowledge we did a first study and measured students eye movements as they watched 5 different short informal instructional videos, often found on YouTube. To then test whether or not students eye movements actually changed when they weren’t paying attention, we had the students watch the videos again, but this time we asked them to count backwards in decrements of 7 from a high prime number between 800 and 1000. The reason for the range was simply to remove any learning effect, as the students had to restart the counting backwards every time they watched a new video for the second time.

Attending condition

Distracted condition

We can see visually how different peoples eye movements are between whether you are watching it normally or being asked to count backwards. We can quantify how similar or dissimilar peoples eye movements are using simple correlation. We compute how correlated each person is to that of all the other people watching the video. This results in what we call Intersubject Correlation (ISC). Essentially this is a number that quantifies how similar you are to a group of people that also watched the video. We can see that this ISC of eye movements drops significantly when people are asked to count, meaning this measure captures the attention modulation.
With this knowledge we asked the students questions about the content of the videos and surely enough the measure of Intersubject Correlation (ISC) is actually predictive of how much people remember.

Interestingly we also found that the pupil size synchronize between people when they watch the videos. This might be due to the local luminance fluctuations in the video, but even if we regress these out we still see synchronization. This is really fascinating and I feel its worth a follow up study.

After this initial study we wanted to see if the ISC of eye movements and pupil size would also be modulated by attention and predictive of students test taking performance for other video types, comprehension / recall questions. Surely enough with 2 additional experiments we reproduced our findings.

In the end we cant ask a student to buy an eye tracker or any equipment in order for them to attend an online class. So we thought, “how can we make this technology available at-home”. The simple answer is that we can track peoples eye movements with something the majority of computers, smartphone and laptop have today, namely the webcam. We developed a website / online experimental framework to carry out these experiments (Elicit, Github).

We could now scale up the experiments and we ended up having over 1.000 students watch 5 videos and answer the questions about the videos. This is still early stages in terms of precision of the eye tracking data, but because we use correlation simple things like offsets or spatial noise doesn’t affect the ISC that much. In order not to have to transmit the gaze position of all students to our server, we instead transmit a median gaze position across students that had already watched the video. This helps denoise the data and results in higher correlation values between the student and the reference.

We could not compare the wISC values (‘w’ means weighted using simple machine learning) and the test taking performance and surely enough we found a correlation between the test scores and the wISC.

This really opens up a new field of how we can explore curriculum development, potential interventions to improve learning or gauge whether some education material is more suited for some people rather than others. The possibilities are endless.

New paper: Music synchronizes brainwaves across listeners with strong effects of repetition, familiarity and training

Our new paper on the effect of repetition in music is finally out. This has been a great joy to work with Lucas Parra and Elizabeth Margulis, merging the knowledge from music cognition, neuroscience and data science.

As a musical piece is repeated the listener’s “neural engagement” decreases for pieced that are composed in a familiar style, but pieces of unfamiliar style keeps the listener engaged. This effect is most pronounced for listeners that have some musical training. Musicians also seem to be significantly more engaged with the classical pieces than with non-musicians.

https://www.nature.com/articles/s41598-019-40254-w

CUNY snapshot about online learning

SNAPSHOT: Online Learning

New paper “Neural engagement with online educational videos predicts learning performance for individual students”

Our paper “Neural engagement with online educational videos predicts learning performance for individual students” is out in “Neurobiology of Learning and Memory” Volume 155, November 2018, Pages 60-64 you can read the paper here

This is an exciting result which shows that with higher neural engagement, when watching educational videos, students perform better on questionnaires related to the video content, presented after the video. Neural engagement is measured as the Inter Subject Correlation of EEG signals recorded while students watch educational videos within the topic of Science, Technology, Engineering and Mathematics (STEM). This is a very intuitive result, if students pay attention to the videos, they also do better on the test.

Future research will look in to easier and more mobile methods to assess engagement in online education. Also investigating if our results generalize to other types of educational material and teaching styles. Stay tuned for updates.

1 Year in New York working in Parra lab at the City College of New York

In late 2017 i took the jump, from working in safe little Denmark to work in New York city at the City College of New York. From a lab that was focused on machine learning and modelling to a lab that goes from question, designing an experiment, collecting data, modelling and answering those questions (full cycle). Which has been a welcomed challenge.
I have been warmly welcomed in the lab by Prof. Lucas Parra and all the people working in the lab. It has been a big change, both in working environment (things are different in the US) but also the new field i am working in. From worrying about features, models etc. to now doing Neural Engineering working with EEG, ECG and eye tracking.

I have been hired in the NSF funded project “Assessing student attentional engagement from brain activity during STEM instruction”. Here we are exploring how to assess both the efficacy of different teaching styles in online education and different physiological measures to assess the attentional engagement of students. I have been hired for a 3-year postdoc and i am looking forward to the challenge. I will keep you updated with progress 🙂

Short experiment on Music and Emotion

If you have 10-15 minutes to spare, please participate in this short experiment on Music and Emotion press here

New listening experiment on Music and Emotion

Participate in this new listening experiment investigating several aspects around music and emotion.

Want to get a free movie ticket for a Danish cinema of your choice? or just help a researcher out?

Please press here to participate.

It requires that you have a pair of headphones or speakers connected.

Talk at Nordic.ai: From cognition to recommendation

I have been invited to talk at the Nordic.ai “festival”, 9th of March @Vega in Copenhagen, Denmark. The program indicates that its going to be 20+ variants of Machine Learning!!

To break up the stream of “how deep is your network” i will talk a little about how we define the question that we want to solve using ML. Specifically i will explore the different both evolutionary and cognitive psychological mechanisms we can target to make better music services.
See you there 🙂