When people engage in a shared experience, like watching a movie or listening to music, our behavioral, physiological and neural responses behave in a very similar fashion. This is in lieu due to the shared stimuli that we are engaging with. Our body reacts and processes the incoming stimuli through our experience. This process also happens to other people who were engaged with the stimuli. Literally our brainwaves, heart rate, gaze position, pupil size etc. is going up and down in unison. But what causes this? And what part of the body can we expect to react and be in sync in a similar fashion between people? In this new paper out in PNAS Nexus we propose there are 2 things that are needed in order for a given signal modality to synchronize between people. 1. There has to be a robust brain-body coupling with the signal in question, 2. That people are cognitively processing a common stimuli. An important part here is that we propose that people do not need to be co-present, i.e. watch a movie together for their physiological signals to start synchronizing. We propose that it is the stimuli and our cognitive processing of those stimuli that makes us react in a similar way.
This is not to say that being together doesn’t have any effect on us, we know that all too well. We just went through a pandemic where many of us were craving social contact. Sometimes it’s better to watch a romantic movie by yourself, because you can just cry all out and not worry about who sees you (that’s a little sad that we haven’t gotten there yet as a society). Other times it’s great to watch action movies together because when the hero wins, we can cheer together. So there are surely social effects when engaging with the world. But from a research perspective it’s important to see if the synchronization between people can happen when people are just in isolation or you need social interactions for it to occur. Personally I think if we can tease those two contributions apart, we can learn a lot about social interactions and be able to isolate what that contribution is.
Back to our study, we carried out 3 experiments in total (around N=150 participants), each with a different purpose. The first was to ask people to watch short informative videos while we record a great number of signal modalities. We recorded EEG, Heart Rate (HR), Gaze position, Pupil size and respiration. You can imagine how many sensors these people had on them, it was pretty intense, but people were pretty casual about it. In the first study N=92 subjects watched 3 videos (about 10 min) of educational youtube videos. We measured whether these signals would statistically synchronize between each other and we found that EEG, HR, Gaze and Pupil statistically synchronized between people, but not respiration. In fact we have measured this on much longer signals too and never found anything meaningful, which to me is a surprise because I would expect moments of suspense would have people gasping at the same time. I guess not! Mind you that the magnitudes of the synchronization is pretty different where the eyes and pupils synchronize to a much higher degree than the others, which is likely driven by the visuals in the videos.
So what is it that makes these signals synchronize? We first looked at the spectral properties of the synchronization, i.e. it could be that there is an underlying already existing signal that is shared across all modalities that when we engage with a stimuli we entrain to. But looking both at the power spectra from each modality and the synchronization spectra, these are pretty different. One common thing though is that they are all on very slow signals, in the range of 10 sec. But the shapes and peaks are pretty different.
I have now done these experiments in quite a few constellations, with different types of videos, with music and speech signals and always wondered, how does all ISC of all these modalities relate to each other? Most are modulated by attention and predictive of memory, so one could expect that they are all basically measuring the same thing, or at least partially. So I compared the ISC of all the measured modalities to each other and it turns out they are co-modulated. Meaning that if the ISC is high for EEG then it’s also high for HR, Pupil and Gaze. So are they really just the same? Not quite because this is at the scale of minutes, i.e. we compare the ISC values computed for the entire 10 min of stimuli. If we reduce that scale down to 10 seconds and let a sliding window pass through the stimuli and compute ISC values for each window and correlate the time resolved ISC between modalities, we get a different picture. The correlation drops between modalities, which to me shows that we might be capturing something very similar on long timescales, but at the shorter scales the similarity drops. This is a very interesting question which we might look into in the future. One could imagine that if people were attentively engaging with a stimuli, being in that state, different stimuli properties would “activate” different bodily responses.
Is there a common underlying factor that is driving the ISC? Well we know that each of the modalities independently are modulated by attention and predictive of memory, which we reproduced on two additional experiments, and we know that they are correlated with each other. So when we do a PCA analysis of the ISC values including all modalities we find that the first principal component loads on EEG, Gaze, Pupil and HR and this component is predictive of memory as well. From a machine learning perspective it’s essentially an unsupervised way of combining measures and that is predictive of memory.
Going back to our proposed theory, we have confirmed that we see synchronization despite the fact that all our subjects watched the videos in isolation. So social interaction is not a requirement, but again it could be that there would be additive effects. So when does the synchronization between people occur? Our second point in our theory was that we require there to be a robust connection between the brain and the physiological signal in question when we engage with common stimuli. The latter point is really important because of course the brain and body constantly communicate. So to test this we compute what we call Within Subject Correlation (WSC) i.e. can we linearly combine the signals we measure on scalp EEG to predict a physiological signal. We can compute the correlation between that prediction with the physiological signal in question. It Turns out that all the signals where we see ISC to occur also have a robust WSC. This is not a binary measure and so we can expect to see some signals to have a weaker WSC and most likely weaker ISC too.
To confirm our theory we tested two unseen signals namely head movements and saccade rate. Both could be good candidates to synchronize between people really, so we were excited. What we did find was positive and negative results, meaning saccade rate had a robust WSC and we found significant ISC across multiple participants, it was modulated by attention and predictive of memory. However that was not the case for head movements.
My take home message here is that the ISC on longer timescales are remarkably similar, opening up for experiments where we can measure attention by not using the more time costly EEG measure. We can even combine measures of ISC and potentially gain a more robust predictor of memory and attention. How these are so similar is quite remarkable because there isn’t an easy explanation as to why they would be similar. So many questions answered and as research goes, it also opens up a whole new set of questions. Stay tuned for more work coming up.