Science says: human brain is hard wired to LOVE videos – how? Why?
Why do humans LOVE video content? What's the evolutionary advantage?
Especially in a world where social media addiction is real. Screen time keeps going up year over year. Yet everyday we just… watch more video content.

And our reaction is “Oh, look, pretty pictures!”.
You could be walking through your living room or someone else's phone catches your eye on a train. A video playing around us is enough to make us stop what we’re doing.
Just sucks us in. Bypasses our willpower. Obliges us to watch along. Sometimes, even when it’s not content we’re personally interested in. What? How? Why!
That’s what this article will explore from a scientific lens.
Answering questions like:
- Why do images with visuals perform BETTER than either alone?
- What is cognitive load? And how does it affect our ability to watch videos?
- How did we end up this way? (thanks evolution!)
- What’s going on inside the brain when we see an instructional video?
So using science lets deconstruct what's going on with video.
But disclaimer: this is a highly speculative subject. A lot of the research is theoretical and done through inference. So don’t take it literally, but rather with a grain of salt.
These are proven best practices so take what works from them. Yet, it’s the map not the territory, so don’t read it too literally or you might be in for a rough surprise.
Our eyes & movement detection: inseparable
The truth is: our eyes are coded to detect movement. Otherwise most of our ancestors would have been eaten by predators!

Humans have specialized cells in our eyes to detect colour and movement. It’s part of the reason we “can’t not look”. We’re so visual that it’s hard, maybe even impossible at times to ignore what we see. Our eyes are on… pretty much all the time.
Some speculate it comes from prehistoric cave man days. Similar to how fruit displays bright colours to attract our attention. If we see a sudden movement our eyes are drawn to it.
It’s simple but it works.
Fruit wants us to eat it. What does it do? Evolve bright colours to grab our attention.
Predator wants to eat us. What do our eyes do? Evolve to detect fast and subtle movements.
This has kept humans alive for hundreds of thousands of years. And with each generation our eyes became better and better at detecting danger.
Fast forward to 2025, and at least part of the reason videos suck us in so much is this evolutionary path we took. Our eyes can’t resist moving images. It could even be potentially dangerous to ignore them.
So video is kind of biologically addictive. Since your eyes want to know: whats that thing moving so fast - safe or dangerous?
It gets you engaged in the most natural way possible. You just kind of have to know “whats going on” before you can look away.
So what exactly is getting us engaged? Well, lets get into the theories:
Dual Coding theory: words or images? Both!
In 1971, Allan Paivio came up with a theory called Dual Coding about how the brain processes information.

He would state that two systems work in unison
- Verbal information such as words, language
- Nonverbal information such as images, spatial representations
So why is it called dual coding? For best learning outcomes Paivio suggests using both verbal and nonverbal communication. Encoding information into the brain using two pathways.
This helps learning, memory and recall. So getting more of the brain engaged with the content in front of it.
In practice this means using flowcharts, or mindmaps within content. Creating visual representations of hierarchies, and how to link concepts together. Those students who scribble or doodle within their notes: it works.
As you can imagine, this is massive for video creators. It provides guidance on preparation and consideration about how to maximize words/visuals within a clip.
It’s not just words or visuals - it’s both. So before starting your video compile all the visuals and talking points which can help you.
So this puts to sleep the classic debate of “what's most important in a video?”. The simple answer is: it all works together.
Cognitive load theory: simpler is better
Within educational design there’s a theory which touches on how we learn. There’s a bit we can take away from this and apply to video.
Actually research into education and learning is one of the best parallels for understanding video. Video research with a proper methodology is harder to come by.
Proposed by John Sweller, cognitive load theory says that our working memory is limited and how information is presented can influence effort required to understand.
The key concepts here are: working memory, cognitive load and long-term memory.
We all have a limited amount of working memory available. It’s why most of us can remember somewhere between 5-9 numbers. After our working memory stops, well – working.

Now anything new has a “cognitive load” for us. This simply means that it takes mental effort to process and make sense of. 3 parts make up this concept:
- Intrinsic load: how hard we find the task personally
- Extraneous load: how clearly the information is presented
- Germane load: effort required to work through
Too much cognitive load and we feel overwhelmed. Can’t concentrate, and are generally feeling pretty miserable.
So how do we apply this to video production?
We remove the idea that “anything goes”. And instead replace with: simplicity.
This means brevity in our scripts. It means clarity in our visual representations. It means removing anything “extra” because it can be a barrier to understanding.
Apps like Hemingway editor helps writers keep their language clear and concise for exactly these reasons. Nothing quite like this exists for visuals but the idea is similar: make it easier not harder for people to understand.
Now questions such as “how simple?” and “whats too simple” are not easy to answer using a theory. You will have to play around in practice with how (and much) to apply this.
This often involves leaving your ego at the door.
Now, for our next theory…
Multimedia learning principle: images + text = better
After dual coding theory was published more scientists got curious about the question of “images and text, whats better?”. And so, the research began.
Since then, it’s been proven that images + text can work better together than either alone. It works so well that within instructional learning design it is referred to as Mayer's 12 Principles of Multimedia Learning.
Richard Mayer's research is great because it’s hard to find “video specific” studies. There’s too many factors like: what kind of video, style, context of consumption etc.
But Mayer was able to distill principles from educational research which we can APPLY to video production. Thinking of them as “best practices” for how to share information. Then applying these principles to video to maximize our media.
Apart from validating dual coding theory. Mayer proposes great ground rules
- Remove all fluff. Keep only what's coherent, what's absolutely necessary.
- Highlight the important. Human attention must be directed.
- Narration & graphics = enough. Careful with subtitles. It can steal attention.
- Group related concepts (visual and verbal) together.
- Segment your information. It helps with recall and retention.
Interestingly enough, many of these parallel ideas are from the Gestalt design movement.
Which was criticized for being too philosophical, but decades later has been proven with many real world experiments. Highly suggest reading into Gestalt design to better understand how it relates to these studies.
Gestalt provides insight into HOW and WHY objects should be placed for maximum comprehensibility. Which is more of an art than a science, but, nonetheless there are principles developed for us to follow.
Mirror neurons: why you should ‘show not tell’
You know “monkey see, monkey do”?

It’s more than a cute phrase. There’s real truth behind it. Scientists noticed that humans and other species have something called “mirror neurons”.
What’s that? Well neurons allow you to do everything you do. Yeah, everything. From talking, eating, sitting, walking, writing, thinking, breathing, it all requires neurons.
You can call humans neuron processing centers and you wouldn’t be that far away from the truth. We’re just neurons upon neurons of neurons.
Now, “mirror neurons” are special. How so?
As it turns out, watching someone perform an action - there’s a part of our brain that imagines ourselves doing it. It’s a mechanism we use to understand the intention behind other people actions. It’s how we learn by watching.
Literally “monkey see, monkey do”.
Ever notice how in extreme sports or very technical ones after someone accomplishes an impossible move, it becomes possible for many other athletes? This is mirror neurons in action. Seeing is believing.
And once you believe, you can start to believe that you yourself could also do this.
In simple terms: show me how it’s done, show me it’s possible, and my brain will do the rest.
Which is asking a series of repeating questions: why, how, what until the answer reveals itself. Impressive machinery that we have inside of our head, ain’t it so?
Using science in your next video: how to
Obviously now you want to use science to biologically hack your viewers. Make a video so good, they can’t even look away. Can it be done? Yes.
Can it be done consistently? Not unless you’re a real professional.
The question is great: how to make videos people can’t stop watching? Those that quite literally steal their attention away?
The unpopular answer is: no one knows. There’s too many variables.
Yet every content creator or marketer wants to tell you it’s simple and easy. Yet, how can this be true? Just consider this fact.
“Video” is a word. Videos can be radically different. From length, style, subject matter, to a million other variables which may or may not appeal to a certain audience.
Believe me. Nobody can tell you “what works” with scientific precision. At best, you get guidelines which take you into the “right” direction.
Some of the variables which make it hard:
- Your audience: their knowledge, education, behaviours, psychology, creates huge variations in whats too much or too little.
- Your topic: the complexity, nuances, details, scope, etc
- The visuals: whats available? How easy is it to understand?
So while we know the principles remain the same, the actual application can be drastically different. And as of today, there’s no clear way forward to figuring this out. This is the “art” side of video production.
Video is a complex medium. It’s multimedia. It combines voice, image and editing techniques to make something greater than the sum of it’s parts.
Creativity is the key in any video production process as it requires the content creator to figure out what they wish to achieve. Don’t search for shortcuts.
Let science guide you. Let it inform your options. Let it aide your decision making.
But don’t fall into the trap of searching for a “step by step guide to making the perfect video”. It doesn’t exist.
Stop waiting for video effectiveness proof: go find it
There’s a side to being human: we love certainty.
Yet, many of the best things in life are uncertain. And this love of certainty can really hold us back. Today it’s certain that video content works. It dominates consumption habits like nothing ever before. People literally can’t get enough.
Yet, exactly why, what and how it works is uncertain. So you have to figure it out for yourself, for your audience, subject and context.
Theories and principles must be applied. Depending on how you apply them: they may or may not work for you.
Science can’t answer enough of your questions. You can only find the answers your seek through experience. Doing hard work, figuring it out and making mistakes in the process.
Then you will find what works. Then you will see the magic of video as a medium.
So use these theories to guide you forward but not as a crutch. It’s your responsibility to make sense of how, when and where to apply them.
Make more videos. Evaluate how they do. Using your own experience to prove the science through personal experimentation and observation.
