Beyond decoding: How rhythm and prediction shape reading fluency
Western cultures have oversimplified rhythm
Have you ever noticed how rhythm shapes not just music, but the way we speak, read and even think?
This question has fascinated me for years and as I explored recent research I realised that our understanding of rhythm in Western culture is far too narrow. We often reduce it to simple beats and patterns, yet rhythm is the invisible thread that connects movement, language and even our sense of time.
In Western traditions rhythm is often treated as a technical skill, something to be counted or measured. But in many ancient and non-Western cultures, rhythm is understood as a living system, one that reflects the cycles of nature, the pulse of community and the flow of emotion. When we strip rhythm of its context, we lose its richness and its power to connect us, not only to music but to each other and our humanity as well. For schools this misunderstanding of rhythm has real consequences. We often ask children to try harder, when the underlying timing system that supports fluency is not yet integrated.
This narrow view of rhythm has also shaped the way it’s studied scientifically and this ripples out into classroom. Some empirical studies have required children to chant and clap rhythmic patterns without any musical context. From an educational perspective, it raises questions about how well such tasks capture the complexity of rhythm for children.When we isolate rhythm from its musical or cultural environment, we risk missing the very essence of what makes it meaningful, which of course is its capacity to create connection, anticipation and flow.
Rhythm in language is contextual
This same reductionist approach also appears in linguistic studies, where researchers apply the Western concept of musical rhythm to language. Natural speech doesn’t follow such predictable patterns. It flows through syllables and phrases of varying lengths, shaped by emotion, intention and context. When we listen to computer-generated speech, we instantly sense its artificiality, because the rhythm lacks the subtle nuances that make human communication so characterful and vibrant.
The brain as a prediction machine
Neuroscience shows that the brain is not a passive receiver of information, but an active predictor. In each moment it generates expectations about timing, tone and meaning, adjusting in real time as new information arrives. This is a wonderful reminder that our brains are adapting all the time to the environment that we perceive.
Both music and language rely on rhythm as the basis for structure. Both are recursive. The flow of one phrase (whether musical or linguistic) springs from the one that preceded it, meaning that music and language generate new linguistic or musical rhythm based on an existing structure. The given and new relationship creates a logical sequence that unfolds over time.
Prediction is learned early as a survival strategy. As infants we learn to predict what will happen next in our environment, in the broadest terms. For example, we learn that the word ‘cup’ or the word ‘water’ are associated with mother and the kitchen sink, whereas the words, ‘book’ and ‘story’ are associated with father at bedtime. Our young lives become internal schema that are built on contextual cues of time, place and person. In conversations, the role of context sparks possibilities for the next phrase. Our minds work with these ‘candidates’. It is easy to enjoy a conversation when the dialogue is rich with so many potential avenues for discussion. A 'cliff-hanger' at the end of a TV episode is an example of the power of context to spark speculative thinking and the need to find out what happens next shows the power of predictive processing and its connections with the reward network. There is no doubt that our attention has been captured. Prediction does not operate in isolation — it depends on timing.
Entrainment as shared timing
We are all familiar with the sensation of nodding along, as we listen to a person speaking, or realising that our attention has been captured by something that we have started to read. Entrainment is a broad term that refers to a natural phenomenon. You may have seen video clips of metronomes that begin beating at different times, but soon become part of one larger field as they synchronize. This also applies to the movement patterns of animals, fish, insects and birds. Repeated actions tend to phase lock into the most sustainable and efficient pattern over time. The effect of entrainment adds a further dimension, because synchronizing with others makes the action even more efficient and sustainable. The fascinating aspect of this is that endless repetition implodes unless variation is introduced. This is why novelty and variety are experienced as reward and relief from the monotony of pure repetition. Entrainment is only sustainable in a coordinated activity such as working on an assembly line when a degree of variety is part of the experience. Music was introduced into factories to provide an external timekeeper as well as to introduce some contextual variety.
Neural timing and the experience of flow
I will never forget how excited I felt twenty-five years ago when I was writing my MA Dissertation and read that the timing of a musical phrase, a spoken utterance and a physical gesture would fall within the same predictable time window, averaging at about 5 seconds in duration (Gerstner and Fazio, 1995).
I was thrilled by the elegance of this idea, and also to discover later in that same paper, that the period of time elapsing between secretion of dopamine by the basal ganglia corresponded with this same time window. This was when I learned that the regular pulsing of dopamine framed our subjective experience of the present moment. How about you? Do you feel that your perceptual time window of the present is approximately five seconds long? The article moved on to demonstrate the existence of a temporal time window in different mammals, in my view, very persuasively, because there is variability between species. Back to humans, what is interesting, is that when we focus our attention on a task, time can drag or fly. If we are engrossed, our sense of the temporal window of the 'present' dilates. It is even possible to enter a flow state in which there is very little sense of time passing at all.
When children are not able to let their reading flow, they experience reading as many short disconnected units of time, as if juddering or stalling rather than flowing. In my view, this can be supported and stabilized using rhythm within a high quality musical context.
In a recent study, researchers were able to demonstrate that delta-band entrainment takes place in the early stage of processing by the auditory cortex. The delta-band frequency is relatively slow and Luo and Poeppel (2007) showed that this frequency synchronized with prosody, whereas Lamekina et al., (2024) showed that delta-band frequency entrainment carried the contextual element of prosody ‘into the future’ and enhanced the predictive aspect of sound perception at the early stage of auditory processing. This finding mirrors the role of delta-band entrainment in the predictive quality of perception of tones some ten years earlier. In practice, the prosodic qualities of speech are tonal in the sense that there is intonation in the shape of the utterance and also in terms of its expressive qualities.
The expressive aspect of prosody is an indicator of deeper engagement. The authors showed that brain areas in the left hemisphere associated with prediction were activated during entrainment of delta-band frequencies in the auditory cortex of the right hemisphere, underscoring the importance of deeper rhythmic structures in language processing, whether in dialogue or in reading.
Why fluency depends on timing, not just decoding
In answering the question ‘why does fluency depend on anticipation and not only decoding’, it’s because fluent reading and speech are not only about recognizing symbols or sounds. The processes are more about predicting what comes next than we might realise. Our brains are constantly forecasting patterns of rhythm, syntax and meaning, This is why anticipation is required when we create a feeling of flow, a consequence of stable timing. Decoding on the other hand in the absence of flow loses context and become as string of steps that never truly cohere. Rhythm and prediction allow the brain to entrain to the flow of language, so comprehension and expression become effortless.
What this means for the teaching of reading
This understanding has profound implications for how we teach fluency. Traditional methods focus on accuracy and repetition, training learners to process language as a sequence of units. But if fluency depends on rhythmic prediction, then teaching should focus on helping all children to feel the timing and flow of language. Activities that emphasize rhythm, movement and prosody can help to activate the brain’s predictive timing networks, making reading and speaking more natural and connected. In this way fluency becomes more about experiencing the living pulse of individual expression.
By bringing rhythm back to the heart of learning, we value timing, affect and expression as much as accuracy. In doing this, we move beyond fluency as performance and towards fluency as comprehension and engagement. Reading is a social act of synchrony between author and reader. Rhythm teaches us that meaning is not something we construct alone, but something we create together (with the author) in time.
From an instructional perspective, this suggests that fluency difficulties are not always a matter of insufficient practice or decoding accuracy, but of unstable timing and prediction. Interventions that strengthen rhythmic and prosodic sensitivity may therefore support reading fluency indirectly, by restoring flow rather than increasing effort.
REFERENCES
Gerstner, G.F. and Fazio, V. A. (1995) Evidence of a universal perceptual unit in mammals, Ethology, 101, 89-100.
Lamekina, Y., Titone, L., Burkhard, M. And Meyer, L. (2024) Speech Prosody Serves Temporal Prediction of Language via Contextual Entrainment, 1–14• The Journal of Neuroscience, July 10, 2024• 44(28):e1041232024
Luo H, Poeppel D (2007) Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron 54:1001–1010.




