Review article

Exploring cognitive individuality and the underlying creativity in statistical learning and phase entrainment

Tatsuya Daikoku1[*],2,3, Kevin Kamermans1, Maiko Minatoya1

1Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan

2Centre for Neuroscience in Education, University of Cambridge, Cambridge, UK

3Center for Brain, Mind and KANSEI Sciences Research, Hiroshima University, Hiroshima, Japan

EXCLI J 2023;22:Doc828



Statistical learning starts at an early age and is intimately linked to brain development and the emergence of individuality. Through such a long period of statistical learning, the brain updates and constructs statistical models, with the model's individuality changing based on the type and degree of stimulation received. However, the detailed mechanisms underlying this process are unknown. This paper argues three main points of statistical learning, including 1) cognitive individuality based on "reliability" of prediction, 2) the construction of information “hierarchy” through chunking, and 3) the acquisition of “1-3Hz rhythm” that is essential for early language and music learning. We developed a Hierarchical Bayesian Statistical Learning (HBSL) model that takes into account both reliability and hierarchy, mimicking the statistical learning processes of the brain. Using this model, we conducted a simulation experiment to visualize the temporal dynamics of perception and production processes through statistical learning. By modulating the sensitivity to sound stimuli, we simulated three cognitive models with different reliability on bottom-up sensory stimuli relative to top-down prior prediction: hypo-sensitive, normal-sensitive, and hyper-sensitive models. We suggested that statistical learning plays a crucial role in the acquisition of 1-3 Hz rhythm. Moreover, a hyper-sensitive model quickly learned the sensory statistics but became fixated on their internal model, making it difficult to generate new information, whereas a hypo-sensitive model has lower learning efficiency but may be more likely to generate new information. Various individual characteristics may not necessarily confer an overall advantage over others, as there may be a trade-off between learning efficiency and the ease of generating new information. This study has the potential to shed light on the heterogeneous nature of statistical learning, as well as the paradoxical phenomenon in which individuals with certain cognitive traits that impede specific types of perceptual abilities exhibit superior performance in creative contexts.

Keywords: phase entrainment, Bayesian, chunking, hierarchy, music, rhythm


Understanding cognitive individuality and its underlying creativity is crucial for advancing our understanding of human cognition. One critical cognitive function that contributes to language and music acquisition is known as “statistical learning” (Saffran et al., 1996[71]). Statistical learning is an innate function of the brain that allows individuals to learn the underlying structure of sensory input by detecting statistical patterns and regularities in the environment.

Recent studies have suggested that individual differences in statistical learning is linked to various cognitive abilities and developmental disorders such as autism spectrum disorder (ASD) and developmental dyslexia (Misyak and Christiansen, 2012[55]; Siegelman et al., 2017[73]; Palmer and Mattys, 2016[60]; Obeid et al., 2016[58]; for review, see Arciuli, 2017[3]; Saffran, 2018[70]). Despite the potential importance of statistical learning in comprehending individual cognitive differences and brain development, it remains unclear how such differences in cognitive abilities arise through statistical learning.

Here, we review neural and computational studies on how cognitive individuality emerges through statistical learning in the brain. Further, for constructive understanding, we conducted a simulation experiment to visualize the temporal dynamics of perception and production processes through statistical learning in different cognitive models. We utilized three models that have varying levels of sensitivity to sound stimuli: hypo-sensitive, normal-sensitive, and hyper-sensitive models. Considering that statistical learning is fundamental to brain development, we also discuss how typical versus atypical brain development influences the perception and production of information through statistical learning.

Statistical Learning and Its Predictive Processing

Auditory prediction and its individuality

Recently, a growing body of studies has tried to explain the neural and computational mechanisms of learning and generation of auditory structured information (such as music and language) based on the general principle of predictive processing in the brain (Vuust et al., 2022[83]). Predictive processing in the brain works to minimize the prediction error between the bottom-up sensory signals of sound stimuli from the external environment and the top-down predictive signals based on internal models (Friston, 2010[32], 2017[33]; Clark, 2013[16]). The "reliability" of the prior probability of top-down predictions is controlled by perceptual uncertainty. The brain learns to adapt continuously to the uncertain environment by reducing perceptual uncertainty and prediction errors.

Researchers have attempted to understand cognitive individuality from the perspective of predictive processing in the brain. For example, it has been explained by the dependence on top-down predictions based on the prior probability of internal models (hypo-/hyper-prior) and the dependence on bottom-up sensory signals from the external environment (hypo-/hyper-sensitive) (Pellicano and Burr, 2012[64]; Philippsen et al., 2022[66]). Intuitively, hyper-prior/hypo-sensitive individuals can be characterized as those with strong judgments based on past experiences and hypo-prior/hyper-sensitive as those who adapt quickly to new environments. Recent studies have suggested that such distinct dependence on prior prediction reflects the dynamics of brain development (Philippsen and Nagai, 2019[65]; Philippsen et al., 2022[66]). Neurotypical children tend to exhibit unstable dependence on prior predictions, but over time, they develop the ability to effectively combine sensory information with prior predictions. This enhances their resilience to disruptions in an uncertain environment. On the other hand, individuals with ASD may exhibit distinct patterns of development in predictive processing (see Table 1(Tab. 1); References in Table 1: Bijlenga et al., 2017[11]; Engel-Yeger et al., 2016[29]; Hallam et al., 2009[41]; Lam et al., 1990[50]; Lewy et al., 1985[54]; Nathan et al., 1999[57]; Panagiotidi et al., 2018[61]; Pellicano and Burr, 2012[64]; Philippsen et al., 2022[66]; Van de Cruys et al., 2019[80]; Ward et al., 2017[85]). That is, they tend to exhibit stronger dependence or reliance on prior predictions in certain situations (hyper-prior) (Philippsen and Nagai, 2019[65]), while in other circumstances, they may exhibit a weaker dependence or reliance on prior predictions (hypo-prior) or stronger reliance on sensory input (hyper-sensitive) (Sinha et al., 2014[74]; Thye et al., 2018[76]; Robertson and Baron-Cohen, 2017[69]). Thus, they tend to exhibit variability in their reliance on prior predictions, rather than a consistent pattern of either enhancement or decrease.

Statistical learning of structured sequence

The statistical learning is an essential cognitive function that is closely linked to brain development (Saffran, 2018[70]) and important for understanding individual differences in the perception and production of music and language within the framework of the predictive processing. The basic mechanism involves calculating the statistical probability of environmental information (particularly the transition probability of sequential information) and the uncertainty of probability distribution, and predicting future information based on an internal probabilistic model acquired through statistical learning. The transition probability is a conditional probability of an event en+1, given the preceding n events based on Bayes' theorem: P(en+1|en), while the uncertainty is often calculated using several ways of information-theoretical entropy such as conditional entropy:

−ΣP(xi)ΣP(xi+1|xi)log2P(xi+1|xi) (1)

From the psychological standpoint, the formula can be construed as positing that the brain expects a forthcoming event en+1 based on the most recent preceding events en in a given sequence.

The prediction strategy and resulting impression can vary depending on uncertainty, even when the transition probabilities are identical. For example, a recent neural study has revealed that the brain strategically alters the “order” of transition probabilities, that is, the length of the preceding n events used as a reference for expectation, based on the uncertainty of sequential information (Daikoku and Yumoto, 2023[27]). Another evidence also showed the preference for music stimuli can be understood as a prediction process. That is, it is represented by precision-weighted inverted U curves of the product of the transition probability and uncertainty (Vuust et al., 2012[82]; Vuust and Witek, 2014[84]; Koelsch et al., 2019[49]; Cheung et al., 2019[15]; Gold et al., 2019[34]). Thus, the prediction strategy (order of transition probability) and individual preference is formed by the integration of uncertainty into probability based on individual's internal model.

Such uncertainty is not universally inherent in music of language per se. Rather, it is “perceptual” uncertainty that is shaped by an individual's auditory experience. For example, in the case of language, when a native speaker hears a particular word, the uncertainty in predicting the probable subsequent words is low, making prediction easier. Conversely, for non-native speakers, predicting the next word is more difficult due to higher uncertainty. This is a result of individuals constantly updating their internal models through extended periods of statistical learning, thereby generating an appropriate language probability model. Neural and behavioral studies have highlighted the impact of individual's auditory experience and expertise on the statistical learning abilities (Daikoku and Yumoto, 2020[26]), which can lead to familiarity with specific types of music or genres (Vuust et al., 2012[82]). Furthermore, in addition to perception and learning, past experiences with statistical learning play a crucial role in the development of individual traits related to music production (composition) and creativity (Daikoku and Yumoto, 2020[26]).

Brain's statistical learning is Bayesian inference but not maximum likelihood estimation

Importantly, auditory experience affects not only perceptual uncertainty, but also the "reliability" of probabilities. For instance, an A-to-B transition which occurs in (1) 9 out of 10 trials and in (2) 90 out of 100 trials both have a transition probability of 90 %. However, the degree of reliability is higher in the former case than the latter. Such reliability is useful for the brain to make judgments even for events with low transition probability. Comparing an event that occurs in 10 out of 100 trials with an event that occurs in 1 out of 10, the brain will recognize that the former is reliably unpredictable and confidently uses this information to make predictions.

Neurophysiological studies have observed a gradual representation of statistical learning effects as the number of learning repetitions increases (Daikoku et al., 2015[25]), indicating that the brain's statistical learning is based on Bayesian inference, which gradually improves the reliability of probabilities with increasing experience rather than on maximum likelihood estimation, which does not vary with learning repetitions. Thus, the amount of learning (auditory experience) not only changes the uncertainty of the brain's internal model but also the reliability of probabilities, which can affect cognitive individuality and the way of predictions.

However, most studies of statistical learning have referred maximum likelihood estimation based on Markov models or n-gram models that do not consider the "reliability" of probabilities, and thus, have not taken into account the effect of learning trial. Therefore, this study developed a novel model, referred to as a “Hierarchical Bayesian Statistical Learning (HBSL)” model incorporating the Bayesian reliability of probabilities into a Markov model. We then used this model to examine the learning process when a specific auditory stimulus sequence is repetitively learned.

It is of note that the reliability of probabilities is not only subject to the amount of learning (experience), but also to prediction biases. As mentioned above (section Auditory prediction and its individuality), cognitive individuality can be characterized as those with strong or weak dependence or reliance on the prior probability of internal models, referred to as hyper-prior or hypo-prior, respectively, and those with strong or weak dependence on the auditory inputs, referred to as hyper-sensitive or hypo-sensitive, respectively (Pellicano et al., 2012[64]; Philippsen et al., 2022[66]). Thus, the reliability of prediction can also vary depending on the way of prediction as well as auditory experience. In summary, cognitive individuality is associated with a mixture of "dependency on prior prediction (or sensory signal)" and "amount of statistical learning".

Hierarchy of syntactic and rhythm structure, and phase entrainment

Statistical learning has basically been derived from a hypothesis that explains the mechanism of chunking, which detects information units with high transition probabilities from sequential information such as words or phrases (Saffran et al., 1996[71]). Therefore, many previous studies have examined the neural and computational mechanisms of chunking through statistical learning. On the other hand, recent studies have proposed two types of “hierarchical” statistical learning systems (Altman, 2017[2]; Daikoku et al., 2021[24]). The first is a system based on the fundamental function of statistical learning, which groups the chunks of information that have high transition probabilities and integrate them into a cohesive unit. The second is a system that arranges various chunked units to create a hierarchical syntactic structure (Figure 1(Fig. 1)). That is, statistical learning plays a crucial role in the acquisition of the hierarchy, which is an essential and unique feature of language and music (Patel, 2003[63]).

Particularly, the hierarchical structure of auditory rhythms has been considered important for the acquisition of music and language (Goswami, 2017[38]). The hierarchy of rhythms refers to a structure in which the lower-level rhythms, such as those corresponding to syllables and musical notes (e.g., crotchet) around 4-12 Hz, are included in the higher-level rhythms around 1-3 Hz, which correspond to prosody, intonation, and long musical note such as minim (Daikoku et al., 2022[22]). Furthermore, there are rhythms around 12-30 Hz that correspond to phonemes or sound onsets at even lower levels of the hierarchy. This can be visualized by analyzing the amplitude modulation (AM) envelope of sound waveforms (Figure 2(Fig. 2); Reference in Figure 2: Daikoku and Goswami, 2022[20]).

It is known that human auditory perception relies in part on phase entrainment of the AM rhythm patterns in sounds at different timescales simultaneously. Such a phase entrainment (also described as phase alignment, neural coupling, tracking, and synchronization) has been shown to contribute to parsing of the sound signal into units such as syllables and words (Poeppel, 2003[67]).

A recent study has shown that the acquisition of the slower rhythm (1-3 Hz), that is, phase entrainment of 1-3Hz rhythm (Attaheri et al., 2022[5]) is particularly important for early learning and development of language and music. Notably evidence has also shown that the ability of 1-3 Hz phase entrainment is associated with statistical learning capacity (Assaneo et al., 2019[4]), and neural oscillations synchronize with the statistical chunks acquired via statistical learning (Batterink and Paller, 2017[8]).

However, brain development can interfere with this function of phase entrainment through statistical learning (Smalle et al., 2022[75]). Individuals with developmental disorders such as Autism Spectrum Disorder (ASD) and developmental dyslexia, which is characterized by difficulties in reading, spelling, and impaired phonological processing (Ramus et al., 2003[68]; Vellutino et al., 2004[81]), exhibit decay of statistical learning and rhythm processing (Arciuli, 2017[3]; Saffran, 2018[70]; Goswami, 2019[39]). Therefore, statistical learning plays a critical role in brain development and the emergence of cognitive individuality. Over a prolonged period of statistical learning, the brain updates and constructs statistical models, with the model's individuality changing based on the type and degree of stimulation received. However, the detailed mechanisms underlying this process remain unknown.

To provide a constructive understanding of the potential relationships between statistical learning and 1-3Hz rhythm acquisitions, in the next section, we conduct a simulation experiment to visualize the temporal dynamics of perception and production processes through statistical learning, using a newly devised model referred to as the HBSL model with different dependence or reliability on bottom-up sensory stimuli relative to top-down prior prediction: hypo-sensitive, normal-sensitive, and hyper-sensitive models that takes into account both reliability and hierarchy, mimicking the statistical learning processes of the brains with different cognitive individuality. Then, we discuss how atypical cognitive development and individuality (i.e., hypo- and hyper-sensitive) influence the perception and production through statistical learning.


Hierarchical Bayesian statistical learning model

This study developed a computational model, which simulates statistical learning processes of the brain, referred to as HSBL model (Daikoku and Nagai, 2022[23]; Daikoku et al., 2021[24]) (Figure 1(Fig. 1)). The scripts of the model have been deposited to an external source ( This is a model that integrates Bayesian estimation with Markov processes using a Dirichlet distribution as a prior distribution. This model can not only calculate the transition probabilities but also determine the "reliability of probabilities" from the inverse of the variance of the prior distribution of the transition probabilities. Using the normalized values of transition probabilities and reliability, this model chunks transition patterns when the product of "reliability * probability" is greater than a constant c. The constant can be decided based on the sample length and the number of learning trial. In this study, we defined c=5 given the sample length used in this experiment. In this study, three models (hypo-, normal-, and hyper-sensitive) with different degrees of dependence on sensory signals were generated by manipulating the parameter vector of the Dirichlet distribution.

Hypo-sensitive: α = (α_1*0.25, ..., α_K*0.25)

Normal-sensitive: α = (α_1*1, ..., α_K*1)

Hyper-sensitive: α = (α_1*4, ..., α_K*4) (2)

where each α_i corresponds to the prior probability of category i. Specifically, for K categories, α is a K-dimensional vector of positive real numbers. Although the degree of updating transition probabilities remains constant among the three models, differences emerge in terms of the changes in the reliability of transition probability. In the hyper-sensitive model, the reliability of probabilities (variance of prior distribution) varies easily depending on the sensory input, while in the hypo-sensitive model, the reliability of probabilities is less likely to vary even when a new input is provided (eight times weaker than hyper-sensitive model). The normal-sensitive model is an intermediate model between the hyper- and hypo-sensitive models in terms of sensitivity to sensory input.

Learning and production processes

We generated fifteen different models by manipulating the degree of dependence on sensory signals and the amount of learning. We used the MIDI data of the Japanese children's song "Yuuyake Koyake" as the training data, and repeated the learning of the song one to five times using each of the three models (hypo-, normal-, and hyper-sensitive). As a result, a total of fifteen models were generated, consisting of three degrees of dependence on sensory signals and five amounts of learning. We investigate how each of the hypo-, normal-, and hyper-sensitive model transforms the internal model over five trials of learning. Furthermore, using the probability distribution of these fifteen models, a hundred pieces of music were probabilistically generated for each model through an automatic composition process (Daikoku and Nagai, 2022[23]).

Comparison of internal representations in the model

We compared the total Bayesian surprise (or total prediction errors) that occurred during learning, measured by the Kullback-Leibler divergence between a distribution P(x) before learning an event (en) and a distribution Q(x) after learning the event (en+1), as well as the total number of chunks generated during 5 trials of statistical learning. The Kullback-Leibler divergence has often been used to measure prediction error or Bayesian surprise in the framework of predictive processing of the brain (Friston, 2010[32]; Baldi and Itti, 2010[7]; Itti and Baldi, 2009[44]). It is a metric used to measure the similarity between two different probability distributions. It represents how much information is lost when one probability distribution changes into another, and since it is non-negative, a small value indicates that the two distributions are similar. Specifically, it is calculated by taking the difference between the probability density functions of the two distributions, taking the logarithm at each point, and then computing the weighted average with respect to one of the distributions. The Kullback-Leibler divergence between two probability distributions P(x) and Q(x) is calculated using the following formula:

DKL(P||Q) = ΣP(i) log (P(i)/Q(i)) (3)

Here, P(i) and Q(i) represent the probabilities of selecting the value i according to the probability distributions P and Q, respectively. In addition, we calculated the average probability distribution of the 100 songs generated by each model and compared the similarity of the models to the training data (i.e., original data) using t-distributed stochastic neighbor embedding (tSNE).

Comparison of acoustic properties of rhythm

We converted the MIDI data of the 100 songs generated by each model into WAV format and extracted the rhythm waveform (modulation wave) below 15 Hz using the Bayesian probabilistic amplitude modulation model (PAD, (Turner and Sahani, 2011[78])). The acoustic signals were first normalized based on the z-score (mean = 0, SD = 1) in case the sound intensity influenced the spectrotemporal modulation feature. The spectrotemporal modulation of the signals was analyzed using PAD to derive the dominant AM patterns. Music and speech signals can be decomposed into slow-varying AM patterns and rapidly-varying carrier or frequency modulation (FM) patterns (Elliott and Theunissen, 2009[28]; Turner, 2010[77]; Daikoku et al., 2022[22]). AM patterns are responsible for fluctuations in sound intensity, which are considered to be a primary acoustic feature of perceived hierarchical rhythm. On the other hand, FM patterns reflect fluctuations in spectral frequency and noise. AM envelopes of speech signals can be separated from the FM structure by means of amplitude demodulation processes. The PAD model employs Bayesian inference to infer the modulators and carrier, and to identify the envelope that best fits the data and a priori assumptions. More specifically, amplitude demodulation is the process by which a signal (yt) is decomposed into a slowly varying modulator (mt) and a rapidly varying carrier (ct):

yt = mt * ct (4)

PAD employs amplitude demodulation as a process of both learning and inference. Learning involves the estimation of parameters that describe distributional constraints, such as the expected timescale of variation of the modulator. Inference involves estimating the modulator and carrier from the signals based on learned or manually defined parametric distributional constraints. This information is probabilistically encoded in the likelihood function P(y1:T|c1:T, m1:T, θ), the prior distribution over the carrier p(c1:T|θ), and the prior distribution over the modulators: p(m1:T|θ). Here, the notation x1:T represents all the samples of the signal x, ranging from 1 to a maximum value T. Each of these distributions depends on a set of parameters θ, which control factors such as the typical timescale of variation of the modulator or the frequency content of the carrier. In more detail, the parametrized joint probability of the signal, carrier, and modulator is:

P(y1:T, c1:T, m1:T|θ) = P(y1:T|c1:T;m1:T, θ) * p(c1:T|θ) * p(m1:T|θ) (5)

Bayes' theorem is applied for inference, forming the posterior distribution over the modulators and carriers, given the signal:

P(c1:T, m1:T|y1:T, θ) = P(y1:T, c1:T, m1:T|θ) / P(y1:T|θ) (6)

The full solution to PAD is a distribution over the possible pairs of modulators and carriers. The most probable pair of modulator and carrier given the signal is returned:

m*1:T, c*1:T=argmax P(c1:T, m1:T|y1:T, θ) (7)

PAD utilizes Bayesian inference to estimate the most suitable modulator (i.e., envelope) and carrier that best align with the data and a priori assumptions. The resulting solution takes the form of a probability distribution, which describes the likelihood of a specific setting of modulator and carrier given the observed signal. Thus, PAD summarizes the posterior distribution by returning the specific envelope and carrier with the highest posterior probability, thereby providing the best fit to the data.

PAD can be run recursively using different demodulation parameters each time, producing a cascade of amplitude modulators at different oscillatory rates to form an AM. The positive slow envelope is modeled by applying an exponential nonlinear function to a stationary Gaussian process, resulting in a positive-valued envelope with a constant mean over time. The degree of correlation between points in the envelope can be constrained by the timescale parameters of variation of the modulator (i.e., envelope), which can either be manually entered or learned from the data.

In the present study, we manually entered the PAD parameters to produce the modulators at an oscillatory band level (i.e., <10 Hz) isolated from a carrier at a higher frequency rate (>10 Hz). The carrier reflects components, including noise and pitches, for which the frequencies are much higher than those of the core modulation bands. In each sample, the modulators (envelopes) were converted into time-frequency domains using scalogram (Figure 2(Fig. 2)). The scalograms depict the AM envelopes derived by recursive application of probabilistic amplitude demodulation. We then calculated the average frequency power at each frequency and further averaged it over the 100 songs generated by each model.

Representation of Individual Difference

Learning process

This study suggests that the Hypo-sensitive model had the highest total Bayesian surprise or total prediction error (i.e., Kullback-Liebler divergence) during learning, followed by Normal-sensitive model and Hyper-sensitive model (Figure 3(Fig. 3), left). Furthermore, the Normal-sensitive and Hyper-sensitive models showed a gradual decrease in Bayesian surprise and an increase in chunking through the trial of learning, whereas the Hypo-sensitive model showed no decrease in Bayesian surprise or increase in chunking (Figure 3(Fig. 3)).

Production process

In terms of acoustic features of composed music after learning, both the Hyper-sensitive model and Normal-sensitive model showed a gradual increase in the 2 Hz rhythm, which corresponds to short phrases that are considered important in the initial learning of auditory sequences (such as music or language) (Figure 5). On the other hand, rhythms corresponding to notes or beats in the 3-5 Hz range gradually decreased with learning. In contrast, the Hypo-sensitive model showed a gradual decrease in the 2 Hz rhythm and a gradual increase in the 3-5 Hz rhythm with learning.

Regarding probability distribution, the tSNE analysis showed that the similarity of the probability distribution of the composed music to the original music was highest for the Hyper-sensitive model, followed by the Normal-sensitivity model and the Hypo-sensitive model, in that order (Figure 4(Fig. 4)).


Emergence of individuality through statistical learning

Statistical learning is a fundamental process for brain development and contributes to forming individual difference of perception and production (Siegelman et al., 2017[73]; Daikoku and Yumoto, 2020[26]; Daikoku, 2018[19]). Computational studies allow for the modeling of the brain's developmental processes and the emergence of individuality in predictive functions underlying statistical learning. In this study, we used a model that mimics brain's statistical learning processes including hierarchically structured building to investigate how auditory cognitive individuality arises from statistical learning. Specifically, we conducted a simulation experiment to examine the contributions of two factors to auditory cognitive individuality: 1) sensitivity to sensory signals (hypo-, normal-, hyper-sensitive) and 2) amount of statistical learning (number of learning trials). Our results showed that various auditory cognitive individualities can arise depending on differences in both of sensitivity to sensory signals and amount of statistical learning.

In particular, the normal- and hyper-sensitive models gradually reduced Bayesian surprise, increased the number of chunks (Figure 3(Fig. 3)), and generated 1-3 Hz rhythm (Figure 4(Fig. 4)) through learning. Moreover, the effects were more pronounced and earlier in the hyper-sensitive model than in the normal-sensitive model. In contrast, in the hypo-sensitive model, neither the reduction of Bayesian surprise nor the chunking occurred through learning. In sum, the simulation experiment showed that the learning efficiency was highest for the hyper model, lowest for the hypo model, and intermediate for the normal model. Due to its high sensitivity to sensory signals, the hyper-sensitive model may exhibit greater adaptability to external input information, potentially resulting in faster learning rates for external sensory statistics.

On the other hand, the hypo-sensitive model produced music with statistically different characteristics from those of the training data (i.e., original music), compared to the other models (Figure 5(Fig. 5)). That is, the statistical similarity between the generated music and the original music was highest for the hyper-sensitive model and lowest for the hypo-sensitive model. This suggests that the hypo-sensitive model, which showed poor learning efficiency for external information, may have difficulty in statistical learning and chunking, but may be more likely to generate new information. In contrast, the hyper-sensitive model, which can efficiently learn statistical regularities of external information, may have difficulty in generating new information. This suggests that different levels of dependence or reliance on sensory signals lead to differences in the internal model even for the same learning stimuli (Figure 6(Fig. 6)), and that these differences also influence performance in creative contexts.

From statistical learning to rhythm acquisition

This study demonstrated that statistical learning contributes to the acquisition of rhythms around 1-3 Hz (Figure 5(Fig. 5)). Furthermore, while the rhythms around 1-3 Hz gradually increased in hyper-sensitive and normal-sensitive models, they gradually decreased in hypo-sensitive. Previous studies have shown that individuals with ASD, who also tend to exhibit hypo-sensitivity, produce speech with weak prosody and monotone intonation, which corresponds to a core 1-3Hz phase component in speech rhythm hierarchy (Kanner, 1943[48]). In addition, a previous study that examined the speech rhythm of individuals with ASD using the PAD model employed in this study revealed a decrease in power around the 1-3 Hz frequency range corresponding to the rhythm of prosody and intonation (Daikoku et al., 2022[22]). Taken together with the results of this study, it is possible to say that individuals with hypo-sensitivity may have difficulty acquiring rhythms around 2 Hz.

A recent study has shown that the neural processing of the slower rhythm, that is, oscillatory phase entrainment of 1-3 Hz rhythm (Attaheri et al., 2022[5]) is particularly important for early learning and development in language. Notably, evidence has also shown that the ability of 1-3 Hz phase entrainment is associated with statistical learning capacity (Assaneo et al., 2019[4]), and neural oscillations synchronize with the statistical chunks acquired via statistical learning (Batterink and Paller, 2017[8]). However, brain development interfered with this function of phase entrainment in statistical learning (Smalle et al., 2022[75]). Individuals with ASD and developmental dyslexia exhibit decay of statistical learning as well as rhythm processing (Arciuli, 2017[3]; Saffran, 2018[70]; Goswami, 2019[39]). Therefore, statistical learning may play a critical role in acquisition and development of 1-3 Hz rhythm.

As stated in the Introduction section, two types of “hierarchical” statistical learning systems have been proposed (Altman, 2017[2]; Daikoku et al., 2021[24]). The first is to chunk series of “local” information that have high transition probabilities. The second is to arrange these chunked units to create a “global” hierarchical syntactic structure (Figure 1(Fig. 1)). In our study, the first function corresponds to the acquisition of 3-5 Hz rhythm while the second is the acquisition of 1-3 Hz rhythm (Figure 2(Fig. 2)). According to previous studies, individuals with ASD exhibit inconsistent responses to local deviants (information that induces prediction error). For example, in studies of ASD and mismatch negativity (MMN, an event-related response component in an EEG signal that occurs in response to deviant signals), some studies demonstrated weaker MMN in individuals with ASD than in typically-developed individuals (Seri et al., 1999[72]; Abdeltawwab and Baz, 2014[1]; Bonnet-Brilhault et al., 2016[12]), while other studies detected larger MMN responses in ASD (Gomot et al., 2002[36], 2011[35]; Ferri et al., 2003[30]; Lepistö et al., 2005[53]; Green et al., 2020[40]). These findings suggest that individuals with ASD may exhibit either hypo-sensitivity or hyper-sensitivity to local sensory properties. However, studies on global predictive processing (e.g., hierarchical structure building) have consistently reported that individuals with ASD exhibit weak response to global deviants (see Figure 1(Fig. 1)) (Goris et al., 2018[37]). This implies that ASD is hypo-sensitive to non-local statistics, while sensitivity to local events depends on the type of stimuli (Ide et al., 2017[43]), representing either hypo- or hyper-sensitivity to local statistics.

Balance of reliance on prior prediction and its development

Past studies suggest that as the brain develops, neurotypical individuals transition from relying heavily on sensory input statistics while giving less weight to prior predictions (known as hypo-prior or hyper-sensitive) to properly integrating sensory statistics with prior predictions (Philippsen and Nagai, 2019[65]). This helps individuals to become more resilient in uncertain environments. However, developmental disabilities, such as ASD, may result in different neural mechanisms underlying prior prediction (Nagai, 2019[56]; Lanillos et al., 2020[51]). Studies suggest that individuals with ASD have hyper-plasticity in short-term statistical learning, leading to a preference for recent sensory statistics over global (i.e., long-term) statistical structures (Sinha et al., 2014[74]; Saffran, 2018[70]). This means that individuals with ASD may heavily rely on sensory input while giving less weight to prior prediction (i.e., hypo-prior or hyper-sensitive) during statistical learning. It is worth noting that there may be contrasting abnormalities in predictive function in ASD, with a stronger reliance on prior predictions (i.e., hyper-prior) (Philippsen and Nagai, 2019[65]) instead of hypo-prior predictions (Pellicano and Burr, 2012[64]). Thus, the abnormality of prior prediction in ASD and other developmental disorders may be characterized by instability or variability rather than either enhancement or decay in reliance on prior prediction (for summary, see Table 1(Tab. 1)).

Several studies have also indicated that children with developmental language disorders, including developmental dyslexia, which is defined by difficulties in reading, spelling, and impaired phonological processing (Ramus et al., 2003[68]; Vellutino et al., 2004[81]), demonstrate a diminished capacity for statistical learning (Daikoku et al., 2023[21]). This suggests that developmental dyslexia may exhibit language impairment resulting from difficulties in detecting and utilizing the statistical regularities of language, leading to a potential hypo-sensitive characterization. Nevertheless, as noted in the features of ASD, they may also display normal predictive processing or hyper-sensitivity to other sensory signals such as music and somatosensory signals. The possibility of varying prediction processing depending on the type of stimulus could be a critical key for future research.

Such instability of reliance on prior prediction could also influence the precision of perceptual uncertainty, as the precision is estimated by the inverse variance of any sensory input (i.e., prior distribution) (Koelsch et al., 2019[49]). Studies have indicated that ASD is susceptible to perceptual uncertainty (Boulter et al., 2014[14]; Lawson et al., 2014[52]; Van de Cruys et al., 2014[79]). The inability to tolerate uncertainty can be considered a key marker of generalized anxiety disorder (Freeston et al., 1994[31]). This trait may also be related to the heightened anxiety commonly observed in people with ASD and could have a negative effect on their creativity (Baas et al., 2008[6]). It has been suggested that individuals with ASD may experience increased anxiety levels when the level of uncertainty in their environment is high (Boulter et al., 2014[14]). In other words, the fear of uncertain situations can potentially limit the creative potential of individuals with ASD.

However, the unique feature of predictive processing and statistical learning in ASD may not always result in negative outcomes but could have positive effects in certain situations. Several studies have reported that individuals with ASD sometimes exhibit superiority in certain abilities (Boucher et al., 2012[13]), such as mathematics, visual search skills (O'Riordan et al., 2001[59]), and music and art skills (Happé and Frith, 2009[42]; James, 2010[45]). Therefore, such a unique feature of predictive processing and statistical learning may not always result in negative outcomes but could have positive effects in certain situations.

Thus, atypical brain development may display specific characteristics (rather than decay or facilitation) of predictive processing. It is assumed that these specificities of predictive processing, that is hypo-/hyper and hypo-/ hyper-priors sensitivities, could impact statistical learning ability and (statistical) creativity.

Efficiency in learning or novelty in creation

A previous study has found that individuals with ASD were able to come up with more unconventional and uncertain ideas during divergent thinking tasks compared to typically developed individuals. However, the total number of ideas generated by individuals with ASD was fewer than that of typically developed individuals (Best et al., 2015[10]).

Neural evidence partially supports this finding and explains it by the hypo-connectivity between the prefrontal cortex and other regions in brains of individuals with ASD (Belmonte et al., 2004[9]; Just et al., 2004[46], 2012[47]; Courchesne and Pierce, 2005[18]; Green et al., 2020[40]). Prior predictions are mainly generated in the frontal regions and transmitted to sensory areas through synaptic connections (Cope et al., 2017[17]; Park et al., 2018[62]). The connectivity between these regions is critical for conveying prior predictions and creating plausible representations of sensory input. However, in individuals with ASD, the altered connectivity between these regions can lead to a modulation of prior predictions, resulting in the production of uncertain information, known as hypo-prior.

Previous studies have shown that neural entrainment induced by statistical learning is enhanced when the prefrontal cortex is temporarily disrupted using repetitive transcranial magnetic stimulation (rTMS). This suggests that the temporary disruption in prefrontal cortex function may have caused a hypo-prior or hyper-sensitive state in the brain, potentially resulting in improved statistical learning ability. Our simulation experiments have also shown that hyper-sensitivity leads to improved statistical learning ability from all aspects of reduction of prediction error, increase of chunk, and 1-3 Hz rhythm acquisition, thereby supporting the findings of these previous studies.

However, it is important to note that our simulation only controlled sensitivity (bottom-up processing), not prior (top-down processing), and the models repeatedly learned the same music. Therefore, in the hyper-sensitive model, the reliability of the internal model inevitably increases due to the repeated learning of the same information. This means that hyper-sensitivity “during learning” could lead to a kind of hyper-prior “during production”. Future study needs to investigate how the efficiency in learning (perception) and novelty in creativity (production) are affected when learning various types of information or when controlling for both sensitivity and prior.

In summary, atypical alterations in prior prediction may display specific cognitive individuality involved in perception and production (or learning and creation) through statistical learning. However, such an individuality may not necessarily be favored over the other, as the efficiency of learning and the ease of creating new information may be partially in a trade-off. This study suggests that simulation experiments using statistical learning may lead to a better understanding of the relationship between learning efficiency and creativity in learning systems that exhibit different levels of dependence on sensory signals. Further research on the cognitive individuality may illuminate the potential diversity in human society.


This study suggests that hyper-sensitivity allows for efficient statistical learning of information, but makes it difficult to generate new information, while hypo-sensitivity makes it difficult to learn statistically, but may make it easier to generate new information. Different individual characteristics may not necessarily be favored over the other, as the efficiency of learning and the ease of generating new information may be partially in a trade-off. This study has the potential to shed light on the underlying factors contributing to the heterogeneous nature of the supposedly innate ability of statistical learning that all individuals possess, as well as the paradoxical phenomenon in which individuals with certain cognitive traits that impede specific types of perceptual abilities exhibit superior performance in creative contexts.



This research was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI (22KK0157; 21H05063; 22H05210; 22KK0157), and the Japan Science and Technology Agency (JST) Moonshot Goal 9 (JPMJMS2296), Japan. The funding sources had no role in the decision to publish or prepare the manuscript.

Conflict of interest

The authors declare no competing financial interests.

Authors' contribution

T.D. conceived the method of experiment and data analyses. T.D. analyzed the data and prepared the figures. K.K. and M.M. surveyed previous literature and compiled it into a table. T.D. wrote the original draft of the manuscript. K.K. and M.M. reviewed and edited the manuscript. All authors finalized the manuscript.

Data availability

The scripts for the computational model (Hierarchical Bayesian Statistical Learning: HBSL) and analysis and all data including results have been deposited to an external source (



1. Abdeltawwab MM, Baz H. Automatic pre-attentive auditory responses: MMN to tone burst frequency changes in autistic school-age children. J Int Adv Otol. 2015;11(1):36-41. doi: 10.5152/iao.2014.438
2. Altmann GT. Abstraction and generalization in statistical learning: implications for the relationship between semantic types and episodic tokens. Philos Trans R Soc Lond B Biol Sci. 2017;372(1711):20160060. doi: 10.1098/rstb.2016.0060
3. Arciuli J. The multi-component nature of statistical learning. Philos Trans R Soc Lond B Biol Sci. 2017;372(1711):20160058. doi: 10.1098/rstb.2016.0058
4. Assaneo MF, Ripollés P, Orpella J, Lin WM, de Diego-Balaguer R, Poeppel D. Spontaneous synchronization to speech reveals neural mechanisms facilitating language learning. Nat Neurosci. 2019;22:627-32. doi: 10.1038/s41593-019-0353-z
5. Attaheri A, Choisdealbha ÁN, Di Liberto GM, Rocha S, Brusini P, Mead N, et al. Delta- and theta-band cortical tracking and phase-amplitude coupling to sung speech by infants. Neuroimage. 2022;247:118698. doi: 10.1016/j.neuroimage.2021.118698
6. Baas M, De Dreu CK, Nijstad BA. A meta-analysis of 25 years of mood-creativity research: hedonic tone, activation, or regulatory focus? Psychol Bull. 2008;134:779-806. doi: 10.1037/a0012815
7. Baldi P, Itti L. Of bits and wows: A Bayesian theory of surprise with applications to attention. Neural Netw. 2010;23:649-66. doi: 10.1016/j.neunet.2009.12.007
8. Batterink LJ, Paller KA. Online neural monitoring of statistical learning. Cortex. 2017;90:31-45. doi: 10.1016/j.cortex.2017.02.004
9. Belmonte MK, Allen G, Beckel-Mitchener A, Boulanger LM, Carper RA, Webb SJ. Autism and abnormal development of brain connectivity. J Neurosci. 2004;24:9228-31. doi: 10.1523/JNEUROSCI.3340-04.2004
10. Best C, Arora S, Porter F, Doherty M. The relationship between subthreshold autistic traits, ambiguous figure perception and divergent thinking. J Autism Dev Disord. 2015;45:4064-73. doi: 10.1007/s10803-015-2518-2
11. Bijlenga D, Tjon-Ka-Jie JYM, Schuijers F, Kooij JJS. Atypical sensory profiles as core features of adult ADHD, irrespective of autistic symptoms. Eur Psychiatry. 2017;43:51-7. doi: 10.1016/j.eurpsy.2017.02.481
12. Bonnet-Brilhault F, Alirol S, Blanc R, Bazaud S, Marouillat S, Thépault RA, et al. GABA/Glutamate synaptic pathways targeted by integrative genomic and electrophysiological explorations distinguish autism from intellectual disability. Mol Psychiatry. 2016;21:411-8. doi: 10.1038/mp.2015.75
13. Boucher J, Mayes A, Bigham S. Memory in autistic spectrum disorder. Psychol Bull. 2012;138:458-96. doi: 10.1037/a0026869
14. Boulter C, Freeston M, South M, Rodgers J. Intolerance of uncertainty as a framework for understanding anxiety in children and adolescents with autism spectrum disorders. J Autism Dev Disord. 2014;44:1391-402. doi: 10.1007/s10803-013-2001-x
15. Cheung VKM, Harrison PMC, Meyer L, Pearce MT, Haynes JD, Koelsch S. Uncertainty and surprise jointly predict musical pleasure and amygdala, hippocampus, and auditory cortex activity. Curr Biol. 2019;29:4084-92.e4. doi: 10.1016/j.cub.2019.09.067
16. Clark A. Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behav Brain Sci. 2013;36:181-204. doi: 10.1017/S0140525X12000477
17. Cope TE, Sohoglu E, Sedley W, Patterson K, Jones PS, Wiggins J, et al. Evidence for causal top-down frontal contributions to predictive processes in speech perception. Nat Commun. 2017;8(1):2154. doi: 10.1038/s41467-017-01958-7
18. Courchesne E, Pierce K. Why the frontal cortex in autism might be talking only to itself: local over-connectivity but long-distance disconnection. Curr Opin Neurobiol. 2005;15:225-30. doi: 10.1016/j.conb.2005.03.001
19. Daikoku T. Musical creativity and depth of implicit knowledge: spectral and temporal individualities in improvisation. Front Comput Neurosci. 2018;12:89. doi: 10.3389/fncom.2018.00089
20. Daikoku T, Goswami U. Hierarchical amplitude modulation structures and rhythm patterns: Comparing Western musical genres, song, and nature sounds to Babytalk. PLoS One. 2022;17(10):e0275631. doi: 10.1371/journal.pone.0275631
21. Daikoku T, Jentschke S, Tsogli V, Bergström K, Lachmann T, Ahissar M, et al. Neural correlates of statistical learning in developmental dyslexia: An electroencephalography study. Biol Psychol. 2023;181:108592
22. Daikoku T, Kumagaya S, Ayaya S, Nagai Y. Phonological characteristics shared by questioner and responder: a comparison between individuals with and without autism spectrum disorder. PsyArXiv. January 11, 2022. doi:10.31234/
23. Daikoku T, Nagai Y, DAIKIN Ltd. Computational algorithm to support human creativity (PAT.T:2022-077559). In Japanese. 2022
24. Daikoku T, Wiggins GA, Nagai Y. Statistical properties of musical creativity: roles of hierarchy and uncertainty in statistical learning. Front Neurosci. 2021;15:640412. doi: 10.3389/fnins.2021.640412
25. Daikoku T, Yatomi Y, Yumoto M. Statistical learning of music- and language-like sequences and tolerance for spectral shifts. Neurobiol Learn Mem. 2015;118:8-19. doi: 10.1016/j.nlm.2014.11.001
26. Daikoku T, Yumoto M. Musical expertise facilitates statistical learning of rhythm and the perceptive uncertainty: A cross-cultural study. Neuropsychologia. 2020;146:107553. doi: 10.1016/j.neuropsychologia.2020.107553
27. Daikoku T, Yumoto M. Order of statistical learning depends on perceptive uncertainty. Curr Res Neurobiol. 2023;4:100080. doi: 10.1016/j.crneur.2023.100080
28. Elliott TM, Theunissen FE. The modulation transfer function for speech intelligibility. PLoS Comput Biol. 2009;5(3):e1000302. doi: 10.1371/journal.pcbi.1000302
29. Engel-Yeger B, Muzio C, Rinosi G, Solano P, Geoffroy PA, Pompili M, et al. Extreme sensory processing patterns and their relation with clinical conditions among individuals with major affective disorders. Psychiatry Res. 2016;236:112-8. doi: 10.1016/j.psychres.2015.12.022
30. Ferri R, Elia M, Agarwal N, Lanuzza B, Musumeci SA, Pennisi G. The mismatch negativity and the P3a components of the auditory event-related potentials in autistic low-functioning subjects. Clin Neurophysiol. 2003;114:1671-80. doi: 10.1016/s1388-2457(03)00153-6
31. Freeston MH, Rhéaume J, Letarte H, Dugas MJ, Ladouceur R. Why do people worry? Pers Individ Differ. 1994;17:791-802. doi: 10.1016/0191-8869(94)90048-5
32. Friston K. The free-energy principle: a unified brain theory? Nat Rev Neurosci. 2010;11:127-38. doi: 10.1038/nrn2787
33. Friston K, FitzGerald T, Rigoli F, Schwartenbeck P, Pezzulo G. Active inference: a process theory. Neural Comput. 2017;29(1):1-49. doi: 10.1162/NECO_a_00912
34. Gold BP, Pearce MT, Mas-Herrero E, Dagher A, Zatorre RJ. Predictability and uncertainty in the pleasure of music: a reward for learning? J Neurosci. 2019;39:9397-409. doi: 10.1523/JNEUROSCI.0428-19.2019
35. Gomot M, Blanc R, Clery H, Roux S, Barthelemy C, Bruneau N. Candidate electrophysiological endophenotypes of hyper-reactivity to change in autism. J Autism Dev Disord. 2011;41:705-14. doi: 10.1007/s10803-010-1091-y
36. Gomot M, Giard MH, Adrien JL, Barthelemy C, Bruneau N. Hypersensitivity to acoustic change in children with autism: electrophysiological evidence of left frontal cortex dysfunctioning. Psychophysiology. 2002;39:577-84. doi: 10.1017/S0048577202394058
37. Goris J, Braem S, Nijhof AD, Rigoni D, Deschrijver E, Van de Cruys S, et al. Sensory prediction errors are less modulated by global context in autism spectrum disorder. Biol Psychiatry Cogn Neurosci Neuroimaging. 2018;3:667-74. doi: 10.1016/j.bpsc.2018.02.003
38. Goswami U. A neural basis for phonological awareness? An oscillatory temporal-sampling perspective. Curr Dir Psychol Sci. 2017;27(1):56-63. doi: 10.1177/0963721417727520
39. Goswami U. A neural oscillations perspective on phonological development and phonological processing in developmental dyslexia. Lang Linguist Compass. 2019;13(5):e12328. doi: 10.1111/lnc3.12328
40. Green HL, Shuffrey LC, Levinson L, Shen G, Avery T, Randazzo Wagner M, et al. Evaluation of mismatch negativity as a marker for language impairment in autism spectrum disorder. J Commun Disord. 2020;87:105997. doi: 10.1016/j.jcomdis.2020.105997
41. Hallam KT, Begg DP, Olver JS, Norman TR. Abnormal dose-response melatonin suppression by light in bipolar type I patients compared with healthy adult subjects. Acta Neuropsychiatr. 2009;21:246-55. doi: 10.1111/j.1601-5215.2009.00416.x
42. Happé F, Frith U. The beautiful otherness of the autistic mind. Philos Trans R Soc Lond B Biol Sci. 2009;364:1346-50. doi: 10.1098/rstb.2009.0009
43. Ide M, Yaguchi A, Atsumi T, Yasu K, Wada M. [Hypersensitivity as extraordinary high temporal processing in individuals with autism-spectrum disorders]. Brain Nerve. 2017;69:1281-9. Japanese. doi: 10.11477/mf.1416200905
44. Itti L, Baldi P. Bayesian surprise attracts human attention. Vision Res. 2009;49:1295-306. doi: 10.1016/j.visres.2008.09.007
45. James I. Autism and art. Front Neurol Neurosci. 2010;27:168-173. doi: 10.1159/000311200
46. Just MA, Cherkassky VL, Keller TA, Minshew NJ. Cortical activation and synchronization during sentence comprehension in high-functioning autism: evidence of underconnectivity. Brain. 2004;127:1811-21. doi: 10.1093/brain/awh199
47. Just MA, Keller TA, Malave VL, Kana RK, Varma S. Autism as a neural systems disorder: a theory of frontal-posterior underconnectivity. Neurosci Biobehav Rev. 2012;36:1292-313. doi: 10.1016/j.neubiorev.2012.02.007
48. Kanner L. Autistic disturbances of affective contact. Nerv Child. 1943;2:217-50
49. Koelsch S, Vuust P, Friston K. Predictive processes and the peculiar case of music. Trends Cogn Sci. 2019;23:63-77. doi: 10.1016/j.tics.2018.10.006
50. Lam RW, Berkowitz AL, Berga SL, Clark CM, Kripke DF, Gillin JC. Melatonin suppression in bipolar and unipolar mood disorders. Psychiatry Res. 1990;33:129-34. doi: 10.1016/0165-1781(90)90066-e
51. Lanillos P, Oliva D, Philippsen A, Yamashita Y, Nagai Y, Cheng G. A review on neural network models of schizophrenia and autism spectrum disorder. Neural Netw. 2020;122:338-63. doi: 10.1016/j.neunet.2019.10.014
52. Lawson RP, Rees G, Friston KJ. An aberrant precision account of autism. Front Hum Neurosci. 2014;8:302. doi: 10.3389/fnhum.2014.00302
53. Lepistö T, Kujala T, Vanhala R, Alku P, Huotilainen M, Näätänen R. The discrimination of and orienting to speech and non-speech sounds in children with autism. Brain Res. 2005;1066:147-57. doi: 10.1016/j.brainres.2005.10.052
54. Lewy AJ, Nurnberger JI Jr, Wehr TA, Pack D, Becker LE, Powell RL, et al. Supersensitivity to light: possible trait marker for manic-depressive illness. Am J Psychiatry. 1985;142:725-7. doi: 10.1176/ajp.142.6.725
55. Misyak JB, Christiansen MH. Statistical learning and language: An individual differences study. Lang Learn. 2012;62(1):302-331. doi: 10.1111/j.1467-9922.2010.00626.x
56. Nagai Y. Predictive learning: its key role in early cognitive development. Philos Trans R Soc Lond B Biol Sci. 2019;374(1771):20180030. doi: 10.1098/rstb.2018.0030
57. Nathan PJ, Burrows GD, Norman TR. Melatonin sensitivity to dim white light in affective disorders. Neuropsychopharmacology. 1999;21:408-13. doi: 10.1016/S0893-133X(99)00018-4
58. Obeid R, Brooks PJ, Powers KL, Gillespie-Lynch K, Lum JA. Statistical learning in specific language impairment and autism spectrum disorder: a meta-analysis. Front Psychol. 2016;7:1245. doi: 10.3389/fpsyg.2016.01245
59. O'Riordan MA, Plaisted KC, Driver J, Baron-Cohen S. Superior visual search in autism. J Exp Psychol Hum Percept Perform. 2001;27:719-30. doi: 10.1037//0096-1523.27.3.719
60. Palmer SD, Mattys SL. Speech segmentation by statistical learning is supported by domain general processes within working memory. Q J Exp Psychol (Hove). 2016;69:2390-401. doi: 10.1080/17470218.2015.1112825
61. Panagiotidi M, Overton PG, Stafford T. The relationship between ADHD traits and sensory sensitivity in the general population. Compr Psychiatry. 2018;80:179-85. doi: 10.1016/j.comppsych.2017.10.008
62. Park H, Thut G, Gross J. Predictive entrainment of natural speech through two fronto-motor top-down channels. Lang Cogn Neurosci. 2018;35:739-51. doi: 10.1080/23273798.2018.1506589
63. Patel AD. Language, music, syntax and the brain. Nat Neurosci. 2003;6:674-81. doi: 10.1038/nn1082
64. Pellicano E, Burr D. When the world becomes 'too real': a Bayesian explanation of autistic perception. Trends Cogn Sci. 2012;16:504-10. doi: 10.1016/j.tics.2012.08.009
65. Philippsen A, Nagai Y. A predictive coding model of representational drawing in human children and chimpanzees. In: 2019 Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) (pp 171-6). New York, NY: IEEE, 2019
66. Philippsen A, Tsuji S, Nagai Y. Simulating developmental and individual differences of drawing behavior in children using a predictive coding model. Front Neurorobot. 2022;16:856184. doi: 10.3389/fnbot.2022.856184. s
67. Poeppel D. The analysis of speech in different temporal integration windows: cerebral lateralization as 'asymmetric sampling in time'. Speech Commun. 2003;41:245-55. doi: 10.1016/S0167-6393(02)00107-3
68. Ramus F, Rosen S, Dakin SC, Day BL, Castellote JM, White S, et al. Theories of developmental dyslexia: insights from a multiple case study of dyslexic adults. Brain. 2003;126:841-65. doi: 10.1093/brain/awg076
69. Robertson, C., Baron-Cohen, S. Sensory perception in autism. Nat Rev Neurosci. 2017;18:671–84. doi: 10.1038/nrn.2017.112
70. Saffran JR. Statistical learning as a window into developmental disabilities. J Neurodev Disord. 2018;10(1):35. doi: 10.1186/s11689-018-9252-y
71. Saffran JR, Aslin RN, Newport EL. Statistical learning by 8-month-old infants. Science. 1996;274:1926-8. doi: 10.1126/science.274.5294.1926
72. Seri S, Cerquiglini A, Pisani F, Curatolo P. Autism in tuberous sclerosis: evoked potential evidence for a deficit in auditory sensory processing. Clin Neurophysiol. 1999;110:1825-30. doi: 10.1016/s1388-2457(99)00137-6
73. Siegelman N, Bogaerts L, Frost R. Measuring individual differences in statistical learning: Current pitfalls and possible solutions. Behav Res Methods. 2017;49:418-32. doi: 10.3758/s13428-016-0719-z
74. Sinha P, Kjelgaard MM, Gandhi TK, Tsourides K, Cardinaux AL, Pantazis D, et al. Autism as a disorder of prediction. Proc Natl Acad Sci U S A. 2014;111:15220-5. doi: 10.1073/pnas.1416797111
75. Smalle EHM, Daikoku T, Szmalec A, Duyck W, Möttönen R. Unlocking adults' implicit statistical learning by cognitive depletion. Proc Natl Acad Sci U S A. 2022;119(2):e2026011119. doi: 10.1073/pnas.2026011119
76. Thye MD, Bednarz HM, Herringshaw AJ, Sartin EB, Kana RK. The impact of atypical sensory processing on social impairments in autism spectrum disorder. Dev Cogn Neurosci. 2018;29:151-67. doi: 10.1016/j.dcn.2017.04.010
77. Turner R. Statistical models for natural sounds. Ph.D. Dissertation. London: University College, 2010
78. Turner RE, Sahani M. Demodulation as probabilistic inference. IEEE Trans Audio Speech Lang Process. 2011;19:2398-411. doi: 10.1109/TASL.2011.2135852
79. Van de Cruys S, Evers K, Van der Hallen R, Van Eylen L, Boets B, de-Wit L, et al. Precise minds in uncertain worlds: predictive coding in autism. Psychol Rev. 2014;121:649-75. doi: 10.1037/a0037665
80. Van de Cruys S, Perrykkad K, Hohwy J. Explaining hyper-sensitivity and hypo-responsivity in autism with a common predictive coding-based mechanism. Cogn Neurosci. 2019;10:164-6. doi: 10.1080/17588928.2019.1594746
81. Vellutino FR, Fletcher JM, Snowling MJ, Scanlon DM. Specific reading disability (dyslexia): what have we learned in the past four decades? J Child Psychol Psychiatry. 2004;45(1):2-40. doi: 10.1046/j.0021-9630.2003.00305.x
82. Vuust P, Brattico E, Seppänen M, Näätänen R, Tervaniemi M. The sound of music: differentiating musicians using a fast, musical multi-feature mismatch negativity paradigm. Neuropsychologia. 2012;50:1432-43. doi: 10.1016/j.neuropsychologia.2012.02.028
83. Vuust P, Heggli OA, Friston KJ, Kringelbach ML. Music in the brain. Nat Rev Neurosci. 2022;23:287-305. doi: 10.1038/s41583-022-00578-5
84. Vuust P, Witek MA. Rhythmic complexity and predictive coding: a novel approach to modeling rhythm and meter perception in music. Front Psychol. 2014;5:1111. doi: 10.3389/fpsyg.2014.01111
85. Ward J, Hoadley C, Hughes JE, Smith P, Allison C, Baron-Cohen S, et al. Atypical sensory sensitivity as a shared feature between synaesthesia and autism. Sci Rep. 2017;7:41155. doi: 10.1038/srep41155

Figure 1: An example of hierarchical statistical learning of music. Reprinted from Daikoku et al. (2021). Misty by Errol Garner, composed in 1954 but arranged by the authors to simplify. The arrangement, chord names, and symbols are simplified (just major/minor, flat, and 7th note) to account for the two-five-one (II-V(7)-I) progression. For example, jazz music has general regularities in chord sequences such as the so-called “two-five-one (II-V-I) progression.” It is a succession of chords whose roots descend in fifths from the supertonic (II) to dominant (V), and finally to the tonic (I). Such syntactic progression frequently occurs in music, and therefore, the statistics of the sequential information have high transitional probability and low uncertainty. Thus, once a person has learned the statistical characteristics, it can be chunked as a commonly used unit among improvisers. In contrast, the ways of combining the chunked units are different between musicians.

Figure 2: An example of rhythm hierarchy with a scalogram. Reprinted from the paper by Daikoku and Goswami (2022). Both language and music consist of hierarchical rhythmic structures that include rhythms around 2 Hz, as compared to other auditory stimuli.

Figure 3: Statistical learning effects in each trial of learning. The left is the total Bayesian surprise (or prediction error) in learning a piece of music, and the right is the number of chunks after statistical learning. Blue, black, and red represent the hypo-, normal-, hyper-sensitive models, respectively.

Figure 4: Acoustic properties of composed music after each trial of Statistical learning. Blue, black, and red represent the hypo-, normal-, hyper-sensitive models, respectively.

Figure 5: Statistical properties of composed music after each trial of Statistical learning. Blue, black, and red represent the hypo-, normal-, hyper-sensitive models, respectively.

Figure 6: Graphical representation of the normal model, hypo- and hyper-sensitive models.


Table 1: Individual differences of prediction/sensitivity to external stimuli

[*] Corresponding Author:

Tatsuya Daikoku, Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan; Phone Number: 03-5841-1656, eMail: