Introduction

The cultural importance of imaginary worlds in contemporary societies cannot be overstated. Star Wars is the most lucrative media franchise in history. Harry Potter is the best-selling book series of all time. Game of Thrones holds the all-time audience record for a TV series. One Piece is the best-selling manga series in manga history. The Avengers: Endgame is the highest-grossing movie of all time. And Zelda is the best-selling video game series in the history of video games. They all have something in common: their creators developed an imaginary environment, that is, a fictional environment that does not actually exist, which is different from the real world, and that the consumers know to be partly or fully invented1,2,3,4. Moreover, such fictional stories with imaginary worlds appear better at building huge fan communities, as exemplified by the fact that lots of them turn into highly profitable multimedia franchises (e.g., the Wizarding World, the Marvel Cinematic Universe1). In other words, modern culture, from movies to clothing to theme parks to educational tools, increasingly revolves around imaginary worlds. Why is there such a massive interest in fantasy worlds? What lies behind this anthropological phenomenon?

In a theoretical paper, we proposed that imaginary worlds in fictional stories artificially trigger the human preference for exploration5. This preference is best described as an evolved cognitive mechanism that processes cues of new or information-rich environments as inputs and delivers an adaptive approach behavior as the output: it makes people environmentally curious and prompts directed exploration (i.e., exploration that aims at seeking and acquiring information about the environments, thus reducing uncertainty, as opposed to exploitation or random exploration6,7,8,9,10,11,12,13,14,15). This mechanism of environmental curiosity was selected because it enhanced fitness in ancestral environments, motivating humans to discover new habitats, new cooperative or sexual partners, resources such as food or water, and new fitness-relevant information16, 17. It explains, for instance, universal walking patterns integrating multiple changes of directions (i.e., Lévy walks, observed in hunter-gatherers18).

Because imaginary worlds are unknown fictional settings, they can be seen as new environments that consumers discover. As Tolkien put it himself, “part of the attraction of The Lord of the Rings”, and other fictions with imaginary worlds, relies on the “intrinsic feeling of reward” we experience when “viewing far off an unvisited island or the towers of a distant city” (letter to Colonel Worskett, 20 September 1963). This statement is very close to the one of Shigeru Miyamoto, the creator of Zelda, who reported that he “wanted to create a game world that conveyed the same feeling you get when you are exploring a new city for the first time” (1989). Following the same intuition, we hypothesized that humans are fascinated by imaginary worlds for the same reasons, and under the same circumstances, as they are motivated to explore new and unfamiliar environments5. It is equivalent to saying that imaginary worlds constitute a superstimulus of explorable environments19, 20, or that imaginary worlds are part of the actual domain of the cognitive mechanism that evaluates landscapes (whose proper domain is constituted of real cues of explorable environments), just like masks are part of the actual domain of the cognitive mechanism of human face recognition21.

Let us note that we differentiate between spatial exploration and cognitive exploration, and symmetrically between environmental curiosity and other domain-specific forms of curiosity (e.g., curiosity for how tools causally work22; see also23) because we reason that different forms of curiosity should not be sensitive to the same cues to orient attention and behavior. Relying on some recent molecular evidence, Hills24 has argued that spatial exploration and cognitive exploration are linked at the phylogenetic level: “What was once foraging in a physical space for tangible resources became, over evolutionary time, foraging in cognitive space for information related to those resources” . Here, we focus on environmental curiosity, that is, on the specific curiosity for environments that prompt spatial exploration while recognizing that exploratory preferences may exist as a more general cluster25, 26. For instance, there is experimental evidence that a preference for spatial exploration in a foraging task is associated with a preference for cognitive exploration in a problem-solving task27. In any circumstance, there seems to be a cognitive mechanism that specifically processes landscapes17, 28, 29.

In our theoretical paper, we reviewed the experimental evidence for the universal association between, on the one hand, novelty and learning opportunity and, on the other hand, exploratory choices (in the absence of extrinsic reward). For instance, participants in unfamiliar lab settings choose more exploratory options30. Consistently, when asked to choose new environments to explore, people use cues of perceptual or epistemic novelty to make their choice31, 32. In the brain, this is supported by the findings that novelty cues and learning opportunities lead to dopamine enhancement in midbrain areas33,34,35,36,37,38,39,40. There is also strong experimental evidence that humans universally favor more mysterious and more explorable environments when asked to report their preferences for landscapes28, 41,42,43.

In this paper, we further test the exploration hypothesis. In Study 1, we first explore the importance of environmental curiosity in structuring the landscape of contemporary fiction. Fictional stories can trigger a range of human motivations, from romantic love (e.g., in romance novels) to threat detection (e.g., in horror films) to causal reasoning (e.g., in detective novels). However, not all motivations are equally important to fiction: typically, eating food or discovering new smells are important to humans, but marginal in fiction, for reasons beyond the scope of this paper. Consequently, while fictional stories about food or perfumes do exist, they are relatively uncommon and not generally considered as belonging to a distinct genre. On the contrary, if exploration is one of the most important psychological factors in contemporary fiction, it should structure the landscape of contemporary fiction. We should observe that, when organized by semantic proximity, fictional stories with a strong emphasis on environmental curiosity, particularly those featuring imaginary worlds, should cluster together, and should cluster apart from other large, well-identified genres, such as romance or horror fiction. In Study 1, we also test that fictional characters in stories with imaginary worlds navigate their environments more than in other stories. This prediction relies on the idea that the psychology of fictional characters should be crafted to be consistent with their fictional environments: their evaluation of resourceful or informative landscapes should motivate them to explore more.

In studies 2 and 3, we develop a set of empirical predictions derived from our theory. In our theoretical paper5, we predicted that if the exploration theory is valid, then the attraction to imaginary worlds would be intrinsically linked to the desire to explore novel environments and that both would be influenced by the same underlying factors. In other words, we predict that each source of variability that explains inter-individual differences in the sensitivity of environmental curiosity should also explain inter-individual differences in the consumption of fictional stories with imaginary worlds. Before testing this mapping between cognitive variability and cultural variability, we further explain the origins and the mechanisms underlying inter-individual variability in environmental curiosity.

When and why should organisms decide to explore or not to explore? This question is best known as the evolutionary exploration–exploitation trade-off (26, 44; see45, 46 for reviews), which is dealt with differently in different species according to their life history strategies (e.g.,47,48,49,50,51). Crucially, this trade-off is also dealt with differently between individuals from the same species, including Homo sapiens. This is best seen in the inter-individual differences in spatial abilities52. We contend that sources of adaptive variability explain the inter-individual differences in people’s motivation to explore. First, we review the literature in evolutionary psychology and associated empirical evidence showing that the sensitivity of human environmental curiosity varies according to adaptive sources of variability. We select four of them that seem to explain a significant part of the variance while acknowledging that other factors could be added to increase the explained variance.

The importance of exploratory preferences varies at the inter-individual level according to some fixed personality traits, which are thought of as evolutionary strategies of specialization to some ecological or social niches53,54,55: people are genetically hardwired to be more or less curious. This is captured by the existence of a genetically inherited personality trait often called Openness-to-experience56,57,58. It constitutes one of the five dimensions within the Big Five, the model of human personality57, 59. The five dimensions that compose it (i.e., Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism) have been designed to capture the universal variability of human personalities and behaviors: humans differ in the personality “scores” associated with each of these dimensions. The Big Five is considered the most widely accepted model of human personality today60,61,62,63,64.

The Openness trait is correlated with novelty-seeking behavior38, 56, 65, a preference for creativity66,67,68,69,70,71, spatial cognitive capacities72,73,74,75, a preference for using maps76, a preference to explore a system77, 78, and innovative deviations from observed demonstrations in learning tasks79. In other words, people higher in Openness-to-experience are overall more curious and explorative80. In the cultural domain, the Openness trait is correlated with the liking of adventure movies, fantasy movies, and science fiction movies81, the enjoyment of abstract art82, the preference for jazz, blues, classical, rock, alternative, and folk music83,84,85. It also correlates with some cultural practices reported by people, such as going to the theatre, art galleries, or museums86, or seeking novel food87. We, therefore, predict that people higher in Openness should be, on average, fonder of fictional stories set in imaginary worlds.

The sensitivity of exploratory preferences varies according to the developmental stage of the individual. Evolutionary developmental psychology explains why: younger individuals have more to learn from the world, so it is more adaptive for them to explore their environments and try to reduce knowledge gaps88,89,90. A complementary explanation posits that the evolutionary costs associated with exploration (e.g., resource shortage risk) are outweighed by parental caregiving investments91, 92. This can be seen as an adaptive feedback loop or as an adaptive developmental division of labor93,94,95.

There is already much experimental evidence that children are indeed more curious and eager to explore than adults (see96, for a review). Children are more explorative than adults in foraging tasks97,98,99, in bandit tasks100, in explanation-seeking tasks101, 102, in search tasks103, in decision-making tasks104, 105, in problem-solving tasks106, in causal-learning tasks107, 108, and in change-detection and visual search tasks109. Importantly, such behavioral data from experimental research and computational modeling show that children are not merely prone to random sampling behavior: they show clear patterns of directed exploration110. Another study found that intellectual curiosity is negatively correlated with age, after controlling for education level, sex, and culture111. In a foraging task, adolescents explored more and more optimally than adults did112, consistent with other findings suggesting that adolescents too are more motivated to explore novel, although risky, scenarios than adults112,113,114,115,116,117. In accordance with such findings, Openness has been shown to decline with age across countries118,119,120,121,122. We, therefore, predict that younger people should be, on average, fonder of fictional stories set in imaginary worlds.

The sensitivity of exploratory preferences varies according to an individual’s biological sex123. Selection pressures for exploratory preferences and abilities have been stronger for males in a lot of terrestrial vertebrates, and more specifically in a lot of mammalian species, because of different mating patterns for access to mates (caused by differences in reproductive variance between the sexes): in this view, spatial exploration is thought of as a male reproductive strategy124,125,126. For instance, in humans, there is evidence that, in the Tsimane (i.e., a forager-horticulturalists people in Bolivia), males travel more than females, and even more so during periods of intensive mate search127, but not earlier in ontogeny75.

Another evolutionary rationale posits that, in humans specifically, exploratory preferences and abilities contributed differentially to the reproductive success of males and females because of the sexual division in foraging activities: males would have specialized in solving spatial problems associated with hunting (which requires a propensity to explore unfamiliar environments) while females would have specialized in solving spatial problems associated with plant gathering (which requires a propensity to learn and remember object locations128, 129). Both rationales can explain why, in humans, males develop higher spatial abilities specifically related to exploration than females (see130, 131 for meta-analyses, see132 for a review) and navigate in wider ranges than females (133, 134, but see135).

Lastly, another complementary hypothesis proposes that males and females evolved different cognitive preferences and skills related to inventorying and classifying features when exploring the physical world, partly because of a male specialization in tool-use: ‘systemizing’, the drive to explore and understand systems, would have had a greater impact on the reproductive success of males than that of females77, 136. There is evidence that males score higher in systemizing78, 137, 138, that fetal testosterone levels positively correlate with a highly restricted range of interests, which is a marker of both high-systemizing and high-functioning autism139, and that there are more males with either higher systemizing-quotient or autistic traits who are interested in non-social domains of knowledge, such as engineering, mathematics, or science78, 140,141,142.

Overall, three different (and non-mutually exclusive) evolutionary hypotheses propose that some adaptive challenges human males specifically faced during their evolution (i.e., searching for mates, hunting, or using tools) led males to be more systematically curious about their spatial (and non-social) environments while leading females to be more systematically curious about their social environments. This is consistent with the findings that (1) in modern societies (here the United States, with experimental evidence from 320,000 participants) males score higher in the personality trait Openness-to-experience143, and (2) in hunter-gatherer societies (here, the Hadza of Tanzania, with evidence from GPS data), males explore more land, follow more sinuous paths, walk further per day144, and also perform better in three tests of spatial ability145. We, therefore, predict that males should be, on average, fonder of fictional stories set in imaginary worlds.

Finally, exploratory preferences are hypothesized to vary according to the local ecology of an individual. Exploration is most valuable and adaptive in more affluent, safer, and therefore more predictable environments146,147,148. Why? In unsafe and poor ecologies, exploration is very risky, notably because if exploration does not pay off, one is left with nothing. Relatedly, the opportunity costs of exploration are higher in scarcity because one is better off exploiting one’s environment to provide for more pressing needs. Conversely, in more affluent, safer, and predictable ecologies, such risks are lower: notably, when surrounded by more resources, individuals can afford to lose some of them in the short term149. Therefore, exploration is best defined as a ‘venture behavior’, that is, a preference for a high variance of rewards over short-term gains (as opposed to ‘hazardous behavior’150). Since organisms evolved in changing environments151, selection pressure would have favored exploratory preferences that are highly flexible to the local ecology, with time horizon as the crucial mediator111, 149, 152.

More generally, phenotypic plasticity enables an organism to adapt to new situations and environments by changing its behavior, rather than having to wait for genetic adaptations to occur. This flexibility in behavior can be an advantage in unpredictable or changing environments, allowing organisms to survive and reproduce more effectively. The behavioral effect of the local ecological cues on exploratory preferences, curiosity, and spatial search strategies153 is observed in a wide range of species, such as in orangutan154,155,156, honeybees157, parrots158, and chickadees159. It is parsimoniously hypothesized that it applies to humans146, 160.

At the individual level, there is empirical evidence that people living in richer families score higher in Openness-to-experience161, 162 and that people with higher income at one stage of their life are less likely to decrease in Openness-to-experience later on163. In a foraging task, people with more adverse childhood experiences remain in patches longer and, thus, explore less164. At the level of societies, recent empirical studies show that, across the world, people living in more affluent countries exhibit higher levels of openness to change and new experiences165,166,167. Finally, a recent study shows that between-countries differences in levels of causal learning and pretend play in children (i.e., the United States vs. Peru) are similar to those within-countries due to different socio-economic statuses (i.e., mixed-SES United States vs. low-SES United-States168). We, therefore, predict that people living in more affluent local ecologies should, on average, be fonder of fictional stories set in imaginary worlds.

We reviewed evolutionary rationales and empirical evidence showing how the human motivation to explore (i.e., environmental curiosity) adaptively varies according to people’s personality traits, age, sex, and ecological conditions (Fig. 1).

Figure 1
figure 1

Proposed model of the computational architecture of environmental curiosity (in orange), with the actual domain of the cognitive mechanism (in green).

Our hypothesis, therefore, leads to fine-grained predictions based on the adaptive variability of environmental curiosity5, 20. If imaginary worlds do exploit this cognitive mechanism, the sources of its adaptive flexibility that we reviewed should account for the variability in the human fascination for imaginary worlds, across time and populations.

Results

Study 1: Unsupervised clustering of movies

Before testing predictions about the sources of variability of the preference for imaginary worlds, we straightforwardly investigate whether fictional stories with imaginary worlds are related to exploration. We test that (1) stories with imaginary worlds constitute a well-identified cluster in the global set of movies produced and that (2) this emerging cluster is related to environmental curiosity through exploration-related content. We use independently two machine-learning algorithms. The random-forest algorithm, based on manually annotated movies, and trained on plot keywords, is designed to detect imaginary worlds in a sample of 9424 movies. This algorithm is successful in identifying movies set in imaginary worlds with an out-of-bag error rate of 9.35%. In parallel, we combine natural language processing techniques (i.e., Sbert Transformer) and topic modeling methods to project those 9424 movies into a semantic latent space, embedding the summary plots: the closer movie summaries are in their meaning and content, the closer movies will be into this space. Seven clusters naturally emerged, and we extracted the most specific n-grams to describe their content (see Fig. 2).

Figure 2
figure 2

Unsupervised clustering of 9424 fictional movies. (A) Projection of movies in a semantic latent space, with 7 clusters, and the tagging of some movies from the imaginary-world cluster (cluster 1). (B) The 20 most specific n-grams for each cluster, and the names that we attributed to them based on the lists.

Combining both algorithmic methods, we show that at least one cluster which has emerged embeds more specifically movies with imaginary worlds, as identified by the random-forest algorithm. First, we find a significant relationship between being associated with a specific cluster and being detected as a movie with an imaginary world (X2 (6, N = 9424) = 576.754, p < 0.001). We, therefore, reject the null hypothesis that asserts that the two variables are independent of each other: movies with imaginary worlds are not randomly distributed across all clusters (Fig. 3). We then perform the same analysis to show that one specific cluster (cluster 1, hereafter ‘imaginary world cluster’) specifically embeds movies with imaginary worlds (X2 (1, N = 9424) = 1542.759, p < 0.001). In fact, 71% of the movies with imaginary worlds detected by our algorithm belong to this specific cluster. Even if this cluster includes ‘only’ 30% of movies with imaginary worlds, this compares to only 2%, on average, in the other clusters. Let’s note that our algorithm is conservative: it does miss a lot of imaginary worlds (i.e., false negatives), but it is unlikely to wrongly label a movie that is not set in an imaginary world as having one (i.e., false positives; see “Supplementary Materials”). In all, movies with imaginary worlds are similar enough in their content for an unsupervised algorithm to cluster them together in one cluster, based only on plot summaries. This is also qualitatively observable in the n-grams that are most specific to this cluster, blending words related to multiple genres such as fantasy (e.g., ‘dragon’), science fiction (e.g., ‘alien’), dystopia (e.g., ‘survivor’), and more broadly related the supernatural (e.g., ‘vampire’).

Figure 3
figure 3

Contingency table of movies with and with no imaginary worlds, and with and with no exploration-related content, across cluster 1 and all other clusters.

Finally, we show that this imaginary-world cluster specifically embeds movies with exploration-related content, and significantly more so than any other cluster. Each movie summary is ascribed a binary variable of exploration-relatedness, based on the exact match between at least one word from an algorithmically generated list of exploration-related words and words from the movie summaries (see “Methods”). There is a significant relationship between being associated with a cluster and being associated with exploration-related content (X2(6, N = 9424) = 75.035, p < 0.001). We, therefore, reject the null hypothesis that asserts that the two variables are independent of each other: movies with exploration-related content are not randomly distributed among all clusters (Fig. 3). We perform again the same analysis and show that the imaginary-world cluster specifically embeds movies with exploration-related terms in their summary plots (X2(1, N = 9424) = 73.946, p < 0.001).

Consistent with our general hypothesis, these results suggest that fictions with imaginary worlds resemble each other, at least in part because they are related to exploration. Note that this study also comes as an external validity test for the random-forest algorithm: the latter is successful in identifying movies with imaginary worlds that, in addition, resemble each other in terms of their content. We will use its tagging of movies with imaginary worlds in the next study.

Study 2: Demographic and psychological characteristics of individuals who ‘like’ movies with imaginary worlds on Facebook

We now turn to specific predictions about the variability of the fascination for imaginary worlds in stories, that we derived from the adaptive sources of variability of human environmental curiosity (see “Introduction”). We predicted that people higher in Openness-to-experience, younger people, males, and people living in affluent local environments would be more likely to enjoy fictional stories set in imaginary worlds. We used the Movie Personality Dataset (MPD) which aggregates averaged personality (i.e., Big Five) and demographic traits (i.e., sex, age) from the Facebook myPersonality Database (N = 3.5 million81). We couple this dataset with the outcome of the random-forest algorithm which efficiently identifies movies as being set in an imaginary world or not (see “Study 1”). First, we find that, as predicted, movies with imaginary worlds on Facebook are liked by an audience that is, on average, higher in Openness-to-experience than movies with no imaginary worlds (ß = 0.12, p < 0.01, CI [0.02, 0.22], Cohen’s d = 0.24; Fig. 4). In other words, approximately 60% of movies with imaginary worlds have higher aggregated scores of Openness-to-experience than the mean of Openness-to-experience of movies with no imaginary world. Although we had no specific prediction derived from our hypothesis, we report the correlations between the four other traits of the Big Five and the liking of imaginary worlds (Agreeableness: ß = 0.08, p = 0.451, CI [− 0.12, 0.27]; Conscientiousness: ß = − 0.29, p < 0.01, CI [− 0.47, − 0.10]; Extraversion, ß = − 0.68, p < 0.001, CI [− 0.84, − 0.54]; Neuroticism, ß = 0.19, p < 0.05, CI [0.01, 0.37]). Second, we found that movies with imaginary worlds on Facebook are liked by an audience that is, on average, more composed of males than movies with no imaginary worlds (ß = 0.44, p < 0.001, CI [0.31, 0.57], Cohen’s d = 0.68; Fig. 4). It means that there is a 68.5% chance that a movie with an imaginary world picked at random will have a higher percentage of males liking it on Facebook than a movie with no imaginary world picked at random. We found no significant association between the age of consumers and the presence of imaginary worlds in movies (ß = − 0.005, p = 0.33, CI [− 0.015, 0.0051], Fig. 4). Finally, with the full model, with all 3 variables of interest as explanatory variables, and the liking of movies with imaginary worlds as the outcome variable, we found significant coefficients with the predicted directions (Fig. 5).

Figure 4
figure 4

Distribution of the average scores of the Big Five personality traits (AE), of the average sexes (F), and of the average ages (G) of movies with and with no imaginary worlds. *p < 0.05; **p < 0.01; ***p < 0.001.

Figure 5
figure 5

Model output of the Linear Probability Model explaining the presence of an imaginary world in a movie with 3 variables: the aggregated level of Openness-to-experience, sex, and age of the people who liked such movies on Facebook. *p < 0.05; **p < 0.01; ***p < 0.001.

With computational methods, we provide observational evidence that people who like movies with imaginary worlds on Facebook are overall higher in Openness-to-experience. These results are in line with existing empirical evidence showing an association between Openness-to-experience and the consumption of or preference for specific genres often associated with imaginary worlds such as science fiction and fantasy81, 169,170,171,172,173. Also consistent with our predictions, we provide evidence that people who report liking movies with imaginary worlds are more likely to be males. We did not find any significant association between age and the liking of movies with imaginary worlds. This can be explained by the very restricted range of Facebook users at the time the participants were interviewed (in 2009–2010): aggregated ages associated with each movie range from 17.9 to 32.7 years old. We would need a much larger range to assess the impact of age on the consumption of fictional stories with imaginary worlds. Let’s note that evolutionary developmental psychology does not make any strong predictions about the change in the sensibility of environmental curiosity within this specific life stage. Rather, it makes predictions about the difference in the sensitivity of environmental curiosity between this life stage, earlier ones, and older ones174. Further research should investigate the differences in cultural preferences between children and other life stages. Finally, let us note that, with this study, the results about the personality traits and biological sex of such audiences are generalizable only within the specific age range of this dataset.

Study 3: Demographic and psychological characteristics of individuals who self-report liking stories set in imaginary worlds

We now turn to experimental tests of the same predictions (all pre-registered). We asked participants to report their preferences for fictional stories with imaginary worlds using a questionnaire and asked them to respond to a range of psychometric questionnaires (see “Methods”). We predicted that people higher in Openness-to-experience, younger people, males, and people living in affluent local environments would be more likely to enjoy stories set in imaginary worlds. We take advantage of experimental paradigms to further test two other hypotheses. We test the presumably complementary ‘systemizing hypothesis’, which suggests that people enjoy imaginary worlds because they like to understand the ways newly presented imaginary worlds are structured and operate (i.e., because they are ‘higher systemizers’). We also test the alternative and widely spread ‘escapist hypothesis’ which posits that people enjoy imaginary worlds because they want to escape the difficulties of the real world: we look at whether people who report having more difficulties in life also report enjoying more fictional stories with imaginary worlds. We predicted that it would not be the case. We report the results in Fig. 6.

Figure 6
figure 6

Summary of the predictions and results of the experimental study with the self-reporting paradigm, as pre-registered. We removed from the pre-registration 2 mediation tests that we could not perform (see Pre-registration).

First, we tested our core prediction about the association between the preference for imaginary worlds and environmental curiosity: people who score higher in the Curiosity and Exploration Inventory scale report enjoying more imaginary worlds in fictional stories (Fig. 7). Then, we tested predictions related to the adaptive variability of environmental curiosity and found that participants who report liking more imaginary worlds in stories are overall higher in Openness-to-experience, younger, and more likely to be males (Fig. 8). Although participants with higher socio-economic status scored significantly higher on the Curiosity and Exploration Inventory scale, they did not report enjoying more imaginary worlds. It is consistent with the hypothesis that phenotypic plasticity does impact exploratory preferences (which increase as the local ecology gets more affluent and predictable), but it suggests that it may not translate in the cultural domain. It might also be that this prediction has been affected by the reduction of the sample size due to participation exclusion.

Figure 7
figure 7

The scores of reported preferences for imaginary worlds as a function of CEI-II scores.

Figure 8
figure 8

Model output of a Linear Model with the reported preference for imaginary worlds as the dependent variable and the level of Openness-to-experience, sex, age, and SES of the participants as the independent variables. *p < 0.05; **p < 0.01; ***p < 0.001.

Let’s finally turn to the two other hypotheses that we tested. First, it does not seem that stories with imaginary worlds are enjoyed because they allow consumers to ‘escape’ the difficulties of the real world. We tested this prediction by looking at the correlation between a self-reported measure of well-being and the preference for imaginary worlds. We reasoned that, if this hypothesis were true, the more unhappy people are, the more they should like imaginary worlds. As predicted, this association turned out to be non-significant. Of course, more empirical tests should be run, but this first result suggests that the ‘escapist hypothesis’ is either false or incomplete. Finally, we tested the effect of levels of systemizing on the preference for imaginary worlds. This supplementary hypothesis was confirmed: the higher people are in systemizing, the higher they score on the Curiosity and Exploration Inventory scale and the more they report enjoying imaginary worlds. With a mediation analysis, we found that the systemizing quotient mediated 70% of the effect of sex on the preference for imaginary worlds: while this is not a causal paradigm, this result is consistent with the hypothesis that males enjoy more imaginary worlds in large part because they’re higher in systemizing (Fig. 9).

Figure 9
figure 9

(A) Correlation between the preference for imaginary worlds and the Systemizing Quotient. (B) Correlation between exploratory preferences (CEI-II) and the Systemizing Quotient. (C) The association between sex and the scores of systemizing. (D) The association between sex and the reported preference for imaginary worlds. (E) Levels of systemizing mediate the effect of sex on the preference for imaginary worlds with an indirect effect of − 0.17 (p < 0.001), leaving a non-significant direct effect (in parentheses on the center path, next to the total effect). The proportion of effects mediated is 70%.

This study replicates previous findings of sex differences in systemizing, with similar magnitude77, 78. We further demonstrate that this difference translates in the cultural domain, impacting the reported preference for imaginary worlds in fiction. Besides, we show that the psychological trait of systemizing is highly correlated to exploratory preferences, supporting the hypothesis that systemizing is subsumed under the drive to explore that we coined environmental curiosity175. Under our theoretical account, systemizing is the labeling of an extreme form of information processing that this mechanism of curiosity about the physical world can take, which is indeed modulated by sex differences because of ancestral selection pressures. Crucially, this account may explain why fans of imaginary worlds like to explore imaginary worlds in depth rather than exploring new worlds again and again176, and why fans of imaginary worlds end up remembering and storing huge amounts of information related to the imaginary worlds they like, for instance in Wikipedia-like online ‘Fandoms’ (e.g., the Star Wars online fandom aggregates more than 175,000 pages1).

Discussion

In all, we provide empirical evidence supporting the hypothesis that exploratory preferences explain why humans are fascinated by imaginary worlds in fictional stories. We reviewed the evolutionary psychological literature on environmental curiosity and exploratory preferences in humans. Then, we showed that fictions with imaginary worlds cluster together because of the semantic proximity of their summary plots, suggesting that they resemble each other in terms of their content. We showed that movies from this cluster are specifically associated with exploration-related content. As predicted by the exploration hypothesis5, we then provided evidence that the adaptive variability of the sensitivity of environmental curiosity reflects, and therefore likely explains, the variability of the preference for imaginary worlds in fiction.

Observational analyses of large cultural datasets showed that people who ‘like’ movies with imaginary worlds on Facebook are overall higher in Openness-to-experience, younger, and more likely to be males (when controlling for the two other variables). This dataset can be biased as it aggregates personality scores of people who decided to create a Facebook account and ‘like’ movies on Facebook. Therefore, we replicated such findings with experimental methods: we provided consistent evidence that participants who report enjoying imaginary worlds in movies, novels, and video games are overall higher on a scale of exploratory preferences, higher in Openness-to-experience, younger, higher in systemizing, and more likely to be males. We did not find an association between the socioeconomic status of the participants and their reported preference for imaginary worlds, although participants higher in socio-economic status scored significantly higher on the scale of exploratory preferences. The core prediction that, in synchrony, the socioeconomic level of people should impact the preference for fictional stories set in imaginary worlds, through a mediating effect of environmental curiosity, should be further tested with other datasets or experimental tests. Future research should also further study the causal impact of ecological conditions on the production of speculative fictional stories in diachrony, at the macro-level of societies.

Cognitive scientists have long argued that universal cognitive adaptations can explain the evolution, stabilization, and distribution of cultural traits21, 177,178,179. Here we demonstrate that the way such cognitive adaptations vary (between individuals, across ontogeny, and with changes in the local ecologies) can explain the variable appeal of such cultural traits to human cognition. More specifically in studying fictional stories, some researchers have focused on universal appeal for some content features, making some stories more popular than others (180,181,182,183, e.g., for romance184,185,186,187,188, e.g., for horror189). We contend that evolutionary psychology now provides predictions and powerful ways to interpret findings about the differences and changes in human cultural preferences (e.g., for romance158, 159, e.g., for horror160). Further research in fiction study could investigate the variability of many other preferences (and associated consummatory behavior) with such an evolutionary framework.

Behind the field of entertainment, the success of imaginary worlds in modern societies reveals important changes in individual preferences and personality traits. Why would people come to enjoy stories with imaginary worlds now, and not before? Because we have provided empirical evidence that the appeal for imaginary worlds relies on exploratory preferences, the increasing success of fiction with imaginary worlds may reflect changes in human exploratory preferences. We proposed that humans universally become more curious and explorative as they live in more affluent ecologies, notably because the evolutionary costs of curiosity decrease in such environments. This hypothesis did not lead to significant results when comparing people’s preferences for imaginary worlds at different socio-economic levels. However, it could mean that people process other cues than sheer income to assess how well-off they are (e.g., cues at the country level, such as unemployment insurance). If our hypothesis is true, economic growths of the last decades or even of the last centuries, in most human societies, likely fueled a bigger and bigger audience for stories set in imaginary worlds, and producers of fiction could therefore invest more and more in the creation and refinement of such worlds190.

It is worth noting that this hypothesis fits qualitative observations about the cultural evolution of imaginary worlds at the country level. Modern stories with imaginary worlds first became popular in the United Kingdom4, which was at the time the leading country in terms of GDP per capita191, and then mostly developed in the Euro-American sphere. By contrast, for most of the nineteenth and twentieth centuries, the popularity of imaginary worlds was rather limited in less economically developed countries. For instance, while Jules Verne was first translated into Chinese in the early 20th and inspired Chinese writers to write science-fiction and fantasy stories during the late Qing dynasty and early Republican era, stories set in imaginary worlds remained marginal in Chinese literature during the twentieth century192, 193. In East Asia, imaginary worlds started to become mainstream first in Japan in the 1950s194, 195 which had started its industrialization in the late nineteenth century, then in Hong Kong and Taiwan196, which had started to develop economically in the 1970’. During the same time, imaginary worlds were much less popular in mainland China192, 197 and they became mainstream in mainland China at the turn of the new millennium, that is, 20 years after the take-off of the Chinese economy196,197,199.

While future empirical research should thoroughly test this hypothesis, there are already several indications in favor of this idea. For instance, recent studies on the evolution of personality traits have shown an increase in Openness-to-experience in high-income countries both in Western200 and Eastern societies201. However, these studies are obviously limited, both in terms of sample size, population diversity, and measurement. If we are right, the rise of imaginary worlds in all parts of the world would suggest that Openness-to-experience is rising in modern societies and that it has been rising for at least 150 years. That is, we now could use the evolution of the relative production of stories with imaginary worlds as a proxy for changes in human exploratory preferences. Our results can therefore contribute to the understanding of behavioral and cultural changes over the long run146, 166, 202.

Methods

Data

Extraction of existing data from the Internet or previous studies (Studies 1 & 2)

We use the Internet Movie Database (IMDb) to obtain metadata about 9424 movies, such as their genres, summary plots, and their keywords. In the Movie Personality Dataset (MPD), for each of the 846 movies, we have (1) the movie metadata, (2) the average personality traits, average age, and average sex of people who like it on Facebook, and (3) the presence (or not) of an imaginary world in it (see “Algorithmic methods”, below). Nave et al.81 built an important dataset that makes it possible to map the associations between movie characteristics and the characteristics common to people who like such movies. Note that because socio-demographic scores are aggregated, the sex variable associated with each movie becomes a continuous variable between 0 and 1 (as a percentage of males who liked it, where 1 would mean that all people who liked this movie on Facebook self-reported themselves as men).

Collection of original data through experimental designs (Study 3)

The design and predictions for this study were pre-registered (https://osf.io/8yj3v). All methods were carried out in accordance with relevant regulations and approved by the Conseil d’évaluation éthique pour les recherches en santé, CERES n°201,659. We recruited 350 participants from the online research participation platform Prolific (180 males, 165 females, 4 others, Mage = 46, SDage = 19.5). Participants confirmed their informed consent. We removed participants failing the attention check and participants failing to respond to the follow-up study, leaving a total sample of 230 participants (101 males, 127 females, 2 others, Mage = 48, SDage = 16.3, Rangeage: 19–82; we still run the analyses that were possible without the follow-up study with the entire sample size, after removal of participants who failed the attention check; see “Supplementary Materials”). Our pre-registered sample size was higher (319) for 95% power (with α = 0.2 and p < 0.05), but with 230 participants the statistical power level is above 80%. All methods were carried out by relevant guidelines and regulations and informed consent was obtained from all subjects participants.

In the first part of the experimental study, 3 paradigms aimed at capturing an individual score of preference for fictional stories with imaginary worlds. Since the latter paradigm is the only one that provided consistent results with the large-scale observational study, we only provided detailed results for this one (see “Supplementary Information” for the results of the other paradigms). While we do not deny that such failures to find results consistent with our predictions across all pre-registered paradigms weakens the significance of our findings, it is not surprising to us that the self-reporting method (efficiently used in similar research; e.g.,203, 204) has overall more predictive power than newly designed paradigms. Besides, when asked to choose between two movies or rate a movie summary, people likely use lots of cues to make their choice, such that the fact that it takes place in an imaginary world might not be decisive. Conversely, the last paradigm, where participants are straightforwardly asked whether they like movies, novels, or video games set in imaginary worlds, targets more precisely the content feature we want to study.

We believe that such paradigms have limitations that can explain such results. Our theory predicts that some content features go along better together, because they tap into psychological preferences that share common cognitive and neural bases, and therefore are present in the same people. This would mean that some of the randomly created movie plots would be ‘objectively’ better at appealing to a certain audience because by chance they would bring together locations and plots that are psychologically ‘consistent’. Some others are not. This creates a bias in the randomly created plots. Regarding the second paradigm, it is obvious that people take many elements into account when deciding which of two films they would prefer to watch. The presence of imaginary worlds might not make any difference for some people and might even be hard to detect with the cues we present them with. We now think that all content features of movies should be controlled for in such paradigms if we want to be precise about what drives people to consume and enjoy some stories.

Let’s note, however, that the three computed scores of preferences for imaginary worlds all significantly and positively correlate with each other. It suggests that the (self-reported) specific preference for imaginary worlds drives the actual choice of consumption of movies, while not driving it enough to provide significant results with our sample size (see “Supplementary Materials”). Further research should keep on trying to find more ecologically valid experimental paradigms to complement findings from self-reporting. We believe that the associations between horror movies and morbid curiosity204, 205 and between movies with imaginary worlds and environmental curiosity could serve as tests that new experimental paradigms are successful in capturing the preferences of consumers, before expanding the methodology.

For the self-reporting paradigm, we first created an 8-item scale. A factor analysis (KMO sampling adequacy = 0.62; see “Supplementary Materials”) indicated that two clusters of items emerged from the responses (X2(13) = 60.19, p < 0.001). We removed the items that didn’t load onto factor 2, which was more specific to the preference for imaginary worlds (see “Supplementary Materials”). The 4-item scale showed near-acceptable reliability (α = 0.66). Here are the 4 items, that participants had to rate on a 7-Likert scale from ‘I fully disagree’ to ‘I fully agree’: (1) ‘I like movies, novels and video games with more information about the world than about the characters’, (2) ‘I like movies, novels and video games in which the fictional characters explore their environment’, (3) ‘I like movies, novels and video games with novel and surprising technologies’, and (4) ‘I like movies, novels and video games which make me feel I am traveling in a foreign world’. The final individual score of preference for imaginary worlds is the mean of all the ratings.

Then, the participants were also asked to respond to (1) the Big Five questionnaire BFI-10206, to measure the score of Openness-to-experience, (2) the Curiosity and Exploration Inventory-II207 to measure exploratory preferences, (3) the short Warwick-Edinburgh Mental Well-being Scale208 to assess well-being scores, (4) the 8-item version of the Systemizing Quotient209 to measure scores of systemizing, (5) the childhood and current Socio-Economic Status (as designed in210) as a proxy for the affluence of the local ecology, (6) their reported gender, and (7) their age.

Algorithmic methods

Random-forest algorithm (study 1 & 2)

The first step is to detect the presence of imaginary worlds in movies. We start by manually coding 385 movies randomly selected in the IMDb dataset, as being set in an imaginary world or not. We base this decision on one main criterion: whether or not the IMDb movie summary mentions a location that does not exist in the real world. Then, we extend this categorization to 9424 movies with a classification algorithm based on a random-forest method211 and trained on plot keywords (i.e., user-generated keywords associated with movies which describe “any notable object, concept, style or action that takes place during a title”). This algorithm is successful in identifying movies set in imaginary worlds with an out-of-bag error rate of 9.35%. Among the 328 movies annotated as not being set in an imaginary world, the random-forest algorithm miscategorized only 5 of them, and among the 57 movies that we manually annotated as being set in an imaginary world, it accurately finds 26 of them: the algorithm, therefore, underestimates the number of movies with imaginary worlds. To further validate the external validity of this predictive algorithm, we showed that movies identified as being set in an imaginary world by the algorithm were more likely to be classified in the science fiction and fantasy IMDb genres, two genres in which producers of fiction commonly classify fictions with imaginary worlds (see “Supplementary Materials” for the results).

Topic modeling method (study 1)

Independently from this first step, we use Natural Language Processing methods and Topic Modeling to project those 9424 movies into a semantic latent space. More specifically, we use SBert Transformer211,212,214, which has been trained on millions of common language corpora and can map words, sentences, and paragraphs to a multidimensional dense vector space (i.e., word embedding;215, 216, and which achieves state-of-the-art performance on machine learning-tasks related to text understanding217). Such techniques allow us to define the semantic closeness of words, sentences, or paragraphs in an unsupervised fashion, by making the algorithm look at the contexts in which words are used in common language corpora. The underlying assumption is that words used in similar ways, at such a very large scale, have similar meanings. Here, we project movies into a semantic space using their movie description: the closer movie summaries are semantically, the closer movies will be into this space. Then, we use the K-Means algorithm to cluster this space into 7 clusters (with the elbow method218 determining the number of clusters that maximizes the explained variation). For every cluster, we compute the most 20 specific n-grams using the chi-squared statistics test, thus providing the words that most specifically describe the clusters. For these computations, we use the Python ‘bunkatech’ package (https://github.com/charlesdedampierre/BunkaTopics).

Creation of an extended list of exploration-related terms

First, we manually create a list of 5 core terms directly related to exploration (i.e., ‘exploration’, ‘explorer’, ‘explorers’, ‘explores’, ‘exploring’). Then, we extend this list using again the algorithm Sbert Transformer, this time applied to the movie summaries. More specifically, we find the 20 words that are closest to each of the core terms in the dataset of the movie summaries itself (with no consideration of whether the movie is set in an imaginary world or not, or is part of the imaginary-world cluster or not), and then remove duplicates. We end up with 37 terms in this extended list of exploration-related words.

Statistical models

Chi-2 tests of independence (study 1)

We combine both algorithmic methods (see “Algorithmic methods”): we use the chi-squared test of independence to check that at least one cluster which has emerged from the Topic Modeling method embeds more specifically movies with imaginary worlds, as identified by the random-forest algorithm. In other words, we look at the correspondence between two computationally designed features of 9424 movies: belonging to an emergent cluster and being detected as a movie with an imaginary world. We use the same test to check that the imaginary-world cluster embeds more specifically such movies with exploration-related terms. In other words, we look at the correspondence between two computationally designed features of movies: belonging to the imaginary-world cluster and the binary variable of exploration-relatedness.

Linear probability models (study 2)

To test the correlations between the appeal for movies with imaginary worlds and the average scores of Openness-to-experience, age, and sex, we use Linear Probability Models, with such scores as explanatory variables, and the binary variable of the presence or absence of an imaginary world as the outcome variable. Then, we use one Linear Probability Model with all the scores as explanatory variables and the binary variable of the presence or absence of an imaginary world as the outcome variable (see “Supplementary Materials”, Appendix B, for model assumptions check).

Linear models and t-tests (study 3)

To test predictions with the data from the experiment, we use (1) linear models with the score of preference for imaginary worlds as the dependent variables, and, in turn, the score of the Curiosity and Exploration Inventory-II, the score of Openness-to-experience, the age, the socio-economic status, the Systemizing-Quotient, and the Well-Being score, (2) linear models with the score of Curiosity and Exploration Inventory-II as the dependent variables, and in turn, the score of Openness-to-experience, the age, the socio-economic status, and the Systemizing Quotient, (3) t-tests with, in turn, the score of preference for imaginary worlds and the Systemizing Quotient as the dependent variables, and the sex as a binary variable as the dependent variable. We also perform a mediation analysis with the R ‘Mediation’ package. Finally, we perform a non-preregistered linear model with the score of preference for imaginary worlds as the dependent variable and scores of Openness-to-experience, sex, age, and socio-economic status as the independent variables.