RETRACTED ARTICLE: A cross-linguistic investigation of /h/ symbolism: the case of H2O

AL-Jarrah, Rasheed

doi:10.1057/s41599-025-06397-0

Download PDF

Article
Open access
Published: 03 January 2026

RETRACTED ARTICLE: A cross-linguistic investigation of /h/ symbolism: the case of H2O

Rasheed AL-Jarrah¹

Humanities and Social Sciences Communications volume 13, Article number: 64 (2026) Cite this article

4577 Accesses
104 Altmetric
Metrics details

Subjects

This article was retracted on 23 January 2026

This article has been updated

Abstract

The relationship between word formation and the meanings words convey has been a philosophical concern since antiquity. Scholars across disciplines have long engaged with the issue of form–meaning mapping, giving rise to two competing perspectives: one emphasizing the arbitrariness of language, and the other advocating sound symbolism (for details, see Perniss et al. 2010; Lockwood and Dingemanse 2015, among others). In Reichard’s (1944; 1950) words, linguistic forms have intrinsic power, but in Wittgenstein’s words (1953[2009]), they are externally empowered. This study investigates the linguistic and symbolic functions of [h]-sounds by proposing a relatively unique word dichotomy (REAL vs FAKE). The analysis integrates symbolic and orthographic evidence into broader debates on arbitrariness in linguistic representation. While language-related research has often treated the linkage between form and meaning as a product of social convention, experimental studies in other fields increasingly demonstrate that sound symbolism lies at the core of word formation (e.g., Stokoe 1991; Armstrong et al. 1995; Dingemanse et al. 2015). The central question, therefore, is not whether such a relationship exists, but rather how to uncover its subtle manifestations. Despite extensive inquiry, no universally accepted model of analysis has yet emerged. The present study contributes to the existing literature by bridging phonological theory, cognitive linguistics, and anthropological linguistics through a cross-linguistic examination of the symbolic nature of the [laryngeal]/h/ in both pronunciation and orthography. A survey of its use across ancient and modern languages reveals a division of labor: in classical and so-called “primitive” languages, /h/ retains a phonemic (primary) function, whereas in modern and non-primitive languages it has largely been reduced to an allophonic (secondary) status.

Linguistic structure from a bottleneck on sequential information processing

Article Open access 24 November 2025

Sound-meaning associations allow listeners to infer the meaning of foreign language words

Article Open access 02 November 2023

Lexibank, a public repository of standardized wordlists with computed phonological and lexical features

Article Open access 16 June 2022

Introduction

God’s life-giving gift, water, is the cause for the creation of gardens (Burnet 1920, p. 35), and at the same time, it is the reason for their growth. Consider this:

And we sent down from the sky blessed water, and We caused to grow thereby gardens and grain for harvest (9) and tall palm trees with clustered fruit (10) as provision for the servants. And we revived thereby a dead land. Thus is the emergence (11). (Qur’an)

Accordingly, water has two main functions: creation and growth. It is, however, our contention that whereas creation is the work of the horizontal bonds between the components of matter after it splits, growth is the work of the vertical bonds between these components. In simple terms, the basic component in the creation process is the element of hydrogen (H), and the basic component in the growth process is the element of Oxygen (O). It is therefore H2O. As a language-driven research paper, we will try in the article below to bring forth some linguistic, psychological, and anthropological pieces of evidence to support our claim.

The question that is addressed here is: What is the word that should be used for “the liquid which we use for drinking” (viz., the chemical formula H2O)? Each language uses one and only one word to describe the “clear liquid, without color or taste, which falls from the sky as rain and is necessary for animal and plant life” (Cambridge Dictionary^{Footnote 1}). Whereas it is water in English, it is maa’ in Arabic, maim in Hebrew, and eau in French, etc. The question that has always been mind-bending to philosophers, linguists, anthropologists, and psychologists (as well as to ordinary people) is: Do such words fit their referent (viz., water) by virtue of the sounds they are made of? Two relatively opposing lines of thought have always been available across the board: Whereas one advocates arbitrariness, the other argues for symbolic meaning.

Language arbitrariness vs phonetic symbolism

The idea of a globally understood ‘natural language’ (where sound-meaning mapping is one-to-one) has always been in the minds of people, irrespective of time and place (for the so-called pre-Babel Tongue, see Wilkins 1668). The relationship between the sounds of words and their meanings has instigated a sizable share of publications through various lines of research, including, but not limited to, language arbitrariness (Muin et al. 2021), sound symbolism (Parault and Schwaneflugel 2006; Cabrera 2012), phonetic symbolism (Sapir 1929; Nuckolls 1999; Erben et al. 2020), iconicity (Perniss and Vigliocco 2010), sound-meaning interface (Blasi et al. 2016).), phono-semantics (Joo 2020), ideophones (Voeltz and Kilian-Hatz 2001)—a sufficient reason for the list of works cited at the end of each research paper investigating the intricacies of the topic to be super lengthy.

Philosophers and linguists debating the form-meaning relationship have been cautious not to make clear-cut pronouncements. However, despite the vagueness of their contentions, one can sense two major schools of thought: whereas one supports the hypothesis that there is ‘some link’ between the sounds of the word and its meaning, the other strongly believes that the association is arbitrary. Interestingly enough, scholars advocating the non-arbitrary relationship do not discard the arbitrariness principle altogether, and scholars advocating the arbitrariness principle do acknowledge instances of systematic sound—meaning mappings. Rarely do we find researchers who adopt the strong version of either claim.

The oldest debate record on form-meaning correspondence dates back to the Greek philosopher Cratylus, a disciple of Heraclitus^{Footnote 2}, who argued for a natural bond between the form of the word and its meaning, probably due to his conviction that names have divine origins. Researchers adopting this line of thought have been intuitively motivated by a number of observations: Some words are onomatopoeic (e.g., cats meow and dogs bow-wow); some words that begin with certain sound clusters (e.g., English/gl/ for light, vision, etc.) communicate certain meanings (Bergen 2004); certain sounds are always available when expressing certain concepts, ideas, etc. (e.g., /s/ for sliding snake-like movement); etc. Although influenced by Cratylus, Socrates later rebuked his teacher’s philosophy of language, advocating the soundness of the counterevidence brought forth at the time, thus concluding that language itself is philosophically inferior to the study of matter. Thereafter, Plato and Aristotle denied the link altogether, further endorsing the arbitrariness principle, believing that the connection between language and objects is not natural, but socially (and probably culturally) determined. The split in Greek philosophy on the subject matter is therefore a natural ramification of the larger division between the pre-Socratic and the post-Socratic philosophies^{Footnote 3}. Hence, Socrates has been credited by many contemporary scholars as the founder of Western philosophy, which has incorporated stoicism, epicureanism, cynicism, skepticism, and other schools of thought, with the philosophy of religion gaining more prominence over religious philosophy (for details, see the methodology section below). Amidst the outbreak of the industrial revolution, Locke (1690) recharged the principle of arbitrariness as one of the foundations of human understanding, thus becoming a significant source of empiricism in contemporary Western philosophy.

Recently, the debate has continued, with prominent scholars divided unevenly between the two camps. Whereas the majority adopt the arbitrariness principle, very few still believe in one-to-one sound-meaning correspondence. For example, whereas Spair (1929), the American anthropologist and linguist, never declined the likelihood of a symbolic relationship between sound and meaning and that instances of correspondences are, to him, tangible pieces of evidence of a universal language system, Ferdinand de Saussure, the Swiss philosopher and linguist, rejected sound symbolism altogether when suggesting a clear-cut divide between the signifiant (signifier) and signifie (signified). His first principle of semiotics was the lack of correspondence between the significant and the signifie (1916/1959, p. 67). Since then, it has become axiomatic for structural linguists that language is arbitrary in the sense of a lack of a mandatory correspondence between the sounds of the language and the concepts they stand for. In most textbooks that present the debate to language students, the form-meaning relationship is viewed as a socio-cultural practice whose tools (mainly sounds/letters, and words) are no more than social conventions. In other words, to the overwhelming majority of contemporary linguists, the sounds of the language are not naturally given but socially practiced. In Wittgenstein’s words, symbols (irrespective of size) are dead until they are put into practice (see Wittgenstein 1953[2009], p. 432; 1974). However, in Reichard’s (1944; 1950) words, sounds have power. The debate is then whether sounds have intrinsic power or are externally empowered.

The arbitrariness principle adopted by the majority of modern structural linguists is corroborated by at least two main difficult-to-refute observations. First, people cannot tell the meaning of a word by just hearing it. Second, different languages use different words for the same object (see Locke 1690). Most contemporary linguistic research has since been wrapped with the basic premise that the relationship between the sounds of words and their meanings cannot be stated absolutely and logically. Evidence to the contrary has been a real challenge to the very few who still see some ‘logical/ reasonable’ relationship between the form (i.e., sound/phonaestheme/syllable/morpheme/word) and their meanings.

More recently, however, an emerging line of research in other fields of inquiry (viz. experimental psychology and anthropology) has started to bring forth some pieces of evidence rebuking the arbitrariness principle strongly advocated in language philosophy research (Cf. Köhler 1929[1970]; D’Anselmo et al. 2019; Tzeng et al. 2017; Padraic et al. 2006; Padraic et al. 2011; 2014). In this arena, sound symbolism has just gained support over arbitrariness (for a review of the accumulating research on sound symbolism, see Nuckolls 1999). For instance, Köhler found that people associate certain pseudo-words (viz. Takete and Maluma) with certain shapes (viz. curvy objects and spiky objects). The link has gained approval as the Takete-Maluma phenomenon, later on called the Bouba/Kiki effect. To substantiate/rebuke the claim, some researchers (e.g., Sučević et al. 2013) have demonstrated that sound symbolism is a pre-semantic phenomenon, thereby emphasizing the possibility of a shape-sound correspondence. What this basically means is that sounds do have intrinsic qualities (see Reichard’s 1944; 1950). Research has also shown that people across cultures are sensitive to form-meaning mappings (see Bolinger 1950; Jackson and Waugh 1997; Bremner et al. 2013). To see whether the phenomenon is innate or acquired, Bremner (2013), for example, conducted experiments in some remote areas (The Himba of North Namibia) that had little exposure to other cultures and that had little environmental influence. Results supported some symbolism patterns. Analyzing data from 229 different languages, Ciccotosto (1991) showed that patterns of sound-shape matching do exist globally. Findings showed that although people from various cultural backgrounds demonstrated different perceptual styles, they all showed universal correspondence between sound (audition) and shape (vision), matching what they hear (though meaningless to them) with shapes they already know well. Furthermore, findings by several other renowned anthropologists have confirmed that the phenomenon is more prevalent in traditional, rural, agrarian societies than in secular, urban, industrial ones. On the psychological front, Maurer et al. (2006) showed that such correspondences were found more in two-and-a-half-month-old infants. All in all, the debate outside purely linguistic circles is not about the presence (or lack thereof) of sound symbolism, but rather the intensity (i.e., the amount and patterns) of the phenomenon. Hence, languages vary in terms of how much each has preserved natural correspondences in its component inventories (Day 2004; Nuckolls 1999).

A sizable portion of research examining the Bouba/Kiki effect (e.g., Chen et al. 2016; Chen et al. 2021; Chow 2018; Graven 2019; Schmidtke 2014) researched the relationship from diverse angles such as in word shape (Ković et al. 2010), in toddlers vs in adults (Brown 1980; Maurer et al. 2006; Kantartzis et al. 2009; Imai et al. 2008; Walker et al. 2010; Imai and Kita 2014); in sound taxonomy (Janković and Marković 2001; Janković, Vučković, and Radaković 2005; Ramachandran and Hubbard 2001; Westbury 2005; Maurer et al. 2006; Nielsen et al. 2011), etc. Language typology and linguistic taxonomies (e.g., ideophones) based on iconicity were also attempted by a number of researchers (see Perniss, et al. 2010; Lockwood and Dingemanse 2015). More recently, computational models that aim to facilitate language learning by mapping form onto meaning have been attempted (see Gasser 2004; Padraic et al. 2011, 2006).

Furthermore, influenced by research findings in neuroimaging, which basically aim to find where some pieces of information and their processing comprehension and production mechanisms are localized in the brain (e.g., Pesenti et al. 2000), research in experimental psychology (e.g., Westbury 2005; Perniss et al. 2010; Dingemanse et al. 2015; Lockwood and Dingemanse 2015) has revealed interesting findings that should be of utmost interest to linguists and students of language. For example, it has been shown that a distinction should be made between, for example, two types of consonants: sharp consonants such as /k/, /z/, /r/, /ʧ/, /ʃ/ and “soft” consonants such as /m/, /l/, /b/, /v/, /n/, resulting in a taxonomy of sharp-sounding words vs soft-sounding words. Field experimentation in this area has shown that sharp-sounding words (e.g., kiki) are more efficiently processed within spiky frames, while soft-sounding words (e.g., Bouba) are more efficiently processed when presented within curvy frames. Research has also drawn a line of demarcation between soft sounds (e.g., front vowels, fricatives) and harsh sounds (e.g., back vowels, stops). To illustrate, part of Westbury’s (2005) experimentation has shown that whereas non-continuant sounds (e.g., stops) are recognized faster in spiky frames, continuant sounds (e.g., nasals) are recognized faster in curvy frames. Some researchers have even compiled a list of ‘ugly words’ (e.g., regurgitate, fetid, hoist, and foist) that have the power of synesthesia, where certain sounds, melodies, musical notes, etc., trigger specific visual or emotional responses. Research on a phenomenon called word aversion (e.g., Thibodeau et al. 2014; Malady 2013; Shah 2015) has also substantiated the form-meaning correspondence (Friedrich 1979; Nuckolls 1999; Perniss et al. 2010; Dingemanse et al. 2015; Lockwood and Dingemanse 2015). For example, Thibodeau’s (2016) research was focused on just one word, “moist,” as a case study, since it appears to garner the strongest feelings of aversion among the American public (Thibodeau 2016). Although his argument seems to discredit the phonological explanation, he himself found a difference between words that were avoided because of their semantic connotations, such as taboo words (e.g., nigger, incest), vis-à-vis those that were avoided due to their phonological make-up/ properties/ features, etc. (e.g., hoist, foist). The phenomenon is attested in all languages where people in common parlance self-report aversion, particularly, of a handful of sets of words whose pejorative tones indicate their meaning.

To understand the implications of the form-meaning mapping on language learning, Stokoe (1991), Armstrong et al. (1995), and Perniss et al. (2010), among others, have demonstrated that many concepts in sign languages are iconically represented based on the salient features of real-world objects and events. Research to this end has revealed that form-meaning mappings foster language acquisition and early language development. Brown (1980), Maeda and Maed (1983), Imai et al. (2008) concurred on why Japanese children acquire words that are form-meaning mapped faster than words that are not. Monaghan et al. (2015) confirmed that sound–meaning mapping “is more pronounced in early language acquisition than in later vocabulary development”. Developmental research (e.g., Rizzolatti and Arbib 1998; Stokoe 2001) has stressed the advantage of learning iconic/symbolic forms over arbitrary ones. To the psycholinguist, symbolic form-meaning mappings facilitate language processing (comprehension, production, and acquisition); hence, iconic/symbolic words tend to be learned very early by children and with less processing cognitive effort by foreign language learners (see Maeda and Maeda 1983). The phenomenon has been confirmed by research findings on sign languages. Perniss et al. (2010) unequivocally state, “if research on language had started with sign languages, the picture would look quite different—iconicity, rather than arbitrariness, would be heralded as the fundamental feature of linguistic forms”. The question that arises here is: how do such research findings in the other fields of inquiry inform structural linguists?

Language symbolism in language-related research

Founders of modern structural linguistics (e.g., Noam Chomsky) recognize that, given the poverty of the stimulus evidence, language would be impossible without some innate knowledge. For this, proponents of the Chomskyan innateness hypothesis argue for two levels of information that should work in concert for any language to be learnable: phonetic (allophonic) information and phonemic information, corresponding roughly with Swadesh’s (1971, p. 179) distinction between an instinctive-intuitive level characteristic of animal communication and a conventional level exclusive to human communication (for details see Burling 2005). Whereas allophonic information is the work of nature, and is therefore rule-governed (i.e., predictable), phonemic information is the work of convention and is therefore idiosyncratic (i.e., unpredictable). It has become axiomatic, even to the novice student of language, that the divide between allophonic information and phonemic information is difficult to discard in any articulatory system of human communication. The reasoning is like this: Were the information all allophonic, cross-linguistic variations would be impossible to emerge, and the system of communication would be naïve and inefficient for complex system of communication (very much like that of lower-order creatures), and alternatively, were the articulatory system all phonemic, patterns would be impossible to find, relegating the process of language learning to discrete unit-by-unit intake. Wired by already available ‘built-in’ allophonic information, language learners (at a very young age) make use of phonemic information to figure out the boundaries for each language on its own. Only when the two types of information are intertwined can the system become complex enough for sons of man to run efficient communication. Given the distinction, the questions that would be of utmost concern to the linguist are:

Is the form-meaning mapping (if any) limited to small sets of words or generalizable across the whole word inventory?
How much phonemic information in the articulatory system is symbolic?
Why do languages not demonstrate iconicity/symbolism with the same intensity?
Which words are more likely to be iconic in all languages?
Which linguistic component/element/constituent (e.g., feature, sound, syllable, sub-morpheme, morpheme, or whole word) is at the core of symbolic mapping?
Are the psycholinguistic properties of symbolism/arbitrariness measurable?
Etc.

The present paper focuses on some specific research questions outlined above and elaborated on below.

Research problem

As researchers from various fields of inquiry still hold conflicting (and sometimes blurred) views on sound-meaning mapping, the problem merits further investigation from alternative perspectives. In the current linguistic endeavor, we hold the conviction that relevant previous studies were flawed in terms of design and analysis. On the logistic front, comparisons were made between studies whose inputs were incongruent. Whereas different studies targeted different linguistic components (e.g., feature, sound, syllable, sub-morpheme, morpheme, or whole word), comparisons were made across the board. It is probably a dereliction to compare the findings of a study targeting, for example, phonaestheme (e.g., /gl/) symbolism with another targeting sound, syllable, or word symbolism. On the theoretical front, the findings are ambiguously interpreted as they are still not theoretically well-grounded, i.e., words come from a divine source (such as the religious view)^{Footnote 4}, are naturally symbolic (such as the “Bow-Wow” Theory, the “Pooh-Pooh” Theory, etc.), socially constructed (such as the interactionist view) or genetically grounded (such as the innateness hypothesis).

To find which of the various forms are(not) symbolic, the current investigation is a case study that targets one specific ‘entity’ for which all languages denote one (and only one) word. The word we choose to experiment with is the one whose referent is the “clear liquid, without color or taste, which falls from the sky as rain and is necessary for animal and plant life” (Cambridge Dictionary). It is water in English; it is maa’ in Arabic; it is maim in Hebrew; and it is eau in French, etc. The mind-bending inquiry is: Which of these words indeed fit this referent naturally given, of course, the scientifically undisputed fact that each molecule of it consists of two atoms of hydrogen (H) joined to a single atom of oxygen (O) (pronounced in Arabic as هو).

Theoretically, the current paper addresses the form-meaning mapping by adopting the strongest view of the innateness hypothesis; hence, the research is linguistically oriented. However, before letting ourselves into the main body of the research reported below, attention should be drawn to some intricacies of the subject matter.

First, language innateness, advocated by the father of modern linguistics, Chomsky, entails that ‘symbolism’, irrespective of size, is biologically/genetically motivated. Adopting the evolutionary view of language, Hauser, Chomsky, and Fitch (2002, p. 1574) put it, “the available data suggest a much stronger continuity between animals and humans with respect to speech than previously believed”. If true, an enzyme that produces a chemical reaction should then be activated when the form corresponds with the meaning. This means that the sound-symbolic substrate is then universal, i.e., available to all people, irrespective of the language they speak. Interestingly, the facial feedback hypothesis (see Buck 1980; Strack et al. 1988) substantiates the claim. Besides, the form-content linkage in animal communication supports the contention that the codes are inherited biologically/genetically. Hence, a perfect code requires that only one signal be used in one given context, and a context will evoke only one signal. For researchers who would choose to investigate the contention in future work, the real challenge is to find out which component (e.g., feature, sound, syllable, sub-morpheme, morpheme, whole word) is the real code.

Second, innateness is enhanced by naturalness. What this means is that sound symbolism (such as that available in idiophones) would be widespread in the so-called ‘primitive languages’ (Sapir-Whorf hypothesis). Results of a sizeable portion of research (e.g., Doke 1935; Newman 1968; Samarin 1971; Wescott 1977; Childs 1994; Alpher 1994) have shown that sound symbolism is less attested in Western European languages (vis-à-vis African, Asian, and indigenous Australian, North and South American languages). This is likely because form-meaning overlap tends to erode over time. Claimed to be relatively non-primitive, the Indo-European languages are known to have sparse examples of iconic words. Probably for this, arbitrariness has been the mainstream view because linguistic research has been Eurocentric.

Until quite recently, one gap in form-meaning mapping research was that it has been conducted mostly on European languages whose iconic word inventories have not survived the test of time. To fill in the gap, research has been conducted sporadically on languages outside the Indo-European language family such as the indigenous languages of South America (Nuckolls 1996), Balto-Finnic languages (Mikone 2001), Southeast Asian languages (Diffloth 1972; Watson 2001), sub-Saharan African languages (Childs 1994), Australian Aboriginal languages (Alpher 2001; McGregor 2001; Schultze-Berndt 2001), Japanese, Korean, Southeast Asian languages (Diffloth 1972; Watson 2001), etc. However, little—if any—has been done on ancient language families such as Semitic languages (e.g., Hebrew, Arabic, Aramaic), which are believed to have more remnants of the so-called prototypical language. The research reported in the article below focuses on the one word that these languages use for water, paying special attention to the Arabic sounds/letters making up the word, which is context-independent, namely H2O (Arabic هو). Therefore, the focus will be two-fold: sound-meaning mapping and shape-meaning mapping.

Contextualizing the current research

Arabic is one of the languages, often described as a major vehicle of culture and learning, that has influenced other languages across the globe; hence, it is the liturgical language of approximately 2 billion Muslims. Unlike Indo-European languages, which are believed to be iconically impoverished, Arabic is loaded with such words. It is, therefore, worthy of investigation for a number of reasons. First, although Arabic is classified as a Central Semitic language along with Hebrew, Aramaic, and Maltese. etc., it is often recognized as the one that has maintained traces of proto-Semitic the most. Recall that much of the work on form-meaning mapping has been driven by the desire to trace the origins of language and, therefore, reconstruct what is believed to be a prototypical language (see Armstrong 1983). Second, the writing system (orthography) of Arabic is intrinsically calligraphic in nature, viz., meaning is conveyed through calligraphy where the letters are shaped into an actual form of a thing, animal, plant, etc. (see, for e.g., Cabrera’s 2012, pp. 121–122) demonstration of the phonetic configuration of the etymon *puti). Notably, the orthography of most Semitic languages, including Arabic, employs a type of alphabetic script that is predominantly consonantal, likely because consonants are considered the primary carriers of meaning. Interestingly, findings of some research have shown that synaesthetes, for example, were more successful in reporting form-meaning mapping for Arabic numbers than for corresponding Roman ones (see Ramachandran and Hubbard 2001). Third, the morphology of Arabic is mostly nonconcatenative, viz., using specific templates to derive words rather than amalgamating prefixes and suffixes to word beginnings and endings (Hassanein 2023). This is still practiced when borrowing words from other languages. Loan words respect the phonotactic constraints of Arabic, a phenomenon known as Arabicization among traditional Arab grammarians. All in all, it is assumed here that Arabic words are more likely to evoke form-meaning mappings in terms of both sound-meaning mapping, i.e., phono-mimic association, and shape-meaning mapping, i.e., pheno-mimic association (see Cabrera 2012).

Methodology

Although religion and philosophy lend a hand in explaining the nature of knowledge, each begins with a different presupposition. Whereas religion requires that faith be granted for what is said, philosophy questions everything that is said. Since the time of Socrates, two perspectives still compete: religious philosophy vis-à-vis philosophy of religion. Whereas the former requires faith upfront, the latter discards faith altogether in search of tangible pieces of evidence chained through logical reasoning. As a student of language engrossed in current Western philosophy, I adopt the latter perspective. The primary motivation is that research in this area still yields inconclusive (and sometimes contradictory) findings. The reasons are surely miscellaneous, including, but not limited to, study design, data collection, interpretation of the results, and above all, the theoretical framework within which the studies are couched. Part of the problem, we believe, is that research on whether form-meaning mapping is symbolic or arbitrary has not been properly approached, particularly in that words are considered on a one-size-fits-all scale of evaluation, a state of affairs that shattered findings and made comparisons chancy. Of theoretical significance here is the adaptation of a theoretical framework in light of which the findings are interpreted. Concisely, I will try to examine the symbolic meaning of the linguistic form that fits its referent sound by virtue of the sounds it is made of, by putting forward a stronger version of Chomsky’s innateness model of analysis, the framework to which we turn attention in the following subsection.

Suggested theoretical framework: basic tenets

The main argument here is that for research findings in form-content linkage to be consistent and for comparisons to be safe, the vocabulary of LANGUAGE should be cross-classified as we cross-classify words for semantic, morphological, or syntactic reasons. One division of labor should be made, for example, between early-acquired words and later-acquired words. Research has already shown that whereas early-learned vocabulary tends to be more iconic, later-learned vocabulary sacrifices iconicity disproportionately (see Monaghan et al. 2014). Capitalizing on research findings such as this one, we propose that a line of demarcation should be drawn between two types of words: Real words vs fake words. At the core of almost every previous attempt made so far is the argument that both systematicity and arbitrariness are acknowledged simultaneously, yet disproportionately. Whereas a small portion of the vocabulary shows absolute iconicity, the remaining bulk entertains varying degrees of arbitrariness. While it is often assumed that iconicity erodes over time, form—meaning mapping has demonstrably persisted in a select set of words, which we term here as real (i.e., genuine) words. Building on the notion of a non-arbitrary connection between the form or sound of a sign and the concept it represents, this study proposes that language may preserve iconic relationships that extend beyond mere social convention. However, the extent of this relationship varies across languages. For instance, the Arabic word for “dog” (كلب /kalb/) retains a sound–form connection that is arguably more evocative of its referent than its English counterpart, suggesting that certain languages preserve iconicity more robustly. This observation has motivated the proposal of a novel dichotomy between real and fake words, distinguished by the degree to which their phonological shape and sound continue to embody this inherent symbolic relationship.

Accordingly, the hypothesis we put forward here is that sound-meaning correspondence can be straightforwardly established for real words, but the task, though not impossible, is more complicated for fake words. One criterion that can be used to determine which word is real and which is fake involves exactitude. To illustrate, whereas real words are used denotatively (and therefore have only one meaning), fake words are used connotatively (and therefore have multiple meanings/concepts, etc.), One piece of evidence comes from research on translation, where word connotations/ metaphors, etc., pose a greater challenge to the interpreter (al-kuran 2024; Alsharif et al. 2026). For example, the English expression “viewpoint” is fake simply because it is used generously for both “what you see” and “what you hear”. In fact, it would be more truthful if the expression were only used to refer to those you see. Had this been the case, the language would have then sanctioned the use of another expression (e.g., “earpoint”) for what you hear, to distinguish between the interlocutors’ visual and auditory experiences. To illustrate, an artist would paint a landscape differently had he heard about it or seen it with his very own eyes. Being loosely used, the expression “viewpoint” is unlikely to convey meanings (and therefore resemblances) related to both the eye and the ear simultaneously. Of theoretical significance for the argument we make here is that the expression ‘viewpoint’ falsely maps the word to its meaning because it violates Grice’s maxims of cooperative conversation (see discussion below). In current terms, the expression sacrifices systematicity, which is definitely a necessary condition for bootstrapping early word learning (see Thompson et al. 2012) and for enhancing second/foreign language acquisition. To illustrate, an English second language learner, whose native language distinguishes between two views, each entertained by one of the body organs, finds the English expression “viewpoint” bogus, and therefore difficult to retrieve (particularly at early stages of language learning) when trying to express a point of view distinctively. This is because the user has to learn a language-specific (i.e., English) rather than language-independent (viz. LANGUAGE) term.

Suggested theoretical framework: machinery

In his theory of implicature and the cooperative principle of conversation, Paul Grice, the British philosopher of language, argues that a line of demarcation should be drawn between ‘natural meaning’ and ‘non-natural meaning’. Complementarily, we suggest that a distinction be made between real words and fake words. The argument becomes like this: whereas real words denote ‘natural meaning’, fake words denote ‘non-natural meaning’. Grice put forward four maxims on which the distinction can be weighed. At the level of detail considered here, we argue that real words are those that do not violate Grice’s maxims. To illustrate, Grice (1957 [1989]) suggests that people can achieve maximally effective communication by not violating any of the following maxims: quantity, quality, relation, and manner. Pulling the terms to the current investigation, we argue that fake words violate:

The maxim of quantity, which requires giving as much information as needed, and no more;
The maxim of quality, which requires not giving information that is false or not supported by evidence;
The maxim of relation, which requires giving only relevant information, and
The maxim of manner requires being clear, brief, and orderly to avoid obscurity and ambiguity.

Real words, on the other hand, do not flout any of these maxims, thus creating one-to-one sound-meaning correspondence. Inherent in this line of reasoning is the belief that iconicity is gradient, i.e., the amount of iconicity in a word corresponds with (1) how many of these maxims are flouted, and (2) how much of each is violated.

By not flouting the maxims, real words, we argue, are those whose referents embody valid concepts (or truth value in Gricean terms), and fake words are those whose referents embody false concepts. Inherent in this line of reasoning is that concepts themselves can also be real or fake: Whereas real concepts are biologically given in the human brain (i.e., natural), and are thus context-independent, fake concepts are just emerging due to social practices (i.e., non-natural) and therefore context-specific. Examples of real words include onomatopoeic words, ideophones in classical and remote languages, words that include phonosemantic associations at the sound level (certain sounds are always available in certain concepts, such as sub-morphemic units that are associated with certain meanings (e.g., English /gl/). The proposition is that a word is real iff (if and only if) it is linked with only one concept, so that the correspondence is one-to-one. The other two correspondences outlined below should not be sanctioned; hence, the form-content linkage flouts the maxims and thus becomes fake:

one-to-many correspondence: a word that is linked to more than one concept
many-to-one correspondence: two or more words that are linked to one concept

This is one reason why we choose to experiment with the word whose referent is ‘water’, hence only one word is used to denote the concept in almost all languages. However, it should be made clear in advance that the ‘fakeness’, so to speak, of a word is gradient as it is measured by how much it flouts these maxims. For example, whereas the word ‘moon’ is a concept (a celestial object), “the moon” is a thing (the celestial object that orbits our planet and uniquely harbors life). The difference between the two is like this: Whereas true concepts are antenatal in the human mind (named) to describe, explain, and capture reality as it is, false concepts are created (named) to describe, explain, and capture reality as people understand it. It is therefore a nature-nurture trade-off. Of theoretical significance here is that

figurative language (including metaphors, personification, hyperbole, allusions, and idioms) is always fake
words that have a strictly literal meaning are real.
similes are real

Building on Grice’s theory of implicature and the cooperative principle, this study proposes that Real words align with Grice’s maxims of quantity, quality, relation, and manner, thereby ensuring a one-to-one correspondence between form and meaning. Fake words, by contrast, flout these maxims, producing non-natural meanings that are context-dependent, ambiguous, or conceptually unstable. This dichotomy not only grounds the analysis of linguistic iconicity in a pragmatic framework but also demonstrates that the degree of “fakeness” in a word can be measured by the extent to which it violates these maxims. In this way, the proposed distinction extends Grice’s insights on communication to the level of lexical structure, offering a novel perspective on the interplay between sound, meaning, and truth conditions in language. This distinction is illustrated by the concept of water, which is consistently denoted by a single form across languages, thereby preserving its status as a real word. By contrast, words with multiple competing forms or extended figurative uses exemplify fake words, since they flout the Gricean maxims by introducing ambiguity and non-natural meaning.

Thesis statement: The form that fits the referent (water) is HO (sounding HO, drawn as هو in Arabic).

Discussion

Form-content analysis of /HO/: هو

If we observe the symbol which stands for hydrogen (H), we find that its drawing (H) consists of two identical parallel lines connected exactly in the middle. At the symbolic level, the two lines are the two identical atoms of hydrogen, and the line connecting them is the construction process (viz., the horizontal connection between them). The drawing of the /h/ shapes in English (specifically higher case) is, we reckon, far from being accidental. As ancient languages used symbols to encode specific meanings, such symbolic relationships have been preserved in their written forms, which are definitely less prone to alteration. In simple terms, as the form of the letter embodies a concept, ancient nations managed to show the mandatory relationship between the symbols and the concepts they embody, thus drawing the sound (or combination of sounds) as a letter in the way(s) that symbolically expressed what that nation ‘saw’ as its true meaning(s). For instance, the /h/ sound was drawn in two ways in the ancient Egyptian hieroglyphic language, viz.:

The first is a house or shelter, while the second is something like a coil of wire that transmits light or energy (wick)—it is the source of energy. In more symbolic terms, both drawings express the process of creating and connecting, together representing encompassing and infinity.

The drawing of the same sound is not arbitrary in other classical languages. In ancient Hebrew, for example, it was drawn in the form of someone raising his hands to the sky, calling on a great power (see Figure below)^{Footnote 5}. Interestingly, the sound, which is the fifth letter in the Hebrew alphabet, is represented in more than one shape to embody its meaning(s).^{Footnote 6}

Given the various orthographical representations of the letter in modern Hebrew, it is drawn in the following forms:

The two parallel lines representing the construction process are present, and the connection is demonstrated in various ways.

The pictographs of the letter in classical, as well as modern, Arabic, embody its meaning, too. It is represented in several shapes according to its position in the word, as in the following illustrative figure:

To illustrate, the Arabic word for the hoopoe is هدهد. The head of the hoopoe (with its eye, beak, and crown) reflects the shape of the letter itself (هـ). At the beginning of the word هدية (gift), the letter is drawn as a gift being wrapped beautifully (the gift with its surrounding envelope). In the middle of the word كهف (cave), the symbol represents the shape of a cave. At the end of the word مياه (water), the sound stands alone as a circle, an encirclement which indicates femininity and, therefore, infinity. This is probably why in ancient cultures the shape of the sound is used to depict the so-called ‘Eye of Horus’:

If we examine it closely, we find that the concept is reflected in the drawing itself. This eye symbolizes the constant observation of the universe. However, the One who is always in charge of the universe is also ‘He’ (Arabic: هو, though with a slightly modified pronunciation), represented by the sound that expresses inspiration, the beginning of creation, breathing, and encompassment. At the symbolic (and allegorical) level, /h/ is a reference to the creator who breathes into his creation.

As for its production in the vocal tract, the /h/ sound is produced in the larynx, pairing only with the glottal stop /Ɂ/, not only in terms of place of articulation but also in terms of voicing; hence, both are voiceless sounds (i.e., produced with no vocal fold vibration). However, contrary to /Ɂ/, which is produced by totally constricting the flow of air, /h/ is produced by letting air escape (breath). What this means is that although both are voiceless laryngeals, one blocks the passage of air (allegorically: diabolically black), and the other opens it (allegorically: celestially white). Interestingly, some linguists argue that both sounds were originally one phoneme. For example, Lehmann (1993) put forward the hypothesis that there had once been an h-like laryngeal sound that was originally two separate sounds: a glottal stop [Ɂ] and a glottal fricative [h]. Beeks (1995) suggested that the glottal stop is one variant of /h/. In more technical terms, whereas the /h/ is a phoneme on its own, /Ɂ/ is just an allophone of /h/ that is realized in some specific environment(s). In the Nawdm language of Ghana, the glottal stop is written ɦ, capital Ĥ.

Like other sounds produced towards the back of the mouth (e.g., uvulars and pharyngeals), laryngeals gradually disappeared in many Indo-European languages. The “missing” sounds are mostly consonants of an indeterminate place of articulation towards the back of the vocal tract. The laryngeal theory posits three laryngeal phonemes in the proto-Indo-European language: *h₁,*h₂, and *h₃. Some proponents of the theory suggest that proto-Indo-European schwa /ə/ was originally a consonant (namely a glottal stop), not a vowel. Linguists, such as Martin Kümmel, argue that the voiceless uvular fricative [خx] is commonly thought to be a remnant of one type of /h/ in some Iranian languages.

The prevailing status of laryngeal /h/ in ancient languages is unquestionable. For example, /h/ in Proto-Indo-European is still evident in H coloration and H loss in many daughter languages. The claim has been made within the framework of the laryngeal theory that the Greek vowels were derived through vowel coloring and H-loss from Proto-Indo-European, with three variants of /h/, constituting the so-called triple reflex. In some other languages (e.g., Sanskrit, Avestan), certain long vowels are pronounced as two syllables. The laryngeal theory states that the phenomenon can be explained as a reflex of contraction following a hiatus caused by the loss of intervocalic /h/ (viz. the sequence VHV becoming VV).

The loss of [h] played a role in the early development of the Indo-European languages. For example, in Maltese, /h/ continued to be realized as an independent phoneme until the 19th century. When lost in most positions, the adjacent vowel gets lengthened. Likewise, /h/ sometimes merges into an immediately succeeding vowel in Tagalog. What this basically means is that, unlike almost all other sounds, /h/ has been undergoing a process of demotion, relegating its status to only secondary articulations. The question that arises here is this: Why has the phoneme-city of /h/ been demoting over the course of time, particularly in Indo-European languages?

As a secular civilization that rejects all beliefs unconfirmed by empirical evidence, ancient Greece rejected the entire religious heritage^{Footnote 7}. Unlike the Egyptians, who were keen to enforce religious laws (religious philosophy), the Greeks started conducting their affairs based on worldly considerations (philosophy of religion). Chiefly associated with naturalism and materialism, secularism rejects, denies, and declines any consideration of immaterial or supernatural deliberations. They developed a system of right and wrong apart from any religious/supernatural considerations. By enforcing the restriction of religion in public spheres, the state contributed to limiting all reference to /HO/ (هو), as it is at the symbolic level, a reference to the creator who breathes (H) into his creation. As a sign of secularism, the Greek language, we reckon, dropped this sound altogether from its phonemic inventory, relegating it to a very marginal role, such as a secondary movement that is not written on the line with the rest of the other main sounds. What is noteworthy here is that to propose a physical world without supernatural interventions, Greek linguists, philosophers, and theorists opted for using letters that do not correspond one-to-one with their symbolic meaning(s), thus distorting the relationship between sound and meaning. Since then, the search for the correspondence between the two has been unsuccessful except for very few remnants, a state of affairs that contributed to adopting linguistic theories that advocate arbitrariness. Therefore, the symbol representing this sound is no longer displayed perfectly in the shape of the original, which was H-like, yet it is still produced as rough breathing.

What is even more stimulating is the alphabetization of /h/. How letters are typically collated in the alphabet is a phenomenon that deserves serious investigation on its own. To illustrate, if sounds/letters were collated phonetically, /h/ should emerge as the first letter in the system; hence, it is the sound that is exclusively laryngeally executed. What is attention-grabbing is that whereas in classical languages (such as Hebrew and Arabic) /h/ was collated fifth, it was placed eighth when the collation underwent Romanization. It is therefore the eighth letter of most European languages. The question that arises here is: what is the correct place of /h/ in the conventional ordering of an alphabet? In the most ancient type of segmental scripts, the abjad, vowels are not represented, and /h/ is collated fifth. In non-abjad systems where vowels are represented, /h/ is placed eighth. The earliest alphabet had a defined sequence because letters were assigned numerical values. In Arabic, for example, it is known as Hisab al-Jummal (Abjad numerals), in Hebrew as gematria, and in Greek as isopsephy. Assigning a numerical value to a name, word, or phrase by reading it as a number had been common practice in those ancient societies for various spiritual reasons, such as seeking healing through incantations and amulets. All in all, /h/ was used to represent 5.

Prevalence of laryngeals in the morphology of languages

Despite the demotion that /h/ has been undergoing in Greek and Latin daughter languages, a cross-linguistic investigation of sound inventories tells straightforwardly that no other sounds have impacted the morphology of all languages like the laryngeals /h/ and /Ɂ/. Influenced by the cultural prestige of Greece and Rome, all Indo-European languages kept much of the remains of Greek’s and Latin’s morpho-syntactic and morpho-semantic traces despite modifications. In English, for example, /h/ is still ubiquitous in pronouns (whether pronounced as a sound or just written as a letter). These include third person pronouns irrespective of gender and number (viz. he, she, they; him her, them; his, hers, their(s); himself, herself, themselves), in demonstrative pronouns (that, these, this, those), in Interrogative and relative pronouns (what, whatever, which, whichever, who, whoever, whom, whomever, whose), in archaic pronouns (thou, thee, thy, thine, and thyself). It is also still present in words with different pronunciation ( thing /ø/, though /ð/, each /tʃ/, ship /ʃ/, enough /f/, or dropped altogether in the so-called H-cluster reduction (naught, whether, rhyme, etc.).

As a phoneme, /h/ was also lost in Latin, the ancestor of all Romance languages. However, traces of the sound can still be attested in almost all these daughter languages, and in their lower and higher varieties. Processes of H-dropping (or aitch-dropping) are attested as distinguishing features between dialects. For example, in some accents and dialects of Modern English, H-dropping causes words like harm, heat, home to be pronounced [arm], [eat], and [ome], though in some dialects such hiatus is prevented (e.g., [h] in behind). Cases of H-dropping also occur in all English dialects when function words (e.g., he, him, her, his, had, and have) are used in their weak forms. Because the /h/ of unstressed ‘have’ is usually dropped, the word is pronounced /əv/ in phrases like should have, would have, and could have. These are spelled out in informal writing as should’ve, would’ve, and could’ve. The pronoun ‘it’ is itself a product of historical H-dropping, hence the older form hit survives as an emphatic form in a few dialects of English (such as Southern American English and in the Scots language). Dropping of /h/ from the cluster /hj/ is found in some American dialects, so that the word human is pronounced /‘juːmən/.

Some English loan words, especially those borrowed from French, may begin with the letter 〈h〉 but not with the sound /h/. Examples include heir, and, in many regional pronunciations, hour, and honest. In some cases, spelling pronunciation has introduced the sound /h/ into such words, as in humble, human, hotel, and (for most speakers) historic. French words without an /h/ (e.g., orrible, abit, armonie) underwent a process of h-addition, becoming (e.g., horrible, habit, harmony) simply because they were derived from Latin words that originally had /h/. H-addition was also applied to herb despite the fact that it is still pronounced without /h/ (/ərb/).

H-dropping in English, for example, has created homophones, a sufficient reason to believe that the alternation was purposeful. Examples include: arm/harm (/ɑːrm, hɑːrm/), ardor/harder (/ˈɑːrdər/, /ˈhɑːrdər/), arc/ark/hark (/ɑːrk, hɑːrk/), art/heart (/ɑːrt, hɑːrt/), eat/heat (/iːt, hiːt/), at/hat (/æt, hæt/), ash/hash (/æʃ, hæʃ/), am/ham (/æm, hæm/), aft/haft (/æft, hæft/), ad/had (/æd, hæd/), axe/hacks (/æk, hæk/), and/hand (/ænd, hænd/), oboe/hobo (/ˈəʊbəʊ, ˈhəʊbəʊ/), old/hold (/əʊld, həʊld/), ohm/home (/əʊm, həʊm/).

Interestingly, H-dropping and h-addition are not linguistic phenomena exclusive to English. When researched cross-linguistically, it becomes evident that this sound (along with its rival /Ɂ/) is uniquely significant as a distinctive mark in the codification of language (or the process of conscious grammaticalization), especially in languages that are still major vehicles of the religious tradition, such as Semitic languages. One clear evidence of the active presence of this sound in the codification processes of masculinization and feminization in Arabic. Arabic, as well as its dialects, distinguishes between the feminine and masculine word by the presence of h-like as a ‘phonetic chunk’ at the end of the word (e.g., mu’alim ‘male teacher’ vs muhalimah ‘female teacher’). Along similar lines, /h/ is a distinctive mark in the codification of definiteness in Hebrew. The noun in Hebrew is definite as long as /h/ is added at the beginning of the word. For example, the Hebrew word for ‘king’ is indefinite, but it becomes definite once /h/ is added at the beginning of it (melek vs hemelek).

All in all, historical and cross-linguistic evidence thus reinforces the claim that /h/ occupies a distinctive linguistic status, not only within Arabic and the wider Semitic family but also across other languages. Its consistent orthographic presence highlights a symbolic and structural weight that extends beyond sound alone. This resilience suggests that /h/ functions as more than a phonetic element, embodying a symbolic role that secures its special place within the linguistic system.

The problem of characterizing /h/

/H/ has been causing problems to structural linguists who have made some attempts to incorporate all speech sounds into a system of distinctive feature organization. In all the suggested frameworks, [h] (and its rival [Ɂ]) turned out to be the most difficult to describe phonetically, phonologically, and acoustically (Chomsky and Halle 1968; Perkell 1980; McCarthy 1991; Ladefoged and Johnsons 2010). Research findings have shown that the two sounds do not fall straightforwardly in one class with any other set of sounds, neither on articulator-based nor on place-based grounds. The traditional taxonomy of all sounds into consonants and vowels fails to incorporate these two sounds straightforwardly. For example, McCarthy (1988; 1992) tries to suggest the feature [pharyngeal] to encompass all ‘throat consonants’, namely uvulars, pharyngeals, and laryngeals. However, a careful reading of McCarthy’s analysis reveals that it is never watertight; hence, many difficult-to-mask counterexamples distinguish laryngeals from uvulars and pharyngeals. The problem of McCarthy’s analysis, like others’, is the basic premise that a sound dichotomy should be binary. To illustrate, structural linguists have long embraced the suggestion that a speech sound be either [vocalic] or [consonantal]. Believing that laryngeals are [consonantal], they have made all attempts to incorporate speech sounds, including laryngeals, to make natural classes. Laryngeals are often grouped in a set of sounds technically called ‘gutturals’, with other ‘throat consonants’. Despite this, research has shown that laryngeals are different from guttural sounds in many ways: they do not share the active articulator with other gutturals in that, whereas all gutturals are produced by activating more than one articulator, laryngeals are produced solely by activating the larynx. What this means is that, unlike all other sounds, laryngeals are articulator-independent; hence, their primary and secondary features are laryngeally executed.

Due to the arcane details of the subject matter, we set ourselves the task to further investigate the phonetic, phonological, and acoustic properties of laryngeals in other works. The preliminary findings of our analysis caused us to suggest a three-partite dichotomy of speech sounds, namely VOWELS, CONSONANTS, and LARYNGEALS. Capitalizing on the fact that laryngeals are the sounds that are unmarkedly articulated the most, we reconsidered the geometry tree of [h] (along with its rival sound [Ɂ], of course). At the root level, it turns out that these two sounds should be set out of the confines of the traditional taxonomy (consonantal vs vocalic). At the branch level of the geometry tree, they turn to lack oral and/or nasal nodes (see Stemberger 1993; McCarthy 1991; Schluter et al. 2016). Given this analysis, the laryngeal node of /h/ and /Ɂ/) is for expressing their voicing status (i.e., being voiceless), not for showing their place of articulation. As both lack Oral, Nasal, and even Place nodes, they are underspecified for all place and articulator features (see Stemberger, 1993). Lying at the root of the geometry tree, it is believed that all other sounds are distinguished by how much they entertain h-like and Ɂ-like features. The classical division of other sounds into vowels and consonants is, we reckon, a ramification of the interplay of these two antagonist forces, whereas one functions to weaken the constriction of the airflow, the other conspires to strengthen it—our demonstration of this interplay will be readily available in our next research paper (forthcoming).

In sum, this study has argued that the [h]-sound, and laryngeals more broadly, occupy a unique status in the phonological hierarchy, one that cannot be fully captured by traditional classifications of throat consonants such as pharyngeals and uvulars. The cross-linguistic evidence demonstrates that while /h/ retains a primary phonemic role in classical and so-called “primitive” languages, it has largely shifted to a secondary allophonic status in modern languages, revealing a diachronic division of labor. These findings invite further investigation into the classification of laryngeals vis-à-vis other gutturals, with the aim of refining theoretical models of feature geometry and phonological categorization. At the same time, the analysis highlights the need to adopt a relevance-theoretic perspective in future research, one that systematically evaluates the trade-off between communicative effect and processing effort for real vs fake words containing [pharyngeal] sounds. By situating laryngeals, a subset of pharyngeal, at the intersection of phonological theory and communicative efficiency, future work can provide a more comprehensive understanding of their symbolic and functional role across languages. In closing, this study underscores the need for further exploration of laryngeal evolution and orthographic symbolism across classical languages, alongside their residual traces in Indo-European traditions. While the Arabic data yield valuable insights, broader generalizations must remain tentative until reinforced by systematic cross-linguistic research.

Data availability

As theory-oriented, no special data were generated or analyzed in the study. Therefore, data sharing is notapplicable to this research.

Change history

09 January 2026
Editor’s Note: The Editorial team is currently investigating concerns raised about the contents of this paper. Editorial action will be taken as appropriate once this process is complete.
23 January 2026
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1057/s41599-026-06551-2

Notes

https://dictionary.cambridge.org/dictionary/english/water
The Merriam-Webster Dictionary defines it as “the liquid that descends from the clouds as rain, forms streams, lakes, and seas, and is a major constituent of all living matter and that when pure is an odorless, tasteless, very slightly compressible liquid oxide of hydrogen H₂O which appears bluish in thick layers, freezes at 0 °C and boils at 100 °C, has a maximum density at 4 °C and a high specific heat, is feebly ionized to hydrogen and hydroxyl ions, and is a poor conductor of electricity and a good solvent” (see: https://www.merriam-webster.com/dictionary/water.
The Greek ‘Obscure philosopher’ posited the theory of unity of opposites (see chapter 2 in Burnet, Early Greek Philosophy, available online: https://plato.spbu.ru/RESEARCH/burnet/burnet.pdf).
Pre-Socrates philosophy, influenced by the ancient tradition of Egypt, was debunked by post-Socratic philosophies.
“In the beginning was the Word, and the Word was with God, and the Word was God.” (John i. 1.)
Its fifth position in the Hebrew phonetic table came to express the five fingers of the hand, the five senses, and so on.
Source: https://www.ancient-hebrew.org/alphabet_letters_hey.html.
To trace the origin and development of secularism, see (Jayne and Jayne 2018).
Jayne, Edward, and Elaine Anderson Jayne. An Archeology of Disbelief: the Origin of Secular Philosophy/ Edward Jayne; Edited by Elaine Anderson Jayne. 2018. All Books and Monographs by WMU Authors. 753. https://scholarworks.wmich.edu/books/753
.

References

al-kuran M (2024) Challenges of rendering chauvinism into arabic: implications for dictionary users and translation equivalence. Jordan J Mod Lang Lit 16(2):409–424. https://doi.org/10.47012/jjmll.16.2.7
Article Google Scholar
Alpher B (1994) Yir-Yoront Lexicon: sketch and dictionary of an Australian language. Cambridge University Press
Alpher B (2001) Sound symbolism and the lexicon: explorations in Australian languages. Language 77(no. 1):1–25
Google Scholar
Alsharif B, Khasawneh R, Alzghoul M (2026) Strategies of rendering metaphor from arabic into english: a comparative study of ChatGPT and Matecat. World J Engl Lang. https://doi.org/10.5430/wjel.v16n1p45
Armstrong DF (1983) Signs of the origin: the emergence of language in the human mind. Oxford University Press
Armstrong DF et al. (1995) Gesture and the Nature of Language. Cambridge University Press
Book Google Scholar
Beeks RS (1995) Comparative Indo-European linguistics: an introduction. John Benjamins
Bergen B (2004) The psychological reality of phonaesthemes. Language 80(no. 2):290–311
Article Google Scholar
Blasi DE et al. (2016) Sound–meaning association biases evidenced across thousands of languages. Proc Natl Acad Sci USA 113(no. 39):10818–10823
Article ADS CAS PubMed PubMed Central Google Scholar
Bolinger D (1950) Intonation and its uses: melody in grammar and discourse. Stanford University Press
Bremner JG et al. (2013) Sensory integration and perception in infants. Dev Sci 16(no. 3):328–345
Google Scholar
Brown R (1980) Social psychology. Free Press
Buck CD (1980) A dictionary of selected synonyms in the principal Indo-European languages: a contribution to the history of ideas. University of Chicago Press
Burling R (2005) The talking ape: how language evolved. Oxford University Press
Burnet J (1920) Early greek philosophy, 3rd edn. A & C Black, London. https://plato.today/RESEARCH/burnet/burnet.pdf
Cabrera J (2012) Phonosemantic patterns in Spanish. John Benjamins
Chen WX et al. (2021) Cross-cultural investigation of the Bouba–Kiki effect. J Exp Psychol 219:1021–1042
Google Scholar
Chen Y (2016) “When “Bouba” equals “Kiki”: cultural commonalities and cultural differences in sound-shape correspondences”. Sci Rep 6(no. 1):26681. https://doi.org/10.1038/srep26681
Article ADS CAS PubMed PubMed Central Google Scholar
Childs GT (1994) African ideophones. Mouton de Gruyter
Chomsky N, Halle M (1968) The sound pattern of english. Harper & Row
Chow E (2018) Phonetic symbolism and its psychological basis. Cogn Sci Rev 17:129–147
Google Scholar
Ciccotosto S (1991) Phonological awareness and literacy development in early childhood. Routledge
Google Scholar
D’Anselmo A et al. (2019) The influence of sound symbolism on word learning in children. Cogn Lang Res J 27:345–366
Google Scholar
Day S (2004) The sound-symbolism hypothesis: a review of research. Cambridge University Press
de Saussure F (1916) Course in general linguistics. Translated by W Baskin, Philosophical Library, 1959 (original work published 1916)
Diffloth G (1972) Notes on expressive meaning. Linguistic Society of America
Dingemanse M et al. (2015) What sound symbolism can and cannot do: testing the iconicity of ideophones from five languages. Language 91(no. 2):e117–e134
Article Google Scholar
Doke CM (1935) Textbook of zulu grammar. Longmans
Johansson E et al. (2020) The typology of sound symbolism: defining macro-concepts via their semantic and phonetic features. Linguist Typol 24(no. 2):253–310
Friedrich P (1979) The language parallax: linguistic relativism and poetic indeterminacy. University of Texas Press
Gasser M (2004) The role of sound symbolism in language learning. Indiana University Research Repository. Available at: https://www.indiana.edu/~gasser/Gasser04.pdf
Graven T (2019) Cognitive mechanisms behind sound-symbolism: an experimental approach. Oxford University Press
Grice HP (1957) “Meaning.” Studies in the way of words. Harvard University Press, pp 213–223 (originally published 1957)
Hassanein H (2023) Antonym sequence in Qur’anic arabic: an emically etiological approach. Jordan J Mod Lang Lit 15(1):199–220. https://doi.org/10.47012/jjmll.15.1.11
Article MathSciNet Google Scholar
Hauser et al. (2002) The faculty of language: What is it, who has it, and how did it evolve?. Science 298(5598):1569–1579. https://doi.org/10.1126/science.298.5598.1569
Imai M et al. (2008) Sound symbolism facilitates early verb learning. Proc Natl Acad Sci USA 105(no. 14):5783–5788
Google Scholar
Imai M, Kita S (2014) The sound symbolism bootstrapping hypothesis for language acquisition. Cogn Sci 38(no. 5):973–992
Google Scholar
Jackson H, Waugh L (1997) Grammar and meaning: a structuralist perspective. Routledge
Janković T, Marković R (2001) Phonological processing and its relation to symbolic cognition. J Exp Linguist 27:87–104
Google Scholar
Janković T, Vučković S, Radaković B (2005) The impact of phonetic structure on perceptual categorization. Linguist Anal J 34:89–115
Google Scholar
Joo I (2020) The role of sound symbolism in language processing: an empirical study. Routledge
Kantartzis K et al. (2009) The role of sound symbolism in infant language development. Cognition 112(no. 3):381–387
Google Scholar
Köhler W (1970) Gestalt psychology. Liveright Publishing (reprint 1970)
Ković, Vanja et al. (2010) The Bouba–Kiki effect: a cross-linguistic investigation. Acta Psychol 135(no. 1):50–57
Google Scholar
Ladefoged P, Johnson K A (2010) Course in phonetics, 6th edn. Wadsworth
Lehmann, WP (1993) Theoretical bases of Indo-European linguistics. Routledge
Locke J (1690) An essay concerning human understanding. Eliz. Holt, London
Lockwood G, Dingemanse M (2015) Iconicity in the lab: a review of behavioral, developmental, and neuroimaging research into sound-symbolism. Front Psychol 6:1246
PubMed PubMed Central Google Scholar
Maeda S, Maeda S (1983) Acoustic properties of phonemes in Japanese. Phonetica 40(no. 3):159–172
Google Scholar
Malady K (2013) The evolution of linguistic forms: a cognitive perspective. Oxford University Press
Maurer D et al. (2006) Cross-modal correspondences: the roles of stimulus similarity and intensity in sound-symbolism. Psychol Sci 17(no. 10):849–855
Google Scholar
McCarthy J (1988) Prosodic morphology: constraint interaction and satisfaction. MIT Pres, 1988
McCarthy, J. Prosodic Morphology: Constraint Interaction and Satisfaction. MIT Press, 1991
McCarthy J (1992) Theoretical perspectives on phonology. Cambridge University Press
McGregor W (2001) The languages of the Kimberley, Western Australia. Routledge
Mikone E (2001) Phonosemantic patterns in Balto-Finnic languages. University of Helsinki
Monaghan P et al. (2014) “How arbitrary is language?” Philos Trans R Soc Lond B Biol Sci 369:20130299
Monaghan P et al. (2015) The role of sound symbolism in language evolution. Trends Cogn Sci 19(no. 10):649–655
Google Scholar
Muin M et al. (2021) Phonetic iconicity and its role in modern language processing. J Cogn Linguist 45(no. 2):257–281
Google Scholar
Newman P (1968) The kanakuru language. Abingdon Press
Nielsen A et al. (2011) The sound symbolic nature of word learning. Cogn Dev 26(no. 3):234–245
Google Scholar
Nuckolls JB (1996) Sounds like life: sound-symbolic grammar, performance, and cognition in Pastaza Quechua. Oxford University Press
Nuckolls JB (1999) The case for sound symbolism. Annu Rev Anthropol 28:225–252
Article Google Scholar
Padraic M et al. (2006) Sound-symbolic mapping in language development. Cognition 98(no. 2):157–165
Google Scholar
Padraic M et al. (2011) The psychological reality of sound-symbolism: evidence from word learning in adults. J Exp Psychol Learn Mem Cogn 37(no. 5):1013–1023
Google Scholar
Parault S, Schwanenflugel P (2006) Sound symbolism and the development of semantic processing in children. J Lang Dev 21(no. 2):187–205
Google Scholar
Perkell J (1980) Physiology of speech production: results and implications of a quantitative cineradiographic study. MIT Press
Perniss P, Thompson RL, Vigliocco G (2010) Iconicity as a general property of language: evidence from spoken and signed languages. Front Psychol 1:227. https://doi.org/10.3389/fpsyg.2010.00227
Article PubMed PubMed Central Google Scholar
Perniss P, Vigliocco G (2010) The bridge between perception and language: sound-symbolism in spoken and signed languages. Philos Trans R Soc B 367:3753–3763
Google Scholar
Pesenti M et al. (2000) Neuroimaging evidence for a modulation of cerebral activity during non-symbolic numerosity processing. Cogn Brain Res 12(no. 3):313–322
Google Scholar
Qur’an S () The Holy Qur’an: English translation. Available at: https://quran.com/
Ramachandran VS, Hubbard E (2001) Synaesthesia: a window into perception, thought and language. J Conscious Stud 8(no. 12):3–34
Google Scholar
Reichard G (1944) An analysis of cochiti grammar. Columbia University Press
Reichard G (1950) Hupa texts with grammar and vocabulary. University of California Press
Rizzolatti G, Arbib M (1998) Language within our grasp. Trends Neurosci 21(no. 5):188–194
Article CAS PubMed Google Scholar
Samarin WJ (1971) Tongues of men and angels: the religious language of pentecostalism. Macmillan
Sapir E (1929) Sound patterns in language. Am J Psychol 40(no. 2):237–245
Google Scholar
Schluter K et al. (2016) Phonetic symbolism and the evolution of word forms. Cogn Sci 40(no. 7):1765–1780
Google Scholar
Schmidtke D (2014) Experimental evidence for sound-symbolism in adult language processing. J Exp Linguist 23:105–129
Google Scholar
Schultze-Berndt E (2001) Ideophones in the languages of Australia. John Benjamins
Shah A (2015) Linguistic evolution and human cognition. Cambridge University Press
Stemberger JP (1993) Morphological and phonological processing in language production. Lawrence Erlbaum Associates
Stokoe WC (1991) Semiotics and human sign languages. Mouton de Gruyter
Stokoe WC (2001) Language in hand: why sign came before speech. Gallaudet University Press
Strack F et al. (1988) Inhibiting and facilitating conditions of the human smile. J Personal Soc Psychol 54(no. 5):768–777
Article CAS Google Scholar
Sučević J et al. (2013) Perceptual and neural bases of sound-symbolism in language processing. Brain Lang 124(no. 3):143–152
Google Scholar
Swadesh M (1971) The origin and diversification of language. Routledge
Thibodeau P et al. (2014) An exploratory investigation of word aversion. Cognitive Science Society, Quebec City
Thibodeau P (2016) The role of metaphor in language and thought. Metaphor Symb 31(no. 2):117–130
MathSciNet Google Scholar
Thompson WF et al. (2012) Sound symbolism in music and language. Music Percept Interdiscip J 29(no. 2):157–163
Google Scholar
Tzeng Y et al. (2017) Sound-symbolism in cross-linguistic perspective. Linguist Typol 21(no. 4):625–648
MathSciNet Google Scholar
Voeltz FKE, Kilian-Hatz C (eds) (2001) Ideophones. John Benjamins
Walker P et al. (2010) Cross-modal associations in sound symbolism: the case of shape and sound correspondences. Cognition 114(no. 1):79–84
Google Scholar
Watson R (2001) Phonosemantic phenomena in Southeast Asian languages. Cambridge University Press
Wescott RW (1977) The phonetic organization of speech. Academic Press
Westbury C (2005) Meaningful sounds: the role of phonology in semantic processing. Cambridge University Press
Wilkins J (1668) An essay towards a real character and a philosophical language. Royal Society, London
Wittgenstein L (2009) Philosophical investigations. Translated by GEM Anscombe. Blackwell (Reprint 2009)

Download references

Acknowledgements

Special thanks go to the anonymous reviewers for their comments and suggestions, which surely improved the quality of the research. This research received no external funding.

Author information

Authors and Affiliations

Yarmouk University, Irbid, Jordan
Rasheed AL-Jarrah

Authors

Rasheed AL-Jarrah
View author publications
Search author on:PubMed Google Scholar

Contributions

As a single-authored paper, it is written, reviewed, prepared for publication, and submitted by the RA.

Corresponding author

Correspondence to Rasheed AL-Jarrah.

Ethics declarations

Competing interests

The author declares no competing interests.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Informed consent

An ethical approval was not obtained.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article has been retracted. Please see the retraction notice for more detail: https://doi.org/10.1057/s41599-026-06551-2

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

About this article

Cite this article

AL-Jarrah, R. RETRACTED ARTICLE: A cross-linguistic investigation of /h/ symbolism: the case of H2O. Humanit Soc Sci Commun 13, 64 (2026). https://doi.org/10.1057/s41599-025-06397-0

Download citation

Received: 04 May 2025
Accepted: 01 December 2025
Published: 03 January 2026
Version of record: 09 January 2026
DOI: https://doi.org/10.1057/s41599-025-06397-0