Home ] Up ]

Proto-Hebrew had a single vowel


Lack of vowels is a major insufficiency of written Hebrew. Vowels could be often reconstructed from syntax – but not always, as the debates about vowelization of the words in the Bible show. Besides, even if vowels could be reconstructed, omitting them is extremely unusual: Germanic languages, for example, could be mostly read without vowels, yet vowels are written. Sumerian also recorded the vowels implicitly in its syllables. People do not learn mathematics as a set of axioms only, but also of theorems, though people can theoretically reengineer theorems on demand. Similarly, we expect a descriptory system of written language to be generally sufficient “as is,” without the need to analyze syntax to decipher vowels of every word. No descriptory system could be fully explicit, some processing is always required, and syntax must be considered; but a descriptory system that relies on such analysis for all the words is clearly not satisfactory.

Vowels could be reconstructed in IE-based languages even easier than in Hebrew because of excessiveness of those languages. Yet, written languages beyond the Egyptian and West Semitic differentiate vowels even in shorthand writing. To omit the vowels is odd – unless vocalization was unessential or unambiguous.

Proto-Hebrew was unambiguous: the language had a single vowel a, which later evolved into other vowels according to syntactical accent. Differentiation of vowels runs exactly along morphological differentiation.

The creators of proto-Hebrew used the vowel only to pronounce consonants. Those scholars understood that short indeterminate sound would not do: r(e) is inconvenient, and d`b`r` sounds unpleasant. Using apostrophic sounds would have caused irregularities of d`b``r (debeer instead of debere) type. No such problem arises with long [a]; dabara is pleasant enough.

Sanskrit vocalizes each consonant with [a] without specifically marking that sound. Hebrew, likewise, need not mark vowels: the only existing vowel a was presumed for every consonant. When writing כ, the Hebrews actually meant [ca].

CV structure provided for clear pronunciation of consonants without blurring as in CCV consonantal clusters or word-final consonants. Initially, each syllable was pronounced distinctly with its own accent (da­ba­rA­, da.ba.ra). As people became more fluent in speech, they pronounced words as a whole, and accent naturally shifted to penultimate position (dabAra). Unaccented final vowel shortened to indistinct shuruk (da-bAr-u), and was eventually lost (dabAr).

The earliest Hebrew did not differentiate between parts of speech, and contained only AA-stem (dabar). Later, phrases expanded to verb and object, and syntactical accent appeared. A likely reason for the syntactic accent is VSO pattern: stress in the VSO, as in constructus, tends to the last word, and verbs have to be artificially accented to avoid their weakening and unclear pronunciation.[1] Syntactical accent had to be especially strong early in the language history as the only way to differentiate between the same word denoting action or object. Accordingly, the second vowel which already had morphological accent, acquired additional accent in verbs because of the phrase’ intonation, catAv – catA:v –catEv.[2] Thus appeared tzere.

Holam appeared in imperatives. They differ from other tenses by forceful pronunciation, intonational accent. Thus, catAv – catA:v! – catOv! – c`tO:v! Very strong stress of imperatives extremely elongated the accented vowel, and reduced the first vowel.

Long composite vowels do not survive in heavy or firmly closed syllables.[3] Unaspirated tav of verb’ suffixes was pronounced with leading stop which firmly closed the preceding syllable, diber – dibar.ti, *canes - ni-cnas. Patah derives from tzere in heavy accented syllables.

Hirek appeared from tzere [ai] in unaccented syllables.[4] In hiphil: hragEsh – arAgish – hirgish.

The Masoretes mistook very distinct vocal schwa in word-beginning followed by another schwa for short e sound. Recognizing that unaccented e should become hirek, they introduced hirek in word-initial double schwa combinations. That Masoretic development runs contrary to the LXX evidence.

Segol appears in epenthesis when one vowel sound splits in two to break a cluster, cotevt (tzere) – cotevet, calb – celev, dvarnu (kamatz) – dvarenu.

Holam turns shuruk in unaccented syllables, cahOl – cahOlah – cahO.llah (post-tonic gemination) – c`hu.llAh.

Usage of the same mater lectionis for holam and shuruk shows that they were interchangeable originally, like in the spoken Arabic today (Mohammed – Muhammad).

The Masoretes put a point mark in waw for u between the point mark positions for o (up) and i (down), corresponding to the u sound length between o and i. They similarly marked kubbutz with a middle dot between the dots for holam and hirek.

Differentiating long and short shuruk and hirek is unwarranted. Matres lectionis (iod, waw) and the dot marks are interchangeable; neither matres lectionis consistently denote long vowels, nor the dot marks denote short vowels. Pedantic Masoretes who invented multitude of marks for every minute occasion surely would not mix long and short vowels. They did not distinguish them. The Masoretes simply left matres lectionis for hirek and shuruk where the Bible already had them, and entered dot marks elsewhere.

Similarly, holam is the same sound whether marked by waw or dot. The Masoretes marked holam with dot, and retained waw where it already was present in the text as mater lectionis. Similarly, kamatz and hey-kamatz are the same long a.

That hirek as dot means the shorter sound than hirek as iod is incidental. Hirek was marked with mater lectionis iod before the Masoretes. They introduced hirek in double-schwa environments, d’v’rei – divrei, n’c’tav – nichtav. All such environments became closed syllables, and hirek there was distinctly short. Since the Masoretes did not add matres lectionis, they marked the new–incidentally shorter–hireks with dot.


Long vowels exist only in accented, open or semi-open (word-final) syllables.[5]

In closed accented syllables, kamatz turns patah, had.davar, hitlabbEsh.

Kamatz in closed unaccented syllables shortens to kamatz-hatuf (short o).[6]

Pronunciation of kamatz in open syllable as kamatz-hatuf before another o sound (kamatz-hatuf or hatef-kamatz) is a phonetic phenomenon of assimilation, not predictive grammatical rule.

Kamatz pronounced as o in words like hochma because מ, נ, ר do not agglutinate with nearby consonant, and force syllabification: ha-chmA, but hoc.mA.


Patah in 3s paal verbs might imitate other conjugations and stems, or distinguish from davar nouns. The patah might also appear because of pronoun, always employed with 3s verbs. Secondary accent on pronoun reduces the main accent, *catev – hU-catAv.


The differentiation of vowels by length reflects chanting, and could not survive for millennium in speech to be heard by the Masoretes.

Epenthesis was unnecessary in speech, and the Masoretes heard it in chanting, calb – celev, cotevt – cotevet.

Elongation of vowels characterizes chanting. More than two simple long vowel sounds are unpronounceable in Hebrew speech. Tzere and holam are long composite sounds ae and au, each longer than kamatz. Words with tzere and holam lack vocal force for kamatz, gadOl – [g’dOl]. In speech, kamatz in gadol is not nearly as long as in davar – only in chanting.

Elongation of vowels in pausal positions characterizes chanting, not speech.


Ashkenazic pronunciation of vowels is incorrect, and consequential to stress shift: davAr –dAvar (Germanized) – dA:var (absence of the post-tonic gemination, like in dAb.bar, caused elongation of the accented vowel in open syllable) – dOv’r (absence of post-tonic gemination caused reduction of the post-tonic vowel).

Post-tonic vowel not protected by gemination is normally reduced, and the expected form is dOv’r. Ashkenazim, faced with two identical kamatz letters in davar, had to pronounce two similar sounds. A sound-alike counterpart to long [o] was short [o], and dov’r – dOivor (iod as semi-stop prevents reduction of the second short [o] into schwa).

Pronunciation of kamatz as long [a] is clear from the kamatz’ association with hey (e.g., suffix ca - ch) which cannot be possible read [o]. That Hebrew affix hey is aleph in Aramaic further supports the kamatz as [a].



[1] Ayn”waw,iod,ayn sub-classes of verbs show that the accent indeed pushes forward. Normally unaccented suffix is accented in that sub-class, gartA, because the accent cannot stay on the first vowel in verbs.

In VSO pattern, nouns follow verbs. If the constructus-like concatenation took place, then catAv –davar – catA-vdavar, and kamatz in open accented syllable elongated to tzere.

Pausal form shows that the syntactical accent actually exists and causes accent shift or—where the shift is impossible—elongation.

[2] The form cotev has short o, reduced from long a; the Masoretes mistook the hatef-kamatz sound for holam. Hebrew words generally have two long vowels, and composite vowel ae – tzere equals almost two long vowels, leaving little air for the first vowel of the word. To pronounce cotev with clear long o, the o sound must be semi-stressed, and would produce post-tonic gemination cOttEv.

[3] Word-final consonants are weak, and final syllables are not firmly closed, thus cotev, gadol.

[4] There are exceptions of necessity. There are the forms dvarה (kamatz, noun), shlemה (tzere, kamatz-tzere participle), gdolה (holam, verb-derived adjective), ctuvה (shuruk, passive participle), khulה (kubbutz, noun-derived adjective), ctivה (hirek, verb-derived noun), dabrה (patah, dabbar noun), catvה (kamatz, verb), dibrה (hirek), and also kirvה (hirek, noun), dubrה (shuruk, verb), and just about every other vowel. Tzere in shlemה form was preserved from reduction to hirek to distinguish the word from ctivה.

[5] Thus epenthetic vowels before suffixes: tal-mi-de-cem, tal-mi-de-nu. Resyllabification is another option to preserve long vowel, tal-mid - tal-mi-dhA.

[6] Kamatz+shuruk make long [o]. Patah+shuruk make short [o].

Syllables can be closed with stop (gemination, dagesh hazak).

Accent could be lost for syntactic reasons, e.g. col is employed in constructus like col-dvarim, almost unaccented, and its kamatz sounds short [o], kamatz katan.