Morphology

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
There are several difficulties in arriving at a consistent use of the term "word" in relation to other categories of linguistic description, and several criteria have been suggested for the identification of words in a language. In the UNLarium, '''words''' (aka '''word forms''') are "the physically definable units which one encounters in a stretch of writing (bounded by spaces) or speech (where identification is more difficult, but where there may be phonological clues to identify boundaries, such as a pause, or juncture features)" (Crystal, 2008, p. 522).
+
'''Morphology''' is the branch of linguistics that studies patterns of word formation within and across languages, and attempts to formulate rules that model the knowledge of the speakers of those languages.
  
In synthetic (inflected) languages, such as the Indo-European ones, we often recognize a sort of "word unit" under a number of different word forms: "loves", "loving" and "loved", for instance, are not usually considered to be different words, but different forms of the same word ("love"). This underlying word unit is often referred to as a '''lexeme''', which corresponds therefore to a set of forms taken by a single word.
+
== Words, word forms and lexemes ==
  
The different instances of a lexeme are said to be derived from different morphological structures, which means that word forms are  
+
There are several difficulties in arriving at a consistent use of the term "word" in relation to other categories of linguistic description, and several criteria (prosodical, morphological, syntactical) have been suggested for the identification of words in a language. One of the main difficulties concerns the use of the term "word" both as a class and as any of its elements. The forms "love", "loves", "loving" and "loved", for instance, may be considered to be different "words" of English or different forms (variants) of the same "word", depending on the case.
analysed into smaller units, called “morphemes”. A '''morpheme''' is the smallest linguistic unit that has semantic meaning.  
+
 
 +
In order to avoid ambiguities, the UNLarium differentiates between these two senses of "word". The first sense, the one in which "love", "loves", "loving" and "loved" are different "words", is called a '''word form'''. A word form is therefore "the physically definable units which one encounters in a stretch of writing (bounded by spaces) or speech (where identification is more difficult, but where there may be phonological clues to identify boundaries, such as a pause, or juncture features)" (Crystal, 2008, p. 522).
 +
 
 +
The second sense, the one in which "love", "loves", "loving" and "loved" and dogs are "the same word", is called a '''lexeme'''. The lexeme is an abstract underlying unit that corresponds to a set of different word forms reputed to be part of the same word class.
 +
 
 +
== Morphemes ==
 +
 
 +
Different word forms are said to be part of the same lexeme if they share the same fundamental morphological identity. This means that word forms are analysed into smaller units, called '''morphemes'''. A morpheme is the smallest linguistic unit that has semantic meaning.  
  
 
There are two main different types of morphemes:
 
There are two main different types of morphemes:
  
* '''root''' (ROO) - The root is the primary unit of a word unit, which carries the most significant aspects of semantic content. Words may have one (“fire”, “man”, “round”, “table”, “blue”, “green”) or several roots, either concatenated (“fireman”) or separated by hyphen (“blue-green”) or spaces (“round table”);
+
* '''root''' (ROO) - The root is the primary unit of a word unit, which carries the most significant aspects of semantic content. Words may have one (“fire”, “man”, “round”, “table”, “blue”, “green”) or several roots (“fireman”, "grandmother");
 
* '''affix''' (AFX) - The affix is a morpheme attached to the root to modify its meaning.
 
* '''affix''' (AFX) - The affix is a morpheme attached to the root to modify its meaning.
  
Line 15: Line 22:
 
*'''suffix''' (SFX) - Appears at the back of the root (such "s" in "tables", or "er" in "writer")
 
*'''suffix''' (SFX) - Appears at the back of the root (such "s" in "tables", or "er" in "writer")
 
*'''infix''' (IFX) - Appears within the root (very rare in English, such as "ma" in "sophistimacated")
 
*'''infix''' (IFX) - Appears within the root (very rare in English, such as "ma" in "sophistimacated")
*'''circumfix''' (CCX) - Appears at the front and at the back of the root (such as "a" + "ed" in "ascattered")
+
*'''circumfix''' (CCX) - Appears at the front and at the back of the root (very rare in English, such as "a" + "ed" in "ascattered")
  
 
As for their roles, there are two main different types of affixes:
 
As for their roles, there are two main different types of affixes:
Line 22: Line 29:
  
 
Word forms (WFO) are, therefore, the combination of ROOTS + INFLECTIONAL AFFIXES + DERIVATIONAL AFFIXES. The combination of ROOTS + DERIVATIONAL AFFIXES (i.e., word forms without inflectional affixes) is normally referred to as '''stem''' or '''inflectional root'''.  
 
Word forms (WFO) are, therefore, the combination of ROOTS + INFLECTIONAL AFFIXES + DERIVATIONAL AFFIXES. The combination of ROOTS + DERIVATIONAL AFFIXES (i.e., word forms without inflectional affixes) is normally referred to as '''stem''' or '''inflectional root'''.  
 
Lexemes, as a set of different word forms with different inflectional affixes, but with the same stem, are normally referred to by a citation (default) word form called '''lemma'''. The lemma, more generally referred to as '''headword''', is essentially an abstract representation, subsuming all the formal lexical variations which may apply within the same lexeme. It is the word form which occurs at the beginning of a dictionary entry, and which is normally the singular, for nouns; the masculine singular, for adjectives; and the infinitive, for verbs.
 
 
== UNLarium hierarchy ==
 
 
The morphological categories presented above may largely coincide:
 
*In single-rooted lexemes without affixes ("now", "angry", "because", etc), the root, the stem and the lemma are the same word form;
 
*In single-rooted lexemes with inflectional affixes only ("table", "love"), the root, the stem and the lemma are frequently (but not always) the same word form;
 
*In lexemes with derivational affixes only ("beautiful", "basically", etc), the stem and the lemma are the same word form;
 
And so on.
 
 
In the UNLarium framework, in order to correctly identify the morphological structure in case of ambiguity, we proceed from the shortest ("root") to the longest ("word form"). Thus:
 
*If form = lemma = stem = root, morphological structure = root;
 
*If word form = lemma = stem, morphological structure = stem;
 
*If word form = lemma, morphological structure = lemma;
 
*Otherwise, morphological structure = word form.
 
Accordingly:
 
*"now" = root
 
*"table" = root
 
*"beautiful" = stem
 
*"unfinished" = stem
 
*"be" = lemma
 
*"loves" = word form
 
  
 
== Examples ==
 
== Examples ==

Revision as of 10:27, 12 January 2010

Morphology is the branch of linguistics that studies patterns of word formation within and across languages, and attempts to formulate rules that model the knowledge of the speakers of those languages.

Words, word forms and lexemes

There are several difficulties in arriving at a consistent use of the term "word" in relation to other categories of linguistic description, and several criteria (prosodical, morphological, syntactical) have been suggested for the identification of words in a language. One of the main difficulties concerns the use of the term "word" both as a class and as any of its elements. The forms "love", "loves", "loving" and "loved", for instance, may be considered to be different "words" of English or different forms (variants) of the same "word", depending on the case.

In order to avoid ambiguities, the UNLarium differentiates between these two senses of "word". The first sense, the one in which "love", "loves", "loving" and "loved" are different "words", is called a word form. A word form is therefore "the physically definable units which one encounters in a stretch of writing (bounded by spaces) or speech (where identification is more difficult, but where there may be phonological clues to identify boundaries, such as a pause, or juncture features)" (Crystal, 2008, p. 522).

The second sense, the one in which "love", "loves", "loving" and "loved" and dogs are "the same word", is called a lexeme. The lexeme is an abstract underlying unit that corresponds to a set of different word forms reputed to be part of the same word class.

Morphemes

Different word forms are said to be part of the same lexeme if they share the same fundamental morphological identity. This means that word forms are analysed into smaller units, called morphemes. A morpheme is the smallest linguistic unit that has semantic meaning.

There are two main different types of morphemes:

  • root (ROO) - The root is the primary unit of a word unit, which carries the most significant aspects of semantic content. Words may have one (“fire”, “man”, “round”, “table”, “blue”, “green”) or several roots (“fireman”, "grandmother");
  • affix (AFX) - The affix is a morpheme attached to the root to modify its meaning.

Affixes are divided into several categories, depending on their position and their role with reference to the root. The most important positional categories are:

  • prefix (PFX) - Appears at the front of the root (such as "un" in "undo", or "re" in "rewrite")
  • suffix (SFX) - Appears at the back of the root (such "s" in "tables", or "er" in "writer")
  • infix (IFX) - Appears within the root (very rare in English, such as "ma" in "sophistimacated")
  • circumfix (CCX) - Appears at the front and at the back of the root (very rare in English, such as "a" + "ed" in "ascattered")

As for their roles, there are two main different types of affixes:

  • inflectional affix (IAX) - Assign grammatical properties (such as number, gender, tense, person) to the root in order to form the different word forms of the same lexeme ("s" in "tables", "ed" in "loved", etc)
  • derivational affix (DAX) - Form a new lexeme by modifying the meaning (and sometimes the category) of the root ("un" in "unhappy", "ness" in "happiness").

Word forms (WFO) are, therefore, the combination of ROOTS + INFLECTIONAL AFFIXES + DERIVATIONAL AFFIXES. The combination of ROOTS + DERIVATIONAL AFFIXES (i.e., word forms without inflectional affixes) is normally referred to as stem or inflectional root.

Examples

lexeme word forms root derivational affixes inflectional affixes stem lemma
1 here here here here
2 happy happy happy happy
3 unhappy happy un- unhappy unhappy
4 table, tables table -s table table
5 happiness happy -ness happiness happiness
6 love, loves, loving, loved love -s, -ing, -ed love love
7 hermoso, hermosa, hermosos, hermosas (es = beautiful) hermos- -o, -a, -s hermos- hermoso
8 unbreakableness break un-, -ness unbreakableness unbreakableness
9 fireman, firemen fire, man fireman fireman
10 part of speech, parts of speech part, of, speech -s part of speech part of speech
Software