Base Form

From UNL Wiki
(Difference between revisions)
Jump to: navigation, search
(How to create a BF)
(How to create a BF)
Line 12: Line 12:
 
|e<br>es<br>ed<br>ing
 
|e<br>es<br>ed<br>ing
 
|-
 
|-
|g
+
|book
|ive<br>ives<br>ave<br>iving<br>
+
|<br><code>&#216;</code><br>s<br>
 
|-
 
|-
|
+
|small
|be<br>am<br>is<br>are<br>was<br>were<br>being<br>been
+
|small
 +
|-
 +
|here
 +
|here
 
|}
 
|}
  

Revision as of 11:41, 26 January 2010

Base Form , or simply BF, is the form used to generate all variants of a given lexeme.

The lemma is not always the most adequate form used to generate the inflections of a given lexeme. As it is a word form, it may already include inflections that have to be removed, such as in "love">"loving". The base form, or simply BF, is the part of the word that is common to all its inflected variants ("lov", for instance). It is used as the basis for the generation of the inflectional forms.

Contents

How to create a BF

The BF is the longest common denominator' between all the possible variations of a lexeme. Then:

BF inflections
lov e
es
ed
ing
book
Ø
s
small small
here here




the same as the LRU, except in case of multiword LRUs that involve discontinuity or infixation, i.e., where variations cannot be generated by simple prefixation and/or suffixation rules. In these cases, the BF will correspond to the lemma of the longest common denominator between all the possible variations of the LRU.

Examples

  • house (simple LRU): BF=LRU
  • mouse (simple LRU with infixation: "mouse">"mice"): BF=LRU
  • coffee house (multiword LRU without infixation: "coffee house">"coffee houses"): BF=LRU
  • give in (multiword LRU with infixation: "give in">"gave in"): BF="give" LRU="give in"
  • behind <someone's> back (discontinuous multiword LRU without infixation: "behind my back", "behind his back", etc): BF="behind" LRU="behind back"
  • take <something> into account (discontinuous multiword LRU with infixaiton: "take it into account", "took that into account"): BF="take" and LRU="take into account"

The use of BF

The use of BFs is derived from a practical limitation rather than from a logical necessity. In order to be efficient and to avoid overcharging the system, generation rules have to be as general and few as possible, what limits considerably the possibility of creating infixation rules. The alternative is to reduce infixable compounds and complex LRUs to the longest common denominator (i.e., to “hyper-regularise” them) in order to treat infixation as a special case of prefixation or suffixation.

In English, the use of BF is limited to phrasal verbs (such as "give in" and "bring (sth) back”) and verbal phrases ("play with fire"). The need of BFs is more noteworthy in highly-inflective languages where compounds and complex LRUs may be reordered or infixed. Consider, for instance, the case of the LRU “lingua” (= “language”), in Latin. As a case-inflectional language, Latin normally has 12 different forms for each noun:

case singular plural
nominative lingua linguae
vocative lingua linguae
accusative linguam linguas
genitive linguae linguarum
dative linguae linguis
ablative lingua linguis

For single-word LRUs, as “lingua”, the process of case-inflection is relatively simple, because it is extremely regular and will always correspond to a suffix. In complex LRUs, however, the process can be quite more complicated, because of infixation and agreement. For “lingua franca”, for instance, we will have again 12 different forms, but generating them is no longer as simple as adding suffixes to the right of the LRU.

case singular plural
nominative lingua franca linguae francae
vocative lingua franca linguae francae
accusative linguam francam linguas francas
genitive linguae francae linguarum francarum
dative linguae francae linguis francis
ablative lingua franca linguis francis

In order to avoid listing all variations of “lingua franca” inside the UNLarium or creating a very specific rule which would apply only in this case, we reduce “lingua franca” to “lingua” and create a special (subcategorization) rule for generating “franca” later on. The LRU will be then “lingua franca”, but the BF will be only “lingua”.

Examples

Lexical Realisations Lexical Realisation Unit (LRU) Base Form (BF)
apple, apples apple apple
city, cities city city
glasses glasses glasses
rosa, rosae, rosam, rosas, rosarum, rosis rosa rosa
beautiful beautiful beautiful
hermoso, hermosa, hermosos, hermosas hermoso hermoso
sum, es, est, sumus, estis, sunt, eram, fui… esse esse
part of speech, parts of speech part of speech part of speech
skinhead, skinheads skinhead skinhead
give in, gives in, gave in, given in, … give in give
pars orationis, partes orationes, partem orationis, partis orationis, … pars orationis pars
bring [sth] back, brings [sth] bak, bringing [sth] back, brought [sth] back, ... bring back bring
play with fire, plays with fire, playing with fire, ... play with fire play
Software