-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
The TEI provides the following attributes for linguistic analysis of <w> elements:
- lemma : provides a lemma (base form) for the word, typically uninflected and serving both as an identifier (e.g. in dictionary contexts, as a headword), and as a basis for potential inflections.
- pos : (part of speech) indicates the part of speech assigned to a token (i.e. information on whether it is a noun, adjective, or verb), usually according to some official reference vocabulary (e.g. for German: STTS, for English: CLAWS, for Polish: NKJP, etc.).
- msd : (morphosyntactic description) supplies morphosyntactic information for a token, usually according to some official reference vocabulary (e.g. for German: STTS-large tagset; for a feature description system designed as (pragmatically) universal, see Universal Features).
- join : provides information on whether the token in question is adjacent to another, and if so, on which side. The definition of this attribute is adapted from ISO MAF (Morpho-syntactic Annotation Framework), ISO 24611:2012.
(copied from TEI P5 17.4.2 which gives some further discussion of "lightweight linguistic annotation"; see also attribute class att.linguisticfor some examples)
There are other possibilities (e.g. those provided by [ISO 12620:2009 ] http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-att.datcat.html)) but these seem closest to what I think WG2 wants to produce.
Which of them would we like to see in the schema? Which others would we like
to add?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels