Topic on Extension talk:WikibaseLexeme/Data Model

Yet an other model, with ''vocable'' as central class

1
Psychoslave (talkcontribs)

So, the logomer proposal being inadequate, but being still concerned with the lexem-class-centric model, here is an other model. This time I didn't came with any original fancy neologism, and used terms with existing ISO definitions when I found one. In all cases I gave online sources, but I also used some books to guide my reflection, especially Le dictionnaire de linguistique et des sciences du langage, Larousse, 2012 (ISBN 9782035888457, OCLC 835329846).


Definitions

Here are the pertaining definitions for this new proposed model.

concept
A unit of thought.
source
https://www.iso.org/obp/ui/#iso:std:iso:5963:ed-1:v1:en:term:3.2
translations
French: notion (Toute unité de pensée. Le contenu sémantique d'une notion peut être ré-exprimé en combinant d'autres notions qui peuvent être différentes d'une langue à l'autre.)
term
word or standalone expression for an entity that has linguistic, semantic and grammatical integrity
source
https://www.iso.org/obp/ui/#iso:std:iso:18542:-2:ed-1:v1:en:term:3.1.12
translations
French: terme (mot ou expression isolée pour une entité qui a une intégrité linguistique, sémantique et grammaticale)
sense
one of the meanings of a word
source
https://www.iso.org/obp/ui/#iso:std:iso-iec:tr:29127:ed-1:v1:en:term:2.8
entity extraction
process that seeks to locate, classify, and tag atomic elements in text into predefined categories
source
https://www.iso.org/obp/ui/#iso:std:iso-iec:tr:29127:ed-1:v1:en:term:2.2
word form (vocable)
contiguous or non-contiguous entity from a speech or text sequence identified as an autonomous lexical item
source
https://www.iso.org/obp/ui/#iso:std:iso:24615:-1:ed-1:v1:en:term:3.24
inflection
modification or marking of a lexeme that reflects its morpho-syntactic properties
source
https://www.iso.org/obp/ui/#iso:std:iso:24611:ed-1:v1:en:3.6
inflected form (form)
form that a word can take when used in a sentence or a phrase
Note 1 to entry: An inflected form of a word is associated with a combination of morphological features, such as grammatical number and case.
source
https://www.iso.org/obp/ui/#iso:std:iso:24611:ed-1:v1:en:3.7
word
lexeme that has, as a minimal property, a part of speech
source
https://www.iso.org/obp/ui/#iso:std:iso:24614:-1:ed-1:v1:en:term:2.23
lexeme
abstract unit generally associated with a set of forms sharing a common meaning
source
https://www.iso.org/obp/ui/#iso:std:iso:24614:-1:ed-1:v1:en:term:2.14
lexicalization
process of making a linguistic unit function as a word
Note 1 to entry: Such a linguistic unit can be a single morph, e.g. “laugh,” a sequence of morphs, e.g. “apple pie” or even a phrase, such as “kick the bucket”, that forms an idiomatic phrase.
source
https://www.iso.org/obp/ui/#iso:std:iso:24614:-1:ed-1:v1:en:term:2.15
manifestation
physical embodiment of a given concept
source:https://www.wikidata.org/wiki/Property:P1557
Inflectional paradigm
A class of words with similar inflection rules.
source:https://en.wikipedia.org/wiki/Inflection#Inflectional_paradigm
transliteration
representation of the graphic characters of a source script by the graphic characters of a target script
source:https://www.iso.org/obp/ui/#iso:std:iso:15919:ed-1:v1:en:term:4.7
transcription
representation of the sounds of a source language by graphic characters associated with a target language
source:https://www.iso.org/obp/ui/#iso:std:iso:15919:ed-1:v1:en:term:4.6
primary data (vocable)
electronic representation of language data
source
https://www.iso.org/obp/ui/#iso:std:iso:24612:ed-1:v1:en:term:2.1
annotation
linguistic information added to primary data
source
https://www.iso.org/obp/ui/#iso:std:iso:24612:ed-1:v1:en:term:2.3
representation
format in which the annotation is rendered, independent of its content
source
https://www.iso.org/obp/ui/#iso:std:iso:24612:ed-1:v1:en:term:2.4
observation
act of observing a property, with the goal of producing an estimate of the value of the property
source
https://www.iso.org/obp/ui/#iso:std:iso:28258:ed-1:v1:en:term:3.19
act of measuring or otherwise determining the value of a property
source
https://www.iso.org/obp/ui/#iso:std:iso:19156:ed-1:v1:en:term:4.11
method of data collection in which the situation of interest is watched and the relevant facts, actions and behaviours are recorded
source
https://www.iso.org/obp/ui/#iso:std:iso:16439:ed-1:v1:en:term:3.41
statement of fact made during an audit or review and substantiated by objective evidence
source
https://www.iso.org/obp/ui/#iso:std:iso:tr:15686:-11:ed-1:v1:en:term:3.1.75
instance of applying a measurement procedure to produce a value for a base measure
source
https://www.iso.org/obp/ui/#iso:std:iso-iec:25040:ed-1:v1:en:term:4.45


The model

I used plantuml class diagram generator for generating this picture, so it's maybe not as nice, but it's far mor flexible. Actually if someone might dare install the corresponding MediaWiki extension I might just copy/paste the text format here.

Note that vocable here is used as a mix of the two definitions to which is it appended in brackets to definition terms above.

A vocable-centered data model
Reply to "Yet an other model, with ''vocable'' as central class"