1 INTRODUCTION TO THE USAS CATEGORY SYSTEM Dawn Archer, Andrew Wilson, Paul Ray
1 INTRODUCTION TO THE USAS CATEGORY SYSTEM Dawn Archer, Andrew Wilson, Paul Rayson October 2002 SEMANTIC TAGS The semantic tags show semantic fields which group together word senses that are related by virtue of their being connected at some level of generality with the same mental concept. The groups include not only synonyms and antonyms but also hypernyms and hyponyms. Currently, the lexicon contains nearly 37,000 words and the template list contains over 16,000 multi-word units. THE LEXICON Each item has one syntactic tag and one or more semantic tags assigned to it. At present, in cases were a word has more than one syntactic tag, it is duplicated (e.g. each syntactic tag is given a separate entry). The semantic tags for each entry in the lexicon are arranged in approximate rank frequency order, most likely tag first, to assist in manual post editing, and to allow for gross automatic selection of the common tag, subject to weighting by domain of discourse. THE MULTI-WORD-UNIT (MWU) LIST Each template consists of a pattern of words and syntactic tags, some using wildcards to enable tagging with inflectional variants and less strictly defined patterns. The semantic tags for each template are arranged in rank frequency order in the same way as the lexicon. Various types of MWUs are included: phrasal verbs (e.g. stubbed out), noun phrases (e.g. riding boots), proper names (e.g. United States of America), true idioms (e.g. living the life of Riley) THE TAGSET The semantic tags are composed of: 1. an upper case letter indicating general discourse field. 2. a digit indicating a first subdivision of the field. 3. (optionally) a decimal point followed by a further digit to indicate a finer subdivision. 4. (optionally) one or more ‘pluses’ or ‘minuses’ to indicate a positive or negative position on a semantic scale. 5. (optionally) a slash followed by a second tag to indicate clear double membership of categories. 6. (optionally) a left square bracket followed by ‘i’ to indicate a semantic template (multi-word unit). 2 OTHER SYMBOLS UTILISED % = rarity marker (1) @ = rarity marker (2) f = female m = male c = potential antecedents of conceptual anaphors (neutral for number) n = neuter i = indicates a semantic idiom Antonymity of conceptual classifications is indicated by +/- markers on tags Comparatives and superlatives receive double and triple +/- markers respectively. Certain words and collocational units show a clear double (and in some instances, triple) membership of categories. Such cases are dealt with using slash tags, that is, all tags are indicated and separated by a slash (e.g. anti-royal = E2-/S7.1+, accountant = I2.1/S2mf, bunker = G3/H1 K5.1/W3, Admiral = G3/M4/S2mf S7.1+/S2mf, dowry = S4/I1/A9-). The initial tagset was loosely based on Tom McArthur's Longman Lexicon of Contemporary English (McArthur, 1981) as this appeared to offer the most appropriate thesaurus type classification of word senses for this kind of analysis. We have since considerably revised the tagset in the light of practical tagging problems met in the course of the research. The revised tagset is arranged in a hierarchy with 21 major discourse fields expanding into 232 category labels. The following table shows the 21 labels at the top level of the hierarchy. A general and abstract terms B the body and the individual C arts and crafts E emotion F food and farming G government and public H architecture, housing and the home I money and commerce in industry K entertainment, sports and games L life and living things M movement, location, travel and transport N numbers and measurement O substances, materials, objects and equipment P education Q language and communication S social actions, states and processes T Time W world and environment X psychological actions, states and processes Y science and technology Z names and grammar 3 The following provides a brief explanation/prototypical examples* of each of the semantic categories (*by ‘prototypical examples’, we mean words that are only assigned the semantic tag under which they have been grouped). A GENERAL & ABSTRACT TERMS A1 General NO ENTRIES All entries are sub-classified into the following: A1.1.1 General actions, making etc. General/abstract terms relating to an activity/action (e.g. act, adventure, approach, arise); a characteristic/feature (e.g. absorb, attacking, automatically); a construction/craft and/or the action of constructing/crafting (e.g. arrange, assemble, bolts, boring, break) PROTOTYPICAL EXAMPLES: ACTIVITY, ANTICS, BUBBLING, BUSY, CHISEL, COMPLY, CRANK, DISMANTLE, DREDGE, FERMENT CHIP AWAY, CHOP DOWN, DRILL DOWN, GET ROUND TO A1.1.2 Damaging and destroying General/abstract terms depicting damage/destruction/demolition/pollution, etc PROTOTYPICAL EXAMPLES: ARMAGEDDON, BLEMISH, BREAKAGES, BULLDOZE, CONTAMINATING, CRACKED, DAMAGE, DESTROY CLOG UP, CONK OUT, DRY ROT, RIP TO PIECES A1.2 Suitability General/abstract terms relating to appropriateness, suitability, aptness, etc PROTOTYPICAL EXAMPLES: APPROPRIATE (+), DIGRESS (-), INAPPROPRIATE (-), SUIT (+), RELEVANCE, UNSUITABLE (-) BESIDE THE POINT (-), MAKE DO (+), WENT OFF AT A TANGENT (-) A1.3 Caution General/abstract terms relating to vigilance/care/prudence, or the lack of. PROTOTYPICAL EXAMPLES: CAREFULLY (+), CAUTIOUSNESS (+), RASH (-), DELICATELY (+), FRUGAL (+), IMPULSIVE (-) JUST IN CASE (+), LOOK BEFORE YOU LEAP (+), SLAP DASH (-) A1.4 Chance, luck General/abstract terms depicting likelihood/probability/providence, or the lack of. PROTOTYPICAL EXAMPLES: ACCIDENT, CHANCE, COINCIDENTAL, FLUKE (+), FORTUNATE (+) AT RANDOM, BAD LUCK (-), HARD LUCK (-), TAKE A GAMBLE A1.5 Use READY (ONLY ENTRY) All additional entries are sub-classified into the following: A1.5.1 Using General/abstract terms denoting use, or the lack of 4 PROTOTYPICAL EXAMPLES: DISUSE (-), DORMANT (-), UNUSED (-), USE, UTILISATION, UTILISE IN USE (+), PUT TO USE, USE UP (+) A1.5.2 Usefulness General/abstract terms denoting usefulness, or the lack of PROTOTYPICAL EXAMPLES: APPLICABLE (+), FUNCTIONAL (+), FUTILE (-), INEFFECTIVE (-), POINTLESS (-), USEFUL (+) CAME IN HANDY (+), IT’S NO GOOD (-), NO POINT (-), NEITHER USE NOR ORNAMENT (-), WORTHWHILE (+) A1.6 Physical/mental General/abstract terms denoting (level of) practicality/abstraction PROTOTYPICAL EXAMPLES: ABSTRACTION, HYPOTHETICAL, METAPHYSICS, NOTIONALLY, PRACTICALITIES, THEORISE IN THEORY, IVORY TOWER A1.7 Constraint General/abstract terms denoting (level) of restriction/autonomy PROTOTYPICAL EXAMPLES: ABANDON (-), ANARCHISTIC (-), BOUND (+), BRIDLE (+), CAGED (+), CAPTIVITY (+), CHAOTIC (-) BEND THE RULES (-), BOUND UP (+), CLAMP DOWN (+), FENCE IN (+) A1.8 Inclusion/Exclusion General/abstract terms denoting (level of) inclusion/exclusion PROTOTYPICAL EXAMPLES: EMBED (+), ENCOMPASS (+), INCLUDE (+), EXCEPT (-), ENSHRINED (+), EXCLUDE (-), INVOLVE (+), LUMP (+) APART FROM (-), ASIDE FROM (-), EN SUITE (+), IN BUILT (+), OTHER THAN (-), TAKE ACCOUNT OF (+) A1.9 Avoiding General/abstract terms denoting (level of) avoidance/evasion, etc PROTOTYPICAL EXAMPLES: ABEYANCE, AVOIDENCE, COP-OUT, DODGE, EVADE, IDLE, IGNORE, INFRINGE, NEGLECT, REFRAIN, SHIRK COP OUT, GIVE IT A MISS, GONE OUT THE WINDOW, HELD BACK A2 Affect NO ENTRIES Entries are sub-classified into the following: A2.1 Affect: Modify, change General/abstract terms denoting (propensity for) change PROTOTYPICAL EXAMPLES: ADAPT (+), ADJUSTABLE (+), ALTER (+), AMEND (+), BLENDING (+), CONVERTIBLE (+), DISTORTS (+), INCORIGIBLE (-), IRREVERSIBLE (-), IRREVOCABLY (-) BLOW HOT AND COLD (+), BURN YOUR BOATS (-), CHANGE DIRECTION (+), CANCEL EACH OTHER OUT (-), CHANGE TACK (+) 5 A2.2 Affect: Cause/Connected General/abstract terms denoting causal relationship, or lack of PROTOTYPICAL EXAMPLES: ARISING, ATTRIBUTABLE, CAUSE, CONSEQUENCE, DERIVE, DISCONNECTED (-), EFFECT, HINGED BASED ON, BECAUSE OF, BRING ABOUT, BY VIRTUE OF A3 Being General/abstract terms relating to being/existing PROTOTYPICAL EXAMPLES: CONSIST (+), EXIST (+), NONEXISTENT (-), NOTHINGNESS (-) AMOUNT TO (+), CAME INTO BEING (+), NON EXISTENT (-) A4 Classification NO ENTRIES All entries are sub-classified into the following: A4.1 Generally kinds, groups, examples General/abstract terms denoting types, groups, examples PROTOTYPICAL EXAMPLES: ARCHETYPE, BRANDED, CATALOGUE, CATEGORY, CLASSED, CLASSIFY, DEFINITIVE, E.G., I.E., KIND CASE IN POINT, IN THE REALMS OF, OF SORTS, SUBJECT AREA A4.2 Particular/general; detail General/abstract terms denoting (level of) generality/detail PROTOTYPICAL EXAMPLES: CLEAR-CUT (+), DETAILED (+), PARTICULAR (+), GENERALISE (-), GENERIC (-), INDEPT (+), VAGUE (-) ATTENTION TO DETAIL (+), GET TO THE POINT (+), IN THE BROADEST TERMS (-), IN GENERAL (-), ON A BROAD SCALE (-), THE INS AND OUTS (+) A5 Evaluation PROTOTYPICAL EXAMPLES: QUALITATIVE (ONLY ENTRY) Additional entries are sub-classified into the following: A5.1 Evaluation: Good/bad Evaluative terms depicting quality PROTOTYPICAL EXAMPLES: ABYSMAL (-), ACE (+), ADVERSELY (-), ALRIGHT (+), APPALLING (-), ATROCIOUS (---), CREDITABLE (+) GOT A LOT GOING FOR IT (+), AT HIS PEAK (+++), A CUT ABOVE (+), BELOW STANDARD (-), DOWN SIDE (-) A5.2 Evaluation: True/false Evaluative terms depicting truth PROTOTYPICAL EXAMPLES: ALLEDGEDLY, BUNKUM (-), CANDID (+), DECEIVE (-), DISHONEST (-), FACT (+), FALSE (-), HOAX (-) A LOAD OF TRIPE (-), GOSPEL TRUTH (+), HAND ON HEART (+), HARD FACTS (+), LOAD OF COBBLERS (-), NON FICTION (+) 6 A5.3 Evaluation: Accuracy Evaluative terms depicting accuracy PROTOTYPICAL EXAMPLES: ACCURACY (+), APPROXIMATE (-), ERROR (-), FAULT (-), MIS-CUES (-), PRECISION (+), PROPER (+) WORD PERFECT (+), FALSE MOVE (-), MAKE A MISTAKE (-), SLIP UP (-), SPOT ON (+), TO A TEE (+) A5.4 Evaluation: Authenticity Evaluative terms depicting authenticity PROTOTYPICAL EXAMPLES: ARTIFICIAL (-), ASSUMED (-), COPIES (-), FAKE (-), GENUIINELY (+), ORIGINAL (+), PUCKER (+), REAL (+) CLOUD CUCKOO LAND (-), FANTASY WORLD (-), FOR REAL (+), IN ACTUAL FACT (+), MOCK UP (-) A6 Comparing uploads/Finance/ usas-guide.pdf
Documents similaires
-
27
-
0
-
0
Licence et utilisation
Gratuit pour un usage personnel Attribution requise- Détails
- Publié le Jui 27, 2022
- Catégorie Business / Finance
- Langue French
- Taille du fichier 0.5843MB