Wordnet lexicographer files
Impact analysis of adverbs for sentiment classification on Twitter product reviews. Sequential patterns rule-based approach for opinion target extraction from customer reviews. View 2 excerpts, cites background and methods. Understanding customer regional differences from online opinions: a hierarchical Bayesian approach.
Sjra With the increasing demand for a personalized product and rapid market response, many companies expect to explore online user-generated content UGC for intelligent customer hearing and product … Expand. View 1 excerpt, cites methods. Developing a hybrid collaborative filtering recommendation system with opinion mining on purchase review.
The most commonly used algorithm in recommendation systems is collaborative filtering. However, despite its wide use, the prediction accuracy of this algorithm is unexceptional.
Furthermore, whether … Expand. View 2 excerpts, references background. A database is generated only if there are no errors. Input files correspond to the syntactic categories implemented in WordNet - noun , verb , adjective and adverb.
Each input lexicographer file consists of a list of synonym sets synsets for one part of speech. Although the basic synset syntax is the same for all of the parts of speech, some parts of the syntax only apply to a particular part of speech. See wninput 5WN for a description of the input file format. One or more input files, in any combination of syntactic categories, may be specified.
See lexnames 5WN for a list of the lexicographer files used to build the complete WordNet database. See wndb 5WN for a description of the database file formats. So that's how you find relations between synsets. I guess the pointer symbols in the line for the dog were just to inform which types of relations I could find for the word dog?
Isn't it redundant? Also, see that I didn't use the lexicographer's file at all. I know that in data. As you can see, I could find hypernym, and many other relations, just by looking at the index and data files, I didn't use any of the so-called lexicographer files. During WordNet, development synsets are organized into forty-five lexicographer files based on syntactic category and logical groupings.
These groupings are some sort of parallel clusters flat groupings to the hyper-hyponym hierarchical ontology. The Lexicographer File Format section describes the syntax for entering a semantic pointer, and Word Syntax describes the syntax for entering a lexical pointer. Although there are many pointer types, only certain types of relations are permitted between synsets of each syntactic category.
Many pointer types are reflexive, meaning that if a synset contains a pointer to another synset, the other synset should contain a corresponding reflexive pointer. Each verb synset contains a list of generic sentence frames illustrating the types of simple sentences in which the verbs in the synset can be used. For some verb senses, example sentences illustrating actual uses of the verb are provided. Whenever there is no example sentence, the generic sentence frames specified by the lexicographer are used.
The generic sentence frames are entered in a synset as a comma-separated list of integer frame numbers. The following list is the text of the generic frames, preceded by their frame numbers:. Synsets are entered one per line, and each line is terminated with a newline character.
A line containing a synset may be as long as necessary, but no newlines can be entered within a synset. Within a synset, spaces or tabs may be used to separate entities. Items enclosed in italicized square brackets may not be present. Synsets of this form are valid for all syntactic categories except verb, and are referred to as basic synsets. At least one word and a gloss are required to form a valid synset.
Pointers entered following all the words in a synset represent semantic relations between all the words in the source and target synsets. Adjective may be organized into clusters containing one or more head synsets and optional satellite synsets. Adjective clusters are of the form:. Each adjective cluster is enclosed in square brackets, and may have one or more parts.
Each part consists of a head synset and optional satellite synsets that are conceptually similar to the head synset's meaning. Parts of a cluster are separated by one or more hyphens - on a line by themselves, with the terminating square bracket following the last synset.
0コメント