From View message header detail Stefan Benus Sent Friday, January 11, 2008 3:35 am To Adamantios Gafos Subject [Fwd: Re: electronic database of Hungarian] Attachments vCard(sb513) 1K -- Štefan Benuš http://www1.cs.columbia.edu/~sbenus/ Department of English and American Studies Constantine the Philosopher University Štefánikova 67 94974 Nitra, Slovakia ----- Original Message ----- From andras@kornai.com Date Sun, 21 Apr 2002 02:28:22 +0000 (GMT) To sb513@nyu.edu (Stefan Benus) Cc andras@kornai.com Subject Re: electronic database of Hungarian Stefan Benus writes: > > Dear Andras, > I have two quick questions about Szota1r, hope you can find time to > answer. I wasn't able to get hold of Papp's dictionary yet but I would > like to know if there is coding for compounds (in nouns mostly) and pre- > fixes (in verbs). In short, those cases assumed not to be harmonizing. > And, what does it mean if Frequency is 0? The work with the database is > interesting. > Thanks a lot, > Stefan Dear Stefan, frequency 0 means that in a large frequency count (large at the time, that is) the word did not appear, 1 means it appeared only once in the (500k words) count, higher numbers mean increasingly higher adjusted absolute frequency ranges, see FABSZ in the metadata I sent with it. For the difference between absolute and adjusted frequency see Juilland and Chang-Rodrigues' Spanish Frequency Dictionary. The C code is for compounding, values are: 1 1 stem 2 2 stems 3 3 stems 4 4 stems 5 5 stems 6 preverb + 1 stem 7 preverb + 2 stems 8 preverb + 3 stems 9 none, other There is also a D code for derived/non-derived status: the values are 0 no derivational suffix 1 derivational suffix at the end of the word 2 derivational suffix inside the word 3 both in the middle and at the end of the word 8 non-Hungarian suffix 9 non-forming suffix "2" really means the situation when there is a compound XY, and X itself is formed by a derivational suffix (which thus ends up in the middle of the word) "9" is what people often call "phantom stems" e.g. in English there is waiter, writer, carpenter, and there are verbs wait write but no verb *carpent. These codes are not entirely reliable. Whatever test lists you come up with you should have them vetted by native speakers first. To make full use of the database, you should absolutely make an effort to obtain at least the foreword to the Reverse-Alphabetised Dictionary -- if all else fails, write to the Library of the Institute of Linguistics, Hungarian Academy of Sciences and I'm sure they can send you a xerox copy. I would send you one myself, except my own copy is in Hungary. Best, Andras