-region, -social group, -field of discourse, - medium, -attitude
Types of variation-
1relate primarily to the language users and are relatively permanent for them
people use a regional variety because they live in a region or have once lived in a region
people use a social variety because of their affiliation with a social group
many people can communicate in more than one regional or social variety and can therefore switch varieties according to the situation. People can move to other regions or change their social affiliations, and may then adopt a new regional of social variety
2 relate to language use. People select the varieties according to the situation and the purpose of the communication
3 the field of discourse relates to the activity in which they are engaged
4 the medium may be spoken or written
5 the attitude expressed through language is conditioned by the relationship of the participants in the particular situation.
A common core or nucleus- is present in all the varieties, so that however esoteric a variety may be, it has running through it a set of lexico-grammatical characteristics that are present in all the others.- It is that fact that justifies the application of the name “English” or of the name “Ukrainian” to all the varieties.
Dialects-Varieties according to region have a well-established label. Geographical dispersion is in fact the classic basis for linguistic variation, and in the course of time such dispersion may result in dialects becoming so distinct that we regard them as different languages. This stage was long ago reached with the Germanic dialects that are now Dutch, English, German, Swedish, etc, but it has not been reached with the dialects of English that have resulted from the regional separation of communities within the British Isles elsewhere in the world.
Ukrainian dialects-1. Pivnichne Narichchia (Nothern Dialect)Spoken in Tchernigivsky, Zhytomyrsky, Rivnensky and Volynsky regions and in the northern part of Kyiv and Sumy regions. This dialect is close to Ukrainian-Belarusian patois and is classified into three more dialects: shidnopolisky govir (eastern woodlands patois ['pætwɑː] ), seredniopolisky govir (middle woodlands patois) and zakhidnopolisky govir (western woodlands patios). 2. Pivdenno-Skhidne Narichchia (South-Eastern Dialect)Spoken on the territory of Poltava, Kharkiv, Lugansk, Donetsk, Kherson regions, Crimea, south-eastern parts of Sumy, Kyiv, Kirovograd and Odesa regions, west of Cherkasy region and south of Mykolayiv region. The Middle Dnieper region patios, which makes the basis of Ukrainian literary language also belongs to this group. The dialect is also spoken by Ukrainian settlers in Kuban, Krasnodar, Stavropol and Povolzhye regions of the Russian Federation, Far East, Siberia, Kazakhstan and Kyrgyzstan. 3. Pivdenno-Zakhidne Narichchia (South-Western Dialect)Spoken in Zakarpatsky, Ivano-Frankivsk, Lviv, Tchernivtsi, Khmelnytsk, Vinnytsia, Ternopil regions, north-western part of Kirovograd and Odesa regions, south-western Kyiv region, northern Zhytomyr, western Tcherkasy and northern Mykolayiv regions. This one is also widespread in the bordering territories of Moldova, Romania, Hungary, Slovak Republic and Poland. Separate patois are spoken among Diaspora in Britain, Canada, USA and other countries. They comprise most numerous stylistically and grammatically different patois, which vary greatly within the group. Diaspora dialects are opposed to the classical Ukrainian language, although are considered the most true Ukrainian language among its speakers.
Dialects in Britain
Dialects in US
Regional dialects-Craig M. Carver shows about two dozen dialect regions in the US, based mainly on vocabulary, in his American Regional Dialects. Peter Trudgill, in his Dialects of England, shows sixteen modern dialect regions in England, based on grammar, vocabulary, and accent (there are more in Wales, Scotland and Ireland
Regional variations-Regional variation seems to be realized predominantly in phonology. We generally recognize a different dialect from a speaker’s pronunciation or accent before we notice that the vocabulary or LEXICON is also distinctive
Social variation-is variation in speech according to educational and social status (sometimes - age and sex differences). There is an important polarity between uneducated and educated speech. Educated language naturally tends to be given the additional prestige of government agencies, the professions, the political parties, the press, the law court, and the pulpit – any institution which has to address itself to a public. It is codified in dictionaries and grammars and is taught at all levels of schooling. It is almost exclusively the language of printed matter. It comes to be referred to as standard English or literary Ukrainian.
varieties according to the field of discourse-is the type of activity engaged in through language. A speaker of a language has a repertoire of varieties according to field and switches to the appropriate one as occasion requires. The number of varieties that speakers command depends upon their profession, training and interests. Typically the switch involves turning to the particular set of lexical items habitually used for handling the field in question.
varieties according to medium-those conditioned by speaking and writing respectively. Since speech is the primary or natural medium for linguistic communication, it is reasonable to focus on the differences imposed on language when it has to be expressed in a graphic medium instead.As with varieties according to the field we are dealing here with two varieties that are in principle at the disposal of any user of a language as occasion may demand, irrespective of the variety of language they use as a result of region and education
constraints: some field varieties are difficult to compose except in writing (legal statutes especially) other varieties are restricted to speech (a radio commentary on a football match will be phrased very differently from a newspaper report of the same game)
Varieties according to attitude-constitute, like field and medium varieties, a range of a language any section of which is in principle available at will to any individual speaker of a language, irrespective of the regional variant or national standard he may habitually use. This class of varieties is often called “stylistic”, but this term is used with several different meanings. In lexicology we are concerned with the choice of words that proceeds from our attitude to the hearer or reader, to the topic and to the purpose of our communication. We recognize a gradient in attitude between FORMAL (relatively stiff, cold, polite, impersonal) on the one hand and INFORMAL (relatively relaxed, warm, rude, friendly) on the otherMany sentences like the foregoing can be rated ‘more formal’ or ‘more informal’ in relation to each other, but it is useful to acknowledge unmarked variety of English or Ukrainian, bearing no obvious colouring that has been induced by attitude. On each side of NEUTRAL language we may usually distinguish words that are markedly formal or informal. Many sentences like the foregoing can be rated ‘more formal’ or ‘more informal’ in relation to each other, but it is useful to acknowledge unmarked variety of English or Ukrainian, bearing no obvious colouring that has been induced by attitude. On each side of NEUTRAL language we may usually distinguish words that are markedly formal or informal. But we must account also for the intimate, casual or hearty – often slangy – language used between very close friends especially of a similar age or members of a family
stylistic differentiation of the vocabulary-If we follow the stylistic differentiation of the English vocabulary suggested by I.R.Galperin we should distinguish:
1. Stylistically neutral words.
2. Literary-bookish words with the following subdivisions: -technical vocabulary, -barbarisms -poetical words -archaisms-literary neologisms
archaism-is the deliberate use of an older form that has fallen out of current use. Archaisms are most frequently encountered in poetry, law and ritual writing and speech. Their deliberate use can be subdivided into literary archaisms, which seeks to evoke the style of older speech and writing; and lexical archaisms, the use of words no longer in common use. is the deliberate use of an older form that has fallen out of current use. Archaisms are most frequently encountered in poetry, law and ritual writing and speech. Their deliberate use can be subdivided into literary archaisms, which seeks to evoke the style of older speech and writing; and lexical archaisms, the use of words no longer in common use.
Archaisms are kept alive by ritual and literary uses and by the study of older literature. Should they remain recognised, they can be revived, as the word anent was in this past century. FEXanent - regarding; In English one indicator of a deliberately archaic style is the use of the second person singular pronoun thou and its related case and verb forms.
Ironically, the word thou fell out of English speech because it was thought abruptly colloquial, like French tu. Thou is now seen in current English usage only in literature that deliberately seeks to evoke an older style, though there are also some still-read works that use thou, especially religious texts
The word ye and its related forms also are indicative of archaism, however in spoken English it might be hard to tell the difference, especially if the speaker has an accent that seems strange to the listener
Stylistically neutral layer-consists of words mostly of native origincomprises fully assimilated borrowingssuch words are devoid of any emotive colouring and are used in their denotative meaning, e.g. table, street, sky, go, speak, long, easy, never, often, etc. are not fixed to style. They can be used and dominate in texts of any style.can name concrete objects, phenomena, abstract notions, features of objects, actionIn groups of synonyms neutral words fulfil the function of the synonymic dominant.Neutral words constitute the basis of both English and Ukrainian languages vocabulary.
Stylistically marked layer-
Literary-bookish words (“learned” words):belong to the formal style, to the formal category of communication.are more stable due to the traditions of the written type of speechare used in descriptive passages of fiction, scientific texts, radio and television announcements, official talks and documents, business correspondence, etc. mark the text as belonging to this or that style of written speech, but when used in colloquial speech or in informal situations, they may create a comical effect are mostly of foreign origin and have polymorphemic structure, e.g. solitude, fascination, cordial, paternal, divergent, commence, assist, comprise, endeavor, exclude, heterogeneous, miscellaneous, hereby, thereby, herewith, wherein, etc. are not stylistically homogeneous: Besides general-literary (bookish) words, e.g. harmony, calamity, alacrity, etc., we may single out various specific subgroups, namely: 1) terms or scientific words such as, e.g. renaissance, genocide, teletype, etc.; 2) poetic words and archaisms such as, e.g. whilome - ‘formerly’, aught - ‘anything’, ere - ‘before’, albeit - ‘although’, fare - ‘walk’, tarry - ‘remain’, nay - ‘no’; etc.; 3) barbarisms and foreign words, such as, e.g., bon mot - ‘a clever or witty saying’, apropos [ˌaprə'pəʊ, 'aprəpəʊ] – ‘with reference to; concerning’, faux pas [fəʊ 'pɑː] – ‘an embarrassing or tactless act or remark in a social situation’, etc.; 4) neologisms such as, e.g. teledish - ‘a dish-shaped aerial for receiving satellite television transmissions’, roam-a-phone – ‘a portable telephone’ (now – mobile phone), graviphoton – ‘a hypothetical particle’, etc
Barbarisms- are words which have already become facts of the English language. They are, as it were, part and parcel of the English word-stock, though they remain on the outskirts of the literary vocabulary The word barbarism was originally used by the Greeks for foreign terms used in their language. etymologically rooted in barbaros - the babbling outsider unable to speak Greek Are of foreign origin and not entirely assimilated into the English language. They bear the appearance of a borrowing and are felt as something alien to the native tongue.Most of them have corresponding English synonyms; e.g. chic [ʃiːk] – ‘stylish’; bon mot [bɒn 'məʊ] – ‘a clever witty saying’; en passant [ɒn pæˈsɑːnt; French ɑ̃ pasɑ̃] – ‘in passing’; ad infinitum - ‘to infinity’ and many other words and phrases.It is very important for purely stylistic purposes to distinguish between barbarisms and foreign words proper. Foreign words, though used for certain stylistic purposes, do not belong to the English vocabulary. They are not registered by English dictionaries, except in a kind of addenda which gives the meanings of the foreign words most frequently used in literary English. Barbarisms are generally given in the body of the dictionary. There are foreign words in the English vocabulary which fulfil a terminological function. Therefore, though they still retain their foreign appearance, they should not be regarded as barbarisms. such words as solo, tenor, concerto, blitzkrieg (the blitz), luftwaffe and the like should also be distinguished from barbarisms. They are different not only in their functions but in their nature as well. They are terms.
Terminological borrowings have no synonyms; barbarisms, on the contrary, may have almost exact synonyms.
Some foreign words and phrases which were once used in literary English to express a concept non-existent in English reality, have entered the class of barbarisms and many of them have gradually lost their foreign peculiarities, become more or less naturalized and have merged with the native English stock of words: conscious, retrograde (directed or moving backwards), spurious (false or fake) and strenuous (requiring or using great effort or exertion )are words in Ben Jonson's play which were made fun of in the author's time as unnecessary borrowings from the French
With the passing of time they have become common English literary words. They no longer raise objections on the part of English purists. The same can be said of the words scientific, methodical, penetrate, function, figurative, obscure, and many others, which were once barbarisms, but which are now lawful members of the common literary word-stock of the language.
Neologisms- newly coined lexical units or existing lexical units that acquire a new sense.
Neologism is any word which is formed according to the productive structural patterns or borrowed from another language and felt by the speakers as something new.
Examples: tape-recorder, supermarket, V-day (Victory day). The research of cosmic space by the Soviet people gave birth to new words: Sputnik, spaceship, space rocket that used to be new
may be divided into: 1) Root words: Ex: jeep – a small light motor vehicle, zebra – street crossing place etc; 2) Derived words: Ex: collaborationist – one in occupied territory works helpfully with the enemy, to accessorize – to provide with dress accessories; 3) Compound: Ex: air-drop, microfilm-reader.
New words are as a rule monosemantic. Terms, used in various fields of science and technology make the greater part of neologisms. New words belong only to the notional parts of speech: to nouns, verbs, adjectives etc.
colloquial words- are characteristic of the informal style of spoken English. Colloquialisms are common sayings that people use in everyday speech and some are very old expressions. Colloquialisms are expressions appropriate to informal, conversational occasions. For example, I felt “down in the dumps” is a colloquialism for feeling depressed or miserable The etymology of the term “colloquialism” can be traced to the Latin word “colloqui”, which in turn is derived from the words “com” meaning “with” and “loqui” meaning “conversation”.
The phrase is used to refer to language that is normally used in casual conversation.
Authors and playwrights often use colloquial language while writing, and therefore you may often come across instances of colloquialism in novels and plays because they provide an impression of actual or genuine talk
Generally, colloquialisms are specific to a geographical region. They are used in “everyday” conversation and, increasingly, through informal online interactions.
An example of the regional specificity of colloquialisms is the term used when referring to “soft drinks”. In the Upper Midwestern United States and Canada, soft drinks are called “pop”, whilst in other areas, notably the Northeastern and far Western United States, they are referred to as “soda”. In some areas of Scotland, the term “ginger” is used.
Words that have a formal meaning can also have a colloquial meaning. For example, “kid” can mean “young goat” in formal usage and “child” in colloquial usage.
An example of a colloquialism and how it migrates to other areas is the Indian phrase, "Please do the needful", meaning, "Please do what is implied and/or expected". As the global workplace expands, this once regional phrase is now being used outside the area in which it originated.
One should distinguish between: literary colloquial words (which are used in every day conversations both by educated and non-educated people) non-literary colloquialisms which include( ( slang, jargonisms, professionalisms vulgarisms))
slang- refers to informal (and often transient) lexical items used by a specific social group, for instance teenagers, soldiers, prisoners, or surfers. As a rule, their meanings are based on metaphor and often have ironic colouring, e.g. attic (“head”), beans (“money”), saucers (“eyes”), etc.
Such words are easily understood by all native speakers, if they are not specific for any social or professional group
is not considered the same as colloquial speech, which is informal, relaxed speech used on occasion by any speakerSlangisms are often used in colloquial speech but not all colloquialisms are slangisms.
One method of distinguishing between a slangism and a colloquialism is to ask whether most native speakers know the word (and use it); if they do, it is a colloquialism.
Slang functions in two ways: 1) the creation of new language and new usage by a process of creative informal use and adaptation, 2) the creation of a secret language understood only by those within a group intended to understand it.
Slang is a type of sociolect aimed at excluding certain people from the conversation. Slang initially functions as encryption, so that the non-initiate cannot understand the conversation, or as a further way to communicate with those who understand it. Slang functions as a way to recognize members of the same group, and to differentiate that group from the society at large. Slang terms are often particular to a certain subculture, such as musicians, skateboarders, and drug users.
Jargon - words or phrases used by people in a particular job or group that can be difficult for others to understand are usually motivated and, like slang words, have metaphoric character, e.g. bird (“spacecraft”) /astronauts’ jargon/; to grab (“to make an impression on smb.”) /newspaper jargon/; grass, tea, weed (“narcotic”) / drug addicts’ jargon/, etc. Words such as “backup”, “chatroom” and “browser” are computer jargon. Jargon is often referred to as “technical language”. It makes communication quicker and easier among members of a group who understand it.
ecobabble –using the technical language of ecology to make the user seem ecologically aware
Eeurobabble - the jargon of European community documents and regulations
gobbledygook - incomprehensible or pompous jargon of specialists
psychbabble - using language loaded with psychological terminology
technobabble - technical jargon from computing and other high-tech subjects
Vulgarism - most dictionaries offer "obscene word or language" as a definition for vulgarism, others have insisted that a vulgarism in English usage is different from obscenity or profanity, cultural concepts which connote offenses against the community.
derives from Latin vulgus, the "common folk", and has carried into English its original connotations linking it with the low and coarse motivations that were supposed to be natural to the commons, who were not moved by higher motives like fame for posterity and honor among peers— motives that were alleged to move the literate classes. Thus the concept of vulgarism carries cultural freight from the outset, and from some social perspectives it does not genuinely exist, or — ought not to exist.
One kind of vulgarism, defined by the OED as "a colloquialism of a low or unrefined character," substitutes a coarse word where the context might lead the reader to expect a more refined expression: "the tits on Botticelli's Venus" is a vulgarism.
Lexicography is divided by two related disciplines:Practical lexicography is the art or craft of compiling, writing and editing dictionaries.Theoretical lexicography is the scholarly discipline of analyzing and describing the semantic, syntagmatic and paradigmatic relationships within the lexicon(vocabulary) of a language, developing theories of dictionary components and structures linking the data in dictionaries, the needs for information by users in specific types of situation, and how users may best access the data incorporated in printed and electronic dictionaries. This is sometimes referred to as 'metalexicography'.
a lexicographer - someone who writes or contributes to a dictionary or dictionaries can be regarded as descriptive linguists: empirically analyze and describe language with a traditional emphasis on individual items of vocabulary do not require linguistic knowledge alone, but according to the particular dictionary project may draw on other non-linguistic disciplines including…. make knowledge about language available to various sectors of the wider public mediate between different kinds of language knowledge and different kinds of user needs
The first step for a lexicographer:To analyze and compare various uses of words which have been registered in texts or stretches of speech in order to arrive at the systemic description of the word’s semantic structure, its different meanings
Lexicographyandlexicology - have a common object of study for they describe the vocabulary of a language.
The essential difference between them lies in the degree of systematization and completeness:1) Lexicology aims at systematization revealing characteristic features of words. 2) The field of lexicography is the semantic, formal, and functional description of all individual words. Dictionaries aim at a more or less complete description.
Lexicology is necessary and indispensable in so far as there is lexicography:we need theory as long as there is its tangible application Lexicography – applied lexicology
Subject-matter of Lexicography: history of lexicography,dictionary criticism,use of dictionaries,compilation and structure .
Postulates of lexicography - 1) is concerned with the description and explanation of the vocabulary of the language2) the basic unit of dictionary-making is the linguistic unit3) dictionaries may describe the whole vocabulary or concentrate on one or more aspects4) dictionary-making has to develop metalanguage for handling information5) all dictionaries are motivated by the needs of the language-user whom they serve
Metalanguage - A set of symbols used in talking about language or describing natural languages, a universal semiotic code used for handling and presenting linguistic information in the entries of the dictionary
Dictionary: definition- A generic name for a kind of reference book listing words of a languageThe term dates back to the 16th cen.(from Lat. Dictionarium – a collection of dictions – sayings, words (a medieval book containig lists of words and phrases however organized is regarded as the prototypical work of reference classifies and stores information in print or electronic form and has an access system or systems designed to allow users to retrieve the information in full or in part as readily as possible the information is essentially linguistic and may include material on the form, meaning, use, origin, and history of words and other lexical items.phonetic and grammatical information is word-related and thus essentially lexical. Put very simply, a dictionary is a book or bank about words
Linguisticdictionaries - linguistic or lexical information may be distinguished from extralinguistic or encyclopedic information.
The subject-matter of linguistic dictionaries is lexical units and their linguistic properties:
list of words, with definition, pronunciation, etymology , grammar as well as semantic and pragmatic characteristics or with their equivalents in another language (languages)
Linguisticdictionaries - can be divided into different categories by different criteria (Ladislav Zgusta “Manual of Lexicography”).
The first criterion:1) diachronicdictionaries – are primarily concerned with the history of a language and the development of words in the course of time2) synchronicdictionaries – deal with language vocabulary at one stage of its development
diachronicdictionaries two types are distinguished: historicaldictionaries that register the changes that occur in the form and meaning of a word) and etymologicaldictionaries (that concentrate their attention on the origin of a word).
The second criterion language coverage - 1) generaldictionaries - represent the vocabulary as a whole with a degree of completeness depending on the scope of the book. These include all kind of unabridged, semi-abridged, abridged dictionaries depending on the amounr of items. The Oxford Dictionary is one of the largest dictionaries of this type.2) restricteddictionaries – confined to a given type or variety of words, e.g. dictionaries of dialects, synonyms, idioms etc.
The third criterion: number of languages -Monolingual (unilingual) dictionaries – the lexicon is described and defined by means of the same language Bilingual (interlingual) dictionaries Multilingual dictionaries
The third criterion: user orientation-The learner’sdictionary – general synchronic monolingual dictionary,E.g. Advanced Learner’s Dictionary of Current English by A.S.Hornby
User-friendliness -1) careful control over the language of definitions2) the provision of information on the grammar of words3) greater attention to lexical collocations (information about environments in which words tend to appear most frequently)4) the development of strategies for aiding appropriate word choice (through usage notes, synonym sets or information about pragmatics
Encyclopaedicdictionaries -(the biggest ones are called encyclopaedias)Provide the information about the extralingual world, dealing with concepts of a designative character (terms, events in history, names etc.)Deal with notions rather than words covering the given conceptual area
The Encyclopaedia Britannica (24 vols.)
The Encyclopaedia Americana (30 vols.)
Collier’s Encyclopaedia (24 vols.) intended for students and school teachers
Chamber’s Encyclopaedia (15 vols.) – a family type reference book
World Book Encyclopaedia for primary and secondary school
Everyman’s Encyclopaedia (12 vols.) for all-round use
Subjectdictionariesofacademicreference -Take a middle position between dictionaries proper and encyclopaedias:on the one hand – cover particular disciplines and their entries, deal with theories, issues and methods within those disciplines on the other hand – provide a whole range of terms introducing the reader to the vocabulary of the specialized field in question: R.Fowler “a Dictionary of Modern Clinical Terms”
Thesaurus. Ideographic dictionaries are designed for English-speaking writers, orators or translators seeking to express their ideas adequately. The Latin word thesaurus means treasury. For dictionaries in which words and their definitions belong to the same language, the terms unilingual, monolingual, explanatory are used. These dictionaries provide information on all aspects of the lexical units entered: graphical, phonetical, grammatical, semantic, stylistic, etymological, etc.
Othertypesofdictionaries - Two languages are represented in bilingual or translationdictionaries. The aim of a translation dictionary is to help in translating from one language into another. Phraseologicaldictionaries accumulated vast collections of idiomatic or colloquial phrases, proverbs and other, usually image-bearing word-groups with profuse illustrations. Dictionariesofslang contain elements from areas of substandard speech such as vulgarisms, jargonisms, taboo words, curse-words, colloquialisms
Dictionaries -of word-frequency inform the user as to the frequency of occurrence of lexical units in speech, to be more exact in the “corpus of the reading matter or in the stretch of oral speech on which the word-counts are based. Pronouncingdictionaries record contemporary pronunciation. They indicate variant pronunciations (which are numerous in some cases), as well as the pronunciation of different grammatical forms
The study of language based on examples of "real life" language use stored in corpora - computerized databases created for linguistic research.
CorpusLinguistics-A collection of texts assumed to be representative of a given language, dialect, or other subset of a language to be used for linguistic analysisA collection of linguistic data (usually contained in a computer database) used for research, scholarship, and teaching.
Although the term corpus linguistics was apparently not in use until the 1980s, it is generally agreed that this sub-discipline of linguistics has been in existence longer - at least since the early 1960s" (Change in Contemporary English: A Grammatical Study, 2012). Corpus (from Latin corpus – body)
Thefirstsystematicallyorganizedcomputercorpus- was the Brown University Standard Corpus of Present-Day American English(commonly known as the Brown Corpus), compiled in the 1960s by linguists Henry Kučera and W. Nelson Francis.
Characteristicfeaturesofcorpora - Collection of texts ,Naturally occurring/authentic texts,Representative of a given language,Collected according to specific criteria,Stored in machine-readable format,Used for linguistic analysis
Communicationacts-Corpus linguistics assumes that language is a social phenomenon, to be observed and described above all in accessible empirical data (communication acts). Corpora are cross-sections through a universe of discourse which incorporates virtually all communication acts of any selected language community, be it monolingual (e.g., German or English), bilingual (e.g., English, Welsh) or multilingual (e.g., Western European
When looking towards language as a social phenomenon, we assume that meaning is expressed in texts.
the language community sets the conventions on the formal correctness of sentences and on their meaning. Those conventions are both implicit and dynamic; they are not engraved in stone like commandments.
Language as a social phenomenon manifests itself only in texts that can be observed, recorded, described and analyzed.
Most texts happen to be communication acts, that is, interactions between members of a language community.
Anidealuniverseofdiscourse - would be the sum of all communication acts ever uttered by members of a language community. Therefore, it has an inherent diachronic dimension and sets the conventions on the formal correctness of utterances
Conventions and modifications - Any communication act may utilize syntactic structures in a new way, create new collocations, introduce new words or redeﬁne existing ones. If those modiﬁcations are used in a sufﬁcient number of other communication acts or texts, they may well result in the modiﬁcation or amendment of an existing convention
Corpus linguistics aims to reveal the conventions of a certain language community and their modifications on the basis of a relevant corpus.
In a corpus, words are embedded in their context. Corpus linguistics is, therefore, especially suited to describe the gradual changes in meaning: it is the context which determines the concrete meaning in most areas of the vocabulary
Manageablecorpus - The ideal universe of discourse would be far too large for linguistics to explore it in its entirety. It would have to be broken down into cross-sections with regard to the phenomena that we want to describe. There is no such thing as a ‘one-size-ﬁts-all’-corpus. It is the responsibility of the linguist to limit the scope of the universe of discourse in such a way that it may be reduced to a manageable corpus, by means of parameters such as language (sociolect, terminology, jargon), time, region, situation, external and internal textual characteristics etc
NotableEnglishlanguagecorpora -include the following: British National Corpus (BNC),The American National Corpus (ANC),The Corpus of Contemporary American English (COCA),The International Corpus of English (ICE)
BNC-a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of British English from the later part of the 20th century, both spoken and written.The latest edition is the BNC XML Edition
SORT of BNC -Monolingual: deals with modern British English, not other languages used in Britain. However non-British English and foreign language words do occur in the corpus.Synchronic: covers British English of the late 20th century, rather than the historical development which produced it.General: includes different styles and varieties, and is not limited to any particular subject field, genre or register; contains examples of both spoken and written language; avoids over-representing idiosyncratic texts.
ThewrittenpartoftheBNC - 90% of the corpus, includes extracts from regional and national newspapers, specialist periodicals and journals for all ages and interests, academic books and popular fiction, published and unpublished letters and memoranda, school and university essays, among many other kinds of text
Sample: For written sources, samples of 45,000 words are taken from various parts of single-author texts. Shorter texts up to a maximum of 45,000 words, or multi-author texts such as magazines and newspapers, are included in full.
ThespokenpartoftheBNC -10% of the corpus, consists of orthographic transcriptions of unscripted informal conversations (recorded by volunteers selected from different age, region and social classes in a demographically balanced way) and spoken language collected in different contexts, ranging from formal business or government meetings to radio shows and phone-ins
BNCwork on on building the corpus began in 1991 ,
the second edition BNC World (2001)
the third edition BNC XML Edition (2007)
two sub-corpora with material from the BNC have been released separately:
the BNC Sampler (a general collection of one million written words, one million spoken)
the BNC Baby (four one-million word samples from four different genres
TheOpenAmericanNationalCorpus (OANC) -a massive electronic collection of American English, including texts of all genres and transcripts of spoken data produced from 1990 onward. All data and annotations are fully open and unrestricted for any use.
Available Data and Annotations
OANC: 15 million words of contemporary American English with automatically-produced annotations for a variety of linguistic phenomena
is the largest freely-available corpus of American English. The corpus was created by Mark Davis and it is used by tens of thousands of users every month (linguists, teachers, translators, and other researchers). COCA is also related to other large corpora
COCA -The corpus contains more than 450 million words of text and is equally divided among spoken, fiction, popular magazines, newspapers, and academic texts. It includes 20 million words each year from 1990 - 2012 and the corpus is also updated regularly.
Because of its design, it is perhaps the only corpus of English that is suitable for looking at current ongoing changes in the language
You can easily carry out semantically-based queries of the corpus. For example, you can contrast and compare the collocates of two related words (little/small, democrats/republicans, men/women), to determine the difference in meaning or use between these words.
You can find the frequency and distribution of synonyms for nearly 60,000 words and compare their frequency in different genres, and also use these word lists as part of other queries. Finally, you can easily create your own lists of semantically-related words, and then use them directly as part of the query.
TheInternationalCorpusofEnglish (ICE) -began in 1990 with the primary aim of collecting material for comparative studies of English worldwide. Twenty-six research teams around the world are preparing electronic corpora of their own national or regional variety of English. Each ICE corpus consists of one million words of spoken and written English produced after 1989. For most participating countries, the ICE project is stimulating the first systematic investigation of the national variety.
Corpuslexicography -The process of compiling or revising a dictionary based on texts (of written and/or spoken language) collected in an electronic format (i.e., corpora).
British linguist John Sinclair (1933-2007), founder of the COBUILD project at the University of Birmingham, oversaw the production of the first strictly corpus-based dictionary, Collins COBUILD English Language Dictionary (1987).
COBUILDEnglishLanguageDictionary (1987) -is a learner’s dictionary centered around a word’s use in context, and is created from an analysis of an evolving English textual corpus (the Bank of English, on which current editions of the COBUILD dictionary are based, was officially launched in 1991 and now includes 524 million words. This corpus evidence allows lexicographers to include frequency information as part of a word’s entry (helping learners concentrate on common words) and also to include sentences from the corpus that demonstrate a word’s common collocations — the words and phrases that it frequently appears with.
The words in the COBUILD came from books, magazines, newspapers, pamphlets, leaflets, conversations, radio and television broadcasts/
The aim: to provide a fair representation of contemporary English
Computer tolls are applied which affects the choice of words for the dictionary and the order of meanings in entries
Longman Dictionary (LDOCE) The corpora that make up the Longman Corpus Network enabled to establish frequencies of usage and the most common constructions of wordsAll the definitions of the dictionary (1995) are presented in frequency order with the most common meanings first
Cambridge dictionaries -Behind CIDE is a corpus called International Cambridge Survey (covers instances of words within one hundred million items representing major varieties of English
Innovation: how it solves the problem of polysemy
Each entry represents one sense of the word indicating the seme which makes the basis of this sense
Parallelcorpora -is a corpus that contains a collection of original texts in language L1 and their translations into a set of languages L2 ... Ln. In most cases, parallel corpora contain data from only two languages.Closely related to parallel corpora are comparable corpora, which consists of texts from two or more languages which are similar in genre, topic, register etc. without, however, containing the same content
Compilationofparallelcorpora -The texts of a corpus are chosen according to specific criteria which depend on the purpose for which it is created. In particular, compilers have to decide whether to include a static or dynamic collection of texts, and entire texts or text samples. Questions of authorship, size, topic, genre, medium and style have to be considered we well. In any case, a corpus is intended to comply with the following requirements: (i) it should contain authentic (naturally occurring) language data; (ii) it should be representative, i.e. it should contain data from different types of discourse.
Alignment of a parallel corpus -In order to use a parallel corpus properly it is necessary to align the source text and its translation(s):
one has to identify the pairs or sets of sentences, phrases and words in the original text and their correspondences in the other languages.
Parallel text alignment is important because during the translation process sentences might be split, merged, deleted, inserted or reordered by the translator in order to create a natural translation in the target language
In order to compare the original text and its translation(s), it is necessary to (re-)establish the correspondences between the texts. In the process of alignment, anchor points such as proper names, numbers, quotation marks etc. are often used as a points of orientation. The degree of correspondence between the texts of a parallel corpus varies depending on the text type. For example, a fictional text may allow the translator a greater freedom than a legal one.
Corpora -in –contrastive- studies - Parallel corpora (i.e. multilingual corpora) – a valuable source of data: a principal reason for the revival of contrastive linguistics that has taken place in 1990sGive new insights into the languages compared that are likely to be unnoticed in the studies of monolingual corpora
can be used for a range of comparative purposes and increase our understanding of language-specific, typological and cultural differences, as well as of universal features;
illuminate differences between source texts and translations;
can be used for a number of practical applications, e.g. in lexicography, language teaching and translation
help translators to find translation equivalents between the source and the target language;
provide information on the frequency of words, specific uses of lexical items as well as collocational and syntactic patterns;
may help translators to develop systematic translation strategies for words or phrases which have no direct equivalent in the target language;
sets of possible translations can be identified and the translator can choose a translation strategy according to the specific register, topic and genre.
In recent times, parallel corpora have been increasingly used to develop resources for automatic translation systems.
Parallel corpora are used more and more to design corpus-based (bilingual) dictionaries.
Generaldivisionofphrases -various structural and semantic types of phrases are characterized by different degrees of stability in a language.
the most general division of phrases can be the following: free phrases versus set expressions.
we proceed from restrictions imposed upon the lexical filling of structural patterns which are specific for every language there are no absolutely free word-combinations. Restrictions depend on the ties existing in extra-linguistic reality and grammatical properties of
typesofwordcombinations -Description of different types of word combinations in any language means, in the first place, the study of their semantic structure displayed in the interrelation of the semantic content of the components.
The branch of lexicology which studies word-combinations or phrases may be termed phraseology.
The word “phraseology” itself has very different meanings in this country and in UK or USA. In our linguistic literature the term has come to be used for expressions where the meaning of one element is dependent on the other one irrespective of the structure and properties of the unit (V.V.Vinogradov).
Phraseology was singled out as a special branch of linguistics in the forties of the 20th century. The theory of phraseology has been initiated in the researches of O. Potebnia, I. Sreznevskyi, A.Shakhmatov and F. Fortunatov. Some of them had been influenced by the teaching of Ferdinand de Saussure. In the twenties Ye. Polivanov was most productive in the study of phraseology
The forefather of the theory of phraseology is considered to be the Swiss linguist of the French origin Charles Bally. In his works “Précis de stylistique” and “Tratè de stylistique française” (1905) Ch.Bally introduced the definition of “phraseology” as “a chapter of stylistics”. He presents the following four groups of word-combinations:
freephrases – word combinations which are not stable and disintegrate after being formed,
usualphrases - word-combinations with relatively free connection of their components, where some changes are admittable;
3) phraseologicalfusions –word-groups, in which the two concepts blend together;
4) phraseologicalunities – the combinations in which the words lose their meaning and express only one concept
Expressiveness =Phraseological units can be classified according to the tropes forming their basis. The expressiveness is accounted for by an unusual combination of components due to which a unique meaning arises.
Phraseologisms can be based on such tropes as metaphors, hyperboles, metonymies, similes; many of them have developed from free word-combinations or denote specific notions of certain culture, thus being realias
V. Vinogradov -The turning point in the classification of the phraseological units was V. Vinogradov’s research. His classification is synchronic. He developed some points first advanced by Charles Bally. Thanks to him phraseological units were defined as lexical complexes with specific semantic features. 1) Phraseologicalfusions (ôðàçåîëîã³÷í³ çðîùåííÿ) – stable, indivisible word combinations the meaning of which cannot be deduced from the meaning of the words which make up a combination, e.g. ïåêòè ðàê³â, ñîáàêó ç’¿ñòè, ñêàêàòè â ãðå÷êó, ðîçâîäèòè àíòèìîí³¿, äàòè äðàëà, âð³çàòè äóáà, íå äî ñîë³, òî÷èòè ëÿñè ) Phraseologicalunities (ôðàçåîëîã³÷í³ ºäíîñò³) – semantically indivisible combinations but their semantic content is partially motivated by the meaning of words that make up this phraseological unit, e.g.,çàêèíóòè âóäêó, òÿãíóòè ëÿìêó, ì³ëêî ïëàâàòè, ïîêëàñòè çóáè íà ïîëèöþ, òîâêòè âîäó ó ñòóï³Phraseologicalcombinations (ôðàçåîëîã³÷í³ ñïîëó÷åííÿ) – stable phrases in which one of the components has an independent meaning, which is concretized in permanent use with other words. For example, áðàòè preserves its lexical meaning in combination with different nouns reveals the meaning of the phraseological unit, e.g. í³÷îãî â ðîò íå áðàòè, áðàòè ðóøíèêè, áðàòè ãîðó, áðàòè áëèçüêî äî ñåðöÿ, áîàòè íà ãëóì.
Contextological approach-It was pointed out by N. Amosova and A. Kunin that this classification being developed for the Russian phraseology does not fit the peculiar English features.
N. Amosova’s approach is contextological. She defines phraseological units as units of fixed context. Fixed context is defined as the context characterized by a specific and unchanging sequence of definite lexical components. Units of fixed context are subdivided into phrasemes and idioms
Phrasemes are binary – one component has phraseologically bound meaning, the other serves as a determining context. For instance:
small talk, small hours
In idioms the new meaning is created by the whole, though every element may have its original meaning weakened or even completely lost. For instance:
in the nick of time – at the exact moment,red tape, to play with fire
R. Zorivchak -every phraseological unit as a polylexeme construction consists of a combination of lexemes with a certain structure and grammatical features (the first sense layer).
The verbal image appears on the basis of the first sense layer and then is shaped into the phraseological meaning (the second sense layer).
The general meaning is one semantic whole, which is the result of interrelationship of all individual semes. The verbal image is sometimes latent. This semantic structure is enriched by connotations
Suggested classification word combination - a unity of at least two notional words that do not present a structure of predication and come to life as the result of a realization of the compatible semantic components of these words. The material for the analysis of the semantic structures of word combinations are dictionary entries revealing the semantic content of the components.
word-sense- combinations -If the study of the corresponding dictionary entries shows that the cumulative sense of a word combination is formed as the result of combining of the components senses, then such word combinations can be considered word-sense ones (WSC).
Senseindivisibility can be defined as the impossibility to single out those semantic features in the semantic structures of separate components which can be singled out for the whole combination.
QIC and IC are phraseological units.
a phraseological unit is a combination of two word forms which do not present a syntactic structure of predication and are characterized by sense indivisibility.
Sense indivisibility being the crucial factor for defining a phraseological unit, this factor can be viewed as a key one for understanding of the mechanism of phraseologisation. The process of phraseologisation of free phrases is based on the possibility of singling out a semantic feature or features of a word combination which cannot be found in its separate components.
Techniqueoftheanalysis is based on the studying of the semantic features of separate components of this combination and the combination in integrity,The suggested technique allows to single out three semantic types of word combinations: word-sense combinations, quasi idiomatic combinations, idiomatic combinations
Semanticprocess -These semantic features can make the basis of another sense or senses of this word combination. Thus we have to deal with the semantic process which is exactly the same as the process of the semantic development of separate words, namely, polysemy. In the process of language development the semantic feature which made the basis of the formation of a new sense of the word combination can get so “darkened” that formally equivalent units can become homonymous.
anthropocentric spectrum - Expressive value of phraseological units is combined with distinctiveness and emotional couloring, they autonomise human features and behaviour, actualise norms of life etc. Both English and Ukrainian languages abound in various phraseological units covering the whole anthropocentric spectrum: from the internal life of a person – to social status. Phraseology reflects the human psychology in all its representations and features, simulates probable variants of human behaviour, gives “recipies” of the situations etc.
Contrastiveanalysis We can discover a lot of similarities and differences in phraseological treasures of both contrasted languages. Similarities can be explained either by the common source of origin (biblical expressions, expressions of Roman origin etc.) or universal character of some peculiarities of the natural world, human physiology etc
Proverb is a saying of a didactic or advisory nature in which a generalization is given a special, often metaphoric expression: “A bird in the hand is worth two in the bush” (people say it when they think it is not worth giving up something you already have for only the possibility of getting something better).
Familiarquotations -are different from proverbs in their origin. They come from literature but by and by they become part of the language, so that many people using them do not even know that they are quoting, and very few could name the play of original usage even when they are aware of using a quotation from William Shakespeare
sourcesphraseological units originate from various sources: 1) legends, traditions,religions, narrations and beliefs of the folk. to beat the wind – to waste time, to be busy with vain work;
To show the white feather – to show timidity (a white feather in a tail of fighting cock was a sign of bad breed);to leap apes in hell – to die as an unmarried woman (according to old English narrations old unmarried women were intended to leap apes after their death
Sources realia -blue stocking – learned woman (one of English admiral Boscawen’s literary meetings in the 18th century in London was called “the meeting of blue stockings”, because scientist Benjamin Spellingflete came in blue stockings);
blue book – reference book that contains surnames of persons who occupy state posts in the USA;
to carry coals to Newcastle – to do something absurd (Newcastle is the centre of English coal industry);
personalities of writers, kings and scholars:
King Charles’ head – obsessive idea (according to Charles Dickens’ novel “David Copperfield”);
Queen Anne is dead – nothing new;
a Sherlock Holmes – a detective;
a Sally Lunn – sweet roll;
historical facts =As well be hanged for a sheep as for a lamb – if one is to be executed because of stealing a sheep, so why not steal a lamb (an old English law according to which one who steals a sheep is executed);
the curse of Scotland – nine of diamonds in cards (the card is called in honor of the resemblance with the blazon (ãåðá) of Duke Stair, who hated Scotland)
fables and fairy-tales:
Fortunate’s purse – purse full of money;
the whole bag of tricks – very sly;
henpecked husband – a man habitually subdued by his wife;
a marriage portion – a bride’s dowry (ïîñàã),
to marry a fortune – to take as a husband a rich and well-respected man,
Miss Right – smb.’s future wife,
Mr. Right – smb.’s future husband
seasons and weather:
rush season – period when people are especially busy doing something;
out of season – not available for sale, out of point, not in a proper place;
settled weather – period of calm weather, free from storms and extremes;