Linguistic expeditions






Study and documentation of endangered languages


Researchers: Tayiana Bagariatskaya, Elena Budianskaya, Olga Kazakevich, Tatiana Reutt, Svetlana Chlenova, Vera Khoruzhaya;

headed by Olga Kazakevich


This research branch has been developed in the Laboratory since 1987. At that time Olga Kazakevich and Jane Anoshkina launched the project aimed at creating a full text computer database of Selkup, a minority language of Siberia. This project (along with the project of creation of a computer database of the Chukot-Kamchatka languages at the Institute of Linguistic Studies in Saint Petersburg) marked the beginning of the use of computers in researches into non-written and new-written languages of Russia,


In 2000 a special group for documentation of and research into endangered languages was organized. There are two research directions being developed for the moment:

1. Documentation of and research into endangered languages of Western and Central Siberia (Tatiana Bagariatskaya, Elena Budianskaya, Olga Kazakevich, Tatiana Reutt, Vera Khoruzhaya).

2. Documentation of and research into endangered languages of the Maluku Islands (Svetlana Chlenova). 


In the framework of Siberian studies  

on the basis of the Laboratory linguistic expeditions for documentation of endangered languages of Siberia are being regularly organized. Up to now the subjects to documentation were local dialects of Selkup, Ket, and Evenki. Since 2001 10 expeditions have been organized. The fieldwork have been done in villages of the Krasnoselkup and Pur districts of the Yamalo-Nenets autonomous area, and of the Turukhansk and Evenki districts of the Krasnoyarsk territory. During these expeditions a substantial amount of linguistic data was collected, whereby modern technologies of the audio and video speech recording were used. A standard procedure of linguistic data collection in the situation of language shift have been developed (to-day practically all the autochthonous ethnic minorities of Siberia are shifting to Russian). The expeditions were supported by Russian Foundation for Basic Researches and by Russian Foundation for the Humanities ( We often have linguistic students among the participants of the expeditions: in 2002, 2003, and 2005 those were students of the Department for Theoretical and applied Linguistics, Lomonosov Moscow State University, and in 2005, 2006, and 2007 students of the Institute of Linguistics, Russian State University for the Humanities.


The processing of the collected data (at different stages supported by grants from Russian Foundation for Basic Researches, Russian Foundation for the Humanities, and  Research Support Scheme of the Open Society Institute) is being done in several directions:

1) creation and enlargement of multimedia computer archives of Selkup, Ket, and Evenki speech, which include sounding dictionaries and folklore texts and life-stories in audio, video, and graphic representation in local dialects of the surveyed villages; development of linguistic multimedia databases;

2) analysis of the linguistic situations in the surveyed villages;

3) instrumental analysis of segmental and supra-segmental phonetic characteristics of Selkup, Ket, and Evenki speech;

4) morphological indexation of Selkup, Ket, and Evenki texts;

5) analysis of structural changes developing in Selkup, Ket, and Evenki during the last sentury;

6) discourse analysis of the collected texts;

7) content analysis of the collected texts whereby we come in touch with folklore studies and ethnography, from the one side, and with history of Russia, from the other side;

8) research into the Selkup-Ket-Evenki language contacts;

9) processing of the shot video materials: creation of a computer video archive and attempts to make video films on the life of the surveyed villages and the use of minority languages in them.


Some results

A multimedia database of Ket is created
(Demo version


Pilot versions of multimedia databases of Selkup ( and Evenki ( are developed.

Description of the language situation among the Ket, Northern Selkup and Western Evenki has been prepared and partly published.


Using the materials of our multimedia archives instrumental research into Selkup, Ket, and Evenki phrasal intonation has been done.


Description of changes in the structure of some grammatical categories (number, mood, conjugation type) which took place in local dialects of the Northern Selkup during the last century has been published.


On the material of the video archive two video documentaries have been created: "Language as remembrance: the Ket of Sulomai" (2005) and ("The Ket from the Lake Munduiskoye" (2008) (Olga Kazakevich, Aleksandr Chvyrev).


At present we are working at the project "Changing Russia reflected in life-stories of the Ket, Selkup, and Evenki" supported by Russian Foundation for the Humanities (grant 07-04-00332a). The objective of the project is the processing of life-stories in Selkup, Ket, and Evenki local dialects from our archive, their morphological glossing, discourse and content analysis, and their publication both as a book and as an Internet resource. (Olga Kazakevich, Elena Budianskaya).


We are also digitalizing Selkup texts from the archive of Georgiy and Ekaterina Prokofiev recorded in the 1920s and preserved in the Archive of Peter the Great Museum of Anthropology and Ethnography in Saint-Petersburg (Tatiana Reutt, Vera Khoruzhaya).


In the framework of Maluku studies

a digitalized archive of field data on Maluku languages was compiled and is being enlarged.


Maluku islands (Eastern Indonesia) remain  a region peopled by speakers of the  lesser-known  Austronesian languages. The Maluku archive comprises the field materials on 31 Malukan  isolects. The data was  obtained  by Svetlana Chlenova and Michael Chlenov during their residence at Ambon island, centre of Eastern Indonesia in 1963-1965. For data collection a questionnaire containing 517 items wordlist, 36 sentences and paradigms of the verbs to have and to eat was developed. Besides linguistic information the questionnaire included some metadata about each informant and his native language.


These data gathered in the mid-1960-ies are an important source for Eastern Indonesia linguistic studies especially valuable as they synchronically document the Moluccan linguistic situation, before some of the indigenous languages under consideration became  moribund or even already extinct under the growing pressure of Indonesian.


Maluku data became a basis for a series of linguistic and sociolinguistic publications[1] (see


For the time being our Maluku digitalized archive comprises twelve wordlists and grammatical descriptions of six Moluccan languages, three of them (Dawera-Daweloor, Damar and Teun variety of Wetan) have never been analyzed  previously (see Damar research at Ethnologue website: