How To Use

On the main page, choose a variable from the dropdown menu and click the EXPLORE button.

The search results contain three sets of information.

First, the variable name is hyperlinked and takes you to the EA of D-Place, where you can see the variable description and categories.

Next, a map shows the global distribution for the variable.

Finally, at the bottom of the page, a table lists all the world’s languages. The search box can be used to search for the name of a language, country, or language ISO code. The table is sorted alphabetically by country name and then by language name. The table includes the following columns:

MATCHING_ID provides a hyperlinked number that gives detailed information about the contemporary language and the pre-industrial society which is matched to the language.

Once you click on the number, it opens a tab that includes two columns.

In the left column, there is a button to explore the contemporary language on the Ethnologue, where you will find alternative names of the language, its dialects, the countries where or ethnicities by which the language is spoken, etc.

In the right column, there is a button to explore the pre-industrial society which is matched to the contemporary language. Below the buttons, respectively, you will see language family and genealogical branches of the contemporary language and the language spoken by the matched pre-industrial society. By clicking on them, new tabs will show you the contemporary or pre-industrial language in the Glottolog language trees. Note that for each variable, every contemporary language is matched to the linguistically closest pre-industrial society which contains an observation for that variable.

At the bottom of the page, there is a map that shows the centroid location of the contemporary language and the location of the pre-industrial society at the time it was sampled by anthropologists.

COUNTRY and COUNTRY_ID include information about the name and ISO code for the country in which the contemporary language is spoken today.

LANGUAGE_NAME and LANGUAGE_ISO include the name and ISO code of the contemporary language as recorded in the Ethnologue. Clicking on the language name opens its entry page on the Ethnologue website, where you can find alternative names of the language, dialects, and the list of other countries where or ethnicities by which the language is spoken.

ETHNOLOGUE_ID is a unique identifier of the contemporary language spoken in a given country. Note that some languages are spoken in multiple countries.This identifier can be used to link the data to polygon shapefiles of the world’s languages based on the Ethnologue. The polygon for each language shows the geographic distribution of people speaking the language today. The shapefile can be obtained from the World Language Mapping System.

LANGUAGE_GLOTTOLOG_ID is the unique identifier of the contemporary language in the Glottolog’s language database. Clicking on the identifier will open the language in the Glottolog language tree.

LANGUAGE_POPULATION presents estimated population of each language in a country. The data is extracted from the Gridded Population of the World raster data to language polygons.

LANGUAGE_LATITUDE and LANGUAGE_LONGITUDE are coordinates of the centroid of the language, as recorded in the Ethnologue.

VARIABLE_VALUE and VARIABLE_LABEL show the value and category of the variable, based on the Ethnographic Atlas information about the pre-industrial society which is matched to the language. Note that values and categories of the variable can be found by clicking on the hyperlinked code at the top of the page.

SOCIETY_NAME, SOCIETY_ID, SOCIETY_LATITUDE, and SOCIETY_LONGITUDE present information from the Ethnographic Atlas on the name, unique id, and coordinates of the pre-industrial society.

SOCIETY_GLOTTOLOG_ID is the unique identifier of the language—spoken in the pre-industrial society—in the Glottolog language database. Clicking on the identifier will open the language in the Glottolog language tree.

N_NODES As a measure of genealogical distance, this variable indicates number of nodes in the Glottolog language tree that lies between the contemporary language and the language spoken by the matched pre-industrial society. Therefore, N_NODES=0 indicates that the pre-industrial society spoke exactly the same language as the contemporary population to which it is matched. Note that to see the contemporary language and the language spoken in the EA society in the Glottolog tree, you can click on LANGUAGE_GLOTTOLOG_ID and SOCIETY_GLOTTOLOG_ID respectively.

N_BRANCH As a measure of genealogical distance used in our study, it indicates number of branches in the language tree that we needed to go up in order to match the language to a pre-industrial society. See the paper in the About section for more details on the methodology.

MATCH_LEVEL Match level 1 indicates that the variable value is assigned from the linguistically closest society in the EA. Match level 2 indicates that the observation for the variable is missing for the linguistically closest EA society, and therefore, the variable value is assigned from the 2nd linguistically closest society in the EA. Similarly, match levels 3, 4, 5, … indicate that the variable value is assigned from the 3rd, 4th, 5th, … linguistically closest society because the observation is missing for the first 2, 3, 4, … linguistically closest societies.