Technical documentation Personality test
1. Introduction
2. Method
   2.1 Instruments used
3. Exploration and selection of norm data
   3.1 Age
   3.2 Sex
   3.3 Level of education
   3.4 Nationality
   3.5 Labour market position
   3.6 Work sectorindustry
   3.7 Completion time
   3.8 Response variation
   3.9 Human response
4. Analysis
   4.1 Raw scores
   4.2 Correlations
   4.3 Reliability
   4.4 Construct validity / factor analysis
5. Norms
   5.1 Labour force
   5.2 Group differences
6. Conclusion
7. References


This document provides insight into the psychometry of the Big Five personality test of 123test. This test, developed by 123test B.V., is an operationalisation of the Big Five personality theory.

The test measures the five main dimensions of personality and the 30 underlying facets. This makes it a scientific instrument, which, moreover, has a high degree of reliability. has a high validity and a representative and recently assembled norm group used.

Information on reliability, validity and norm groups are described in this document. Also discussed how the dimensions of this test vary as a function of level of education, gender and age.


Since November 1, 2019, more than 500,000 responses of the Big Five Personality Test have been recorded on By analyzing the data of these anonymous respondents, we can form a good picture of this instrument.

Instruments used

The Big Five Personality Test is free to use at The Dutch equivalent of this questionnaire can be found at and can also be used free of charge.

Exploration and selection of norm data

In order to explore the gathered data, this chapter examines a number of background variables of the respondent in more detail. The selection criteria used for the final dataset are indicated for each component.

The complete dataset consists of 490.689 respondents. Based on cirteria such as age, sex, educational level, nationality, labour market position, completion time and response variation, the final subset is made on which analyses are done and with which the final norm is calculated.


An age group of 18 to 67 years has been chosen, because this group best represents the working population of the Western world.


The sex of all respondents is known because this background question was mandatory. Striking is the higher number of women who completed the test. Logically, both sexes are included in the dataset because this group best represents the labour population of the Western world.

Level of education

Because the Big Five Personality Test is specially developed for average to higher educated people, it was decided to select a number of education levels to be included in the dataset. The blue shaded education levels in the diagram are included in the dataset.


The numbers of nationalities represented in the original dataset is enormous: 217 countries, dependencies and territories were represented with more than 10 respondents.

Because the Big Five Personality Test is developed for the English speaking market of the Western world, a country selection is made. Countries selected in the final dataset are shaded blue.

Labour market position

The respondent was asked about his/her labour market position. Only the labour market positions Salaried employment, Self-employed/Freelancer and Officially unemployed were used in the dataset, because this group best represents the labour population of the Western world.

Work sector/industry

The respondent was asked to indicate in which working sector he/she works. A choice could be made from the 23 work sectors used in the model of EurOccupations ( 2009). The distribution gives no reason to correct for this.

Completion time

Looking at the duration of completion of a questionnaire is a good way to determine how seriously a respondent has completed the questionnaire. It was decided to take between 5 and 45 minutes in the final dataset.

Response Variation

Looking at a respondent’s response variation is a good way to determine how seriously the respondent has completed the questionnaire. It was decided to only include a response variation of 5 in the final dataset. A response variation of 5 means that a respondent has used all the answer options of the Likert-5 scale at least once over all 120 items.

Human response

Online questionnaires can suffer from crawlers and bots who fill in the questionnaires automatically. By using a consistency measure we can exclude responses that are not consistent from the dataset.

The consistency measure psychometric synonym (Meade and Craig 2012) has been used to identify artificial and random responses. This consistency measure is calculated by first selecting all item pairs that correlate > .60 across the entire dataset. In this dataset, 9 item pairs are selected. Next, for each respondent the psychometric synonym score is calculated which is equal to the within-person correlation of the selected item pairs.

The cut-off value of 0.2 used by Meade & Craig (2012) was used to filter artificial responses and responses with a random response pattern from the dataset. In the histogram below, the deleted responses are shaded gray.


The final dataset includes 15.107 respondents.

Raw scores

In this chapter the raw scores of all the facets and factors are presented. X-axes have been omitted because of possible unwanted reuse of the norm data.


The histograms of the raw factor scores all show a normal distribution.