Chapter 12 Frequencies

12.1 Intro

Frequency tables are normally used to inspect the distribution of categorical (dichotomous, nominal or ordinal) variables.

12.1.1 Example dataset

This example uses the Rosetta Stats example dataset “pp15” (see Chapter 1 for information about the datasets and Chapter 3 for an explanation of how to load datasets).

12.1.2 Variable(s)

From this dataset, this example uses variable currentEducation_cat.

12.2 jamovi

In the analyses tab there is a symbol with a barchart, called ‘Exploration’. Go to this menu and select Descriptives. Drag the variables ypu want to describe to the variables window and check the little box Frequency tables to obtain a frequency table.

12.3 R

There are many packages that can be used to create a frequency table. We have only presented two examples. Other packages include (but are not limited to):

  • summarytools

  • Deducer

  • janitor

  • questionr

  • sjmisc

    • If you read an SPSS dataset into R, consider using the “frq” command from the “sjmisc” package. It presents both values and value labels (similar to SPSS output).

Note: To use the following commands, it is necessary to install and load the packages first (see section 2.3.2). The example dataset is stored under the name dat (see section 3).

12.3.1 rosetta package

Use the following command (this requires the rosetta package to be installed, see section 2.3.2, and the example dataset to be stored under name dat, see section 3):

rosetta::freq(dat$currentEducation_cat);

To also order a barchart, use:

rosetta::freq(dat$currentEducation_cat, plot=TRUE);

To order frequencies for multiple variables simultaneously, use:

rosetta::frequencies(dat$currentEducation_cat,
                     dat$prevEducation_cat);

12.3.2 descr and kableExtra packages

The descr package is used to run the “descriptive statistics” for the variable (in this case, a frequency table). By default, the freq command in the descr package will also create a basic bar graph. The kableExtra package can be combined with many packages to create aesthetically pleasing tables.

kableExtra::kable_styling(
  knitr::kable(as.data.frame(descr::freq(dat$currentEducation_cat)),
               booktabs=T, digits=2));

In words:

  1. From the descr package, use the frequencies command for the currentEducation variable from the dataset: (descr::freq(dat$currentEducation_cat).
  2. To make the aesthetically pleasing output, we are going to create a kable from the kableExtra package. Kables require dataframes, so we need to turn this frequency output into a dataframe: as.data.frame().
  3. Now, let’s call the kable function from the knitr package: knitr::kable().
  4. And add some stylistic elements, such as (what does booktabs=T actually do?) booktabs=T and changing the number of decimal places to 2 digits digits=2.
  5. Lastly, let’s add kablestyling to make the kable aesthetically pleasing: kableExtra::kable_styling().

12.4 SPSS

Use the following command (this requires the dat dataset to be the active dataset, see 2.4.1):

FREQ VARIABLES=currentEducation_cat.

To also order a barchart, use:

FREQ VARIABLES=currentEducation_cat
  /BARCHART FREQ.

To order frequencies for multiple variables simultaneously, use:

FREQ VARIABLES=currentEducation_cat prevEducation_cat.

12.5 Read more

If you would like more background on this topic, you can read more in these sources:

References

Navarro, Danielle. 2018. Learning Statistics with R. 0.6 ed. New South Wales, Australia. https://learningstatisticswithr.com/.