Chapter 1: Basic Console Operations

This chapter will introduce you to some of the most fundamental functions in R programming such as writing values to variables and performing simple operations on these. For instance, we will learn how we can perform an operation on all items in a list with just a simple command.

The Console

First let have a look at the RStudio interface. It consists of a number of different windows. In this chapter we will be using the console window. I will introduce the other windows along the way when we will need them.

Figure 1: The console window is the bigger window to the left.

We can think about the console window as a giant calculator. We can type in individual lines of code and execute them. One-by-one. The console does not have a memory. Once you have run a command it is gone. You cannot go back and correct stuff. If you make a mistake you simply rerun the procedure with the corrected code. This is actually not the way that we will be using R most of the time, but it is a good way to start. In the next chapter, we will start using the script editor that allow us to write scripts out of multiple lines of code and execute it all at once. These scripts can be modified, saved and rerun at a later point.

Once we get going with scripts, we will mostly use the console window when we need to test individual functions and small pieces of code before we add it to a script.

Simple Calculations In The Console

Now, let's get acquainted with some of the basic functionalities of the R programming language.
We can actually use the console like a calculator. Try to type some simple calculations and hit 'enter'.

  1 + 1
  [1] 2

We observe that the console returns the result 2 as expected. Lets try something slightly more sophisticated.

  1 + 4 * 5 / 2
  [1] 11

Works! Notice that we can also bracket part of a calculation, just as in our old school math

  (1 + 4) * 5 / 2
  [1] 12.5

This is useful. But still this is not the main way we are going to work when we conduct statistical analyses of experimental data. One of the most fundamental procedures in programming is the way in which we write values to 'variables' and manipulate these.

Writing And Operating On Variables

We can think about a variable as a box that we put something into.

Figure 2: Illustration of how we can think about values written into a variable variables

In the depiction above we have made a variable named box and put the value 42 into it. The name is arbitrary. We could really call it anything. Lets see what the example would look like in code:

  box = 42

Notice that I am using the equal sign = here to put a value into the variable. Often you will see that people use a little arrow symbol <- to do so. These two techniques are perfectly equivalent. When I prefer the equal sign it is because that is also used in other programming languages such as Python. Once we have defined our variable, RStudio will remember it and associate box with the value 42. In fact you might notice that the variable has appeared in the 'Environment' pane of RStudio in the upper right corner.

Figure 3: Variables will appear in the 'Environment' window

Now we might want to open the box to remind ourselves what is inside. We do that simply by typing box in the console and hit return:

  box
  [1] 42

We see that the output is 42. Now try to write a new variable called forest, this time with a character or string value. Remember to put quotes (single ' or double ") around string values. Otherwise R think that we are trying to run a command:

  forest = 'moose'
  forest
  [1] 'moose'

This works fine as well. Lets move on to try out some simple operations we can do with variables.

  box + 4
  [1] 46
  box + box
  [1] 84
  box / 2
  [1] 21

We notice that R treats out variable as a number and we can do simple math with it. However, if you print the content of the variable box again, you will notice that the value is still 42. Now if we want to update the content of our variable box by adding a value to it we need to overwrite it.

  box = box + 4
  box 
  [1] 46

Notice that now we have updated the content of our box variable by overwriting the old box with the new box + 4. This is something we do all the time when we program data analyses.

Most often, we are not concerned about having variables in our environment even if we do not use them anymore. However, you can delete a variable using the rm() command (which is short for remove)

  rm(forest)

We can also write multiple values to a variable. First we will consider how we handle lists of items and later on 2 dimensional data frames.

Writing Lists To Variables

If we want to write a list of values to a variable we need R's concatenate function c(). Lets create a variable called my_list and put some random numbers into it (notice that we cannot have white spaces in a variable name so I will use the underscore solution).

  my_list = c(12, 23, 34, 45, 56)
  my_list
  [1] 12 23 34 45 56

Simple list operations

The cool thing about operations on list variables is that it is super-easy to make changes to a number of values in a single procedure or command. For instance, we might want to add a constant, e.g. 2, to all the values in a list. Then we can use the same procedure as we did with the box variable above.

  my_list = my_list + 2
  my_list
  [1] 14 25 36 47 58

Notice that now we have added 2 to all values in the list - juhuuu! That was easy!

Appending values to an existing list

Lets imagine that these values represent some kind of measurement from five experimental participants. Now we collect data from a new participant, e.g. the value 34, and we want to add this value to our list. We do that by concatenating a list out of our previous list and the new value.

  my_list = c(my_list, 36)
  my_list
  [1] 14 25 36 47 58 36

Deleting values from a list

Similarly, we can be in a situation where we need to delete individual values from a list. For this we will use indexing, which is covered in more detail below. There are two different kinds of situations that requires different solutions. For instance, we might want to write a new variable, my_list2, that contain the same values as my_list except from all instances of a certain value, lets say 36. Then we will use the following method.

  my_list2 = my_list[my_list != 36]
  my_list2
  [1] 14 25 47 58

The code means something like "put inside my_list2 all items from my_list except from the value 36". Notice that in this particular case there were two items with the value 36 and they are both gone in my_list2.

However, we could also be in a different situation where we want to delete only the first instance of the value 36 and keep the last one. We will write the output to my_list3. Now we need to think slightly differently about it. The first instance of 36 is the 3rd value in our list and we can specify the operation with this position index.

  my_list3 = my_list[-3]
  my_list3
  [1] 14 25 47 58 36

The result is different from before. Now we only deleted the first instance of 36 (value number 3 in my_list) while we kept the last instance.


Exercise 1

  1. Create a variable with a mixed list of values and string objects (words) like new_list = c(12, ‘Bob’, 24, 34)
  2. Try to multiply the variable * 2. What happens? Why?
  3. Use the command class(new_list) – what does it tell you?
  4. Create a new list variable with only numeric values
  5. Append four new values to the list
  6. Delete the first value from the list

List Commands

There is a number of functions in R that allow us to get summary properties of a list. This is particularly useful when we deal with large lists with hundreds or thousands of values. Here are some of the most basic ones.

Notice that I will use the hashtag # to write comments for each function. Whenever you put a hashtag in front of some text, you tell R that this text is not to be interpreted as commands.

  # get the length of a list (the number of elements)
  length(my_list)
  [1] 6

  # get the maximum value in the list
  max(my_list)
  [1] 58

  # get the minimum value in the list
  min(my_list)
  [1] 14

  # get the sum of values in the list
  sum(my_list)
  [1] 216

  # get the unique elements in the list (take away repetitions)
  unique(my_list)
  [1] 14 25 36 47 58

Exercise 2

  1. Use the new functions to calculate the mean value of my_list
  2. Subtract the mean from all values in my_list and overwrite your list with these new values
  3. Create a new list variable called My_SUM with the minimum, mean, and maximum value of my_list without typing any numbers (clue: you need to embed the functions in your list specification)

List Indexing

To "index" means to specify a position in a list. We use indexing when we want to pull out and inspect or operate on individual values in a list. For instance we might need to pull out a subset from a longer list, delete or correct individual values, etc.

Maybe you have already noticed that in the previous sections, we used two different kinds of brackets, regular soft parentheses (), and so-called hard-brackets []. Regular parentheses are used with functions. The hard ones are used for indexing.

First, let's make a new list that we call new_list.

  new_list = c(1.2, 2.0, 3.4, 0, 0, 0, 2.2, 1.9, 0, 3.5 , 3.6)
  new_list
  [1] 1.2 2.0 3.4 0.0 0.0 0.0 2.2 1.9 0.0 3.5 3.6

We might want to inspect a particular value in the list, for instance the third value. Then we can use hard brackets to index the 3rd position in the list.

  new_list[3]
  [1] 3.4

We can also pull out a range of values from one position index to another using the :.

  # pull out value 3 - 6 from new_list
  new_list[3:6]
  [1] 3.4 0.0 0.0 0.0

Or we can use or c() command to pull out values from particular positions.

  # pull out value number 2, 6, and 10 from new_list
  new_list[c(2, 6, 10)]
  [1] 2.0 0.0 3.5

Often we are in a situation where we have a lot of data, but we are particularly interested in only some of the values. For instance we might want to know how many zeros there are in a list. Then we can combine indexing with regular math expressions.

Notice that here we will use the double equal sign ==. While single equal signs = are used in order to write values to variables, we use the double equal signs == to actually mean 'equal to'.

  # pull out all items with the value 0
  new_list[new_list == 0]
  [1] 0 0 0 0

  # number of zeros?
  length(new_list[new_list == 0])
  [1] 4

We could also be interested in knowing how many non-zero elements there are in our list

  # number of non-zero elements in list
  length(new_list[new_list > 0])
  [1] 7

There are also a set of functions that allows us to inspect the beginning or end of a list. This could be especially useful if we are dealing with a long list.

  # inspect the first 3 values of a list
  head(new_list, 3)
  [1] 1.2 2.0 3.4

  # inspect last 3 values of a list
  tail(new:list, 3)
  [1] 0.0 3.5 3.6

Exercise 3

  1. Use indexing to replace the 3rd value in your list with a different value
  2. Replace all values less than 3 with 0
  3. Use your new functions to write the first 2 and the last 2 values to a different list variable (clue: you need to embed functions in the c() command)

results matching ""

    No results matching ""