Multiplication (and R data types)

This is a basic post about multiplication operations in R. We’re considering element-wise multiplication versus matrix multiplication. First let’s make some data:

# Make some data
a = c(1,2,3)
b = c(2,4,6)
c = cbind(a,b)
x = c(2,2,2)

If we look at the output (c and x), we can see that c is a 3×2 matrix and x is a 1×3 matrix (which I will also call a vector).

# View our data
c
##      a b
## [1,] 1 2
## [2,] 2 4
## [3,] 3 6
x
## [1] 2 2 2

In R the asterisk (*) is used for element-wise multiplication. This is where the elements in the same row are multiplied by one another.

#These will give the same result
c*x
x*c

We can see that the output of c*x and x*c are the same, and the vector x doubles matrix c.

#View our element-wise multiplication output
##      a  b
## [1,] 2  4
## [2,] 4  8
## [3,] 6 12

##      a  b
## [1,] 2  4
## [2,] 4  8
## [3,] 6 12

In R percent signs combined with asterisks are used for matrix multiplication (%*%).

# This works (matrix multiplication)
x%*%c
##       a  b
## [1,] 12 24

If you dig back and remember your matrix multiplication, you’ll find that a 1×3 matrix times a 3×2 matrix gives a 1×2 matrix. It will have the same number of rows as the first matrix (x has 1 row) and the same number of columns as the second matrix (c has 2 columns). Now let’s try this with x and c reversed.

# This doesn't work. Incorrect dimensions.
c%*%x
## Error in c %*% x : non-conformable arguments

R gives us an error because you can’t multiply a 3×2 and 1×3 matrix. For the matrix multiplication to work, the number of columns in the first matrix (c = 3 columns) has to be equal to the number of rows in the second matrix (x= 1 row).

The previous operations were done using the default R arrays, which are matrices. We can confirm this using the command class and typeof below:

# Get the data type
class(c)
typeof(c)
class(x)
typeof(x)

Here’s the output of those functions.

# The output
## [1] "matrix"
## [1] "double"
## [1] "numeric"
## [1] "double"

This shows us that our matrix c, has the R data type of a matrix, with formatting of ‘double’, which means that is is numbers (as opposed to something like ‘character’). This also shows us our 1×3 matrix or vector has the R data type ‘numeric’ and also has the formatting of ‘double’.

Now, let’s say your data is in a data frame rather than a matrix. Let’s see what happens when we perform multiplication on data frames. Remember data frames in R can hold different types of data (numbers, letters, etc.), while matrices can only have one type of data.
***For more info about this see my post here titled CBIND2***
Let’s convert our matrices to data frames using the function data.frame.

c1 = data.frame(c)
x1 = data.frame(x)

Now let’s look at our data. Note that there is an extra column of numbers from 1 to 3 for both c1 and x1. This is just a feature of the data frame output in R, where it is counting the rows 1 through 3.

c1
##   a b
## 1 1 2
## 2 2 4
## 3 3 6

x1
##   x
## 1 2
## 2 2
## 3 2

And just to be thorough, let’s check the R data type, to make sure they are not matrices.

# Check the data type
class(c1)
typeof(c1)
class(x1)
typeof(x1)

Here’s the output of those the data type. Notice that the class is now ‘data.frame’ instead of ‘matrix’ or ‘numeric’.

# The output
## [1] "data.frame"
## [1] "list"
## [1] "data.frame"
## [1] "list"

Now let’s try our simple element-wise multiplication again. You may have guessed it already, but these functions will no longer work.

# These both do not work
c1*x1
x1*c1

Here’s the output of the multiplication (i.e., the errors R provides).

## Error in Ops.data.frame(c1, x1) : 
##   ‘*’ only defined for equally-sized data frames

## Error in Ops.data.frame(c1, x1) : 
##   ‘*’ only defined for equally-sized data frames

According to the error R is providing, we can only multiply data frames of the same size. So, let’s try this out by making some new data.

# Make some data
h=c(2,2)
k=c(4,4)
j=cbind(h,k)
l=j*2

df1 = data.frame(j)
df2 = data.frame(l)

Now let’s look at the data to see what we have

# View the new data frames
df1
##   h k
## 1 2 4
## 2 2 4

df2
##   h k
## 1 4 8
## 2 4 8

Finally, let’s multiply df1*df2 and see what happens.

# Data frame multiplication
df1*df2
##   h  k
## 1 8 32
## 2 8 32

R has done element-wise multiplication on the data frames. This makes sense since we use only the (*) command. If we try this again with the order of the data frames reversed, we will get the same answer.

# Reverse the order for multiplication
df2*df1
##   h  k
## 1 8 32
## 2 8 32

That’s all for now. Hopefully this shed more light onto the way R performs multiplication, especially based on the data type.

Advertisements

How to make a Function in R

This post is meant to show R users how to make their own functions. We’ll start with an easy example below.

Most of my posts provide R code that can be easily copied into R and replicated at home. This post will be a break from that process since functions require saving *.R files and calling them from other *.R files. Let’s begin.

First of all make a new R script file. This will become our function file. There is no difference between a script file and a function file in R. Both are *.R files.

We will make a simple function that multiplies a vector of data by 2. We start by defining our function using the

#make a function
my_function<- function(x){
  x*2
  
}

Now save this R file as “f_myfirstfunction.R” on your Desktop. Now let’s walk through the components of the function. We defined it as “my_function”. This is important as it is how we call the function. After that it’s the

Now we have to open a second R file. This will be the script file that we will use to call the function from. We’ll start this file by setting our working directory to the desktop with the functions getwd() and setwd. getwd() simply states your current working directory in R. setwd is used to change it to wherever you like.

#set the working directory
#rename "your User Name here" based on your user name
#example: owner, Emily, Bill
getwd()
setwd("C:/Users/your User Name here/Desktop")
currwd 

If you get the error, “Error in setwd(“C:/Users/your User Name here/Desktop”):
cannot change the working directory” that means you misspelled some part of your file path. Fix the error and run the code again.

Now we need to make a vector of data, so let’s use the function seq which makes a sequence of values. We’ll save our vector as “data”.

#make some data
data<- seq(from=1, to=10, by=1)
data

Next we have to import the function that we made into the R working space. This is very easy once we have set the working directory. Simply use the call source.

#import the function
source("f_myfirstfunction.R")

I should point out that you need quotations around the R file name. Also, if this file is not saved on the Desktop (the location we set the working directory to), this will give an error “Error in file (….. cannot open the connection”. If this happens move your function file “f_myfirstfunction.R” to your working directory.

Now we will use our awesome new function that we made to multiply the vector “data” by 2. Of course we could just code data*2, but that’s not the point. We’re learning how to write a function.

#call the function
my_function(data)

Awesome! You ran your first function! The R console will spit out the answer:
[1] 2 4 6 8 10 12 14 16 18 20
If we wanted to do something more useful with this output we should save it as a variable. Let’s use data2.

#call the function
data2 <- my_function(data)

This time we get no output from R, but if we type in the variable data2 we get our familiar output:
[1] 2 4 6 8 10 12 14 16 18 20

One important thing to remember when using functions in R is that it doesn’t matter what you save you function file as. When you call your function, you’re using the defined name within the function file code.

#rename the function call to 'times2'
times2<- function(x){
  x*2
}
#rename the function again
zzzzz<- function(x){
  x*2
}

This is the same function saved in file “f_myfirstfunction.R”, but the function name has been changed. Again the function name is what is called from R.

I’ve listed the full text of the script file “call function.R” and the function file “f_myfirstfunction.R” below.

Hope this helps! Happy function writing!

#"call function.R"
#set the working directory
getwd()
setwd("D:/D Documents/wordpress/practicalR/make a function")
currwd

#make some data
data<- seq(from=1, to=10, by=1)
data

#import the function
source("f_myfirstfunction.R")

#call the function
my_function(data)

#call the function - save output as variable
data2 <- my_function(data)
 
#"f_myfirstfunction.R"
my_function<- function(x){
  x*2
}

Secret Santa Picker using R

Here’s a quick post on making a secret santa picker using R. The code eliminates a person from picking themselves, otherwise it’s no frills.

#set the variable for the number of people
npeople=5

fam=matrix(ncol=1, nrow=npeople, NA)
fam[1,1]="name1"
fam[2,1]="name2"
fam[3,1]="name3"
fam[4,1]="name4"
fam[5,1]="name5"


fam2=matrix(ncol=1, nrow=npeople, NA)
names=c("name1","name2","name3","name4","name5")
for (i in 1:npeople){
  #pick the first name
  if (i==1){
    xx2=sample(names, (npeople-i+1), replace=FALSE)
  } else
    xx2=sample(xx2, (npeople-i+1), replace=FALSE)
  
  if (xx2[1]!=fam[i,1]){
    fam2[i,1]=xx2[1]
  } else{
    fam2[i,1]=xx2[2]}
    
  
  #set up the new matrix with one less name
  used=which(xx2==fam2[i])
  xx2[used]="zzzzz"
  xx2=sort(xx2)[1:(npeople-i)]
}

#add "has" to the matrix
has=matrix(ncol=1,nrow=npeople, "has")

#build the final matrices
final=cbind(fam,has,fam2)	
#the final results
final

[,1] [,2] [,3]
[1,] “name1” “has” “name4”
[2,] “name2” “has” “name3”
[3,] “name3” “has” “name5”
[4,] “name4” “has” “name2”
[5,] “name5” “has” “name1”

A high quality plot

I’ll keep this post short and sweet. Here’s some code to get a really nice looking plot in R. It has a high pixel count to produce a high resolution output that can be used in a word document. Because of this, the size of everything in the plot (axes, points, text, axis labels, etc) has to be increased. I have skipped my normal commentary and instead left comments in the code. If you have any questions, leave a comment. Hope this helps!!

##First let's grab a dataset from the R library
nottem
#if we look at the data type 
typeof(nottem)
class(nottem)
#We need to convert from a "time series" object to a "matrix" to plot it
temp = as.matrix(nottem)
data =  matrix(temp, ncol=12, byrow = TRUE)
colnames(data) = c("jan", "feb", "mar", "apr", "may", "jun",
                   "jul", "aug", "sep", "oct", "nov", "dec")
#This tells you where you data is stored
#This is important because we will store our plot here
getwd()
#Calculate the monthly means
monthlymean<-apply(data, 2, FUN=mean)
#Make boxplots of the observed monthly data 
png("nice-plot.png",width=2400,height=1600)
par(mfrow=c(1,1),mar=c(8,9,10,9))

boxplot(data,col="cornflowerblue",
          xaxt='n', xlab="", ylab="",
          main="", cex.main=2, cex.lab=3, cex.axis=3, outcex=2)
points(monthlymean, pch=24, col="black", bg="red", cex=4)

axis(side=1, at=1:12, labels=FALSE, tick=TRUE, cex.axis=3, tck=-0.01)
mtext(c("Jan","Feb","Mar","Apr","May","Jun","Jul",
        "Aug","Sep","Oct","Nov","Dec"),at=1:12,side=1,line=3,cex=3)
mtext("Temperature (Fahrenheit)",side=2,las=0,cex=3.5,line=4.5)

legend("topleft",pch=c(24), c("Mean Observed Precipitation"),
       col=c("black"),pt.bg=c("red"), cex=3)
mtext("Average Monthly Temperatures at Nottingham", side=3, line=4.5, cex=4)
mtext("1920-1939", side=3, line=0.6, cex=3.5)
box(which="outer",col="black",lwd=1)
dev.off()

A nice plot

Plotting 2

In this post we’ll cover go into more detail on plotting commands. We’ll use a scatterplot (X-Y plot) as our example plot. Again we’ll use the command plot.

##First let's make some data
x<-c(1,3,5,7,9,11)
y<-c(2,4,6,8,10,12)

plot(x,y)

scatter

Next let’s change the axis labels. To change the axis titles we’ll use the commands xlab and ylab for the x-axis and y-axis, respectively. We add these calls within the parenthesis of the plot function. Let’s make the x-axis “Even” and the y-axis “Odd”.

plot(x, y, xlab = "Even", ylab = "Odd")
plot2

Looks good! Now let’s change the x- and y-axis limits. We’ll use the commands xlim and ylim. In each case we give a lower and upper limit, so we need to concatenate them together with the c function. In our example we’ll set the x-axis from 0 to 15 using xlim = c(0, 15), and the y-axis from 1 to 20 using ylim = c(1, 20). Again these commands are added within the plot function.

plot(x, y, xlim = c(0, 15), ylim = c(1, 20), xlab = "Even", ylab = "Odd")
plot3

Next let’s add a title calling it “My Plot”. We’ll use the command main = “add your title here”.

plot(x, y, main = "My Plot", xlim = c(0, 15), ylim = c(1, 20), 
     xlab = "Even", ylab = "Odd")
plot4

Now, let’s spice up the colors of our plot. Let’s make the points red and bigger. We use the calls “col” and “cex” to adjust these items.

plot(x, y, col = "red", cex = 2, main = "My Plot", 
     xlim = c(0, 15), ylim = c(1, 20), xlab = "Even", 
     ylab = "Odd")
plot5

Now let’s make our points a little bit fancier. We can use the command pch to change the points from the standard hollow circle to a filled diamond (pch = 18). You can find a snapshot of the different pch symbols here. Since this is a filled symbol, the call col colors the outline and the call bg colors the fill of the symbol.

plot(x, y, pch = 23, bg = "yellow", col = "red", 
     cex = 2, main = "My Plot", xlim = c(0, 15), 
     ylim = c(1, 20), xlab = "Even", ylab = "Odd")
plot6

Finally let’s complete the plot by adding a legend. The legend is different than the previous calls. It goes outside of the plot() command. Add the legend() command on a second line. The first bit of code “topleft” adds the legend to the top left of the plot. The second bit of code calls the legend item by the name “my data”. The rest of the code defines the legend item as we added it into the plot. The exception is the call “pt.bg” which has to be used instead of just “bg”.

plot(x, y, pch = 23, bg = "yellow", col = "red", 
     cex = 2, main = "My Plot", xlim = c(0, 15), 
     ylim = c(1, 20), xlab = "Even", ylab = "Odd")
legend("topleft", "my data", pch=22, pt.bg="yellow", col="red")
plot7

That’s it for now. We’ll do some more plotting next time!

R Studio and Shiny

R Studio has released a web application that is run (nearly) entirely through R (R Studio). It’s called Shiny and it’s great! It easily lets you turn your R scripts into a webpage. This is great for teaching purposes, showing off some code, and publishing to the web.

R Studio has given its users everything they need to make a web app using templates they have provided.  Everything fits into one “.R” file for easy editing and publishing.

You can find the Shiny page here: http://shiny.rstudio.com/

Here’s a link to my Shiny app. This has 4 statistical distributions (normal, lognormal, weibull, exponential) and let’s the user interact with the variables. The box plot and histogram of the data respond to the user controlled inputs.

Check it out here: My Shiny App
(Make sure to give it about 30 seconds to fully load for the first time.)

screenshot

Unbroken – A Visual Guide via Google Earth

I recently finished reading Unbroken by Laura Hillenbrand. It was an immensely powerful story of the human spirit and will to survive and flourish. The only thing I felt was lacking in the book was a visual guide (i.e., map) of Louie’s crazy trip around the Pacific. Therefore I made this Google Earth kmz file which has pins for many of the significant locations in the book once Louie is sent to Honolulu. The kmz file can be viewed in Google Maps by clicking the link, with the option to download if interested (see link below). Here’s a list of the locations included:

KMZ Download (via google drive)

Honolulu
Midway Islands
Wake Atoll
Canton Island
Makin – Gilbert Islands
Tarawa – Gilbert Islands
Funafuti
Nauru
Palmyra Atoll
Flight path of the Green Hornet
Lost at Sea – Louie, Phil and Mac’s drifting journey
Wotje Atoll
Kwajalein Atoll – Execution Island
Ofuna Camp (very approximate)
Omori POW Camp
Naoetsu POW Camp