You are currently viewing How To Find Mean of Multiple Columns in R Programming
R Programming

How To Find Mean of Multiple Columns in R Programming

Scenario

I used to write majority of my blogs based upon the practical constraints or the queries that I received during my training or consulting assignments. This blog is also based on the query raised by one the participants during my training. He asked,

“Sir, I am used to Excel, so If I want to find mean of 50 columns than I put formula and then just drag it. Can we do this in R?”

It just came suddenly, so I was also thinking that How can we do that. Then I answered, “Yes, we can do that by using Apply function”

Let’s See the Data

We will use the same data that we used in my previous blog on How to deal with Missing Values.

The data is Survey from the MASS Package.

This data frame contains the responses of 237 Statistics students at the University of Adelaide to a number of questions.

library(MASS)
## Warning: package 'MASS' was built under R version 3.5.3
library(knitr)
## Warning: package 'knitr' was built under R version 3.5.3
kable(head(survey))
Sex Wr.Hnd NW.Hnd W.Hnd Fold Pulse Clap Exer Smoke Height M.I Age
Female 18.5 18.0 Right R on L 92 Left Some Never 173.00 Metric 18.250
Male 19.5 20.5 Left R on L 104 Left None Regul 177.80 Imperial 17.583
Male 18.0 13.3 Right L on R 87 Neither None Occas NA NA 16.917
Male 18.8 18.9 Right R on L NA Neither None Never 160.00 Metric 20.333
Male 20.0 20.0 Right Neither 35 Right Some Never 165.00 Metric 23.667
Female 18.0 17.7 Right L on R 64 Right Some Never 172.72 Imperial 21.000

 

How to do it

Now let’s saay we want to find out the mean of all the numeric variables i.e. Wr. Hnd, NW.Hnd, Pulse, Height and Age.

So first let us make label that we want to give to the mean of each columns. We will use list function to make a list and store it in to the object list1.

list1 <- list(Writing_Hand = survey$Wr.Hnd, Non_writing_hand = survey$NW.Hnd, Pulse = survey$Pulse, Height = survey$Height, Age = survey$Age)

Now we will use sapply function to find out the mean of all these columns in a signle command line.

sapply(list1, mean, na.rm = T)
## Writing_Hand Non_writing_hand Pulse Height ## 18.66907 18.58263 74.15104 172.38086 ## Age ## 20.37451

It’s amazing. We can find out the mean of multiple columns/ varibales with the help of only few line of code.

You can also use any of the function on this sapply function. e.g. if we want to find out the Standard Deviation of the same columns, then we can use it as follows,

sapply(list1, sd, na.rm = T)
## Writing_Hand Non_writing_hand Pulse Height ## 1.878981 1.967068 11.687157 9.847528 ## Age ## 6.474335

Thanks you for reading the blog. Do comment the next blog you want me to write on.

SexWr.Hand NW.Hnd W.Hnd FoldPulseClapExerSmokeHeightMIAge
Female 18.518.0RightR On L92LeftSomeNever173.00Matric18.250
Male19.520.5LeftR On L104LeftNoneRegul177.80Imperial17.583
Male
Male
Male
Female