How to deal with Missing Values in R Programming?

Scenario

One day I was imparting training to the participants on Statistical Techniques using R Programming. In the class, I was taking Descriptive Statistics and was trying to demonstrate them how to calculate mean using Survey data of Package MASS.

Quick View of Data

## Warning: package 'MASS' was built under R version 3.5.3
## Warning: package 'knitr' was built under R version 3.5.3
kable(head(survey))
Sex Wr.Hnd NW.Hnd W.Hnd Fold Pulse Clap Exer Smoke Height M.I Age
Female 18.5 18.0 Right R on L 92 Left Some Never 173.00 Metric 18.250
Male 19.5 20.5 Left R on L 104 Left None Regul 177.80 Imperial 17.583
Male 18.0 13.3 Right L on R 87 Neither None Occas NA NA 16.917
Male 18.8 18.9 Right R on L NA Neither None Never 160.00 Metric 20.333
Male 20.0 20.0 Right Neither 35 Right Some Never 165.00 Metric 23.667
Female 18.0 17.7 Right L on R 64 Right Some Never 172.72 Imperial 21.000

Find the Mean

Variable Wr. Hand is showing span (distance from tip of thumb to tip of little finger of spread hand) of writing hand, in centimetres. Its continuous variable, so mean would be the correct measurement for central tendency. So I tried following command,

mean(survey$Wr.Hnd)
## [1] NA

I was surprised why is it showing NA even though the data is of continuous type as you can see in the above quick view of data.

I just tried to reload the MASS Package and again did the same procedure considering that there would be some error loading the package. But after that also I was getting the same error. Now I was feeling embarrassed.

Then I started looking at individual data value of that particular variable. And suddently I found, observation no. 43 has value NA.Due to that I was getting the error while calculated mean.

kable(survey[c(40:45),])
Sex Wr.Hnd NW.Hnd W.Hnd Fold Pulse Clap Exer Smoke Height M.I Age
40 Male 19.0 19.0 Right R on L NA Neither Freq Occas 171.00 Metric 19.917
41 Female 17.5 16.0 Right L on R NA Right Some Never 169.00 Metric 17.500
42 Female 17.8 18.0 Right R on L 72 Right Some Never 154.94 Imperial 17.083
43 Male NA NA Right R on L 60 NA Some Never 172.00 Metric 28.583
44 Female 20.1 20.2 Right L on R 80 Right Some Never 176.50 Imperial 17.500
45 Female 13.0 13.0 NA L on R 70 Left Freq Never 180.34 Imperial 17.417

Now I came to know that yes this is the observation which make me embarrased. But how to deal with it. I can remove this observation, but it is having the data for other variables. So if I remove it, then it would be a loss of information. The best way is to skip this observation while calculating mean of Wr. hand variable. So I used following argument in command,

mean(survey$Wr.Hnd, na.rm = T)
## [1] 18.66907

Wooooo !!!!!! Now it’s giving the result without loosing the other information by skipping just NA values.