Blog
How to Create Professional Report
In Data Analysis and Visualization report creation is very crucial part. After using so many analytical tools and techniques on data, there is always a question of sequence of the presentation and ways of presentation.
Here we found an excellent solution which is readily available in R. With this package you can generate an attractive report and share with your colleagues and peers.
Package – Power Weapon
There are more than 10000 Packages available in R. They all are developed with different purposes which support to widen the horizon of R and its application in various areas.
Here we are talking about one package out of these namely DataExplorer. It is wonderful package for performing basic analysis on the data. Let’s Explore this.
Data
Here we are going to use dataset which is inbuilt in R called mtcars.
The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973–74 models).
How to Load Package and Data
We will load the package and load dataset mtcars using following commands. I am storing data of mtcars in data object.
library(DataExplorer)
## Warning: package ‘DataExplorer’ was built under R version 3.5.3
library(knitr)
## Warning: package ‘knitr’ was built under R version 3.5.3
data <- mtcars
kable(summary(data))
|
mpg |
cyl |
disp |
hp |
drat |
wt |
qsec |
vs |
am |
gear |
carb
|
|
|
Min. :10.40 |
Min. :4.000 |
Min. : 71.1 |
Min. : 52.0 |
Min. :2.760 |
Min. :1.513 |
Min. :14.50 |
Min. :0.0000 |
Min. :0.0000 |
Min. :3.000 |
Min. :1.000 |
|
|
1st Qu.:15.43 |
1st Qu.:4.000 |
1st Qu.:120.8 |
1st Qu.: 96.5 |
1st Qu.:3.080 |
1st Qu.:2.581 |
1st Qu.:16.89 |
1st Qu.:0.0000 |
1st Qu.:0.0000 |
1st Qu.:3.000 |
1st Qu.:2.000 |
|
|
Median :19.20 |
Median :6.000 |
Median :196.3 |
Median :123.0 |
Median :3.695 |
Median :3.325 |
Median :17.71 |
Median :0.0000 |
Median :0.0000 |
Median :4.000 |
Median :2.000 |
|
|
Mean :20.09 |
Mean :6.188 |
Mean :230.7 |
Mean :146.7 |
Mean :3.597 |
Mean :3.217 |
Mean :17.85 |
Mean :0.4375 |
Mean :0.4062 |
Mean :3.688 |
Mean :2.812 |
|
|
3rd Qu.:22.80 |
3rd Qu.:8.000 |
3rd Qu.:326.0 |
3rd Qu.:180.0 |
3rd Qu.:3.920 |
3rd Qu.:3.610 |
3rd Qu.:18.90 |
3rd Qu.:1.0000 |
3rd Qu.:1.0000 |
3rd Qu.:4.000 |
3rd Qu.:4.000 |
|
|
Max. :33.90 |
Max. :8.000 |
Max. :472.0 |
Max. :335.0 |
Max. :4.930 |
Max. :5.424 |
Max. :22.90 |
Max. :1.0000 |
Max. :1.0000 |
Max. :5.000 |
Max. :8.000 |
Let’s create report – Use our power weapon
Now its time to create report in just signle line of code. Let’s not wait, just create the report.
NOTE: It will create an HTML File. You can save it.
create_report(data)
shiny::includeHTML(“MTCars Report Blog3.html”)
Data Profiling Report
-
Basic Statistics
-
Raw Counts
-
Percentages
-
Data Structure
-
Missing Data Profile
-
Univariate Distribution
-
Histogram
-
QQ Plot
-
Correlation Analysis
-
Principal Component Analysis
Basic Statistics
Raw Counts
|
Name |
Value
|
|
Rows |
32 |
|
Columns |
11 |
|
Discrete columns |
0 |
|
Continuous columns |
11 |
|
All missing columns |
0 |
|
Missing observations |
0 |
|
Complete Rows |
32 |
|
Total observations |
352 |
|
Memory allocation |
5.8 Kb |
Percentages
Data Structure
root (Classes ‘data.table’ and ‘data.frame’: 32 obs. of 11 variables:)mpg (num)cyl (num)disp (num)hp (num)drat (num)wt (num)qsec (num)vs (num)am (num)gear (num)carb (num)
Missing Data Profile
Univariate Distribution
Histogram
QQ Plot
Correlation Analysis
Principal Component Analysis
Conclusion and other features of the package.
This is not only the fearure that this package have. But there are many features and data visualization capability.
-
DataExplorer-package Data Explorer
-
configure_report Configure report template
-
create_report Create report
-
DataExplorer Data Explorer
-
dataexplorer Data Explorer
-
drop_columns Drop selected variables
-
dummify Dummify discrete features to binary columns
-
group_category Group categories for discrete features
-
introduce Describe basic information
-
plot_bar Plot bar chart
-
plot_boxplot Create boxplot for continuous features
-
plot_correlation Create correlation heatmap for discrete features
-
plot_density Plot density estimates
-
plot_histogram Plot histogram
-
plot_intro Plot introduction
-
plot_missing Plot missing value profile
-
plot_prcomp Visualize principal component analysis
-
plot_qq Plot QQ plot
-
plot_scatterplot Create scatterplot for all features
-
plot_str Visualize data structure
-
profile_missing Profile missing values
-
set_missing Set all missing values to indicated value
-
split_columns Split data into discrete and continuous parts
-
update_columns Update variable types or values
You can explore this feature by using following command after loading the package.
help(“DataExplorer”)
Please share your feedback in comment section. Also comment the next blog topic that you want me to write on.
