You will use R language for data analysis exercises provided in this assessment. These tasks will help to build your knowledge of data formats, storage, retrieval, and analysis techniques. You are required to work on the UCI Adult dataset from the Moodle site. Download the given dataset into your working directory.
For each task, write R code, generate the output by executing the R code on the given dataset and save the screenshots of the output. Save all R source codes, output screenshots and analysis on the generated outputs in an MS Word file. This Word file is required to be submitted as a report for marking. Each task should be numbered correctly for marking.
The data analysis tasks are given as follows.
1. Write R code to load the Adult dataset into the defined local variable called “adult”? Place your screen shot.
2. Write R code to see the number of variables and records in the given dataset.
3. Write R code using tail () function to view the last 3 rows from the given dataset.
4. Write R code to generate a summary of information on the given dataset that should include the minimum, maximum, and mean. Write your explanation on the extracted results.
5. Write R code to check the missing value and create a heat map of missing value. Write your explanation with a screenshot (Hints: Install Amelia package and use missing map function).
6. Write R code to generate a correlation matrix. Write an explanation about your outcome.
7. Write R code to check the county column and create a possible group by continents from the countries.
8. Write R code to generate Histogram of ages coloured by income using ggplot2 package. Write your explanation on your extracted graph.
9. Write R code to generate Histogram of hours worked per week.
10. Write a R code rename the country column to region column and then create a Bar plot of region with fill color defined by income class. (Hints: Use ggplot2 Package)
11. Write a conclusion on your overall analysis and data quality.