This project demonstrates the ability to collect, work with, and clean a dataset to prepare tidy data for later analysis. The dataset used is the Human Activity Recognition Using Smartphones Dataset, which includes data from 30 volunteers performing six activities (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING) captured via smartphone sensors.
run_analysis.R: The main R script that processes the dataset.tidy_data.txt: The resulting tidy dataset.CodeBook.md: Documentation of variables, data, and transformations.
- Ensure all dataset files (e.g.,
train/,test/,features.txt, etc.) are in the working directory. - Install the
dplyrpackage in R:install.packages("dplyr"). - Run the
run_analysis.Rscript in R to generatetidy_data.txt.
The run_analysis.R script performs the following steps:
- Merges the training and test sets into one dataset.
- Extracts measurements on mean and standard deviation.
- Assigns descriptive activity names.
- Labels the dataset with descriptive variable names.
- Creates a tidy dataset with the average of each variable for each activity and subject.
This repository contains a single script (run_analysis.R) that handles the entire data processing pipeline, producing a tidy dataset ready for analysis.