Introduction
In this put up we’ll describe how one can use smartphone accelerometer and gyroscope knowledge to foretell the bodily actions of the people carrying the telephones. The knowledge used on this put up comes from the Smartphone-Based Recognition of Human Activities and Postural Transitions Data Set distributed by the University of California, Irvine. Thirty people had been tasked with performing numerous fundamental actions with an connected smartphone recording motion utilizing an accelerometer and gyroscope.
Before we start, let’s load the assorted libraries that we’ll use within the evaluation:
library(keras) # Neural Networks
library(tidyverse) # Data cleansing / Visualization
library(knitr) # Table printing
library(rmarkdown) # Misc. output utilities
library(ggridges) # Visualization
Activities dataset
The knowledge used on this put up come from the Smartphone-Based Recognition of Human Activities and Postural Transitions Data Set(Reyes-Ortiz et al. 2016) distributed by the University of California, Irvine.
When downloaded from the hyperlink above, the information incorporates two totally different ‘parts.’ One that has been pre-processed utilizing numerous characteristic extraction methods resembling fast-fourier rework, and one other UncookedData
part that merely provides the uncooked X,Y,Z instructions of an accelerometer and gyroscope. None of the usual noise filtering or characteristic extraction utilized in accelerometer knowledge has been utilized. This is the information set we are going to use.
The motivation for working with the uncooked knowledge on this put up is to assist the transition of the code/ideas to time collection knowledge in much less well-characterized domains. While a extra correct mannequin may very well be made by using the filtered/cleaned knowledge offered, the filtering and transformation can range vastly from job to job; requiring a number of guide effort and area data. One of the gorgeous issues about deep studying is the characteristic extraction is discovered from the information, not outdoors data.
Activity labels
The knowledge has integer encodings for the actions which, whereas not vital to the mannequin itself, are useful to be used to see. Let’s load them first.
exerciseLabels <- learn.desk("knowledge/activity_labels.txt",
col.names = c("quantity", "label"))
exerciseLabels %>% kable(align = c("c", "l"))
1 | WALKING |
2 | WALKING_UPSTAIRS |
3 | WALKING_DOWNSTAIRS |
4 | SITTING |
5 | STANDING |
6 | LAYING |
7 | STAND_TO_SIT |
8 | SIT_TO_STAND |
9 | SIT_TO_LIE |
10 | LIE_TO_SIT |
11 | STAND_TO_LIE |
12 | LIE_TO_STAND |
Next, we load within the labels key for the UncookedData
. This file is an inventory of all the observations, or particular person exercise recordings, contained within the knowledge set. The key for the columns is taken from the information README.txt
.
Column 1: experiment quantity ID,
Column 2: person quantity ID,
Column 3: exercise quantity ID
Column 4: Label begin level
Column 5: Label finish level
The begin and finish factors are in variety of sign log samples (recorded at 50hz).
Let’s check out the primary 50 rows:
labels <- learn.desk(
"knowledge/UncookedData/labels.txt",
col.names = c("experiment", "personId", "exercise", "startPos", "endPos")
)
labels %>%
head(50) %>%
paged_table()
File names
Next, let’s have a look at the precise information of the person knowledge offered to us in UncookedData/
knowledgeFiles <- record.information("knowledge/UncookedData")
knowledgeFiles %>% head()
[1] "acc_exp01_user01.txt" "acc_exp02_user01.txt"
[3] "acc_exp03_user02.txt" "acc_exp04_user02.txt"
[5] "acc_exp05_user03.txt" "acc_exp06_user03.txt"
There is a three-part file naming scheme. The first half is the kind of knowledge the file incorporates: both acc
for accelerometer or gyro
for gyroscope. Next is the experiment quantity, and final is the person Id for the recording. Let’s load these right into a dataframe for ease of use later.
fileInfo <- data_frame(
filePath = knowledgeFiles
) %>%
filter(filePath != "labels.txt") %>%
separate(filePath, sep = '_',
into = c("sort", "experiment", "personId"),
take away = FALSE) %>%
mutate(
experiment = str_remove(experiment, "exp"),
personId = str_remove_all(personId, "person|.txt")
) %>%
unfold(sort, filePath)
fileInfo %>% head() %>% kable()
01 | 01 | acc_exp01_user01.txt | gyro_exp01_user01.txt |
02 | 01 | acc_exp02_user01.txt | gyro_exp02_user01.txt |
03 | 02 | acc_exp03_user02.txt | gyro_exp03_user02.txt |
04 | 02 | acc_exp04_user02.txt | gyro_exp04_user02.txt |
05 | 03 | acc_exp05_user03.txt | gyro_exp05_user03.txt |
06 | 03 | acc_exp06_user03.txt | gyro_exp06_user03.txt |
Reading and gathering knowledge
Before we are able to do something with the information offered we have to get it right into a model-friendly format. This means we wish to have an inventory of observations, their class (or exercise label), and the information akin to the recording.
To receive this we are going to scan by way of every of the recording information current in knowledgeFiles
, lookup what observations are contained within the recording, extract these recordings and return every little thing to a straightforward to mannequin with dataframe.
# Read contents of single file to a dataframe with accelerometer and gyro knowledge.
learnInData <- operate(experiment, personId){
genFilePath = operate(sort) {
paste0("knowledge/UncookedData/", sort, "_exp",experiment, "_user", personId, ".txt")
}
bind_cols(
learn.desk(genFilePath("acc"), col.names = c("a_x", "a_y", "a_z")),
learn.desk(genFilePath("gyro"), col.names = c("g_x", "g_y", "g_z"))
)
}
# Function to learn a given file and get the observations contained alongside
# with their courses.
loadFileData <- operate(curExperiment, curUserId) {
# load sensor knowledge from file into dataframe
allData <- learnInData(curExperiment, curUserId)
extractObservation <- operate(startPos, endPos){
allData[startPos:endPos,]
}
# get remark places on this file from labels dataframe
knowledgeLabels <- labels %>%
filter(personId == as.integer(curUserId),
experiment == as.integer(curExperiment))
# extract observations as dataframes and save as a column in dataframe.
knowledgeLabels %>%
mutate(
knowledge = map2(startPos, endPos, extractObservation)
) %>%
choose(-startPos, -endPos)
}
# scan by way of all experiment and personId combos and collect knowledge right into a dataframe.
allObservations <- map2_df(fileInfo$experiment, fileInfo$personId, loadFileData) %>%
right_join(exerciseLabels, by = c("exercise" = "quantity")) %>%
rename(activityName = label)
# cache work.
write_rds(allObservations, "allObservations.rds")
allObservations %>% dim()
Exploring the information
Now that we now have all the information loaded together with the experiment
, personId
, and exercise
labels, we are able to discover the information set.
Length of recordings
Let’s first have a look at the size of the recordings by exercise.
allObservations %>%
mutate(recording_length = map_int(knowledge,nrow)) %>%
ggplot(aes(x = recording_length, y = activityName)) +
geom_density_ridges(alpha = 0.8)
The truth there may be such a distinction in size of recording between the totally different exercise sorts requires us to be a bit cautious with how we proceed. If we practice the mannequin on each class without delay we’re going to need to pad all of the observations to the size of the longest, which would depart a big majority of the observations with an enormous proportion of their knowledge being simply padding-zeros. Because of this, we are going to match our mannequin to simply the most important ‘group’ of observations size actions, these embody STAND_TO_SIT
, STAND_TO_LIE
, SIT_TO_STAND
, SIT_TO_LIE
, LIE_TO_STAND
, and LIE_TO_SIT
.
An fascinating future course can be making an attempt to make use of one other structure resembling an RNN that may deal with variable size inputs and coaching it on all the information. However, you’d run the chance of the mannequin studying merely that if the remark is lengthy it’s probably one of many 4 longest courses which might not generalize to a state of affairs the place you had been working this mannequin on a real-time-stream of knowledge.
Filtering actions
Based on our work from above, let’s subset the information to simply be of the actions of curiosity.
desiredActivities <- c(
"STAND_TO_SIT", "SIT_TO_STAND", "SIT_TO_LIE",
"LIE_TO_SIT", "STAND_TO_LIE", "LIE_TO_STAND"
)
filteredObservations <- allObservations %>%
filter(activityName %in% desiredActivities) %>%
mutate(remarkId = 1:n())
filteredObservations %>% paged_table()
So after our aggressive pruning of the information we may have a good quantity of knowledge left upon which our mannequin can be taught.
Training/testing cut up
Before we go any additional into exploring the information for our mannequin, in an try to be as truthful as attainable with our efficiency measures, we have to cut up the information right into a practice and take a look at set. Since every person carried out all actions simply as soon as (excluding one who solely did 10 of the 12 actions) by splitting on personId
we are going to be certain that our mannequin sees new individuals solely once we take a look at it.
# get all customers
personIds <- allObservations$personId %>% distinctive()
# randomly select 24 (80% of 30 people) for coaching
set.seed(42) # seed for reproducibility
practiceIds <- pattern(personIds, dimension = 24)
# set the remainder of the customers to the testing set
take a look atIds <- setdiff(personIds,practiceIds)
# filter knowledge.
practiceData <- filteredObservations %>%
filter(personId %in% practiceIds)
take a look atData <- filteredObservations %>%
filter(personId %in% take a look atIds)
Visualizing actions
Now that we now have trimmed our knowledge by eradicating actions and splitting off a take a look at set, we are able to truly visualize the information for every class to see if there’s any instantly discernible form that our mannequin might be able to decide up on.
First let’s unpack our knowledge from its dataframe of one-row-per-observation to a tidy model of all of the observations.
unpackedObs <- 1:nrow(practiceData) %>%
map_df(operate(rowNum){
dataRow <- practiceData[rowNum, ]
dataRow$knowledge[[1]] %>%
mutate(
activityName = dataRow$activityName,
remarkId = dataRow$remarkId,
time = 1:n() )
}) %>%
collect(studying, worth, -time, -activityName, -observationId) %>%
separate(studying, into = c("sort", "course"), sep = "_") %>%
mutate(sort = ifelse(sort == "a", "acceleration", "gyro"))
Now we now have an unpacked set of our observations, let’s visualize them!
unpackedObs %>%
ggplot(aes(x = time, y = worth, coloration = course)) +
geom_line(alpha = 0.2) +
geom_smooth(se = FALSE, alpha = 0.7, dimension = 0.5) +
facet_grid(sort ~ activityName, scales = "free_y") +
theme_minimal() +
theme( axis.textual content.x = element_blank() )
So not less than within the accelerometer knowledge patterns positively emerge. One would think about that the mannequin could have bother with the variations between LIE_TO_SIT
and LIE_TO_STAND
as they’ve the same profile on common. The similar goes for SIT_TO_STAND
and STAND_TO_SIT
.
Preprocessing
Before we are able to practice the neural community, we have to take a few steps to preprocess the information.
Padding observations
First we are going to resolve what size to pad (and truncate) our sequences to by discovering what the 98th percentile size is. By not utilizing the very longest remark size this may assist us keep away from extra-long outlier recordings messing up the padding.
padSize <- practiceData$knowledge %>%
map_int(nrow) %>%
quantile(p = 0.98) %>%
ceiling()
padSize
98%
334
Now we merely have to convert our record of observations to matrices, then use the tremendous helpful pad_sequences()
operate in Keras to pad all observations and switch them right into a 3D tensor for us.
convertToTensor <- . %>%
map(as.matrix) %>%
pad_sequences(maxlen = padSize)
trainObs <- practiceData$knowledge %>% convertToTensor()
testObs <- take a look atData$knowledge %>% convertToTensor()
dim(trainObs)
[1] 286 334 6
Wonderful, we now have our knowledge in a pleasant neural-network-friendly format of a 3D tensor with dimensions (<num obs>, <sequence size>, <channels>)
.
One-hot encoding
There’s one last item we have to do earlier than we are able to practice our mannequin, and that’s flip our remark courses from integers into one-hot, or dummy encoded, vectors. Luckily, once more Keras has provided us with a really useful operate to just do this.
oneHotClasses <- . %>%
{. - 7} %>% # deliver integers right down to 0-6 from 7-12
to_categorical() # One-hot encode
trainY <- practiceData$exercise %>% oneHotClasses()
testY <- take a look atData$exercise %>% oneHotClasses()
Modeling
Architecture
Since we now have temporally dense time-series knowledge we are going to make use of 1D convolutional layers. With temporally-dense knowledge, an RNN has to be taught very lengthy dependencies with a view to decide up on patterns, CNNs can merely stack just a few convolutional layers to construct sample representations of considerable size. Since we’re additionally merely in search of a single classification of exercise for every remark, we are able to simply use pooling to ‘summarize’ the CNNs view of the information right into a dense layer.
In addition to stacking two layer_conv_1d()
layers, we are going to use batch norm and dropout (the spatial variant(Tompson et al. 2014) on the convolutional layers and commonplace on the dense) to regularize the community.
input_shape <- dim(trainObs)[-1]
num_classes <- dim(trainY)[2]
filters <- 24 # variety of convolutional filters to be taught
kernel_size <- 8 # what number of time-steps every conv layer sees.
dense_size <- 48 # dimension of our penultimate dense layer.
# Initialize mannequin
mannequin <- keras_model_sequential()
mannequin %>%
layer_conv_1d(
filters = filters,
kernel_size = kernel_size,
input_shape = input_shape,
padding = "legitimate",
activation = "relu"
) %>%
layer_batch_normalization() %>%
layer_spatial_dropout_1d(0.15) %>%
layer_conv_1d(
filters = filters/2,
kernel_size = kernel_size,
activation = "relu",
) %>%
# Apply common pooling:
layer_global_average_pooling_1d() %>%
layer_batch_normalization() %>%
layer_dropout(0.2) %>%
layer_dense(
dense_size,
activation = "relu"
) %>%
layer_batch_normalization() %>%
layer_dropout(0.25) %>%
layer_dense(
num_classes,
activation = "softmax",
identify = "dense_output"
)
abstract(mannequin)
______________________________________________________________________
Layer (sort) Output Shape Param #
======================================================================
conv1d_1 (Conv1D) (None, 327, 24) 1176
______________________________________________________________________
batch_normalization_1 (BatchNo (None, 327, 24) 96
______________________________________________________________________
spatial_dropout1d_1 (SpatialDr (None, 327, 24) 0
______________________________________________________________________
conv1d_2 (Conv1D) (None, 320, 12) 2316
______________________________________________________________________
global_average_pooling1d_1 (Gl (None, 12) 0
______________________________________________________________________
batch_normalization_2 (BatchNo (None, 12) 48
______________________________________________________________________
dropout_1 (Dropout) (None, 12) 0
______________________________________________________________________
dense_1 (Dense) (None, 48) 624
______________________________________________________________________
batch_normalization_3 (BatchNo (None, 48) 192
______________________________________________________________________
dropout_2 (Dropout) (None, 48) 0
______________________________________________________________________
dense_output (Dense) (None, 6) 294
======================================================================
Total params: 4,746
Trainable params: 4,578
Non-trainable params: 168
______________________________________________________________________
Training
Now we are able to practice the mannequin utilizing our take a look at and coaching knowledge. Note that we use callback_model_checkpoint()
to make sure that we save solely the most effective variation of the mannequin (fascinating since sooner or later in coaching the mannequin could start to overfit or in any other case cease enhancing).
# Compile mannequin
mannequin %>% compile(
loss = "categorical_crossentropy",
optimizer = "rmsprop",
metrics = "accuracy"
)
trainHistory <- mannequin %>%
match(
x = trainObs, y = trainY,
epochs = 350,
validation_data = record(testObs, testY),
callbacks = record(
callback_model_checkpoint("best_model.h5",
save_best_only = TRUE)
)
)
The mannequin is studying one thing! We get a good 94.4% accuracy on the validation knowledge, not unhealthy with six attainable courses to select from. Let’s look into the validation efficiency a little bit deeper to see the place the mannequin is messing up.
Evaluation
Now that we now have a skilled mannequin let’s examine the errors that it made on our testing knowledge. We can load the most effective mannequin from coaching based mostly upon validation accuracy after which have a look at every remark, what the mannequin predicted, how excessive a likelihood it assigned, and the true exercise label.
# dataframe to get labels onto one-hot encoded prediction columns
oneHotToLabel <- exerciseLabels %>%
mutate(quantity = quantity - 7) %>%
filter(quantity >= 0) %>%
mutate(class = paste0("V",quantity + 1)) %>%
choose(-number)
# Load our greatest mannequin checkpoint
finestModel <- load_model_hdf5("best_model.h5")
tidyPredictionProbs <- finestModel %>%
predict(testObs) %>%
as_data_frame() %>%
mutate(obs = 1:n()) %>%
collect(class, prob, -obs) %>%
right_join(oneHotToLabel, by = "class")
predictionPerformance <- tidyPredictionProbs %>%
group_by(obs) %>%
summarise(
highestProb = max(prob),
predicted = label[prob == highestProb]
) %>%
mutate(
fact = take a look atData$activityName,
appropriate = fact == predicted
)
predictionPerformance %>% paged_table()
First, let’s have a look at how ‘confident’ the mannequin was by if the prediction was appropriate or not.
predictionPerformance %>%
mutate(end result = ifelse(appropriate, 'Correct', 'Incorrect')) %>%
ggplot(aes(highestProb)) +
geom_histogram(binwidth = 0.01) +
geom_rug(alpha = 0.5) +
facet_grid(end result~.) +
ggtitle("Probabilities related to prediction by correctness")
Reassuringly it appears the mannequin was, on common, much less assured about its classifications for the inaccurate outcomes than the right ones. (Although, the pattern dimension is simply too small to say something definitively.)
Let’s see what actions the mannequin had the toughest time with utilizing a confusion matrix.
predictionPerformance %>%
group_by(fact, predicted) %>%
summarise(rely = n()) %>%
mutate(good = fact == predicted) %>%
ggplot(aes(x = fact, y = predicted)) +
geom_point(aes(dimension = rely, coloration = good)) +
geom_text(aes(label = rely),
hjust = 0, vjust = 0,
nudge_x = 0.1, nudge_y = 0.1) +
guides(coloration = FALSE, dimension = FALSE) +
theme_minimal()
We see that, because the preliminary visualization prompt, the mannequin had a little bit of bother with distinguishing between LIE_TO_SIT
and LIE_TO_STAND
courses, together with the SIT_TO_LIE
and STAND_TO_LIE
, which even have comparable visible profiles.
Future instructions
The most evident future course to take this evaluation can be to aim to make the mannequin extra common by working with extra of the provided exercise sorts. Another fascinating course can be to not separate the recordings into distinct ‘observations’ however as an alternative maintain them as one streaming set of knowledge, very similar to an actual world deployment of a mannequin would work, and see how properly a mannequin may classify streaming knowledge and detect adjustments in exercise.
Gal, Yarin, and Zoubin Ghahramani. 2016. “Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning.” In International Conference on Machine Learning, 1050–9.
Graves, Alex. 2012. “Supervised Sequence Labelling.” In Supervised Sequence Labelling with Recurrent Neural Networks, 5–13. Springer.
Kononenko, Igor. 1989. “Bayesian Neural Networks.” Biological Cybernetics 61 (5). Springer: 361–70.
LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015. “Deep Learning.” Nature 521 (7553). Nature Publishing Group: 436.
Reyes-Ortiz, Jorge-L, Luca Oneto, Albert Samà, Xavier Parra, and Davide Anguita. 2016. “Transition-Aware Human Activity Recognition Using Smartphones.” Neurocomputing 171. Elsevier: 754–67.
Tompson, Jonathan, Ross Goroshin, Arjun Jain, Yann LeCun, and Christoph Bregler. 2014. “Efficient Object Localization Using Convolutional Networks.” CoRR abs/1411.4280. http://arxiv.org/abs/1411.4280.