Exploratory_Analysis_of_Smartphone_Sensor_Data_for_Human_Activity_Recognition

Received 24 August 2023, accepted 6 September 2023, date of publication 12 September 2023,
date of current version 18 September 2023.

Digital Object Identifier 10.1109/ACCESS.2023.3314651
Exploratory Analysis of Smartphone Sensor Data

for Human Activity Recognition
S.M. MOHIDUL ISLAM AND KAMRUL HASAN TALUKDER
Computer Science and Engineering Discipline, Khulna University, Khulna 9208, Bangladesh
Corresponding author: S.M. Mohidul Islam ([email protected])
This work was supported in part by the Information and Communication Technology (ICT) Division of the Bangladesh Government
Ministry of Posts, Telecommunications, and Information Technology under the ICT Fellowship under Grant 22FS15324; and in part
by the Khulna University Administration (by paying the publication fee partially).
ABSTRACT Precise recognition of human activities in any smart environment such as smart homes or
smart healthcare centers is vital for child care, elder care, disabled patient monitoring, self-management
systems, safety, tracking healthcare functionality, etc. Automatic human activity recognition (HAR) based on
smartphone sensor data is becoming widespread day by day. However, it is challenging to understand human
activities using sensor data and machine learning and so the recognition accuracy of many state-of-the-art
methods is relatively low. It requires high computational overhead to improve recognition accuracy. The goal
of this paper is to use exploratory data analysis (EDA) to deal with this strain and after analyzing, visual-
izations and dimensionality reductions are obtained which assists in deciding the data mining techniques.
The HAR method based on smartphone accelerometer and gyroscope sensors’ data, EDA, and prediction
models proposed in this paper is a high-precision method, and its highest accuracy is 97.12% for the HAR
smartphone dataset. Heterogeneous models-based two ensembles: stacking and voting are used in this study
to identify human activities of daily living (ADL). Three estimators are used: Linear Discriminant Analysis,
Linear Support Vector Machines, and Logistic Regression for both stacked and voting generalization. The
experimental results show that the generalization algorithms provide an automatic and precise HAR system
and can serve as a decision-making tool to identify ADL in any smart environment.
INDEX TERMS Activities of daily living, exploratory data analysis, hard voting, heterogeneous model,
smartphone sensor, stacked generalization.
I. INTRODUCTION HAR has gained noteworthy consideration as a vital issue

Human activity recognition (HAR) has appeared as an in numerous practical spheres, including healthcare, surveil-
interdisciplinary and exciting field of research with the con- lance, sports analysis, robotics, and human-computer interac-
nection of computer science, signal processing, and machine tion [1], [2]. For example, in healthcare, HAR techniques can
learning. HAR system can automatically detect, classify, be employed to monitor patients’ physical activities, facilitat-
and understand human activity. HAR systems empower ing the early finding of anomalies and scheming personalized
the expansion of intelligent systems that can familiarize treatment strategies.
themselves with human actions, heighten personalized ser- Activities of daily living (ADL or ADLs) is a term used
vices, and make available valuable insights for observing in healthcare to denote the fundamental activities people
and improving individual health and performance. With the perform in their daily life generally without the help of
explosion of wearable devices, ubiquitous sensing technolo- others such as walking, running, standing, sitting, bathing,
gies, and the growing necessity for smart environments, dressing, walking upstairs, walking downstairs, lying, brush-
ing, etc. [2]. ADL is used as an indicator of an individual’s
The associate editor coordinating the review of this manuscript and functional status. A person who cannot accomplish necessary
approving it for publication was Chao Tong . ADLs may have a worse life quality or be risky in their
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 11, 2023 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 99481
S.M. M. Islam, K. H. Talukder: Exploratory Analysis of Smartphone Sensor Data for HAR
present life situations; hence, they may necessitate the help performance of the method to authenticate the estimated
of other persons and/or mechanical devices [3]. results.
Due to the widespread ease of use of smartphones in recent The rest part of this paper is prepared as follows. The second
years, recognizing ADL using smartphone data has gained section reviews some state-of-the-art in the field of HAR
significant attention. Again, having built-in sensors such and the use of smartphone sensors to capture data. The third
as accelerometers, gyroscopes, magnetometers, and GPS, section presents a new and details exploratory data analysis
smartphones have become powerful tools for collecting data for the selected dataset and details modeling to deal with sen-
about human activities. Moreover, researchers have achieved sor data for ADL recognition. The fourth section illustrates
improved recognition performance and better discrimination and describes the experimental results and discussion with
between similar activities, by exploring the fusion of multi- estimators and their hyper-parameters. The fifth part sums
ple sensors in smartphones [4], [5], [6], [7]. Sensor fusion up the work of this study appeals to a conclusion, and plants
techniques, such as feature-level fusion, early fusion, and late some directorial thought for future study.
fusion have been used to combine data from multiple sensors
meritoriously. II. RELATED WORKS
That means, existing works in HAR based on smartphones In the past decades, HAR has become an active field of
have established the potential of smartphones as trustwor- research and many researchers worked on HAR systems
thy and handy tools to recognize human activities. Various for building various HAR applications in smart environ-
data mining algorithms, sensor fusion approaches, and sig- ments. Generally, a HAR system consists of several common
nal processing techniques have considerably contributed to steps [7]: sensing activity data from environment or body
improving performance and recognition accuracy. sensors, pre-processing and labeling the activity data, seg-
Changes from the existing works which are typically fit mentation using sliding window, feature extraction from
for activity identification part in HAR systems, this study time and/or frequency domains, and modeling using shallow
proposes a new and detailed exploratory data analysis (EDA) and/or deep learning methods with or without transfer learn-
method for visualizing data to separate human activities. ing. As modeling is an essential and significant part of the
Using smartphone sensors’ data for ADLs and combining HAR system, the selection of a classification model has a
the results of multiple heterogeneous models, we achieve prominent effect on the overall precision of the system.
activity recognition effectively. Our contributions are as In the literature, there are two types of HAR systems
follows: based on the classification algorithms used. One prominent
• In this study, a detailed exploratory data analysis is approach is based on the use of shallow learning algorithms.
outlined to deal with the recognition of activities of daily These algorithms learn from labeled datasets where each
living based on smartphone sensor data. So far we know, activity is associated with a specific set of sensor data fea-
no existing work has presented the analysis of HAR tures. Features commonly used include time-domain features,
data in such detail. The detailed EDA highlights the frequency-domain features, statistical features, and spatial
internal characteristics of data so that the data analyst’s features. Researchers have employed classifiers such as deci-
knowledge of identifying activities improves more and sion trees [7], [8], k-nearest neighbors (k-NN) [8], random
deeper, or changes the understanding with the learning forests [7], [8], artificial neural networks (ANN) [8], [9],
to figure out the real distribution of the activity data. support vector machines (SVM) [7], [8], [9], [10], etc. to rec-
• We use boxplots of ‘five number summary’ to visualize ognize activities such as walking, running, sitting, standing,
the dispersion of data; histograms, and bar of probability and cycling. These approaches have demonstrated promising
distribution functions to find the inception for differen- results in accurately recognizing activities with high accu-
tiating the steady and moving activities on univariate racy rates. Kong et al. [7] proposed a method based on six
analysis. Moreover, we apply kernel Principal Compo- different shallow learning models and achieved the highest
nent Analysis (kPCA) as well as T-distributed Stochastic accuracy using linear SVC with the Grid search method of
Neighbor Embedding (t-SNE) manifold learning meth- tuning hyper-parameters. They present some data analysis
ods to investigate the separability of data on all features. like ours but details exploratory analysis for the smartphone
These details EDAs assist in selecting a robust model for sensors data is not provided. Moreover, although their meth-
the HAR method. ods provide certain higher accuracy but require relatively
• The HAR method based on smartphone sensor data, higher training time. Masum et al. [8] captured data using a
detailed EDA, and recognition from multiple hetero- Xiaomi Redmi 4A smartphone, used PCA for selecting fea-
geneous models proposed in this paper is a new tures, and applied several mining algorithms including Dense
lightweight ensemble method. Although the machine Neural Network, Decision tree, k-NN, random forests, SVM,
learning method used here is lightweight, it is a and achieved the highest 94.38% accuracy for their prepared
high-precision method. This study attains higher pre- dataset. They compared the recognition results based on gen-
diction accuracy and lower training time in comparison der (Male and Female) which was not compared in any former
with state-of-the-art shallow and deep models. Various research but their methods provided worse results for highly
model evaluation techniques are used to measure the similar activities such as walking with walking downstairs
99482 VOLUME 11, 2023

and/or walking upstairs. Khan et al. [9] acquired data using the datasets, and obtained the highest 92.7%, 93.7%, and
an LG Nexus 4 smartphone from five different phone posi- 76% accuracy for three datasets respectively. The authors
tions in the body from 40 subjects, sampling at 6 different presented a unique regularization method, explored the influ-
rates, and used data from 30 subjects for offline training and ence of hyper-parameters, and conveyed a recommendation
that of 10 subjects for real-time testing. They used kernel for future researchers who may use deep learning models but
discriminant analysis to reduce class variance and ANN for their suggested setting doesn’t show consistent performance
modeling and achieved the highest 87.1% accuracy. They for all benchmark datasets of HAR as well and their guide-
offered lightweight features that do not necessitate higher lines are limited to few deep models (DNN, CNN, and LSTM)
sampling rates and lengthier time windows for their calcula- and don’t advocate whether they will work for other broadly
tion and so assist in attaining a fast response but those features used deep models such as Inception, Gated Recurrent Unit
are not fully position/orientation-independent of the phone (GRU), etc. for HAR. Xu et al. [12] worked on 18 mid-level
such as the phone in the user’s hands, in a carrier bag, in a gesture activities from the Opportunity dataset, 18 lifestyle
coat’s side pocket, etc. Moreover, their recognition accuracy activities from the PAMAP2 dataset, and 6 activities of daily
is a bit lower than in many former works. Diney et al. [10] living for the dataset used in this study (HAR smartphone
captured accelerometer data using an Android smartphone dataset) and achieved 94.6%, 93.5%, and 94.5% accuracy
from a single subject and proposed an SVM model for the using Inception GoggLeNet and GRU for three correspond-
recognition of three activities of daily living. The authors ing datasets. Though this method provides better general-
developed the depiction of initially engendered vectors into ization and consistent performance than existing methods
compact clusters but captured data of training and recognition but doesn’t explore the class imbalance problem in the
from only one subject, so it cannot be a widespread solution data for real-life HAR applications. Bhattacharya et al. [13]
for HAR applications. proposed Ensem-HAR, where CNN-Net, CNN-LSTM-Net,
Another approach is based on the use of deep learn- ConvLSTM-Net, and StackedLSTM-Net are used as base
ing algorithms for HAR. Deep learning models such as models and Random Forest, is used as a meta-model of
convolutional neural networks (CNNs) and recurrent neu- stacking and implemented their method on three different
ral networks (RNNs) have shown remarkable performance datasets including the one used in this paper and obtain
in various recognition tasks including HAR. By leverag- 95.05% accuracy for HAR Smartphone dataset. Though their
ing the hierarchical representations learned from raw sensor stacking of four deep learning-based models performs better
data, deep learning models can automatically extract rele- than the other works to which it is compared, its accumulative
vant features and capture complex temporal dependencies in training time of four different deep learning-based models is
activity sequences. Researchers have designed deep learning so high that it cannot be a typical method for real-time HAR
architectures for HAR [5], [6], [11], [12], [13], achieving applications.
state-of-the-art accuracy rates and robustness to different In summary, although plenty of work has been completed
environments and user populations. Shi et al. [5] used the to boost and optimize the models in HAR methods; still there
Boulic kinematic model to construct the dataset from body are the following deficiencies:
movement sensors and proposed a Deep Convolutional Gen- (1) As we know human behavior of performing activity is
erative Adversarial Network (DCGAN) and a pre-trained not only usual and impulsive, but also human beings may
deep CNN architecture on ImageNet, VGG-16, deep model perform some unrelated activities. Besides this, there are
for recognizing three types of walking activities based on some variations of performing the same activity by different
moving speed. This method is decent to expand and enrich users. Another challenge is to handle the speed of movement
training set to escape overfitting and acquire better results in moving activities. We use a new EDA method to deal
even in the case of higher similarity between activities such with HAR, which can effectively and accurately separate the
as fast-walking and really-fast-walking but the downside is activities.
that the author works for three types of walking activities (2) There are a variety of machine learning techniques.
only. Ravi et al. [6] proposed CNN models using three dif- So selecting the best machine learning model is a challenge.
ferent regularizations for each of four different datasets: The HAR method based on EDA can find a robust classifier
ActiveMiles, WISDM v1.1, Daphnet FoG, and Skoda and for the dataset to reduce error with low training time.
achieved 95.1%, 98.2%, 91.7%, and 96.7% for recogniz- In this paper, the domain knowledge is enriched by
ing 2, 6, 7, and 10 activities respectively. They achieved exploratory analysis of the data which in sequence helps to
consistent accuracy for real-time classification in low-power select a robust model for the activity classification task, which
devices using their more discriminative and sensor ori- provides high precision results by minimizing the associ-
entation/placement invariant features for the datasets but ated HAR problems like solving the misperception of highly
their precision and computational times are not better than alike activities such as walking and walking-downstairs. For
some former state-of-the-arts. Hammerla et al. [11] worked evaluating the performance of the selected model, numerous
on three different datasets: Opportunity, PAMAP2, and Daph- comparative experiments are conducted and various perfor-
net Gait, proposed five different deep models for each of mance metrics are used. The experimental outcomes show our
VOLUME 11, 2023 99483

methodology overtakes state-of-the-art and reaches higher sensor is separated into two components: body accelera-
accuracy up to 97.12%. tion and gravity acceleration by using another Butterworth
low-pass filter with a cut-off frequency of 0.3Hz because
III. METHODOLOGY gravitational force is supposed to have only low-frequency
The methodology involves the following steps: data collec- components.
tion, data preprocessing and analysis (which includes data From each segmented window, a feature vector is obtained
cleaning as well as exploratory data analysis to observe by estimating variables from both time and frequency
imbalance in the data and analysis on the single and multivari- domains. The data of each feature are normalized and con-
able), and finally modeling with sensor data. The conceptual fined within [-1, 1]. Finally, each record of the dataset
figure of the proposed framework is shown in Fig. 1 and is contains a 561 feature vector, its activity label, and an identi-
described in the sections below. fier of the user who carried out that experiment. Fig. 2 shows
the total data in the dataset with their train and test data
splitting for each activity and we see that about 70% of the
total data is used for training and the rest is used for testing.
FIGURE 2. Splitting whole data into train and test sets.
FIGURE 1. Conceptual figure of the proposed framework.

B. DATA CLEANING
We perform initial data analysis for cleaning the feature data.
A. HAR SMARTPHONE DATASET From the data overview, we see that there is no outlier; all
The dataset we used in this paper was collected from the values are bounded between −1 to 1. Again, we see that
the UCI machine learning repository [14] and prepared by there are no duplicates in the train and test datasets. Moreover,
Anguita et al. [4]. The data is acquired from 30 participants we should not be worried about null values because there
whose age is between 19 and 48. Each person is asked to is no missing value present in the dataset and we find no
perform six activities of daily living: LAYING, SITTING, feature with irregular cardinality. However the dataset con-
STANDING, WALKING, WALKING_DOWNSTAIRS, and tains some features that are irrelevant for machine learning
WALKING_UPSTAIRS by wearing a waist-mounted Sam- modeling, so we remove those irrelevant features from the
sung Galaxy S II smartphone with embedded inertial sensors, dataset.
accelerometer, and gyroscope. Using its built-in accelerom-
eter and gyroscope, the three-axis linear acceleration and C. EXPLORATORY DATA ANALYSIS
three-axis angular velocity are captured respectively at a uni- Getting to understand the data is called Exploratory Data
form rate of 50Hz. Participants’ activities are video recorded Analysis or EDA. It is a statistical way of perceiving
so that their activities can be labeled manually. The labeled and inferring the dataset. Usually, EDA comprises the
data is randomly partitioned into train and test sets, where following [15]:
data from 21 participants are selected for generating the train- 1) Observing data inequality among its various classes.
ing data and the data from the rest participants are selected for 2) Univariate feature analysis of the dataset, and noticing
generating the test data. the implication of a specific feature in classification
The noise is removed from raw sensor signals (accelera- using data visualization methods usually histograms or
tion and angular velocity) by applying a median filter and boxplots.
a third-order Butterworth low-pass filter with a corner fre- 3) Multivariate feature analysis of combined features of
quency of 20 HZ and then segmented in sliding windows the dataset, which is usually done using pair plots
of constant-width of 2.56 seconds and 50% overlapping, i.e. or dimensionality reduction techniques like PCA or
128 readings/window. The raw acceleration signal of the t-SNE.
99484 VOLUME 11, 2023

1) OBSERVING IMBALANCE IN THE DATA Fig. 4 below shows the number of data points for each of
Fig. 3 below shows the data provided by each subject. the six activities of daily living. Fig. 4(a) shows the data per-
Fig. 3(a) shows the data percentage by all users and from centage of all activities and from this figure; we observe that
this figure; we observe that each subject has almost the same each activity has almost the same number of data. We have
number of data. We have only less data from user 8 (eight) only fewer walking staircase data compared to others but
compared to others but that’s acceptable. So, we should not that’s reasonable. So we should not be worried about the
worry about the difference between them. Fig. 3(b) shows difference between them. Fig. 4(b) shows this in more detail,
this in more detail, where data of each user is further high- where data of each activity is further highlighted based on
lighted based on the user’s activities and we see that each each user and we see that each activity is performed by each
user performs each activity in almost equal number of times subject in almost equal number of times which means there
which means there is no significant amount of gap in their is no significant amount of gap in their readings.
readings.
FIGURE 4. Data for each activity class.
The imbalanced range of the percentage of the activities

is 5.7%. So, we conclude that activity-wise data as well as
FIGURE 3. Data provided by each subject.
subject-wise activity data is balanced enough.
The imbalanced range of percentage of the subjects is 2) UNIVARIATE FEATURE ANALYSIS

1.74%. So, we conclude that subject-wise data as well as Analysis of a single dimension or feature is known as univari-
activity-wise each subject data is balanced well. ate analysis. We performed the following univariate analysis:
VOLUME 11, 2023 99485

a: FEATURE/SENSOR IMPORTANCE FROM

DOMAIN KNOWLEDGE
Stationary activities (lying, standing, and sitting) are those
where there is no motion of an object. Moving activities
(Walking, Walking Upstairs, and Walking Downstairs) are
those where there is the motion of an object. That means,
in motionless activities, accelerometer information will not
be very significant whereas in motion activities accelerom-
eter information will be useful. Fig. 5 shows the number
of features in the dataset that come from the accelerometer
and gyroscope sensor of the smartphone. As we see, most of
the features are constructed from accelerometer sensors so
moving activities will easily be distinguished.
FIGURE 5. Number of features from various sensors.
b: STATIC AND DYNAMIC ACTIVITIES ARE

UTTERLY DIFFERENT
Fig. 6(a) shows the probability density functions (PDFs)
for six ADLs based on the feature ‘tBodyACCMag-
mean()’ (mean value for the magnitude of acceleration in the
time domain for body motion). From PDFs, we can observe
the difference between motionless and motion activities.
As per the PDF distribution of Fig. 6(a), we look closer by
dividing the PDFs into two parts to distinguish inactivity and
motion curves, shown in Fig. 6(b) and Fig. 6(c) respectively. FIGURE 6. Histogram and its nearby view for two types of activities.
Comparing these two figures, we can find that motion activ-
ities are less intensive than motionless activities. Moreover, activities clearly by their dispersion. We can find that moving
the ranges of feature data for both types of activities are very activities spread more than motionless activities. Moreover,
dissimilar. we can separate those activities by simply using the following
threshold statement: if (tBodyACC-max()-X < -0.75) then
c: BODY ACCELERATION CAN SEPARATE IT WELL Activity = ‘‘Static’’ else Activity = ‘‘Dynamic’’.
Fig. 7 shows the boxplots for six ADLs based on the fea- Also, using boxplot we can easily separate WALK-
ture ‘tBodyACC-max()-X’ (maximum value of acceleration ING_DOWNSTAIRS activity from others: if
along X -dimension for body motion). From the boxplots, (tBodyACC-max()-X > 0.25) then Activity = ‘‘WALKING_
we can observe the difference between stationary and moving DOWNSTAIRS’’ else Activity = ‘‘others’’.
99486 VOLUME 11, 2023

But still, about 15% of WALKING_DOWNSTAIRS obser- distinguishes all data points belonging to the LAYING activ-
vations are below 0.25 which are misclassified so this ity from other activities by just a single if-else statement: If
condition makes an error of 15% in walking downstairs (tGravityAcc-min()-X < 0.35) then Activity = ‘‘LAYING’’
classification. else Activity = ‘‘others’’.
FIGURE 7. Data dispersion of six ADLs for body-acceleration maximum

value along X-dimension.
Our analysis shows that not only the body acceleration

value itself can separate the activities but also the jerk and
magnitude of the body acceleration can separate them. Jerk
or jolt is the rate at which an object’s acceleration changes
to time and magnitude is the absolute change in motion
regardless of the direction of movement [16]. We observe
also that standard deviation values in both time and frequency
domains, as well as entropy in the frequency domain of jerk
of body acceleration along the X-dimension, can separate the
ADL. Last but not least, the mean value in the time domain of
magnitude of body acceleration can separate human activities
well. These are illustrated in Fig. 19 to Fig. 22, shown in
Appendix A. FIGURE 8. Boxplot and PDF of six ADLs for gravity-acceleration minimum
value in the time domain along the X-axis.
d: GRAVITY ACCELERATION COMPONENTS ALSO MATTERS
Gravity acceleration components can distinguish matting Analysis shows not only the gravity acceleration min in the
activity from others. Fig. 8 shows the data extent using box- time domain along the X-axis but also the gravity accelera-
plots and probability density function (PDF) for six ADLs tion max, gravity acceleration mean, and gravity acceleration
based on the feature ‘tGravityAcc-min()-X’ (smallest value energy (sum of the squares divided by the number of values)
of gravity acceleration in the time domain along the X -axis). in the same domain along the same axis can separate the
From both boxplots and PDF, we can observe that it perfectly matting activity well. Last but not least, the angle between
VOLUME 11, 2023 99487

the X -axis and gravity acceleration mean can also separate a: INVESTIGATING THE SEPARABLITY OF DATA USING KPCA
the in-bed activity from others. These are illustrated in Fig. 23 kPCA is an extension of PCA that achieves non-linear dimen-
to Fig. 26, shown in Appendix B. sionality reduction through the use of kernels to decompose
a multivariate dataset in a set of components that explain a
e: ANGULAR VELOCITY FROM THE GYROSCOPE IS maximum amount of the variance [17]. In PCA the num-
ALSO A FACTOR ber of components is bounded by the number of features
Though accelerometer data is significant to distinct static whereas in kPCA that number is bounded by the number of
and dynamic activities, analysis pays attention to that gyro- instances [18].
scope data in many cases can discriminate them. Fig. 9 We have used polynomial as well as Radial Basis Func-
shows the boxplots for six ADLs based on the feature tion (RBF) as kernel and the resulting figures are shown in
‘fBodyGyro-entropy()-Z’ (entropy value of angular velocity Fig. 10(a) for polynomial kernel degree, 9, and in 10(b) for
from gyroscope along Z -dimension in the frequency domain RBF kernel coefficient, 0.05, respectively.
for body motion). From boxplots, we can see that moving
activities can be clearly distinguished with a threshold value
as follows: If (fBodyGyro-entropy()-Z > 0.04) then Activ-
ity = ‘‘Dynamic’’ else Activity = ‘‘Static’’.
FIGURE 9. Data spread of six ADLs for gyroscope-entropy value along

X-dimension in the frequency domain for body motion.
3) MULTIVARIATE FEATURE ANALYSIS

Analyzing multiple features together is called multivariate
analysis. In the above, we perform analysis on a single FIGURE 10. kPCA using two different kernels to separate the activities.
feature; here we perform analysis over all 561 features
(i.e. excluding ‘subject’ and ‘Activity’ features) to investi- From both figures, we see that steady and moving activities
gate the separability of the data through visualization using can be separated very well. But each of the dynamic activities
two non-linear dimensionality reduction techniques: Kernel is not easily separable from each other whereas each of the
PCA and T-distributed Stochastic Neighbor Embedding. static activities is easily separable from each other with some
99488 VOLUME 11, 2023

errors in standing and sitting. That means, kPCA is good for

separating each static activity but not good for separating each
dynamic activity.
b: INVESTIGATING THE SEPARABLITY OF DATA

USING T-SNE
t-SNE is another tool to observe the behavior of the data by
visualizing them from an extremely high dimensional space
to a compelling low dimensional space. Though it projects
data to low dimensional space still it retains lots of actual
information [19]. It does this by converting affinities between
data points to Gaussian joint probabilities in the original
space and it tries to minimize the Kullback-Leibler divergence
by gradient descent between the joint probabilities of the
embedding space and the original data. In the embedded
space, affinities are represented by Student’s t-distributions.
This allows t-SNE to preserve the local structure which means
data that is close in embedding space remains close, and the
far remains far [20]. It’s a powerful dimensionality reduction
technique that reveals data to lie in multiple, manifolds or FIGURE 11. t-SNE with perplexity 80 for separating the activities.
clusters.
Perplexity is the number of closest neighbors of each point
t-SNE contemplates when producing conditional probabili- both types of activities, especially all dynamic and matting
ties. The perplexity value has an impact on the optimization activities.
of t-SNE and therefore on the quality of the resulting embed-
ding. That’s why we analyze different plots with different D. MODELING WITH SENSOR DATA
perplexities: 2, 5, 20, 30, 50, 60, 80, and 100. A higher We have used those two ensemble approaches where the
perplexity considers a larger number of neighbors and ignores final prediction result is obtained from multiple conceptu-
more local information in favor of the global structure of ally different or heterogeneous learning models: Stacking,
data. Conversely, lower perplexities lead to smaller nearest and Voting. In both approaches, we have combined the pre-
neighbors and thus less sensitivity to global information in dictions of three classical machine learning linear models:
favor of the local neighborhood [20]. Four other factors Linear Discriminant Analysis (LDA), Linear Support Vector
control the performance of the resulting embedding: early Machines (LSVM), and Logistic Regression (Logit). These
exaggeration, learning rate, maximum number of iterations, heterogeneous models are applied with the same hyper-
and angle [20]. The resulting image for perplexity 80 with parameters in both stacking and voting classification cases,
early exaggeration, 12.0, learning rate, 153.167, the maxi- but their learning and recognition strategies are different.
mum number of iterations, 1000, and angle, 0.5, is shown Both stacking and voting classifiers improve generalizability
in Fig. 11. or robustness over a single classifier [21]. In the below, the
We select perplexity 80 because it balances attention three estimators that are used in our both stacked and voting
between local and global characteristics of data than other generalization are outlined with their hyper-parameters so
perplexities. For clarification, the figures for four other per- that one can reproduce the result and then the learning and
plexities: 5, 20, 50, and 100 are shown in Fig. 27 to Fig. 30 in recognition process of both ensembles are delineated.
Appendix C.
In Fig. 11, we see the data points in 2 dimensions and we 1) ESTIMATORS AND THEIR HYPER-PARAMETERS
observe the behavior of those data points. We can see the The parameters that are not directly learned within models
six activities in three folds/clusters. Again we observe that are called hyperparameters. They are provided as arguments
all other classes are fairly separable instead of ‘standing’ and to the model classes’ constructors in Scikit-learn [22].
‘sitting’ classes, because of similarities in sensor values, and Linear Discriminant Analysis provides a linear decision
it is expected because both are static actions. Maybe other boundary which is generated by fitting a class conditional
sensors like the heartbeat sensor can assist in discriminating Gaussian density to each class and using Bayes’ rule [23].
this because the heart rate is different at resting and stand- As the dataset contains six ADLs, the desired dimensionality
ing poses. Laying activity is totally in a different position. here is five. We have used the least squares solution as a
Walking, Walking downstairs, and walking upstairs are some solver and automatic shrinkage as a form of regularization
kind of similar so they are clustered together but separable (to improve the estimation of covariance matrices) using the
from each other. So, t-SNE is good for separating each of Ledoit-Wolf lemma [24].
VOLUME 11, 2023 99489

Support Vector Machines use only a small subset of train-

ing data in producing the decision boundary (called support
vectors) [25]. We have used linear support vector classifi-
cation i.e. SVM with linear kernel and so the multiclass is
handled according to the one-vs-the-rest scheme. The maxi-
mum iteration used here is 500 with a random state of 42.
Logistic Regression or Logit regression is a maximum-
entropy log-linear classification model in which the prob-
abilities unfolding the potential outcomes of a single test
are modeled using a logistic function [26]. We have used
L2 regularized logistic regression with maximum iteration FIGURE 13. Training strategy of the meta-model.
50 and the multiclass is handled using cross-entropy loss.

The list of hyper-parameters for all the above baseline
algorithms is shown in Table 1. the Logit model is fitted on out-samples through 5-fold
cross-validation i.e. using cross-validated predictions of base
models to generalize and avoid over-fitting. The training
TABLE 1. Hyper-parameters of the baseline estimators.
strategy of base-layer estimators is shown in (1) and (2), and
that of the meta-estimator is shown in Fig. 13.
fit (LDA) = LDA_fit(x_train, y_train) (1)
fit (LSVM) = LSVM_fit(x_train, y_train) (2)
During recognition, the outputs of base-layer estimators,
LDA and LSVM, are stacked together in parallel on the test
data, and the Logit estimator in the second layer uses those
outputs as input to compute the final activity class of the
stacking model. The recognition process of those estimators
is shown in (3), (4), and (5) respectively.
predict (LDA) = LDA_predict(x_test, y_test) (3)
predict (LSVM) = LSVM_predict(x_test, y_test) (4)
predict (Logit) = Logit_predict (predict (LDA) ,
2) LEARNING AND RECOGNITION PROCESS OF
predict (LSVM)) (5)
STACKING CLASSIFIER In the above equations (also in (6) to (12) below), x_train
Stacked generalization is a stack of estimators (base estima- and x_test are the train and test data after removing irrelevant
tors) with a final estimator (meta-estimator) which reduces and target features from the training and test set respectively.
the bias of individual estimators. This allows for combining Similarly, y_trainand y_test are the target activity classes of
the different strengths of all individual predictors [27], [28]. those datasets. The fit and predict functions (in various forms)
In our proposed method, among the linear models mentioned represent the learning and prediction by related models on
above the first two are used as base models and the third given data correspondingly.
one is used as a meta-model, stacked in two different layers,
as shown in Fig. 12. 3) LEARNING AND RECOGNITION PROCESS OF
VOTING CLASSIFIER
The voting generalization of the proposed method is a major-
ity rule classifier that combines the three heterogeneous linear
models mentioned above and uses a majority vote (hard
voting) i.e. the mode of the predicted labels to recognize the
human activity. This is useful to balance out the weaknesses
of each model [21]. All models: LDA, LSVM, and Logit are
worked in the same layer, as shown in Fig. 14.
FIGURE 12. Proposed stacking model. After removing irrelevant features from the dataset, the
same train data are fed to all estimators directly, except
We first remove irrelevant features from both the HAR LSVM where data is fed after standardization. That means all
train and test dataset. During training, the whole train data estimators are trained on the whole training data. In the recog-
are fed to the first base estimator, LDA, directly and to the nition stage, each estimator predicts the activity, and the final
second base estimator, LSVM, after standardization whereas output is selected based on the maximum vote. The whole
99490 VOLUME 11, 2023

high which indicates that for any human activity class, the
classifier misclassified it to any other class as well as any
other class is misclassified to it; in both cases, the error is
almost equal and too low. The higher support value of static
activities than that of dynamic activities focus on the disguise
that human is too lazy to do exercise activities. The high
average values of precision, recall, and F1 -score show the
robustness of the models and we see that the stacking model is
FIGURE 14. Proposed voting model.
better than the voting model for the HAR smartphone dataset.
voting strategy is illustrated in the following equations where

(6), (7), and (8) show the training process of each estimator
and (9), (10), and (11) show the prediction process of each
estimator respectively and (12) shows the final recognition
by voting classifier.
fit (LDA) = LDA_fit(x_train, y_train) (6)

fit (LSVM) = LSVM_fit(x_train, y_train) (7)
fit (Logit) = Logit_fit(x_train, y_train) (8)
predict (LDA) = LDA_predict(x_test, y_test) (9)
predict (LSVM) = LSVM_predict(x_test, y_test) (10)
predict (Logit) = Logit_predict(x_test, y_test) (11)
predict (VOTING) = max (predict (LDA) , predict (LSVM) ,
predict (Logit)) (12)
where max represents the majority class of the predictions.
IV. RESULT ANALYSIS

We have implemented our method using the Scikit-learn
library for machine learning in Python 3.0. The following
section describes our experimental results and later the com-
parison with the state-of-the-art and a discussion of the results
is outlined.
A. EXPERIMENTAL RESULTS
Developing the machine learning model by properly tuning
the hyper-parameters of the estimators, the heat map of the
confusion matrix is obtained. Fig. 15 shows the confusion
matrix for both stacking and voting classifiers. From the
heat maps, we outline three remarks: (1) The correlation
degree of the stacking model is higher than that of the voting
classifier (2) The correlation degree of inactivity is higher
than that of moving activity. (3) The correlation degree among
static activities is higher than the correlation of any static
activity with any other moving activity. This means that if any
static activity is misclassified, it is misclassified to another FIGURE 15. Confusion matrix for ensemble models.
static activity not to any motion activity. The opposite is

also true.
Besides this, the detailed classification reports i.e. activity- Again, the learning accuracy and learning time as well as
wise precision, recall, F1 -score, and support, and their recognition accuracy and recognition time of the proposed
weighted averages are also obtained for both ensembles models are also found (Table 4 ). Training accuracy illustrates
(Table 2 and Table 3). By analyzing these tables, we can that models are not fully over-fitted, that’s why we get com-
find that the difference between precision and recall is not so parable prediction accuracy for both models. Comparing the
VOLUME 11, 2023 99491

TABLE 2. Classification report for stacking ensemble.
TABLE 3. Classification report for voting ensemble.
FIGURE 16. Five other classification performance metrics for proposed

stacking and voting models.
TABLE 4. Accuracy and required time for the ensembles.
TABLE 5. Comparison with state-of-the-art.
FIGURE 17. Recognition accuracy of all considered models.
TABLE 6. Training time comparison with state-of-the-art.
data within the table, we see that the stacking ensemble shows
better results than the voting classifier in all cases except in
the case of training time complexity, though the training time
is at a negligible level. The proposed stacking classifier shows
97.12% accuracy. B. COMPARISON AND DISCUSSION
Moreover, for clarifying the performance of the proposed Some state-of-the-art who worked on the same dataset have
models the Cohen’s kappa score, Jaccard score, Mathew been compared with the results of the proposed study, given
correlation coefficient, Hamming loss, and Zero one loss are in Table 5.
also obtained and shown in Fig. 16. These metrics also show Comparing the models developed in this study with similar
the robustness of both models and as before show little better studies in the literature, it is seen that the results of this study
performance for stacking model. are pleasing with an accuracy rate of 97.12% for the same
99492 VOLUME 11, 2023

TABLE 7. Comparison of the proposed stacking model with its baselines.
FIGURE 19. Data dispersion of six ADLs for the body-acceleration-jerk

standard deviation in the time domain along X-dimension.
show better results than them, so deep learning methods are

not automatically every time the best selection when dealing
with machine learning and sensory data models. Though our
voting classifier outperforms a little, our stacking classifier
outperforms a better amount of percentage for recognizing
human activities of daily living from the HAR smartphone
dataset. This result of the stacking ensemble concludes that
stacked generalization can be a good choice for future HAR
systems.
To promote the results and discussion of our study,
we experimented with two other stacking classifiers: (1) using
only the LSVM model as both base-layer estimators and
meta-estimator (2) using LDA and LSVM as base-estimators
and again, LSVM as meta-estimator. The result of all of our
experimented methods is shown in Fig. 17.
FIGURE 18. Training and testing ROC curves of each activity class and
From this figure, we see that the two newly considered
their average for the proposed Stacking ensemble. stacking models show 96.91% and 96.84% accuracy respec-
tively. That means all three stacking methods (the new two
with the proposed one) outperform both our voting model as
dataset. Again, as can be seen from the table, both of our well as all state-of-the-art methods mentioned in Table 5.
ensemble methods outperform the results of state-of-the-art. Again, the highest accuracy is achieved by
We also observe that many of the existing works in the above W. Kong et al. [7] among the existing methods listed in
table use deep learning as a model but our ensemble models Table 5 and only that paper mentioned the training time of
VOLUME 11, 2023 99493

FIGURE 22. Data dispersion of six ADLs for the

FIGURE 20. Data dispersion of six ADLs for body-acceleration-jerk body–acceleration-magnitude mean in the time domain.
standard deviation in frequency domain alongX-dimension.
FIGURE 23. Data extent of six ADLs for gravity-acceleration maximum

value in time domain along X-axis.
FIGURE 21. Data dispersion of six ADLs for body-acceleration-jerk
entropy in frequency domain along X -dimension.
The authors of [7] developed six different models in

their used models. That’s why the model training time of their work and among their models; Linear SVC with Grid-
our proposed method is compared with that of [7], shown SearchCV shows the best accuracy. From the above table,
in Table 6. we see that both of our ensemble models’ training times are
99494 VOLUME 11, 2023

FIGURE 24. Data extent of six ADLs for gravity-acceleration mean in the
time domain along the X-axis. FIGURE 26. Data extent of six ADLs for the position of gravity-
acceleration mean with X-axis.
FIGURE 27. t-SNE with perplexity 5 for separating the activities.
FIGURE 25. Data extent of six ADLs for gravity-acceleration energy in the
time domain along the X-axis.
Moreover, the accuracy of our best model (stacking model)
is compared with its baseline algorithms in Table 7.
less than that of their best model. Therefore, the above two The above table shows that the stacking model provides
comparison tables show the strength of our models in terms better results than its baseline methods and so, this table
of accuracy and time complexity. focuses on the significance of organizing baseline models
VOLUME 11, 2023 99495


Finally, the Receiver Operating Characteristic (ROC)

curves for training and test datasets of our best model (stack-
ing model) are shown in Fig. 18. From the two plots of
Fig. 18, we see that the area under the curve (AUC) for each
activity class and their micro-, as well as macro-average, is
1 for all cases, for both training and test set except 0.99 for
that of SITTING class in the test set, which is very near to
(area) 1. Therefore, these AUCs illustrate the robustness of
the proposed stacking model again.
The above experimental results and comparisons demon-
strate that the exploratory data analysis method has better
generalization performance than the traditional data analysis
method and as an estimator both the proposed stacking and
voting classifier have good performance whereas stacked
generalization has better performance than voting, no mat-
ter for different human activities or different evaluation
strategies. These conclusions offer direction for the strat-
egy of HAR methods using smartphone sensors in the
future.
V. CONCLUSION AND FUTURE WORK

The proposed HAR method is a comparatively higher
accuracy method. Additionally, the method of this study
is lightweight, precise, and reasoning. Moreover, the
exploratory data analysis outlined in this paper visualizes
according to the proposed stacking model framework, which the detailed nature of activities, from which more new HAR
is the best model than our other proposals and mentioned systems can be developed. The experimental results show that
state-of-the-art. our proposed activity visualization and identification method
99496 VOLUME 11, 2023

has the potential to increase the performance of present [8] A. K. M. Masum, S. Jannat, E. H. Bahadur, M. G. R. Alam, S. I. Khan,
HAR applications. Moreover, the solution can be extended and M. R. Alam, ‘‘Human activity recognition using smartphone sen-
sors: A dense neural network approach,’’ in Proc. 1st Int. Conf. Adv.
to the classification of any multivariate time series data for Sci., Eng. Robot. Technol. (ICASERT), Dhaka, Bangladesh, May 2019,
other applications. As our experiment is carried out offline, pp. 1–6.
our future target is to use our EDA to build and publish a real- [9] A. Khan, M. Siddiqi, and S.-W. Lee, ‘‘Exploratory data analysis of
acceleration signals to select light-weight and accurate features for real-
time system to classify human activities. Again, as human time activity recognition on smartphones,’’ Sensors, vol. 13, no. 10,
beings may accomplish numerous activities at the same time, pp. 13099–13122, Sep. 2013, doi: 10.3390/s131013099.
the future HAR hopes to be capable of recognizing parallel [10] P. Dinev, I. R. Draganov, O. L. Boumbarov, and D. Brodić, ‘‘Prepro-
activities. Moreover, our future target is to extend HAR based cessing and clustering raw accelerometer data from smartphones for
human activity recognition,’’ in Proc. CEMA, Athens, Greece, 2017,
on EDA to other human activities, human interactions, and pp. 20–24.
relationships. [11] N. Y. Hammerla, S. Halloran, and T. Plötz, ‘‘Deep, convolutional, and
recurrent models for human activity recognition using wearables,’’ in Proc.
IJCAI, New York, NY, USA, 2016, pp. 1533–1540.
APPENDIX A
[12] C. Xu, D. Chai, J. He, X. Zhang, and S. Duan, ‘‘InnoHAR: A deep neural
FIGURES MENTIONED IN THE SUBSECTION ‘BODY network for complex human activity recognition,’’ IEEE Access, vol. 7,
ACCELERATION CAN SEPARATE IT WELL’ pp. 9893–9902. 2019, doi: 10.1109/ACCESS.2018.2890675.
See Figs. 19–22. [13] D. Bhattacharya, D. Sharma, W. Kim, M. F. Ijaz, and P. K. Singh, ‘‘Ensem-
HAR: An ensemble deep learning model for smartphone sensor-based
human activity recognition for measurement of elderly health monitoring,’’
APPENDIX B Biosensors, vol. 12, no. 6, p. 393, Jun. 2022, doi: 10.3390/bios12060393.
FIGURES MENTIONED IN THE SUBSECTION ‘GRAVITY [14] Human Activity Recognition Using Smartphones Data Set.
Accessed: Jan. 28, 2023. [Online]. Available: https://archive.ics.
ACCELERATION COMPONENTS ALSO MATTERS’ uci.edu/ml/datasets/human+activity+recognition+using+smartphones
See Figs. 23–26. [15] Signal Processing With Machine Learning—Human Activity
Recognition Part 1—EDA. Accessed: Feb. 5, 2023. [Online]. Available:
https://medium.com/analytics-vidhya/signal-processing-with-machine-
APPENDIX C
learning-human-activity-recognition-part-i-eda-a1f3b0e91b63
FIGURES MENTIONED IN THE SUBSECTION [16] Step By Step All Classification Model for Beginners.
‘INVESTIGATING THE SEPARABILITY Accessed: Feb. 7, 2023. [Online]. Available: https://www.kaggle.com/
OF DATA USING T-SNE’ code/devson/stepbystep-all-classificationmodel-for-beginners
[17] B. Schölkopf, A. Smola, and K. R. Müller, ‘‘Kernel principal component
See Figs. 27–30. analysis,’’ in Proc. ICANN, Lausanne, Switzerland, 1997, pp. 583–588.
[18] S. Wold, K. Esbensen, and P. Geladi, ‘‘Principal component analysis,’’
APPENDIX D Chemometrics Intell. Lab. Syst., vol. 2, nos. 1–3, pp. 37–52, Aug. 1987,
THE CODES OF THIS STUDY doi: 10.1016/0169-7439(87)80084-9.
The codes used to produce the outcomes of our study are [19] L. Van der Maaten and G. Hinton, ‘‘Visualizing data using t-SNE,’’
J. Mach. Learn. Res., vol. 9, no. 11, pp. 2579–2605, Nov. 2008.
available in the following Github link: https://github.com/
[20] A. J. Izenman, ‘‘Introduction to manifold learning,’’ Wiley Interdis-
SM-Mohidul-Islam/Exploratory-Analysis-of-HAR-Smartph cipl. Rev., Comput. Statist., vol. 4, no. 5, pp. 439–446, Sep. 2012, doi:
one-Sensor-Data. 10.1002/wics.1222.
[21] T. G. Dietterich, ‘‘Ensemble methods in machine learning,’’ in Proc. MCS,
REFERENCES Berlin, Germany, 2000, pp. 1–15.
[1] C. A. Ronao and S.-B. Cho, ‘‘Human activity recognition using smart- [22] Tuning the Hyper-Parameters of an Estimator. Accessed: Feb. 17, 2023.
phone sensors with two-stage continuous hidden Markov models,’’ in Proc. [Online]. Available: https://scikit-learn.org/stable/modules/grid_search.
10th Int. Conf. Natural Comput. (ICNC), Xiamen, China, Aug. 2014, html
pp. 681–686. [23] R. H. Riffenburgh, ‘‘Linear discriminant analysis,’’ Ph.D. dissertation,
[2] E. Gambi, G. Temperini, R. Galassi, L. Senigagliesi, and A. D. Santis, Dept. Stat., Virginia Polytech. Inst., Blacksburg, VA, USA, 1957.
‘‘ADL recognition through machine learning algorithms on IoT air qual- [24] O. Ledoit and M. Wolf, ‘‘Honey, I shrunk the sample covariance matrix,’’
ity sensor dataset,’’ IEEE Sensors J., vol. 20, no. 22, pp. 13562–13570, J. Portfolio Manage., vol. 30, no. 4, pp. 110–119, Jun. 2004, doi:
Nov. 2020, doi: 10.1109/JSEN.2020.3005642. 10.3905/jpm.2004.110.
[3] P. F. Edemekong, D. Bomgaars, S. Sukumaran, and S. B. Levy, [25] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf, ‘‘Support
Eds., Activities of Daily Living. Bethesda, MD, USA: StatPearls vector machines,’’ IEEE Intell. Syst. Appl., vol. 13, no. 4, pp. 18–28,
Publishing, 2020, Accessed: Mar. 15, 2023. [Online]. Available: Jul./Aug. 1998, doi: 10.1109/5254.708428.
https://www.ncbi.nlm.nih.gov/books/NBK470404/ [26] T. G. Nick and K. M. Campbell, ‘‘Logistic regression,’’ in Topics in
[4] D. Anguita, A. Ghio, L. Oneto, X. Parra, and J. L. Reyes-Ortiz, ‘‘A public Biostatistics, W. T. Ambrosius, Ed. Totowa, NJ, USA: Humana Press,
domain dataset for human activity recognition using smartphones,’’ in 2007, pp. 273–301.
Proc. Esann, Bruges, Belgium, vol. 3, 2013, p. 3. [27] D. H. Wolpert, ‘‘Stacked generalization,’’ Neural Netw., vol. 5, no. 2,
[5] X. Shi, Y. Li, F. Zhou, and L. Liu, ‘‘Human activity recognition based pp. 241–259, Jan. 1992, doi: 10.1016/S0893-6080(05)80023-1.
on deep learning method,’’ in Proc. Int. Conf. Radar (RADAR), Brisbane, [28] T. Hastie, R. Tibshirani, J. H. Friedman, and J. H. Friedman, The Elements
QLD, Australia, Aug. 2018, pp. 1–5. of Statistical Learning: Data Mining, Inference, and Prediction, vol. 2,
[6] D. Ravi, C. Wong, B. Lo, and G.-Z. Yang, ‘‘Deep learning for 2nd ed. New York, NY, USA: Springer, 2009, pp. 1–758.
human activity recognition: A resource efficient implementation on [29] F. Ordóñez and D. Roggen, ‘‘Deep convolutional and LSTM recurrent
low-power devices,’’ in Proc. IEEE 13th Int. Conf. Wearable Implant. neural networks for multimodal wearable activity recognition,’’ Sensors,
Body Sensor Netw. (BSN), San Francisco, CA, USA, Jun. 2016, vol. 16, no. 1, p. 115, Jan. 2016, doi: 10.3390/s16010115.
pp. 71–76. [30] J. Yang, M. N. Nguyen, P. P. San, X. Li, and S. Krishnaswamy, ‘‘Deep
[7] W. Kong, L. He, and H. Wang, ‘‘Exploratory data analysis of human convolutional neural networks on multichannel time series for human
activity recognition based on smart phone,’’ IEEE Access, vol. 9, activity recognition,’’ in Proc. IJCAI, Buenos Aires, Argentina, vol. 15,
pp. 73355–73364, 2021, doi: 10.1109/ACCESS.2021.3079434. 2015, pp. 3995–4001.
VOLUME 11, 2023 99497

[31] F. Cruciani, A. Vafeiadis, C. Nugent, I. M. P. Cleland, K. Votis, and KAMRUL HASAN TALUKDER received the
R. Hamzaoui, ‘‘Feature learning for human activity recognition using con- Bachelor of Science degree (Hons.) in CSE,
volutional neural networks,’’ CCF Trans. Pervasive Comput. Interact., vol. the M.Sc. degree in computer science from the
2, no. 1, pp. 18–32, 2020, doi: 10.1007/s42486-020-00026-2. National University of Singapore (NUS), in 2004,
[32] N. Nair, C. Thomas, and D. B. Jayagopi, ‘‘Human activity recogni- and the Doctor of Engineering (D.Eng.) degree
tion using temporal convolutional network,’’ in Proc. 5th Int. Workshop from Hiroshima University, Japan, in 2008.
Sensor-Based Activity Recognit. Interact., Berlin, Germany, Sep. 2018, He was the Head of the Computer Science and
pp. 1–8. Engineering Discipline for three years. He joined
Khulna University, as a Faculty Member, in 2000.
He was a Postdoctoral Fellow with the Japan Soci-
ety for the Promotion of Science (JSPS), Hiroshima University, for two
years. He is currently a Professor with the Computer Science and Engi-
S.M. MOHIDUL ISLAM was born in 1983. neering Discipline, Khulna University. He is also the Dean of the Science,
He received the B.Sc. and M.Sc. degrees (Hons.) Engineering, and Technology School, Khulna University. He has published
in CSE, in 2007 and 2016, respectively. He is more than 70 peer-reviewed research articles over the years. His research
currently pursuing the Ph.D. degree in computer interests include image analysis, software engineering, networking, and
science and engineering with Khulna University, the IoT.
Bangladesh.
He achieved an ICT Fellowship from
Bangladesh Government for the M.Sc. degree
in engineering research. He joined Khulna Uni-
versity, as a Faculty Member, in 2008. He has
published several research papers in international journals and conferences.
His research interests include human activity recognition, data science,
machine learning, and smart technology. He is a Life Member of the Engi-
neer’s Institution Bangladesh (IEB) and the Bangladesh Computer Society
(BCS). He is the former Joint Secretary of the Khulna Region of Bangladesh
Open Source Network (BdOSN). He is the former Joint Secretary and a
current Branch Member of the Khulna Branch of the Bangladesh Computer
Society.
99498 VOLUME 11, 2023

Exploratory_Analysis_of_Smartphone_Sensor_Data_for_Human_Activity_Recognition

Uploaded by

Copyright:

Available Formats

Exploratory_Analysis_of_Smartphone_Sensor_Data_for_Human_Activity_Recognition

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exploratory_Analysis_of_Smartphone_Sensor_Data_for_Human_Activity_Recognition

Uploaded by

Copyright:

Available Formats

Received 24 August 2023, accepted 6 September 2023, date of publication 12 September 2023,

date of current version 18 September 2023.

Exploratory Analysis of Smartphone Sensor Data

I. INTRODUCTION HAR has gained noteworthy consideration as a vital issue

99482 VOLUME 11, 2023

VOLUME 11, 2023 99483

FIGURE 2. Splitting whole data into train and test sets.

FIGURE 1. Conceptual figure of the proposed framework.

99484 VOLUME 11, 2023

FIGURE 4. Data for each activity class.

The imbalanced range of the percentage of the activities

The imbalanced range of percentage of the subjects is 2) UNIVARIATE FEATURE ANALYSIS

VOLUME 11, 2023 99485

a: FEATURE/SENSOR IMPORTANCE FROM

FIGURE 5. Number of features from various sensors.

b: STATIC AND DYNAMIC ACTIVITIES ARE

99486 VOLUME 11, 2023

FIGURE 7. Data dispersion of six ADLs for body-acceleration maximum

Our analysis shows that not only the body acceleration

VOLUME 11, 2023 99487

FIGURE 9. Data spread of six ADLs for gyroscope-entropy value along

3) MULTIVARIATE FEATURE ANALYSIS

99488 VOLUME 11, 2023

errors in standing and sitting. That means, kPCA is good for

b: INVESTIGATING THE SEPARABLITY OF DATA

VOLUME 11, 2023 99489

Support Vector Machines use only a small subset of train-

50 and the multiclass is handled using cross-entropy loss.

99490 VOLUME 11, 2023

voting strategy is illustrated in the following equations where

fit (LDA) = LDA_fit(x_train, y_train) (6)

where max represents the majority class of the predictions.

IV. RESULT ANALYSIS

static activity not to any motion activity. The opposite is

VOLUME 11, 2023 99491

TABLE 2. Classification report for stacking ensemble.

TABLE 3. Classification report for voting ensemble.

FIGURE 16. Five other classification performance metrics for proposed

TABLE 4. Accuracy and required time for the ensembles.

TABLE 5. Comparison with state-of-the-art.

FIGURE 17. Recognition accuracy of all considered models.

TABLE 6. Training time comparison with state-of-the-art.

99492 VOLUME 11, 2023

TABLE 7. Comparison of the proposed stacking model with its baselines.

FIGURE 19. Data dispersion of six ADLs for the body-acceleration-jerk

show better results than them, so deep learning methods are

VOLUME 11, 2023 99493

FIGURE 22. Data dispersion of six ADLs for the

FIGURE 23. Data extent of six ADLs for gravity-acceleration maximum

The authors of [7] developed six different models in

99494 VOLUME 11, 2023

FIGURE 27. t-SNE with perplexity 5 for separating the activities.

VOLUME 11, 2023 99495

FIGURE 28. t-SNE with perplexity 20 for separating the activities.

Finally, the Receiver Operating Characteristic (ROC)

V. CONCLUSION AND FUTURE WORK

99496 VOLUME 11, 2023

VOLUME 11, 2023 99497

99498 VOLUME 11, 2023

You might also like