Human Activity Recognition Synopsis
Human Activity Recognition Synopsis
Human Activity Recognition Synopsis
Recently, automation system has drawn much attention for industrial as well as
academic research and Human Activity Recognition (HAR) system is one of them in the field
of computer vision analysis technology . Due to the mounting demands from numerous
applications like in medical or healthcare systems, security, visual monitoring, video
acquisition, entertainmentas well as abnormal activity monitoring to capture glimpses of what
is going on as well as thedetection of illegal or possibly harmful practises. Similarly, inthe area
of entertainment, HAR also helps to improve the performance of Human Computer Interaction
(HCI), Furthermore, HAR also play a vital role in a healthcare system to recognize the
activityof the rehabilitation of patients like their action and behaviour to facilitate the
rehabilitation processes. Many scholars have attempted to use the HAR method, especially in
regards to abnormal activities in home, human activity, sports and street activity, healthcare
monitoring and many more application in their studies. In this review paper, the computer
vision-based technologies for recognition of human activity or abnormal behaviours using the
concept of demanding and computationally intelligent classification techniques like deep
learning and machine learning will be extensively reviewed and discussed with challenges and
future possibilities. An action in the HAR mechanism may be witnessed by using the human
eye or by the use of some kind of visualization or sensing technologies. The actual activity of
the individual in the field of view must be constantly monitored for the operation to be properly
performed. Based on the type of involved body parts used for motion, human actions may be
divided into four categories:
Facial and walking: It is focused on the action of a human’s face, or other body parts when
walking, with no requirement of verbal contact.
• Action: it is just a series of gestures performed by humans such as running or sitting or
walking.
• Interaction: It is also an important aspect that incorporates individual actions to be
executed by human. Interaction can be with individual or single person.
• Group Activity: It may be a mix of human movements, behaviours, acts, or
interactions. Here, number of performers can be performed at a time but at least two or more
objects or person needed for the interaction.
1
Problem Statement:
Existing System:
The most successful and popular vector-form feature: histograms of oriented
gradients (HOG) It is shown that the HOG features are based on the contrast of silhouette
contours against the background. Despite all the difficulties on human detection, a lot of
work has been done recent years. First, we may use different features such as edge, Haar
features and gradient orientation features; second, we may use different classifiers such as
Nearest Neighbor, Neutral Network, SVM and Adaboost the second step of human
detection is designing classifier. Large generalization ability and less classifying
complexity are two important criteria for selecting classifiers. Linear support vector
machine (SVM) and AdaBoost are two widely-used classifiers satisfying the criteria. So
the traditional approach of AdaBoost for face detection and has demonstrated both high
recognition accuracy and fast run-time performance. However, in most cases the
classification accuracy is lower than that of the first proposed algorithm based on HOG+
SVM.
2
to the video frames.
Dataset: A kinetics dataset which consists of 400 human activities is used for prediction
and comparison of the input data. Kinetics dataset are taken from YouTube recordings.
The activities are human focuses and cover a wide scope of classes including human-object
communications, for example mowing lawn, washing dishes, humans Actions e.g. Since
the dataset is huge and downloading each clip would be a waste of time given that we
already have pre-trained models by the original author. It will be easy and provides
accurate results when worked on the pre-trained model than to train and tune it separately.
Through the programming in python the captioning of the activity that is identified
by the model can be displayed in the video while execution of the input file.
Simultaneously, the speech of the activity that is captioned will be produced.
3
system is during the ongoing Worldwide Covid-19 Pandemic where intelligent tracking of mass
gathering is of utmost importance to avoid the community spread of the disease. As the system comes
with Real-time human counting, based on the statistics the task of the governing body gets reduce
significantly to identify crowdy places or streets with the help of the proposed system respectively