Visual Assistant Project-Content Final
Visual Assistant Project-Content Final
Visual Assistant Project-Content Final
INTRODUCTION
1
● Supervised Machine Learning:
Supervised machine learning algorithms can apply what has been
learned in the past to new data using labeled examples to predict future
events. Starting from the analysis of a known training dataset, the
learning algorithm produces an inferred function to make predictions
about the output values. The system is able to provide targets for any new
input after sufficient training. The learning algorithm can also compare its
output with the correct, intended output and find errors in order to modify
the Model accordingly.
2
● Reinforcement Machine Learning:
1.1.2 TensorFlow
3
CPUs and GPUs (with optional CUDA and SYCL extensions for general-
purpose computing on graphics processing units). TensorFlow is
available on 64-bit Linux, macOS, Windows, and mobile computing
platforms including Android and iOS.
4
Until around the end of 2014, the officially-supported integrated
development environment (IDE) was Eclipse using the Android
Development Tools (ADT) Plugin, though IntelliJ IDEA IDE (all
editions) fully supports Android development out of the box. As of 2015,
Android Studio, made by Google and powered by IntelliJ, is the official
IDE; however, developers are free to use others, but Google made it clear
that ADT was officially deprecated since the end of 2015 to focus on
Android Studio as the official Android IDE.[7] Additionally, developers
may use any text editor to edit Java and XML files, then use command
line tools (Java Development Kit and Apache Ant are required) to create,
build and debug Android applications as well as control attached Android
devices (e.g., triggering a reboot, installing software package(s)
remotely).
Below is the basic requirements you would need to get started with using
android studio to build mobile applications.
5
Table 1.2 Android Phone Requirements
Research shows that the detection of objects like a human eye has
not been achieved with high accuracy using cameras and cameras cannot
be replaced with a human eye. Detection refers to identification of an
6
object or a person by training a model by itself. Detection of images or
moving objects has been highly worked upon, and has been integrated
and used in commercial, residential and industrial environments. But,
most of the strategies and techniques have heavy limitations in the form
of computational resources, lack of proper data analysis of the measured
trained data, dependence of the motion of the objects, inability to
differentiate one object from other, and also there is a concern over speed
of the movement and light. Hence, there is a need to draft, apply and
recognize new techniques of detection that tackle the existing limitations.
7
requirements for this project. Chapter 5 explains overall System
Architecture and modules it consists of, also the system is represented by
various UML diagrams. Chapter 6 describe the various techniques used in
the project and gives general introduction of it. Chapter 7 deals with the
various testing and results of the project. Chapter 8 gives the Conclusion
and the future enhancements to be done.
1.5 SUMMARY
In this chapter the domain of the project machine learning is
discussed all along with object detection and types of machine learning
concepts. The problem description give the brief description about the
main problem be resolved in this project. Later description about the
project is discussed.
8
CHAPTER 2
LITERATURE SURVEY
In this paper, they have used hardware components like raspberry Pi,
SoC, Pi camera which makes the project more accurate and effective. They have
used pre-trained model named SSD Mobile_Net_v1 which performs very well
compared to other common deep learning models
This paper provides a flexible and effective model to help the visually
challenged people. The results obtained from this prototype were accurate and
reliable. Using Tensor flow an advanced technique, the objects were trained and
given to the module, hence the object detection and identification was easier.
This model will be helpful for the visually challenged people to overcome their
disability.
9
2. Cang Ye and Xiangfei Qian,(2018) ‘3-D Object Recognition of a Robotic
Navigation Aid for the Visually Impaired’, IEEE Transactions on
Neural Systems and Rehabilitation Engineering
10
3. Endo, Y., Sato, K., Yamashita, A., & Matsubayashi, K. (2017).
Indoor positioning and obstacle detection for visually impaired
navigation system based on LSD-SLAM.
In the second step, the wall lines are extracted and incorporated into the
graph for 3-DOF SLAM to reduce X, Y, and yaw errors. The method
reduces the 6-DOF pose error and results in more accurate pose with less
computational time than the state-of-the-art planar SLAM methods. Based
on the PE method, a way finding system is developed for navigating a
visually impaired person in an indoor environment.
11
4. Jianhe Yuan,Wenming Cao, Zhihai He , Zhi Zhang , Zhiquan He (2018)
,Fast Deep Neural Networks With Knowledge Guided Training and
Predicted Regions of Interests for Real-Time Video Object Detection,
2018 IEEE Access ( Volume: 6 )
It has been recognized that deeper and wider neural networks are
continuously advancing the state-of-the-art performance of various computer
vision and machine learning tasks.[3] However, they often require large sets
of labeled data for effective training and suffer from extremely high
computational complexity, preventing them from being deployed in real-
time systems, for example vehicle object detection from vehicle cameras for
assisted driving. In this paper, we aim to develop a fast deep neural network
for real-time video object detection by exploring the ideas of knowledge-
guided training and predicted regions of interest. Specifically, we will
develop a new framework for training deep neural networks on datasets with
limited labeled samples using cross-network knowledge projection which is
able to improve the network performance while reducing the overall
computational complexity significantly. A large pre-trained teacher network
is used to observe samples from the training data.
12
detector, we identify the regions of interest that contain the target objects
with high confidence.
CHAPTER 3
SYSTEM OVERVIEW
3.1.1 Drawbacks
This Model is used to assist the visually impaired people using a mobile
application on their routine. It helps the user by locating objects in front
of them. The Technology used is Image Processing and Machine
Learning. The models are trained using Google’s Tensorflow framework .
13
They are mapped with the images from the real time computer vision.
The output of the above mentioned process is a Talkback feature.
14
to use the system efficiently. The user must not feel threatened by
the system, instead must accept it as a necessity.
3.4 SUMMARY
This chapter deals with the description of the existing system and
its disadvantages which leads to proposed system. Also it discussed about
the various feasibility of the system.
15
CHAPTER 4
SYSTEM REQUIREMENTS
Software Requirement
16
4.3 SUMMARY
This chapter gives description about the hardware and software
requirements of the project.
17
CHAPTER 5
SYSTEM DESIGN
18
5.2.1 Image Recognition and Object Detection
This module is used to capture the real time image through Mobile
camera. The object is the input data given by the blind person while
walking. This module is shown in Fig.5.2.
19
Design deals with the various UML (unified modeling language)
diagrams for the implementation for the project. Design is meaningful
engineering representation of a thing that is to be built. Software design is a
process through which the requirements are translated into representation of
the software. Design is the place where quality is rendered in software
engineering. UML to has many types of diagram which are divided into two
categories. Some types represent structural information, and the rest
represent general types of behavior, including a few that represent different
aspects of interactions.
20
A use case is a set of scenarios that describing an interaction
between a user and a system. A use case diagram displays the
relationship among actors and use cases. The two main
components of a use case diagram are use cases and actors. An
actor is represents a user or another system that will interact with
the system modeled. The visually impaired user in Fig 5.4 is the
actor.
21
5.3.2 Class Diagram
22
5.3.3 Sequence Diagram
23
5.3.4 Activity Diagram
24
5.4 SUMMARY
This chapter gives the overall architecture of this project. It also
describes each module that the overall architecture composed of. The
floor of the project is depicted by various UML diagrams.
25
CHAPTER 6
SYSTEM IMPLEMENTATION
● Java
Obstacles to development include the fact that Android does not use
established Java standards, that is, Java SE and ME. This prevents
compatibility between Java applications written for those platforms and
those written for the Android platform. Android reuses the Java
language syntax and semantics, but it does not provide the full class
libraries and APIs bundled with Java SE or ME.
26
versions, that compile the same code that Dalvik runs to Executable
and Linkable Format (ELF) executables containing machine code.
● XML
27
6.1.2 TensorFlow
28
6.2 SUMMARY
29
CHAPTER 7
SYSTEM TESTING
30
process performs accurately to the documented specifications and
contains clearly defined inputs and expected results.
31
Headphones must be connected.
32
System testing ensures that the entire integrated software system
meets requirements. It tests a configuration to ensure known and
predictable results. An example of system testing is the configuration
oriented system integration test. System testing is based on process
descriptions and flows, emphasizing pre-driven process links and
integration points.
7.3 SUMMARY
In this chapter, various test performed are discussed are. Also
description and purpose of each test is described above.
33
CHAPTER 8
8.1 CONCLUSION
● Facial recognition
● Processing speed of images
● Google maps navigation
8.3 SUMMARY
In this chapter, conclusions describes about the improvement achieved by
this project.
34
APPENDIX 1
SAMPLE CODE
CameraActivity.java
/*
Main Function
*/
/*
Initializing Variables
*/
private static final Logger LOGGER = new Logger();
private static final int PERMISSIONS_REQUEST = 1;
private static final String PERMISSION_CAMERA =
Manifest.permission.CAMERA;
private static final String PERMISSION_STORAGE =
Manifest.permission.WRITE_EXTERNAL_STORAGE;
35
@Override
protected void onCreate(final Bundle savedInstanceState) {
LOGGER.d("onCreate " + this);
super.onCreate(null);
//Set Flag to keep the screen on.
getWindow().addFlags(WindowManager.LayoutParams.FLAG_KEEP_SCREE
N_ON);
setContentView(R.layout.activity_camera);
36
return yuvBytes[0];
}
/**
* Callback for Camera2 API
* Initialize Module 1 for Object Detection
*/
@Override
public void onImageAvailable(final ImageReader reader) {
//We need wait until we have some size from onPreviewSizeChosen
if (previewWidth == 0 || previewHeight == 0) {
return;
}
if (rgbBytes == null) {
rgbBytes = new int[previewWidth * previewHeight];
}
try {
final Image image = reader.acquireLatestImage();
if (image == null) {
return;
}
if (isProcessingFrame) {
image.close();
return;
}
isProcessingFrame = true;
Trace.beginSection("imageAvailable");
final Plane[] planes = image.getPlanes();
fillBytes(planes, yuvBytes);
yRowStride = planes[0].getRowStride();
final int uvRowStride = planes[1].getRowStride();
final int uvPixelStride = planes[1].getPixelStride();
imageConverter =
new Runnable() {
37
@Override
public void run() {
ImageUtils.convertYUV420ToARGB8888(
yuvBytes[0],
yuvBytes[1],
yuvBytes[2],
previewWidth,
previewHeight,
yRowStride,
uvRowStride,
uvPixelStride,
rgbBytes);
}
};
processImage();
} catch (final Exception e) {
LOGGER.e(e, "Exception!");
Trace.endSection();
return;
}
Trace.endSection();
}
38
LOGGER.d("onStart " + this);
super.onStart(); }
@Override
public synchronized void onResume() {
LOGGER.d("onResume " + this);
super.onResume();
@Override
public synchronized void onPause() {
LOGGER.d("onPause " + this);
if (!isFinishing()) {
LOGGER.d("Requesting finish");
finish();
}
handlerThread.quitSafely();
try {
handlerThread.join();
handlerThread = null;
handler = null;
} catch (final InterruptedException e) {
LOGGER.e(e, "Exception!");
}
super.onPause();
39
}
@Override
public synchronized void onStop() {
LOGGER.d("onStop " + this);
super.onStop();
}
@Override
public synchronized void onDestroy() {
LOGGER.d("onDestroy " + this);
super.onDestroy();
}
40
return checkSelfPermission(PERMISSION_CAMERA) ==
PackageManager.PERMISSION_GRANTED &&
checkSelfPermission(PERMISSION_STORAGE) ==
PackageManager.PERMISSION_GRANTED;
} else {
return true;
}
}
// Returns true if the device supports the required hardware level, or better.
private boolean isHardwareLevelSupported(
CameraCharacteristics characteristics, int requiredLevel) {
int deviceLevel =
characteristics.get(CameraCharacteristics.INFO_SUPPORTED_HARDWARE_
LEVEL);
if (deviceLevel ==
CameraCharacteristics.INFO_SUPPORTED_HARDWARE_LEVEL_LEGAC
Y) {
return requiredLevel == deviceLevel;
}
// deviceLevel is not LEGACY, can use numerical sort
return requiredLevel <= deviceLevel;
}
41
private String chooseCamera() {
final CameraManager manager = (CameraManager)
getSystemService(Context.CAMERA_SERVICE);
try {
for (final String cameraId : manager.getCameraIdList()) {
final CameraCharacteristics characteristics =
manager.getCameraCharacteristics(cameraId);
if (map == null) {
continue;
}
return null;
}
42
String cameraId = chooseCamera();
CameraConnectionFragment camera2Fragment =
CameraConnectionFragment.newInstance(
new CameraConnectionFragment.ConnectionCallback() {
@Override
public void onPreviewSizeChosen(final Size size, final int rotation) {
previewHeight = size.getHeight();
previewWidth = size.getWidth();
CameraActivity.this.onPreviewSizeChosen(size, rotation);
}},
this,
getLayoutId(),
getDesiredPreviewFrameSize());
camera2Fragment.setCamera(cameraId);
getFragmentManager()
.beginTransaction()
.replace(R.id.container, camera2Fragment)
.commit();
}
43
}}
if (currentRecognitions != null) {
currentRecognitions = recognitions;
44
speak();
}
if (i + 1 < currentRecognitions.size()) {
stringBuilder.append(" and ");
45
}}
stringBuilder.append(" detected.");
textToSpeech.speak(stringBuilder.toString(), TextToSpeech.QUEUE_FLUSH,
null);
}
46
APPENDIX - 2
SAMPLE SCREENSHOTS
47
In this output multiple objects are detected.
48
In this output, 2 person are detected in a certain distance in front of
the user.
REFERENCES
49
1. Smart Vision using Machine learning for Blind, International Journal of
Advanced Science and Technology Vol. 29, No. 5, (2020).
2. Endo, Y., Sato, K., Yamashita, A., & Matsubayashi, K. (2017). Indoor
positioning and obstacle detection for visually impaired navigation system
based on LSD-SLAM.
WEB REFERENCES
5. Shttps://www.expertsystem.com/machine-learning-definition/
6. https://en.wikipedia.org/wiki/TensorFlow
7. https://en.wikipedia.org/wiki/Android_software_development
8. https://github.com/tensorflow/models/tree/master/research/object_detection
9. https://www.tensorflow.org/
50