Thesis (Keshri, Ankit, Harsh)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 71

Project Id: CSE-A-PROJECT-2023-24-05

Project Report
On
DESKTOP ASSISTANT

Submitted in Partial Fulfillment of the Requirement

For the Degree of

Bachelor of Technology

In

Computer Science and Engineering

By
Ankit Kumar Singh (2002160100020)
Harsh Kumar (2002160100046)
Keshri Nandan (2002160100060)
Under the Supervision
of
Prof. Suman Jha, Department of CSE

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


IIMT COLLEGE OF ENGINEERING, GREATER NOIDA

AFFILIATED TO
Dr A.P.J. ABDUL KALAM TECHNICAL UNIVERSITY,
UTTAR PRADESH, LUCKNOW
JUNE-2024
Page 1 of 59
IIMT COLLEGE OF ENGINEERING, GREATER NOIDA
[Department of Computer Science & Engineering]

TO WHOM IT MAY CONCERN

I hereby certify that Ankit Kumar Singh (2002160100020) , Harsh Kumar


(2002160100046) , Keshri Nandan (2002160100060) are the student of IIMT COLLEGE
OF ENGINEERING, GREATER NOIDA (UP), Affiliated to Dr. A. P. J. Abdul Kalam
Technical University, Lucknow (UP) has undergone Major Project from October-2023 to
June-2024 (Dissertation – I & II) at our organization to fulfill the requirements for the award
of degree of Bachelor of Technology in Computer Science & Engineering. He/She worked
on DESKTOP ASSISTANT project during this period under the supervision of Prof.
Suman Jha during his tenure with us we found him sincere and hard working. We wish
him/her a great success in the future.

(Prof. Badal Bhushan) (Dr. Ajay Gupta)


Project Supervisor, Department of HOD, Department of CSE IIMT
CSE, IIMT College of Engineering, College of Engineering, Gr.
Gr. Noida Noida

Page 2 of 59
DECLARATION BY STUDENT

I hereby declare that the work being presented in this report entitled “Desktop Assistant”
a Exploring and Advertising Website with Special Reference to IIMT College of
Engineering & Technology, Greater Noida is an authentic record of my own work carried
out under the supervision of Prof. Suman Jha. The matter embodied in this report has not
been submitted by me for the award of any other degree.

Dated:

Signature of student Signature of student Signature of student


Ankit Kumar Singh Harsh Kumar Keshri Nandan
Roll no. 2002160100020 Roll no. 2002160100046 Roll no. 2002160100060
Department of CSE Department of CSE Department of CSE

This is to certify that the above statement made by the candidate is correct to the best of my
knowledge.

(Prof. Suman Jha)


Project Supervisor
Department of Computer Science & Engineering

(Prof. Badal Bhushan) (Dr. Ajay Gupta)


Project Coordinator HOD, Dept of CSE
Department of CSE

II

Page 3 of 59
ACKNOWLEDGEMENT

I am highly grateful to our Project Supervisor Prof. Suman Jha, Assistant Professor, CSE,
IIMT COLLGE OF ENGINEERING, Greater Noida (UP), for providing this opportunity to
carry out the Major Project on “Desktop Assistant” a Exploring and Advertising Website
with Spacial Reference to IIMT COLLEGE OF ENGINEERING. I would like to expresses
my gratitude to other faculty members of CSE department for providing academic inputs,
guidance & encouragement throughout this period.

The author would like to express a deep sense of gratitude and thanks DR AJAY GUPTA
, HOD, Department of CSE, IIMT COLLEGE OF ENGINEERING, GREATER NOIDA
(UP), without whose permission, wise counsel and able guidance, it would have not been
possible to carry out my project in this manner.

The help rendered by Prof. Badal Bhushan, Project Co-ordinator, Department of


CSE, IIMT COLLEGE OF ENGINEERING, GREATER NOIDA (UP).

Dissertation for experimentation is greatly acknowledged. Finally, I express my


indebtedness to all who have directly or indirectly contributed to the successful completion
of my major project.

Signature of student Signature of student Signature of student


Ankit Kumar Singh Harsh Kumar Keshri Nandan
Roll no. 2002160100020 Roll no. 2002160100046 Roll no. 2002160100060
Department of CSE Department of CSE Department of CSE

III

Page 4 of 59
ABSTRACT

The project aims to develop a personal virtual assistant for windows based system. Jarvis
draws its inspiration from virtual assistants like Cortana for Windows, and Siri for iOS. It
has been designed to provide a user-friendly interface for carrying out a variety of tasks by
employing certain well-defined commands. Users can interact with the assistant either
through voice commands or using keyboard input. As a personal assistant, Jarvis assists
the end-user with day-to-day activities like general human conversation, searching queries
in google or yahoo, searching for videos, playing songs, live weather conditions, word
meanings, searching for medicine details and reminding the user about the scheduled
events and tasks. The virtual assistant takes the voice input through our microphone and it
converts our voice into computer understandable language and gives the required solutions
and answers which are asked by the user. This assistant connects with the world wide web
to provide results that the user has questioned. This project works on voice input and gives
output through voice and displays the text on the screen. The main agenda of our virtual
assistant is that it makes people smart and give instant and computed results.

Keywords Natural language processing . Natural language understanding . Natural


language generation

IV

Page 5 of 59
List of Tables
SL NO NAME OF THE TABLE TABLE NO.
1 LITERATURE SURVEY T-1
2 LITERARUTE REVIEW T-2

Page 6 of 59
List of Figures
SL NO NAME OF THE FIGURE FIGURE NO.
1 VIRTUAL ASSISTANT 1
2 SYSTEM ARCHITECTURE 5.1
3 USE CASE DIAGRAM 5.2
4 CLASS DIAGRAM 5.3
5 ACTIVITY DIAGRAM 5.4
6 SEQUENCE DIAGRAM FOR QUERY- RESPONSE 5.5(a)
7 SEQUENCE DIAGRAM FOR TASK EXECUTION 5.5(b)

List of Abbreviations
SL NO ABBREVATIONS ACRONYM
1 UNIFIED MODELING LANGUAGE UML
2 USER INTERFACE UI
3 NATURAL LANGUAGE PROCESSING NLP
4 APPLICATION PROGRAMMING INTERFACE API
5 OPERATING SYSTEM OS
6 GRAPHICAL USER INTERFACE GUI
7 ARTIFICIAL INTELLIGENCE AI
8 INTERNET OF THINGS IOT

VI

Page 7 of 59
TABLE OF CONTENTS
Annexation
Certificate I
Declaration II
Acknowledgement III
Abstract IV
List of Tables V
List of Figures VI
Page No.
1 Introduction
1.1 Introduction 11

1.2 Objective 12

1.3 Motivation 13

1.4 Applicability 14

2 Literature Survey

2.1 Literature Survey 15

2.2 Literature review 18


2.3 Problem formulation 20
2.4 Methodology 21
2.5 Planning of work 22

Page 8 of 59
2.6 Facilities required for proposed work 24
3. Project Analysis
3.1 Comparison efficiency with Base Paper 25
3.2 Explanation the performance of result 27
3.3 Solution as per Proposed work 28
4 Preliminaries
4.1 Existing System 30
4.2 Proposed System 31
5. System Design
5.1 Architecture Diagram 33
5.2 UML Diagram 34
6. System Implementation
6.1 Modules 40
6.2 Module Description 40
7. System Specification

7.1 Software Requirements, 42


7.2 Hardware Requirements 43
8. System Study
8.1 Feasibility Study 44
9. System Testing
9.1 TESTING 46
9.2 TYPES OF TESTING UNIT TESTING 46
9.3 TEST RESULTS 48
10 Conclusion & Recommendation
10.1 Conclusion 49

Page 9 of 59
10.2 Limitations 50
10.3 Future Scope 50
Appendix - I 52
Appendix - II 55
Bibliography 58
References 59
Research Paper 61

Page 10 of 59
Chapter 1

1.1 INTRODUCTION
1.2 OBJECTIVE
1.3 MOTIVATION
1.4 APPLICABILITY

1.1 INTRODUCTION
In today's digital age, voice assistants have emerged as a revolutionary technology that
simplifies human computer interaction. Voice assistants are intelligent software programs
designed to understand and respond to human voice commands. They provide a convenient
and hands-free way for users to interact with their devices,access information, perform tasks,
and control various applications.
This paper aims to present the implementation of a desktop voice assistant, which offers a
wide range of functionalities and enhances the user's productivity and convenience. Unlike
traditional desktop applications that rely solely on graphical user interfaces (GUI), a voice
assistant enables users to interact with their computer systems through voice commands,
eliminating the need for physical input devices such as keyboards or mice.
The implementation of a desktop voice assistant involves various components, including
speech recognition, natural language processing (NLP), and machine learning algorithms.
The speech recognition component converts spoken language into text, enabling the system
to understand and interpret user commands accurately. The NLP component analyzes and
comprehends the user's intent, extracting relevant information from the input text. Machine
learning algorithms play a crucial role in training the system to improve its accuracy and
understand user preferences over time.
The primary goal of this project is to create a robust and efficient desktop voice assistant
that provides seamless integration with the user's computer system. The voice assistant
should be capable of executing a wide range of tasks, such as retrieving information from
the web, scheduling appointments, sending emails, playing music, setting reminders, and
Page 11 of 59
performing system operations like opening applications and managing files.
The implementation of a desktop voice assistant presents several challenges, including
ensuring accurate speech recognition, handling ambiguous user commands, and maintaining
privacy and security of user data. These challenges require careful consideration and the
adoption of suitable algorithms and techniques to achieve a reliable and user-friendly voice
assistant system.
By implementing a desktop voice assistant, this project aims to offer users a more intuitive
and efficient way to interact with their computers, ultimately enhancing their productivity
and user experience. The successful implementation of a robust and versatile desktop voice
assistant has the potential to revolutionize the way we interact with our digital devices,
making technology more accessible and user-centric.

In the dynamic landscape of technology, the demand for intelligent and intuitive solutions
to enhance productivity and user experience is ever-growing. In this context, our final year
project aims to develop an advanced Desktop Assistant, a cutting-edge application designed
to seamlessly integrate with users' daily workflows, providing assistance, automation, and
personalized experiences.

The desktop environment remains a central hub for various tasks, ranging from work-related
activities to personal organization. However, the sheer volume of information and the
complexity of modern applications often lead to challenges in managing and optimizing
these tasks efficiently. Our Intelligent Desktop Assistant seeks to address these challenges
by employing state-of-the-art technologies, including natural language processing, machine
learning, and user behavior analysis.

1.2 OBJECTIVE

The development of technology allows introducing more advanced solutions in everyday


life. This makes work less exhausting for employees, and also increases the work safety.
As the technology is developing day by day people are becoming more dependent on it,
one of the mostly used platform is computer. We all want to make the use of these
Page 12 of 59
computers more comfortable, traditional way to give a command to the computer is
through keyboard but a more convenient way is to input the command through voice.
Giving input through voice is not only beneficial for the normal people but also for those
who are visually impaired who are not able to give the input by using a keyboard. For this
purpose, there is a need of a virtual assistant which can not only take command through
voice but also execute the desired instructions and give output either in the form of voice
or any other means.

A Virtual Assistant is the software that can perform task and provide different services to
the individual as per the individual’s dictated commands. This is done through a
synchronous process involving recognition of speech patterns and then, responding via
synthetic speech. Through these assistants a user can automate tasks ranging from but not
limited to mailing, tasks management and media playback. It understands natural language
voice commands and complete the tasks for the user. It is typically a cloud-based program
that requires internet connected devices and/or applications to work. The technologies that
power virtual assistants are machine learning, natural language processing and speech
recognition platforms. It uses sophisticated algorithms to learn from data input and
become better at predicting the end user's needs.

1.3 MOTIVATION
The main purpose of this project is to build a program that will be able to service to humans
like a personal assistant. This is an interesting concept and many people around the globe
are working it. Today, time and security are the two main things to which people are more
sensitive, no one has the time to spoil; nobody would like their security breach, and this
project is mainly for those kinds of people.

This system is designed to be used efficiently on desktops. Virtual Assistants software


improves user productivity by managing routine tasks of the user and by providing
information from an online source to the user. This project was started on the premise that
there is a sufficient amount of openly available data and information on the web that can be
utilized to build a virtual assistant that has access to making intelligent decisions for routine
user activities.

Page 13 of 59
1.4 APPLICABILITY
The mass adoption of artificial intelligence in users’ everyday lives is also fuelling the
shift towards voice. The number of IoT devices such as smart thermostats and speakers are
giving voice assistants more utility in a connected user’s life. Smart speakers are the
number one way we are seeing voice being used.
Many industry experts even predict that nearly every application will integrate voice
technology in some way in the next 5 years. The use of virtual assistants can also enhance
the system of IoT (Internet of Things). Twenty years from now, Microsoft and its
competitors will be offering personal digital assistants that will offer the services of a full-
time employee usually reserved for the rich and famous.

FIGURE 1.1 VIRTUAL ASSISTANT

Page 14 of 59
Chapter 2

2.1 Literature Survey


2.2 Literature Review
2.3 Problem formulation
2.4 Methodology
2.5 Planning or Work
2.6 Facilities required for proposed work

2.1 Literature Survey:

SL No. Paper Title Authors Year Technology

1. Resources and Nikita Desai, 2022 Language Processing Libraries,


components for Nikhik Dabhi Sentiment Analysis Tools
Gujarati NLP systems

2. Arabic sentiment Hasna Chouikhi, Transformer


analysis using BERT Hamza Chniter, 2021 Architecture,Attention
model Fethi Jarray Mechanism,Masked Language
Model (MLM) Pre-training

Page 15 of 59
The Indian repository Narayan Tokenization,Part-of-Speech
3. of resources for Choudhary 2021 Tagging,Stemming and
language technology Lemmatization

Oliver Baclic, Early Detection and


4. Artificial intelligence in Matthew Tunic, 2020 Diagnosis,Medical Imaging and
public health Kelsy Young Diagnostics,Drug
Discovery and Development

Improving the
5. reliability of deep Basemah Alshemali, 2020 Data Augmentation,Adversarial
neural networks in Jugal Kalita Training,Ensemble Methods
NLP: A Review

LEGAL-BERT: the Legal Document Analysis,


6. muppets straight out of 2020 Legal Question Answering,
Ben Lutkevich
law school Named Entity Recognition
(NER) in Legal Texts

Transformer-xl: Zihang Dai, Segment-Level Recurrence


attentive language Zhilin Yang, Mechanism,
7. models beyond a fixed- Yiming Yang 2019 Relative Positional Embeddings,
length context Causal Self-Attention Masking

Bidirectional Context
Bert: Pre-training of Jacob Devlin, Understanding,Transformer
deep bidirectional Ming Chang, Architecture,Masked Language
8. transformers for Kenton Lee 2018 Model (MLM) Pre-
language understanding training,Large-Scale Pre-
training,Fine-Tuning for
Downstream Tasks

An analysis of neural Stephen Merity, Syntax-Aware Models,


9. language modeling at Nitish Shirish 2018 Semantic Role Labeling (SRL),
multiple scales Keskar, Graph-Based Representations
Richard Socher

Long Short-Term Memory


Deep learning applied Marc Moreno Lopez, (LSTM) Networks,Gated
10. to NLP Jugal Kalita 2017 Recurrent Units
(GRUs),Convolutional
Neural Networks (CNNs)

Page 16 of 59
Neural network Feedforward Neural Networks
11. methods for natural Yoav Goldberg 2017 (FNN),Recurrent Neural
language processing Networks (RNN),Long Short-
Term Memory (LSTM)

Neural machine Dzmitry Bahdanau, Attention Mechanism,


12. translation by jointly KyungHyun Cho, 2015 Alignment Model,
learning to align and Yoshua Bengio Soft Attention,End-to-
translate End Learning

Searching better Diksha Khurana, Sequence-to-Sequence


13. architectures for neural Aditya Koli, 2015 (Seq2Seq) with Attention,T5
machine Kiran Khatter, (Text-to-Text Transfer
translation Sukhdev Singh Transformer),MARIAN

Edward Benson, Data Collection and Pre-


14. Event discovery in Aria Haghighi, 2011 processing, Text Mining and
social media feeds Regina Barzilay NLP,Temporal Analysis,Real-
time Monitoring

A unified architecture Ronan Collobert, Transformer Architecture,


15. for natural language Jason Weston 2008 Modular Components,
processing Bidirectional Context

TABLE 1 LITERATURE SURVEY

Page 17 of 59
2.2 Literature Review:

SL Paper Title Year Advantages Disadvantages


No.

1. Resources and 2022 Cultural and Linguistic Limited Availability of


components for Representation,Improved NLP Resources,Quality and Size of
Gujarati NLP System Performance,Facilitating Datasets,
systems Research and Lack of Standardization,Computational
Development,Language Preservation Challenges
2. Arabic sentiment 2021 High Accuracy,Contextual Lack of Interpretability,Large Model
analysis using Understanding,Transfer Size,Lack of Interpretability,Domain-
BERT model Learning,Handles Polysemy Specific Adaptation
3. The Indian 2021 Centralized Access,Facilitates Quality and Bias,Limited
repository of Research,Resource Coverage,Outdated
resources for Sharing,Standardization Information,Copyright and Licensing
language Issue
technology
4. Artificial 2020 Data Privacy and Security,Bias in Data
intelligence in Early Disease Detection,Efficient and Algorithms,Interpretability and
public health Data Analysis,Improved Transparency,Integration with Existing
Surveillance,Personalized Medicine Systems
5. Improving the 2020 Reduced Uncertainty,Robustness to Computational Resources,Overfitting
reliability of deep Noisy Data, Risks,Increased Complexity,Data
neural networks Enhanced Model Dependency
in NLP: A Performance,Increased
review Trustworthiness

6. LEGAL-BERT: 2020 Contextual Understanding,Improved Computational


the Performance,Domain-Specific Resources,Interpretability,Data
muppets straight Features,Fine-Tuning Capabilities Requirements,Legal Complexity
out of law school
7. Transformer-xl: 2019 Long-Term Dependency Handling, Complexity,Resource
attentive Reduced Memory Intensive,Training Time,Potential
language Requirements,Improved Overhead
models beyond a Performance,Increased Context
fixed-length Length
context

Page 18 of 59
8. Bert: pre-training 2018 Contextualized Computational Resources,Memory
of deep Representations,Versatility,Transfer Requirements,Tokenization
bidirectional Learning, Limitations,Lack of
transformers for Large-Scale Pre-training Interpretability,Domain Specificit
language
understanding
9. An analysis of 2018 Capturing Long-Term Data Requirements,Computational
neural language Dependencies,Hierarchical Intensity,Interpretability,Overfitting
modeling at Representations,Improved
multiple scales Performance,Transfer Learning
10. Deep learning 2017 Representation Learning,End-to-End Data Requirements,Fine-Tuning
applied to NLP Models,State-of-the-Art Challenges,Interpretability,Computatio
Performance,Handling Complex nal Resources
Structures
11. Neural network 2017 Learning Complex Patterns,Feature Data Dependency,Computational
methods for Learning,Adaptability,End-to-End Intensity,Overfitting,Interpretability
natural language Learning
processing
12. Neural machine 2015 Attention Mechanism,Improved
translation by Handling of Long Sentences,Better Computational Complexity,Data
jointly learning Translation Quality,End-to-End Requirements,Lack of
to align and Training Interpretability,Vulnerability to Noisy
translate Training Data

13. Searching better 2015 End-to-End Learning,Parameter Computational Resources,Memory


architectures for Sharing,Handling Long Requirements,Tokenization
neural machine Sequences,Context Awareness Limitations,Lack of
translation Interpretability,Domain Specificit
14. Event discovery 2011 Real-Time Information,User Noise and Irrelevance,Ambiguity and
in social media Engagement,Wide Contextual Understanding,Bias and
feeds Coverage,Language Variation Subjectivity,Privacy Concerns
15. A unified 2008 Unified Architecture,End-to-End Complexity,Lack of Task-Specific
architecture for Learning,Shared Optimization,Data
natural language Representations,Efficiency Requirements,Interference between
processing Tasks

TABLE 2 LITERATURE REVIEW

Page 19 of 59
2.3 Problem formulation

 Context Understanding:-
Problem: Existing desktop assistants struggle to maintain context over extended
interactions, leading to misunderstandings of user commands and queries.
Objective: Develop a context-aware mechanism to improve the desktop assistant's ability
to understand and retain context during user interactions.

 Personalisation:
Problem: Desktop assistants often lack the ability to adapt to individual user preferences,
resulting in a generic user experience.
Objective: Implement machine learning algorithms to analyze user behavior and
preferences, enabling personalized responses and suggestions over time.

 Integration with External Services:


Problem: Some desktop assistants face challenges in seamlessly integrating with external
applications and services, limiting their overall utility.
Objective: Enhance API integration to broaden the range of external services the assistant
can interact with, improving its capabilities.

 Multi-modal Interaction:
Problem: Certain desktop assistants primarily rely on text-based interactions, neglecting
the benefits of incorporating multi-modal elements like speech and images.
Objective: Extend desktop assistant capabilities to support multi-modal inputs, providing
users with more versatile and natural interaction options.

 Security and Privacy Concerns:


Problem: Security and privacy issues may arise as desktop assistants handle sensitive
information, impacting user trust and engagement.
Objective: Implement robust security measures, including encryption and user-controlled
privacy settings, to address concerns and enhance user confidence.

 Usability and User Experience:


Problem: Some desktop assistants may have sub optimal user interfaces, response times,
or overall usability.

Objective: Enhance the desktop assistant's user interface, optimize response times, and
ensure intuitive interactions to improve overall usability.

Page 20 of 59
 Adaptive Learning:
Problem: Desktop assistants may lack mechanisms for adaptive learning from user
feedback, hindering continuous improvement.
Objective: Develop adaptive learning mechanisms to enable the desktop assistant to learn
from user interactions and improve its performance over time.

 Compatibility and Interoperability:


Problem: Desktop assistants may lack compatibility with various operating systems,
applications, and devices.
Objective: Ensure compatibility and interoperability across diverse platforms, maximizing
accessibility for users.

 Error Handling and User Guidance:


Problem: Ineffective error handling may lead to user frustration, and there may be a lack
of clear guidance for users.
Objective: Implement robust error-handling mechanisms and provide clear user guidance
to enhance the overall user experience.

 User Trust and Transparency:


Problem: Lack of transparency in the operation of desktop assistants may impact user
trust.
Objective: Establish transparency in the operation of the desktop assistant and implement
features that build user trust regarding the handling and processing of their data.

2.4 Methodology
Project Phases:

a. Research and Literature Review:

• Conduct an in-depth literature review on inclusive design, accessibility, and


prompt engineering.

b. Requirement Analysis:

• Define specific requirements and objectives for the project based on literature
review findings.

Page 21 of 59
c. System Design and Architecture:

• Develop the system design, including the architectural layout of the project.

2.5 Planning of Work

2.5.1. Development:

a. Front-End Development:

• Implement user interfaces using Tinkter.

b. Integration of AI and Front-End:

• Integrate AI-generated content into the front-end interfaces.


• Test the compatibility and effectiveness of the integrated system.

2.5.2. Evaluation and Testing:

a. Comprehensive Study and Comparison:

• Analyze and document the findings for later evaluation.


• Conduct a comprehensive study comparing accessibility features of popular apps with
those generated by our system.

b. User Testing:

• Engage users in testing the system.


• Collect feedback on usability, accessibility, and overall user experience.

Page 22 of 59
2.5.3. Iterative Improvement:

a. Feedback Integration:

• Integrate user feedback into the system to address identified issues.


• Refine prompt engineering strategies based on user interactions.

b. Optimization and Scaling:

• Optimize the system for performance and scalability.


• Ensure that the system can handle varying loads and user interactions.

2.5.4. Documentation and Reporting:

a. Finalizing Reports:

• Compile and finalize documentation, including research findings, development


processes, and user testing results.
Conclusion: This structured plan ensures a systematic approach to the project, with
defined
phases for research, development, evaluation, and iterative improvement. The timeline
provides
flexibility for adjustments based on ongoing findings and feedback.

Page 23 of 59
2.6 Facilities required for proposed work

2.6.1 Hardware:
a. Testing Devices: A range of devices for testing the developed interfaces (e.g.,
desktops, laptops, tablets, smartphones).

2.6.2 Software:
a. Development Environment: IDEs for React.js and Next.js development.
b. Testing and Accessibility Tools:
i. Testing frameworks for unit testing and integration testing.
ii. Accessibility testing tools to evaluate adherence to accessibility standards.

2.6.3 Data Resources:


a. User Interaction Data.
b. Capture and anonymize user interaction data for testing and feedback.
c. Ensure compliance with data privacy regulations.

2.6.4 Collaboration Tools:


a. Documentation and Project Management:
b. Document collaboration tools (e.g., Google Docs, Microsoft Office 365).

Conclusion: The proposed work requires a robust set of facilities encompassing hardware,
software, data resources, collaboration tools, and dedicated testing environments.
Adequate
resources and tools will be essential for the successful execution and evaluation of the
project

Page 24 of 59
Chapter 3

Project Analysis

3.1 Comparison Efficiency with base paper

3.2 Explanation of the Performance of Result

3.3 Solution as per Proposed Work

3.1 Comparison Efficiency with base paper

The efficiency of the "Desktop Assistant" project can be evaluated by comparing its key
features, methodologies, and outcomes with those of the base paper.

Technological Approach:

Base Paper: The base paper utilizes technologies such as Python, Speech Recognition
APIs, and NLP libraries to develop a desktop assistant that aids in performing routine
tasks via voice commands.
Our Project: Similarly, our project utilizes Python, Speech Recognition APIs, and NLP
libraries. Additionally, it incorporates PyQt for GUI development and integrates third-
party services for tasks like weather updates,etc.

Page 25 of 59
Scope and Objectives:

Base Paper: The base paper aims to investigate the effectiveness of desktop assistants in
automating routine tasks and improving user productivity.
Our Project: Our project aims to develop a desktop assistant capable of performing a
wide range of tasks, including providing weather updates, and executing system
commands, thereby enhancing user productivity and convenience.

Key Features:

Base Paper: The base paper focuses on voice-activated commands for basic tasks like
setting reminders, checking emails, and playing music.
Our Project: Our project focuses on an AI-driven desktop assistant that not only performs
basic tasks but also integrates advanced features like natural language understanding for
context-aware responses, a customizable user interface, and seamless integration with
various third-party services.

Conclusion:
In comparison to the base paper, the "Desktop Assistant" project offers a broader scope by
incorporating advanced features such as natural language understanding, a customizable
user interface, and extensive third-party integrations. While the base paper focuses on
improving existing functionalities, our project aims to develop a comprehensive and user-
centric desktop assistant with enhanced capabilities. Both projects contribute to the
advancement of productivity tools, albeit with different approaches and methodologies.

Page 26 of 59
3.2 Explanation of the Performance of Result

3.2.1 Alignment with Objectives:


 The project successfully developed a desktop assistant that utilizes advanced AI
and NLP techniques to perform a variety of tasks, including weather updates, and
executing system commands, etc. This aligns with the objective of leveraging AI-
driven technologies to enhance user productivity and convenience.
 The project incorporates a user-centric approach by allowing users to interact with
the assistant using natural language, ensuring personalized and effective responses.
This aligns with the objective of implementing user-centric design principles.
 The desktop assistant developed in the project adheres to best practices for
software development, including user interface design, security measures, and
performance optimization. This ensures that the application is reliable and
accessible to all users, aligning with the objective of ensuring high-quality software
development standards.
 User testing and feedback mechanisms were employed throughout the
development process to validate the effectiveness of the desktop assistant in
meeting user needs. Insights gained from user feedback informed design decisions
and feature enhancements, ensuring continuous improvement and alignment with
the objective of evaluating the effectiveness of the developed solution.

3.2.2 Evaluation of Key Features:


3.2.2.1 AI-Driven Task Execution:
 Assess the performance and accuracy of AI-driven task execution, including
email management, calendar scheduling, weather updates, and system
command execution. Consider factors such as the relevance, quality, and
efficiency of task completion.
Page 27 of 59
3.2.2.2 User Interface and Experience:
 Evaluate the user interface and experience of the desktop assistant. Assess its
effectiveness in providing an intuitive and seamless user experience, including
ease of use, customization options, and visual appeal.

3.2.2.3 Integration with Third-Party Services:


 Assess the integration of third-party services, such as email providers, calendar
services, and weather APIs. Evaluate the reliability, security, and functionality
of these integrations to ensure seamless operation of the desktop assistant.

3.2.3 Comparison with Expectations:


 Evaluate the overall performance of the desktop assistant against initial
expectations. Assess factors such as task execution speed, accuracy of natural
language understanding, user satisfaction, and system stability.

3.3 Solution as per Proposed Work

The envisioned solution, "Desktop Assistant," stands as a testament to the project's


commitment to leveraging advanced technologies to address productivity challenges
comprehensively. By integrating cutting-edge tools such as Python, Speech Recognition
APIs, NLP libraries, and PyQt, the project has materialized a versatile desktop application
capable of performing a wide range of tasks.

Page 28 of 59
At its core, the solution revolves around empowering users with personalized and efficient
task management. Through the incorporation of AI-driven task execution techniques, users
can interact with the assistant using natural language, enabling the platform to perform
customized tasks ranging from email management and calendar scheduling to weather
updates and system commands. This user-centric approach not only enhances productivity
but also fosters a sense of convenience by catering to diverse user needs.

Furthermore, the solution emphasizes adherence to best practices in software development,


ensuring that the application is reliable, secure, and user-friendly. Features such as an
intuitive user interface, robust security measures, and continuous feedback mechanisms
uphold high standards of software quality, facilitating seamless interaction and trust for
users.

Additionally, the integration of third-party services enhances the functionality and


versatility of the desktop assistant, providing users with a comprehensive tool for
managing various aspects of their daily routines. This integration ensures that users can
access and manage information from multiple sources within a single platform.

In essence, the solution encapsulates the project's vision of creating a more productive and
convenient digital environment where users can efficiently manage their tasks. By
embracing innovative technologies, prioritizing user-centric design, and upholding high
standards of software development, the solution represents a significant step forward in the
quest for enhanced digital productivity.

Page 29 of 59
Chapter 4

PRELIMINARIES

4.1 EXISTING SYSTEM


4.2 PROPOSED SYSTEM

4.1 EXISTING SYSTEM

This project describes one of the most efficient ways for voice recognition. It overcomes
many of the drawbacks in the existing solutions to make the Virtual Assistant more
efficient. It uses natural language processing to carry out the specified tasks. It has various
functionalities like network connection and managing activities by just voice commands. It
reduces the utilization of input devices like keyboard.
This project describes the method to implement a virtual assistant for desktop using the
APIs. In this module, the voice commands are converted to text through Google Speech
API. Text input is just stored in the database for further process. It is recognized and
matched with the commands available in database. Once the command is found, its
respective task is executed as voice, text or through user interface as output.

Page 30 of 59
4.1.1 DISADVANTAGES

 They propose a new detection scheme that gets two similar results which could
cause confusions to the user on deciding the actual/desired output.

 Though the efficiency is high of the proposed module, the time consumption for
each task to complete is higher and also the complexity of the algorithms would
make it very tough to tweak it if needed in the future.

4.2 PROPOSED SYSTEM

1. QUERIES FROM THE WEB:


Making queries is an essential part of one’s life. We have addressed the essential part of a
netizen’s life by enabling our voice assistant to search the web. Virtual Assistant supports
a plethora of search engine like Google displays the result by scraping the searched
queries.
2. ACCESSING NEWS:
Being up-to-date in this modern world is very much important. In that way news plays a
big crucial role in keeping ourselves updated. News keeps you informed and also helps in
spreading knowledge.

3. TO SEARCH SOMETHING ON WIKIPEDIA:


Wikipedia's purpose is to benefit readers by acting as a widely accessible and free
encyclopaedia; a comprehensive written compendium that contains information on all
branches of knowledge.

4. ACCESSING MUSIC PLAYLIST:


Music have remained as a main source of entertainment, one of the most prioritized tasks
of virtual assistants. you can play any song of your choice. However, you can also play a
random song with the help of a random module. Every time you command to play music,
the Virtual Assistant will play any random song from the song directory.
Page 31 of 59
5. OPENING CODE EDITOR:
Virtual Assistant is capable of opening your code editor or IDE with a single voice
command.

4.2.2 ADVANTAGES

 Platform independence
 Increased flexibility
 Saves time by automating repetitive tasks
 Accessibility options for Mobility and the visually impaired
 Reducing our dependence on screens
 Adding personality to our daily lives
 More human touch
 Coordination of IoT devices
 Accessible and inclusive
 Aids hands free operation

Page 32 of 59
Chapter 5

SYSTEM DESIGN

5.1 ARCHITECTURE DIAGRAM


5.2 UML DIAGRAM

5.1 ARCHITECTURE DIAGRAM


An architectural diagram is a diagram of a system that is used to abstract the overall
outline of the software system and the relationships, constraints, and boundaries between
components. It is an important tool as it provides an overall view of the physical
deployment of the software system and its evolution roadmap. An architecture description
is a formal description and representation of a system, organized in a way that supports
reasoning about the structures and behaviors of the system. After going through the above
process, we have successfully enabled the model to understand the features.

FIGURE 5.1 SYSTEM ARCHITECTURE DIAGRAM

Page 33 of 59
5.2 UML DIAGRAM

The Unified Modeling Language is a general-purpose, developmental, modeling language


in the field of software engineering that is intended to provide a standard way to visualize
the design of a system.
A UML diagram is a diagram based on the UML (Unified Modeling Language) with the
purpose of visually representing a system along with its main actors, roles, actions,
artifacts or classes, in order to better understand, alter, maintain, or document information
about the system.
UML defines several models for representing systems
1. The class model captures the static structure
2. The state model expresses the dynamic behavior of objects
3. The use case model describes the requirements the requirements of the user
4. The interaction model represents the scenarios and messages flows
5. The implementation model shows the work unit

ADVANTAGES
 Most used and flexible
 Development time is reduced
 Provides standard for software development
 It has large visual elements to construct and easy to follow

Page 34 of 59
5.2.1 USE CASE DIAGRAM

In UML, use-case diagrams model the behavior of a system and help to capture the
requirements of the system. Use-case diagrams describe the high-level functions and scope
of a system. These diagrams also identify the interactions between the system and its
actors. In this project there is only one user. The user queries command to the system.
System then interprets it and fetches answer. The response is sent back to the user.

FIGURE 5.2 USE CASE DIAGRAM

Page 35 of 59
5.2.2 CLASS DIAGRAM

Class diagram is a static diagram. It represents the static view of an application. Class
diagram is not only used for visualizing, describing, and documenting different aspects of
a system but also for constructing executable code of the software application.
The class user has 2 attributes command that it sends in audio and the response it receives
which is also audio. It performs function to listen the user command. Interpret it and then
reply or sends back response accordingly. Question class has the command in string form
as it is interpreted by interpret class. It sends it to general or about or search function based
on its identification. The task class also has interpreted command in string format.

FIGURE 5.3 CLASS DIAGRAM

Page 36 of 59
5.2.3 ACTIVITY DIAGRAM

An activity diagram is a behavioral diagram. It depicts the behavior of a system. An


activity diagram portrays the control flow from a start point to a finish point showing the
various decision paths that exist while the activity is being executed.
Initially, the system is in idle mode. As it receives any wakeup call it begins execution.
The received command is identified whether it is a question or task to be performed.
Specific action is taken accordingly. After the question is being answered or the task is
being performed, the system waits for another command. This loop continues unless it
receives a quit command.

FIGURE 5.4 ACTIVITY DIAGRAM

Page 37 of 59
5.2.4 SEQUENCE DIAGRAM

A sequence diagram is a Unified Modeling Language (UML) diagram that


illustrates the sequence of messages between objects in an interaction. A
sequence diagram consists of a group of objects that are represented by
lifelines, and the messages that they exchange over time during the interaction.
The below sequence diagram shows how an answer asked by the user is being
fetched from internet. The audio query is interpreted and sent to Web scraper.
The web scraper searches and finds the answer. It is then sent back to speaker,
where it speaks the answer to user.

FIGURE 5.5(a) SEQUENCE DIAGRAM FOR QUERY- RESPONSE

Page 38 of 59
The user sends command to virtual assistant in audio form. The command is passed to the
interpreter. It identifies what the user has asked and directs it to task executer. If the task is
missing some info, the virtual assistant asks user back about it. The received information is
sent back to task and it is accomplished. After execution feedback is sent back to user.

FIGURE 5.6(b) SEQUENCE DIAGRAM FOR TASK EXECUTION

Page 39 of 59
Chapter 6

SYSTEM IMPLEMENTATION

6.1 MODULES
6.2 MODULE DESCRIPTION

6.1 MODULES
 Pyttsx3
 Sapi5
 Speech recognition
 Pyaudio
 Wikipedia
 Webbrowser

6.2 MODULE DESCRIPTION

6.2.1 Pyttsx3 (Python Text to Speech)


A python library that will help us to convert text to speech. It is a cross-platform Python
wrapper for text-to-speech synthesis. It is a Python package supporting common text-to-
speech engines on MacOS X, Windows, and Linux. It works for both Python2.x and 3.x
versions. Its main advantage is that it works offline

6.2.2 Sapi5 (Speech Application Programming Interface)


The Speech Application Programming Interface or SAPI is an API developed by
Microsoft to allow the use of speech recognition and speech synthesis within Windows
applications. To date, a number of versions of the API have been released, which have
Page 40 of 59
shipped either as part of a Speech SDK, or as part of the Windows OS itself.
Applications that use SAPI include Microsoft Office, Microsoft Agent and Microsoft
Speech Server. Many versions (although not all) of the speech recognition and synthesis
engines are also freely redistributable. SAPI 5 however was a completely new interface,
released in 2000. Since, then several sub-versions of this API have been released.
6.2.3 Speech recognition
Speech recognition is the process of converting spoken words to text. Python supports
many speech recognition engines and APIs, including Google Speech Engine, Google
Cloud Speech API, Microsoft Bing Voice Recognition and IBM Speech to Text. Speech
Recognition is an important feature in several applications used such as home automation,
artificial intelligence, etc.
Recognizing speech needs audio input, and Speech Recognition makes it really simple to
retrieve this input. This is a library for performing speech recognition, with support for
several engines and APIs, online and offline.

6.2.4 Pyaudio
To access your microphone with Speech Recognizer, you’ll have to install
the PyAudio package. PyAudio provides Python bindings for Port Audio, the cross-
platform audio I/O library. With PyAudio, you can easily use Python to play and record
audio on a varity of platforms.

6.2.5 Wikipedia
Wikipedia is a Python library that makes it easy to access and parse data from Wikipedia.
It gets article summaries, get data like links and images from a page, and more. This
module provides developers code-level access to the entire Wikipedia reference.

6.2.6 Webbrowser
The webbrowser module provides a high-level interface to allow displaying Web-based
documents to users. Under most circumstances, simply calling the open() function from
this module will do the right thing.
Page 41 of 59
Chapter 7

SYSTEM SPECIFICATIONS

7.1 HARDWARE REQUIREMENTS


7.2 SOFTWARE REQUIREMENTS

7.1 HARDWARE REQUIREMENTS

 Processor - Intel Pentium 4


 RAM - 512 MB
 Hardware capacity:80GB
 Monitor type- 15inch colour monitor
 CD-Drive type- 52xmax
 Mouse
 Microphone
 Personal Computer / Laptop

Page 42 of 59
7.2 SOFTWARE REQUIREMENTS

 Operating System - Windows


 Simulation Tools - Visual Studio Code
 Python - Version 3.9.6
 Packages -
1. Pyttsx3
2. Speech Recognition
3. Wikipedia
4. Pyaudio
5. Webbrowser

Page 43 of 59
Chapter 8

SYSTEM STUDY

8.1 FEASIBILITY STUDY

8.1 FEASIBILITY STUDY

Feasibility study can help you determine whether or not you should proceed with
your project. It is essential to evaluate cost and benefit of the proposed system.
Three key considerations involved in the feasibility analysis are:
 Economical feasibility
 Technical feasibility
 Social feasibility

8.1.1 ECONOMICAL FEASIBILITY


Here, we find the total cost and benefit of the proposed system over current
system. For this project, the main cost is documentation cost. User also would have to pay
for microphones and speakers. Again, they are cheap and available.

8.1.2 TECHNICAL FEASIBILITY


It includes finding out technologies for the project, both hardware and software.
For virtual assistant, user must have microphone to convey their message and a speaker to
listen what system speaks. These are very cheap now a days and everyone generally
possess them.
Page 44 of 59
Besides, system needs internet connection. It is also not an issue in this era where almost
every home or office has Wi-Fi.

8.1.3 SOCIAL FEASIBILITY

The aspect of study is to check the level of acceptance of the system by the user.
This includes the process of training the user to use the system efficiently. The user must
not feel threatened by the system, instead must accept it as a necessity.
The level of acceptance by the users solely depends on the methods that are employed to
educate the user about the system and to make him familiar with it. His level of confidence
must be raised so that he is also able to make some constructive criticism, which is
welcomed, as he is the final user of the system.

Page 45 of 59
Chapter 9

SYSTEM TESTING

9.1 TESTING
9.2 TYPES OF TESTING UNIT TESTING
9.3 TEST RESULTS

9.1 TESTING
The purpose of testing is to discover errors. Testing is the process of trying to
discover every conceivable fault or weakness in a work product. It provides a way to
check the functionality of components, sub – assemblies, assemblies and/or a finished
product It is the process of exercising software with the intent of ensuring that the
Software system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of tests. Each test type addresses a specific
testing requirement.

9.2 TYPES OF TESTING UNIT TESTING


9.2.1 UNIT TESTING
Unit testing involves the design of test cases that validate that the internal program
logic is functioning properly, and that program inputs produce valid outputs. It is the
testing of individual software units of the application. It is done after the completion of an
individual unit before integration. This is a structural testing, that relies on knowledge of
Page 46 of 59
its construction and is invasive. Unit tests perform basic tests at component level and test a
specific business process, application, and/or system configuration.

9.2.2 INTEGRATION TESTING


Integration tests are designed to test integrated software components to determine if
they actually run as one program. Testing is event driven and is more concerned with the
basic outcome of screens or fields. Integration tests demonstrate that although the
components were individually satisfaction, as shown by successfully unit testing, the
combination of components is correct and consistent. Integration testing is specifically
aimed at exposing the problems that arise from the combination of components.
9.2.3 FUNCTIONAL TESTING
Functional tests provide systematic demonstrations that functions tested are available
as specified by the business and technical 29 requirements, system documentation, and
user manuals. Functional testing is centred on the following items:
 Valid input: identified classes of valid input must be accepted identified classes of
valid input must be accepted.
 Invalid output: identified classes of valid input must be accepted.
 Functions: Identified functions must be exercised
 Output: identified classes of application outputs must be exercised.

9.2.4 SYSTEM TESTING


System testing ensures that the entire integrated software system. It tests a
configuration to ensure known and predictable results. System testing is based on the
process descriptions, flows, emphasizing pre-driven process links and integration points.

9.2.5 WHITE BOX TESTING


White Box Testing is a testing in which in which the software tester has knowledge
of the inner workings, structure and language of the 30 software, or at least its purpose. It
is purpose. It is used to test areas that cannot be reached from a black box level.

Page 47 of 59
9.2.6 BLACK BOX TESTING
Black Box Testing is testing the software without any knowledge of the inner
workings, structure or language of the module being tested. Black box tests, as most other
kinds of tests, must be written from a definitive source document, such as specification or
requirements document.

9. 3 TEST RESULTS
All the test cases mentioned above have passed successfully. No defects encountered.

Page 48 of 59
Chapter 10
CONCLUSION, LIMITATION AND FUTURE SCOPE

10.1 Conclusion
10.2 Limitation
10.3 Future Scope

10.1 Conclusion

The intelligent desktop assistant developed in this project represents a significant


advancement in human-computer interaction. By utilizing natural language processing,
machine learning, and user behavior analysis, the assistant addresses the complex needs of
modern users. Unlike traditional desktop applications that rely on graphical interfaces and
require significant user adaptation, this voice-controlled assistant provides an intuitive and
natural interaction method.

Key achievements include simplifying complex tasks through natural language


commands, thereby enhancing user convenience and productivity. The assistant enables
users to perform functions such as web searches, scheduling, emailing, and file
management through voice commands, reducing reliance on physical input devices.

Time efficiency is another critical aspect, with the assistant automating routine processes
and providing a seamless interface for more complex tasks, crucial in today's fast-paced
environment. The adaptive learning system, powered by machine learning algorithms,
ensures continuous improvement in user experience by personalizing responses based on
user interactions.

Security and privacy are prioritized with robust measures like end-to-end encryption and
user-controlled settings, ensuring user trust and confidence. The project also explores new
paradigms in user interface design, demonstrating the potential for desktop assistants to
transform digital interactions through personalized recommendations, intelligent task
prioritization, and integration with external services.

Overall, this project provides a holistic and enriching computing experience, addressing
the evolving needs of users and laying the groundwork for future innovations in intelligent
desktop assistants. It highlights the potential for these technologies to revolutionize
Page 49 of 59
human-computer interaction.

10.2 Limitation

1. User Trust and Transparency: Ensuring transparency in the operations of the desktop
assistant is crucial to maintaining user trust, especially regarding data handling and
processing.
2. Integration with External Services: Challenges exist in seamlessly integrating the
assistant with a wide range of external applications and services, which can limit its utility.
3. Multi-modal Interaction: Extending capabilities to support multi-modal inputs, such as
speech and images, presents technical challenges that need to be addressed to provide a
more natural interaction.
4. Security and Privacy Concerns: Handling sensitive user information securely and
addressing privacy issues are ongoing challenges that impact user confidence and
engagement.
5. Usability and User Experience: Ensuring an optimal user interface, response times, and
overall usability are critical for user satisfaction, but can be difficult to consistently
achieve.
6. Adaptive Learning Mechanisms: Developing effective adaptive learning mechanisms to
continually improve the assistant based on user interactions is complex and resource-
intensive.

10.3 Future Scope

1. Enhanced Personalization: Implementing advanced machine learning algorithms to


further personalize user interactions and responses based on behavioral analysis and
preferences.

2. Broader API Integration: Expanding the range of external services and applications the
assistant can interact with through improved API integration, enhancing its overall
functionality.

3. Support for Multi-modal Inputs: Incorporating capabilities for multi-modal inputs, such
Page 50 of 59
as speech and images, to provide a more versatile and natural interaction environment.

4. Improved Security Measures: Developing robust security measures, including


encryption and user-controlled privacy settings, to address user concerns and enhance
trust.
5. Optimization and Scaling: Continuously optimizing the system for performance and
scalability to handle varying loads and user interactions efficiently.

6. Adaptive Learning and Continuous Improvement: Further refining adaptive learning


mechanisms to enable the assistant to learn from user feedback and improve its
performance over time.

7. Cross-platform Compatibility: Ensuring compatibility and interoperability across


diverse operating systems, applications, and devices to maximize accessibility for users.

Page 51 of 59
Appendix - I

Source Code

Page 52 of 59
Page 53 of 59
Page 54 of 59
Appendix - II

Snapshots

Page 55 of 59
Page 56 of 59
Page 57 of 59
Bibliography

1. https://www.youtube.com/watch?v=s_8b5iq4Rvk

2. https://www.youtube.com/watch?v=tZl1_AcC7Dw

3. https://www.youtube.com/watch?v=C1qddMmwP90&list=PLi78ZOR5bq2kqY7D5fr1
CirK1hlqJZgRk

4. https://www.youtube.com/watch?v=rgGDTO8g2Pg

5. https://www.youtube.com/watch?v=zf-
h1iXapfI&list=PLjC8JXsSUrrg6Plc3khOW6MI_O7DoxnnF

Page 58 of 59
References
1. Abhay Dekate, Chaitanya Kulkarni, Rohan Killedar, “Study of Voice Controlled
Personal Assistant Device”, International Journal of Computer Trends and
Technology (IJCTT) – Volume 42 Number 1 – December 2016.

2. Deny Nancy, Sumithra Praveen, Anushria Sai, M.Ganga, R.S.Abisree, “Voice


Assistant Application for a college Website”, International Journal of Recent
Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-7,April 2019.

3. Deepak Shende, Ria Umahiya, Monika Raghorte, Aishwarya Bhisikar, Anup


Bhange, “AI Based Voice Assistant Using Python”, Journal of Emerging
Technologies and Innovative Research (JETIR), February 2019, Volume 6.

4. Dr.Kshama V.Kulhalli, Dr.Kotrappa Sirbi, Mr.Abhijit J. Patankar, “Personal


Assistant with Voice Recognition Intelligence”, International Journal of
Engineering Research and Technology. ISSN 0974- 3154 Volume 10, Number 1
(2017).

5. Isha S. Dubey, Jyotsna S. Verma, Ms.Arundhati Mehendale, “An Assistive System


for Visually Impaired using Raspberry Pi”, International Journal of Engineering
Research & Technology (IJERT), Volume 8, May-2019.

6. Kishore Kumar R, Ms. J. Jayalakshmi, Karthik Prasanna, “A Python based Virtual


Assistant using Raspberry Pi for Home Automation”, International Journal of
Electronics and Communication Engineering (IJECE), Volume 5, July 2018.

7. M. A. Jawale, A. B. Pawar, D. N. Kyatanavar, “Smart Python Coding through


Voice Recognition”, International Journal of Innovative Technology and Exploring
Engineering (IJITEE) ISSN: 2278-3075, Volume-8, August 2019.

8. Rutuja V. Kukade, Ruchita G. Fengse, Kiran D. Rodge, Siddhi P. Ransing, Vina M.


Lomte, “Virtual Personal Assistant for the Blind”, International Journal of
Computer Science and Technology (JCST), Volume 9, October - December 2018.

9. Tushar Gharge, Chintan Chitroda, Nishit Bhagat, Kathapriya Giri, “AI-Smart


Assistant”, International Research Journal of Engineering and Technology (IRJET),
Volume: 06, January 2019.
Page 59 of 59
10. Veton Kepuska, “Next-Generation of Virtual Personal Assistants (Microsoft
Cortana, Apple Siri, Amazon Alexa and Google 11. Home)”, PyCon, Cleveland,
2018

Page 60 of 59
International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

Desktop Assistant Based on NLP


Authors:
Ankit Kumar Singh , Harsh Kumar , Keshri Nandan

Guide Prof. : Mr.Badal Bhusan

Assistant Professor,

Department of Computer Science and Engineering ,


IIMT College of Engineering , Greater Noida , U.P

Abstract

Natural Language Processing (NLP) has emerged as a critical component of artificial intelligence, enabling
machines to comprehend and interact with human language. This research paper explores the current state of the art
in NLP, highlighting recent innovations, trends, and ongoing challenges. It delves into various applications of NLP,
discusses the datasets and models that drive advancements, and examines the evaluation metrics used to assess NLP
systems. Key innovations such as transformers, pre-trained language models, and transfer learning have
revolutionized the field, leading to significant improvements in performance across a variety of tasks. Additionally,
the paper addresses the growing emphasis on ethical AI and bias mitigation, as well as the integration of NLP with
other AI technologies to create multimodal systems. Applications of NLP in text classification, sentiment analysis,
machine translation, conversational agents, and information retrieval are thoroughly examined. The discussion
extends to the critical role of benchmark datasets and pre-trained models in driving progress. Furthermore, the paper
evaluates the effectiveness of various metrics used to measure the performance of NLP systems. Finally, the future
prospects and potential research directions are considered, highlighting the ongoing efforts to push the boundaries of
what NLP can achieve in an increasingly interconnected and data-driven world.

Keywords Natural language processing . Natural language understanding . Natural language generation

Chapter 1
Introduction

In today's digital age, voice assistants have emerged as a revolutionary technology that simplifies human computer
interaction. Voice assistants are intelligent software programs designed to understand and respond to human voice
commands. They provide a convenient and hands-free way for users to interact with their devices,access
information, perform tasks, and control various applications.

This paper aims to present the implementation of a desktop voice assistant, which offers a wide range of
functionalities and enhances the user's productivity and convenience. Unlike traditional desktop applications that
rely solely on graphical user interfaces (GUI), a voice assistant enables users to interact with their computer systems
through voice commands, eliminating the need for physical input devices such as keyboards or mice.

The implementation of a desktop voice assistant involves various components, including speech recognition, natural
language processing (NLP), and machine learning algorithms. The speech recognition component converts spoken
language into text, enabling the system to understand and interpret user commands accurately. The NLP component
analyzes and comprehends the user's intent, extracting relevant information from the input text. Machine learning
algorithms play a crucial role in training the system to improve its accuracy and understand user preferences over
time.

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 1


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

The primary goal of this project is to create a robust and efficient desktop voice assistant that provides seamless
integration with the user's computer system. The voice assistant should be capable of executing a wide range of
tasks, such as retrieving information from the web, scheduling appointments, sending emails, playing music, setting
reminders, and performing system operations like opening applications and managing files.

The implementation of a desktop voice assistant presents several challenges, including ensuring accurate speech
recognition, handling ambiguous user commands, and maintaining privacy and security of user data. These
challenges require careful consideration and the adoption of suitable algorithms and techniques to achieve a reliable
and user-friendly voice assistant system.

By implementing a desktop voice assistant, this project aims to offer users a more intuitive and efficient way to
interact with their computers, ultimately enhancing their productivity and user experience. The successful
implementation of a robust and versatile desktop voice assistant has the potential to revolutionize the way we
interact with our digital devices, making technology more accessible and user-centric.

In the dynamic landscape of technology, the demand for intelligent and intuitive solutions to enhance productivity
and user experience is ever-growing. In this context, our final year project aims to develop an advanced Desktop
Assistant, a cutting-edge application designed to seamlessly integrate with users' daily workflows, providing
assistance, automation, and personalized experiences.

The desktop environment remains a central hub for various tasks, ranging from work-related activities to personal
organization. However, the sheer volume of information and the complexity of modern applications often lead to
challenges in managing and optimizing these tasks efficiently. Our Intelligent Desktop Assistant seeks to address
these challenges by employing state-of-the-art technologies, including natural language processing, machine
learning, and user behavior analysis.

Chapter 2
Motivation

In the dynamic landscape of computing, the motivation behind undertaking the development of an intelligent
desktop assistant stems from the recognition of an increasingly complex digital environment that users navigate
daily. As technology evolves, so do the expectations of users regarding the efficiency, intuitiveness, and
personalisation of their computing experience. The conventional interfaces and tools often fall short in meeting
these demands, prompting the need for a sophisticated solution. The motivation for this project can be encapsulated
in several key aspects:

 Complexity of Tasks:

As computing tasks become more intricate and multifaceted, users find themselves grappling with the challenge of
managing and executing diverse operations on their desktops. The motivation lies in addressing this complexity by
providing a comprehensive solution that simplifies tasks and enhances overall productivity.

 Natural Interaction:

Traditional interfaces often necessitate a learning curve, requiring users to adapt to rigid command structures and
interfaces. The motivation behind this project is to create a desktop assistant that understands and responds to
natural language, fostering a more intuitive and human-like interaction between users and their computing
environment.

 Time Efficiency:

In the fast-paced digital era, time efficiency is paramount. The project is motivated by the desire to empower users
to accomplish tasks more quickly and effortlessly. By automating routine processes and providing a seamless
interface for complex operations, the intelligent desktop assistant aims to save users valuable time.

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 2


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

 Adaptive and Learning Systems:

Recognizing the importance of adaptability in today's ever-changing technological landscape, the motivation behind
this project is to create a desktop assistant that not only responds to immediate user needs but also learns and
evolves over time. Through machine learning algorithms, the assistant can adapt to user preferences, thereby
enhancing the user experience.

 Enhanced User Experience:

The ultimate motivation is to elevate the overall user experience by providing a desktop assistant that goes beyond
basic functionalities. The project aims to integrate advanced features, such as personalized recommendations,
intelligent task prioritization, and seamless integration with external services, to create a holistic and enriching
computing experience.

 Innovation in Human-Computer Interaction:

The development of an intelligent desktop assistant represents an opportunity to contribute to the field of human-
computer interaction. The project is motivated by the aspiration to innovate and explore new paradigms in user
interface design, leveraging cutting-edge technologies to create a more responsive and user-centric computing
environment.

Conclusion

In essence, the motivation for this project lies in addressing the evolving needs of users in a technologically
advanced era, where the traditional boundaries between humans and computers are increasingly blurred. By
developing an intelligent desktop assistant, this project aims to empower users, streamline their interactions with
digital systems, and contribute to the ongoing evolution of user-centric computing environments.

Chapter 3
Literature Survey related to
Project

SL No. Paper Title Authors Year Technology

1. Resources and Nikita Desai, 2022 Language Processing Libraries,


components for Nikhik Dabhi Sentiment Analysis Tools
Gujarati NLP systems

2. Arabic sentiment Hasna Chouikhi, Transformer


analysis using BERT Hamza Chniter, 2021 Architecture,Attention
model Fethi Jarray Mechanism,Masked Language
Model (MLM) Pre-training

The Indian repository Narayan Tokenization,Part-of-Speech


3. of resources for Choudhary 2021 Tagging,Stemming and
language technology Lemmatization

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 3


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

Oliver Baclic, Early Detection and


4. Artificial intelligence in Matthew Tunic, 2020 Diagnosis,Medical Imaging and
public health Kelsy Young Diagnostics,Drug
Discovery and Development

Improving the
5. reliability of deep Basemah Alshemali, 2020 Data Augmentation,Adversarial
neural networks in Jugal Kalita Training,Ensemble Methods
NLP: A Review

LEGAL-BERT: the Legal Document Analysis,


6. muppets straight out of Ben Lutkevich 2020 Legal Question Answering,
law school Named Entity Recognition
(NER) in Legal Texts

Transformer-xl: Zihang Dai, Segment-Level Recurrence


attentive language Zhilin Yang, Mechanism,
7. models beyond a fixed- Yiming Yang 2019 Relative Positional Embeddings,
length context Causal Self-Attention Masking

Bidirectional Context
Bert: Pre-training of Jacob Devlin, Understanding,Transformer
deep bidirectional Ming Chang, Architecture,Masked Language
8. transformers for Kenton Lee 2018 Model (MLM) Pre-
language understanding training,Large-Scale Pre-
training,Fine-Tuning for
Downstream Tasks

An analysis of neural Stephen Merity, Syntax-Aware Models,


9. language modeling at Nitish Shirish 2018 Semantic Role Labeling (SRL),
multiple scales Keskar, Graph-Based Representations
Richard Socher

Long Short-Term Memory


Deep learning applied Marc Moreno Lopez, (LSTM) Networks,Gated
10. to NLP Jugal Kalita 2017 Recurrent Units
(GRUs),Convolutional
Neural Networks (CNNs)

Neural network Feedforward Neural Networks


11. methods for natural Yoav Goldberg 2017 (FNN),Recurrent Neural
language processing Networks (RNN),Long Short-
Term Memory (LSTM)

Neural machine Dzmitry Bahdanau, Attention Mechanism,


12. translation by jointly KyungHyun Cho, 2015 Alignment Model,
learning to align and Yoshua Bengio Soft Attention,End-to-
translate End Learning

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 4


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

Searching better Diksha Khurana, Sequence-to-Sequence


13. architectures for neural Aditya Koli, 2015 (Seq2Seq) with Attention,T5
machine Kiran Khatter, (Text-to-Text Transfer
translation Sukhdev Singh Transformer),MARIAN

Edward Benson, Data Collection and Pre-


14. Event discovery in Aria Haghighi, 2011 processing, Text Mining and
social media feeds Regina Barzilay NLP,Temporal Analysis,Real-
time Monitoring

A unified architecture Ronan Collobert, Transformer Architecture,


15. for natural language Jason Weston 2008 Modular Components,
processing Bidirectional Context

Chapter: 4 Literature review

Feature Existing Problems in Literature


Proposed Improvements
in theLiterature

Many desktop assistants struggle to Implement a context-aware mechanism that


understand and maintain context over enables the desktop assistant to remember and
Context Understanding extended conversations, leading to reference previous interactions, providing a
misinterpretation of user queries and more coherent and accurate response to user
commands. inputs.
Some desktop assistants may not sufficiently Integrate machine learning algorithms to
adapt to individual user preferences and analyze user interactions and preferences,
Personalization behaviors, resulting in a generic user allowing the desktop assistant to personalize its
experience. responses and suggestions over time.

Certain desktop assistants may face Enhance the assistant's capabilities by


Integration with challenges in seamlessly integrating with improving API integration, allowing users to
External Applications external applications and services, limiting perform actions and retrieve information from a
their overall utility. wider range of external services and platforms.

Some desktop assistants primarily rely on Extend the capabilities of the desktop assistant
text-based interactions, neglecting the to support multi-modal inputs, enabling users
potential benefits of incorporating multi- to interact using speech, images, or other
Multi-Modal modal elements like speech, images, or modalities for a more versatile and natural
Interactions gestures. experience.

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 5


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

Security and privacy issues may arise as Implement robust security measures, including
desktop assistants handle sensitive end-to-end encryption and user-controlled
information, and users may be hesitant to privacy settings, to address concerns and build
Security and Privacy fully engage with the assistant due to trust among users regarding the handling of
privacy concerns. their data.

Conduct Comparative Study ofAccessibility


Comprehensive Study Lack of Comprehensive Study and Evaluate AI-Generated ContentEffectively.
and Comparison Comparison.

The main objectives of NLP include interpretation, analysis, and manipulation of natural language data for the
intended purpose with the use of various algorithms, tools, and methods. However, there are many challenges
involved which may depend upon the natural language data under consideration, and so makes it difficult to achieve
all the objectives with a single approach. Therefore, the development of different tools and methods in the field of
NLP and relevant areas of studies have received much attention from several researchers in the recent past. The
developments can be seen in the Fig.

Evolution of NLP

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 6


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

Chapter: 5
Problem formulation/Objectives

 Context Understanding:-
Problem: Existing desktop assistants struggle to maintain context over extended interactions, leading to
misunderstandings of user commands and queries.
Objective: Develop a context-aware mechanism to improve the desktop assistant's ability to understand and retain
context during user interactions.

 Personalisation:
Problem: Desktop assistants often lack the ability to adapt to individual user preferences, resulting in a generic user
experience.
Objective: Implement machine learning algorithms to analyze user behavior and preferences, enabling personalized
responses and suggestions over time.

 Integration with External Services:


Problem: Some desktop assistants face challenges in seamlessly integrating with external applications and services,
limiting their overall utility.
Objective: Enhance API integration to broaden the range of external services the assistant can interact with,
improving its capabilities.

 Multi-modal Interaction:
Problem: Certain desktop assistants primarily rely on text-based interactions, neglecting the benefits of
incorporating multi-modal elements like speech and images.
Objective: Extend desktop assistant capabilities to support multi-modal inputs, providing users with more versatile
and natural interaction options.

 Security and Privacy Concerns:


Problem: Security and privacy issues may arise as desktop assistants handle sensitive information, impacting user
trust and engagement.
Objective: Implement robust security measures, including encryption and user-controlled privacy settings, to
address concerns and enhance user confidence.

 Usability and User Experience:


Problem: Some desktop assistants may have sub optimal user interfaces, response times, or overall usability.

Objective: Enhance the desktop assistant's user interface, optimize response times, and ensure intuitive interactions
to improve overall usability.

 Adaptive Learning:
Problem: Desktop assistants may lack mechanisms for adaptive learning from user feedback, hindering continuous
improvement.
Objective: Develop adaptive learning mechanisms to enable the desktop assistant to learn from user interactions
and improve its performance over time.

 Compatibility and Interoperability:


Problem: Desktop assistants may lack compatibility with various operating systems, applications, and devices.
Objective: Ensure compatibility and interoperability across diverse platforms, maximizing accessibility for users.

 Error Handling and User Guidance:


Problem: Ineffective error handling may lead to user frustration, and there may be a lack of clear guidance for
users.
Objective: Implement robust error-handling mechanisms and provide clear user guidance to enhance the overall
user experience.

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 7


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

 User Trust and Transparency:


Problem: Lack of transparency in the operation of desktop assistants may impact user trust.
Objective: Establish transparency in the operation of the desktop assistant and implement features that build user
trust regarding the handling and processing of their data.

Chapter: 6
Methodology/ Planning of work

1. Project Phases:
a. Research and Literature Review:
• Conduct an in-depth literature review on inclusive design, accessibility, and prompt
engineering.
b. Requirement Analysis:
• Define specific requirements and objectives for the project based on literature review
findings.
c. System Design and Architecture:
Develop the system design, including the architectural layout of theproject.

2. Development:
a. Code Development:
• Implemented Python.
• Integration of Machine Learning Algorithm.
b. Integration of AI:
• Integrated Chat GPT.
• Test the compatibility and effectiveness of the integratedsystem.
3. Evaluation and Testing:
a. Comprehensive Study and Comparison:
• Conduct a comprehensive study comparing accessibility features
of popular apps with those generated by our system.
• Analyze and document the findings for later evaluation.
b. User Testing:
• Engage users in testing the system.
• Collect feedback on usability, accessibility, and overall user experience.

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 8


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

4. Iterative Improvement:
a. Feedback Integration:
• Integrate user feedback into the system to address identifiedissues.
• Refine prompt engineering strategies based on user
interactions.
b. Optimization and Scaling:
• Optimize the system for performance and scalability.
• Ensure that the system can handle varying loads and userinteractions.
5. Documentation and Reporting:
a. Finalizing Reports:
• Compile and finalize documentation, including research
findings, development processes, and user testing results.

Conclusion: This structured plan ensures a systematic approach to the project, with defined phases for
research, development, evaluation, and iterative improvement. The timeline provides flexibility for
adjustmentsbased on ongoing findings and feedback.

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 9


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

Chapter: 7
Facilities required for proposed work

1. Hardware:
• Testing Devices:
• A range of devices for testing the developed interfaces (e.g., desktops, laptops, tablets,
smartphones).
2. Software:
• Development Environment:
• IDEs for Python development.
• Testing and Accessibility Tools:
• Testing frameworks for unit testing and integration testing.
• Accessibility testing tools to evaluate adherence to
accessibility standards.
3. Data Resources:
• User Interaction Data:
• Capture and anonymize user interaction data for testing andfeedback.
• Ensure compliance with data privacy regulations.
4. Collaboration Tools:
• Documentation and Project Management:
• Document collaboration tools (e.g., Google Docs, MicrosoftOffice 365).
Conclusion: The proposed work requires a robust set of facilities encompassing hardware, software, data
resources, collaboration tools, and dedicated testing environments. Adequate resources and tools will be
essential for the successful execution and evaluation of the project.

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 10


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 08 Issue: 05 | May - 2024 SJIF Rating: 8.448 ISSN: 2582-3930

References

1. Sangpal, R., Gawand, T., Vaykar, S., & Madhavi, N. (2019, July). JARVIS: An interpretation of AIML with integration of
gTTS and Python. In 2019 2nd International Conference on Intelligent Computing, Instrumentation and Control
Technologies (ICICICT) (Vol. 1, pp. 486-489). IEEE.

2. Othman, E. S. (2017). Voice Controlled Personal Assistant Using Raspberry Pi. International Journal of Scientific &
Engineering Research, 8(11), 1611-1615.

3. Mittal, Y., Toshniwal, P., Sharma, S., Singhal, D., Gupta, R., & Mittal, V. K. (2015, December). A voice controlled
multifunctional smart home automation system. In 2015 Annual IEEE India Conference (INDICON) (pp. 1-6). IEEE.

4. Pandey, A., Vashist, V., Tiwari, P., Sikka, S.,&Makkar, P. Smart Voice Based Virtual Personal Assistants with Artificial
Intelligence.

5. Subhash, S., Srivatsa, P. N., Siddesh, S., Ullas, A., & Santhosh, B. (2020, July). Artificial Intelligence-based Voice
Assistant. In 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4) (pp. 593-
596). IEEE.

6. Rahul Kumar, Garima Sarupria, VarshilPanwala, Smit Shah, Nehal Shah (2020), Power Efficient Smart Home With Voice
Assistant, Ieee – 49239. Sivasubramanian A., Shastry P.N., Hong P.C. (eds) Futuristic.

7. Thomas C (2019) https://towardsdatascience.com/recurrent-neural-networks-and-natural-language-processing-


73af640c2aa1. Accessed 15 Dec 2021.

8. Srihari S (2010) Machine Learning: Generative and Discriminative Models.


http://www.cedar.buffalo.edu/wsrihari/CSE574/Discriminative-Generative.pdf. Accessed 31 May 2017.

9. Sakkis G, Androutsopoulos I, Paliouras G et al (2003) A memory-based approach to anti-spam filtering for mailing lists.
Inf Retr 6:49–73. https://doi.org/10.1023/A:1022948414856

10. Seal D, Roy UK, Basak R (2020) Sentence-level emotion detection from text based on semantic rules. In:Tuba M, Akashe
S, Joshi A (eds) Information and communication Technology for Sustainable Development. Advances in intelligent
Systems and computing, vol 933. Springer, Singapore. https://doi.org/10.1007/978-981-13-7166-0_42

11. Newatia R (2019) https://medium.com/saarthi-ai/sentence-classification-using-convolutional-neural-networks-


ddad72c7048c. Accessed 15 Dec 2021.

12. Ochoa, A. (2016). Meet the Pilot: Smart Earpiece Language Translator. https://www.indiegogo.com/projects/meet-the-
pilot-smart-earpiece-language-translator-headphones-travel. Accessed April 10, 2017.

13. Ogallo, W., & Kanter, A. S. (2017). Using natural language processing and network analysis to develop a conceptual
framework for medication therapy management research. https://www.ncbi.nlm.nih.gov/pubmed/28269895?dopt=Abstract.
Accessed April 10, 2017.

14. Gao T, Dontcheva M, Adar E, Liu Z, Karahalios K DataTone: managing ambiguity in natural language interfaces for data
visualization, UIST ‘15: proceedings of the 28th annual ACM symposium on User Interface Software & Technology,
November 2015, 489–500, https://doi.org/10.1145/2807442.2807478

15. Elkan C (2008) Log-Linear Models and Conditional Random Fields. http://cseweb.ucsd.edu/welkan/250B/cikmtutorial.pdf
accessed 28 Jun 2017.

16. Choudhary N (2021) LDC-IL: the Indian repository of resources for language technology. Lang Resources & Evaluation
55:855–867. https://doi.org/10.1007/s10579-020-09523-3.

© 2024, IJSREM | www.ijsrem.com DOI: 10.55041/IJSREM34539 | Page 11

You might also like