Miniproject 6
Miniproject 6
Miniproject 6
Machine Learning algorithms is also useful for predicting crop yield production.
Machine Learning algorithms is also useful for predicting crop yield production. Using past
information on weather, temperature and a number of other factors the information is given.
when the producers of the crops know the accurate information on the crop yield it minimizes
the loss. Machine learning, a fast-growing approach that’s spreading out and helping every
sector in making viable decisions to create the foremost of its applications.
The core objective of crop yield estimation is to achieve higher agricultural crop
production and many established models are exploited to increase the yield of crop
production. Nowadays, 2 ML is being used worldwide due to its efficiency in various sectors
such as forecasting, fault detection, pattern recognition, etc. The ML algorithms also help to
improve the crop yield production rate when there is a loss in unfavourable conditions. The
ML algorithms are applied for the crop selection method to reduce the losses crop yield
production irrespective of distracting environment.
CHAPTER 1
INTRODUCTION
1.1 GENERAL
Tamil Nadu being 7th largest area in India has 6th largest population. It is the leading
producer of agriculture products. Agriculture is the main occupation of Tamil Nadu people.
Agriculture has a sound tone in this competitive world. Cauvery is the main source of water.
Cauvery delta regions are called as rice bowl of Tamil Nadu. Rice is the major crop grown in
Tamil Nadu. Other crops like Paddy, Sugarcane, Cotton, Coconut and groundnut are grown.
Bio-fertilizers are produced efficiently. Many areas Farming acts as major source of
occupation. Agriculture makes a dramatic impact in the economy o f a country. Due to the
change of natural factors, Agriculture farming is degrading now-a-days. Agriculture directly
depends on the environmental factors such as sunlight, humidity, soil type, rainfall, Maxim
um and Minim um Temperature, climate, fertilizers, pesticides etc. Knowledge of proper
harvesting of crops is in need to bloom in Agriculture. India has seasons of
1. Winter which occurs from December to March.
2. Summer season from April to June.
3. Monsoon or rainy season lasting from July to September.
4. Post-monsoon or autumn season occurring from October to November.
Due to the diversity of season and rainfall, assessment of suitable crops to cultivate is
necessary. Farmers face major problem s such as crop management, expected crop yield and
productive yield from the crops. Farmers or cultivators need proper assistant regarding crop
cultivation as now-a-days many fresh youngsters are interested in agriculture. Impact of IT
sector in assessing real world problem is moving at a faster rate. Data is increasing day by
day in field of agriculture. With the advancement in Internet of Things, there are ways to
grasp huge data in field of Agriculture. There is a need o f a system to have obvious analyses
of data of agriculture and extract or use useful information from the spreading data. To get
insights from data, it has to be learnt.
1.2 OBJECTIVE
Prediction of crops was done according to farmer’s experience in the past years. A l though
farmer’s knowledge sustains, agricultural factors has been changed to astonishing level.
There comes a need to indulge engineering effect in crop prediction. Data mining plays a
novel role in agriculture research [11]. This field uses historical data to predict; such
techniques are neural networks, K - nearest Neighbour k-means algorithm does not use
historical data but predicts based on-computing centres of the samples and forming clusters.
Computational cost of algorithm acts as a major issue.
The problem that the Indian Agriculture sector is facing the integration of technology to bring
the desired outputs. With the advent of new technologies and overuse of non-renewable
energy resources, patterns of rainfall and temperature are disturbed. The inconsistent trends
developed from the side effects of global warming make it difficult for the farmers to clearly
predict the temperature and rainfall patterns thus affecting their crop yield productivity and
also Indian GDP is decreasing as crop yielding is decreasing. The main aim of this project is
to help farmers to cultivate a crop with maximum yield.
1.4 EXISTING SYSTEM
• The main challenge faced in agriculture sector is the lack of knowledge about the changing
variations in climate. Each crop has its own suitable climatic features. This can be handled
with the help of precise farming techniques. The precision farming not only maintains the
productivity of crops but also increases the yield rate of production.
• The existing system which recommends crop yield is either hardware-based being costly to
maintain, or not easily accessible.
• Despite many solutions that have been recently proposed, there are still open challenges in
creating a user-friendly application with respect to crop recommendation.
1.5 PROPOSED SYSTEM
Farmers need assistance with recent technology to grow their crops. Proper prediction
of crops can be informed to agriculturists in time basis. Many Machine Learning techniques
have been used to analyse the agriculture parameters. Some of the techniques in different
aspects of agriculture are studied.
Blooming Neural networks, Soft computing techniques plays significant part in
providing recommendations. Considering the parameter like production and season, more
personalized and relevant recommendations can be given to farmers which makes them to
yield good volume of production.
The proposed model predicts the crop yield for the data sets of the given region.
Integrating agriculture and ML will contribute to more enhancements in the
agriculture sector by increasing the yields and optimizing the resources involved. The
data from previous years are the key elements in forecasting current performance.
The proposed system uses recommender system to suggest the right time for using
fertilizers.
The methods in the proposed system includes increasing the yield of crops, real-time
analysis of crops, selecting efficient parameters, making smarter decisions and getting
better yield.
CHAPTER-2
LITERATURE SURVEY
3.1 GENERAL
These are the requirements for doing the project. Without using these tools and software’s
we can’t do the project. So, we have two requirements to do the project. They are
1. Hardware Requirements.
2. Software Requirements.
Anaconda Prompt
Anaconda Prompt is a type of command line interface which explicitly deals with the
ml modules and navigator is available in all the windows, Linux and MacOS. The
Anaconda Prompt has many numbers of IDE’s which make the coding easier. The UI
can also be implemented in python.
Standard Used: ISO/IEC 27001
JUPYTER
It’s like an open-source web application that allows us to share and create the
documents which contains the live code, equations, visualisations and narrative text. It
can be used for data cleaning and transformations, numerical simulation, statistical
modelling, data visualization, machine learning.
Standard Used: ISO/IEC 27001
CHAPTER 4
SYSTEM DESIGN
4.1 GENERAL
Design Engineering deals with the various UML [Unified Modelling language] diagrams for
the implementation of project. Design is a meaningful engineering representation of a thing
that is to be built. Software design is a process through which the requirements are translated
into representation of the software. Design is the place where quality is rendered in software
engineering. Design is the means to accurately translate customer requirements into finished
product
4.3.1 INTRODUCTION:
UML represents Unified Modelling Language. UML is an institutionalized universally useful
showing dialect in the subject of article situated programming designing. The fashionable is
overseen, and become made by way of, the Object Management Group. The goal is for UML
to become a regular dialect for making fashions of item arranged PC programming. In its gift
frame UML is contained two noteworthy components: a Meta-show and documentation.
Later on, a few types of method or system can also likewise be brought to; or related with,
UML.
GOALS: The Primary goals inside the plan of the UML are as in step with the subsequent:
1. Provide clients a prepared to utilize, expressive visual showing Language on the way
to create and change massive models.
2. Provide extendibility and specialization units to make bigger the middle ideas.
3. be free of specific programming dialects and advancement manner.
4. Provide a proper cause for understanding the displaying dialect.
5. Encourage the improvement of OO gadgets exhibit.
6. Support large amount advancement thoughts, for example, joint efforts, systems,
examples and its components.
7. Integrate widespread procedures.
4.3.2 USE CASE DIAGRAM
A use case diagram in the Unified Modelling Language (UML) is a type of behavioural
diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical
overview of the functionality provided by a system in terms of actors, their goals (represented
as use cases), and any dependencies between those use cases. The main purpose of a use case
diagram is to show what system functions are performed for which actor. Roles of the actors
in the system can be depicted.
Fig 4.3.2: Use Case Diagram
In software engineering, a class diagram in the Unified Modelling Language (UML) is a type
of static structure diagram that describes the structure of a system by showing the system's
classes, their attributes, operations (or methods), and the relationships among the classes. It
explains which class contains information . Class diagrams are the blue prints of your system
or subsystem. Class diagrams are useful in many stages of system.
Fig 4.3.3: Class Diagram
A UML Object diagram represents a specific instance of a class diagram at a certain moment
in time. When represented visually, you’ll see many similarities to the class diagram. An
Object diagram focuses on the attributes of a set of objects and how those objects relate to
each other.
Fig 4.3.5: Object Diagram
1. The DFD is also called as bubble chart. It is a simple graphical formalism that can be
used to represent a system in terms of input data to the system, various processing
carried out on this data, and the output data is generated by this system.
2. The data flow diagram (DFD) is one of the most important modelling tools. It is used
to model the system components. These components are the system process, the data
used by the process, an external entity that interacts with the system and the
information flows in the system.
3. DFD shows how the information moves through the system and how it is modified by
a series of transformations. It is a graphical technique that depicts information flow
and the transformations that are applied as data moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a system at any
level of abstraction. DFD may be partitioned into levels that represent increasing
information flow and functional detail.
Fig 4.4: Data Flow Diagram
CHAPTER-5
IMPLEMENTATION
5.1 GENERAL
In this chapter, various supervised machine learning approaches are used. This section
provides a general description of these approaches.
5.2 METHODOLOGIES
Data collection
Data set
Data pre-processing
Model Selection
Performance Analysis
Accuracy Prediction
Data Collection:
This is the first real step towards the real development of a machine learning model,
collecting data. This is a critical step that will cascade in how good the model will be, the
more and better data that we get, the better our model will perform.
There are several techniques to collect the data, like web scraping, manual
interventions and etc. The dataset used in this crop recommendation in India taken from some
other source.
Dataset:
The dataset consists of individual data. There are 8 columns in the dataset, which are
described below.
Data Pre-processing:
Wrangle data and prepare it for training. Clean that which may require it (remove
duplicates, correct errors, deal with missing values, normalization, data type conversions,
etc.)
Randomize data, which erases the effects of the particular order in which we collected
and/or otherwise prepared our data. Visualize data to help detect relevant relationships
between variables or class imbalances (bias alert!), or perform other exploratory analysis.
Split into training and evaluation sets.
Model Selection:
Decision trees can handle high dimensional data with good accuracy. The decision
rules are generally in form of if-then-else statements. The deeper the tree, the more complex
the rules and fitter the model.
Before we dive deep, let's get familiar with some of the terminologies:
Instances: Refer to the vector of features or attributes that define the input space
Attribute: A quantity describing an instance
Concept: The function that maps input to output
Target Concept: The function that we are trying to find, i.e., the actual answer
Hypothesis Class: Set of all the possible functions
Sample: A set of inputs paired with a label, which is the correct output
Testing Set: Similar to the training set and is used to test the candidate concept and
determine its performance.
Performance Analysis:
The performance was evaluated using the metrics like Mean Square Error (MSE).
In the actual dataset, we chose only 8 features:
Accuracy Prediction:
We got an accuracy of 90.7% on test set. Now we got the accuracy prediction of the crop.
5.2 PYTHON
Before we take a look at the details of various machine learning methods, let's start by
looking at what machine learning is, and what it isn't. Machine learning is often categorized
as a subfield of artificial intelligence, but I find that categorization can often be misleading at
first brush. The study of machine learning certainly arose from research in this context, but in
the data science application of machine learning methods, it's more helpful to think of
machine learning as a means of building models of data. Fundamentally, machine learning
involves building mathematical models to help understand data. "Learning" enters the fray
when we give these models tuneable parameters that can be adapted to observed data; in this
way the program can be considered to be "learning" from the data. Once these models have
been fit to previously seen data, they can be used to predict and understand aspects of newly
observed data. I'll leave to the reader the more philosophical digression regarding the extent
to which this type of mathematical, model based "learning" is similar to the "learning"
exhibited by the human brain. Understanding the problem setting in machine learning is
essential to using these tools effectively Applications of Machines Learning.
Machine Learning is the most rapidly growing technology and according to researchers we
are in the golden year of AI and ML. It is used to solve many real-world complex problems
which cannot be solved with traditional approach.
Following are some real-world applications of ML
A random forest is a supervised machine learning algorithm that is constructed from decision
tree algorithms. This algorithm is applied in various industries such as banking and e-
commerce to predict behaviour and outcomes. This article provides an overview of the
random forest algorithm and how it works. The article will present the algorithm’s features
and how it is employed in real-life applications. It also points out the advantages and
disadvantages of this algorithm.
Decision trees are the building blocks of a random forest algorithm. A decision tree is
a decision support technique that forms a tree-like structure. An overview of decision trees
will help us understand how random forest algorithms work.
A decision tree consists of three components: decision nodes, leaf nodes, and a root
node. A decision tree algorithm divides a training dataset into branches, which further
segregate into other branches. This sequence continues until a leaf node is attained. The leaf
node cannot be segregated further.
The nodes in the decision tree represent attributes that are used for predicting the
outcome. Decision nodes provide a link to the leaves. The following diagram shows the three
types of nodes in a decision tree.
Fig 5.4: Decision Tree Diagram