Machine Learning
Machine Learning
Machine Learning
There are four basic steps for building a machine learning application (or model).
These are typically performed by data scientists working closely with the business
professionals for whom the model is being developed.
Step 1: Select and prepare a training data set
Training data is a data set representative of the data the machine learning model
will ingest to solve the problem it’s designed to solve. In some cases, the training
data is labeled data ‘tagged’ to call out features and classifications the model will
need to identify. Other data is unlabeled, and the model will need to extract those
features and assign classifications on its own.
In either case, the training data needs to be properly prepared—randomized, de-
duped, and checked for imbalances or biases that could impact the training. It
should also be divided into two subsets: the training subset, which will be used to
train the application, and the evaluation subset, used to test and refine it.
Step 2: Choose an algorithm to run on the training data set
Again, an algorithm is a set of statistical processing steps. The type of algorithm
depends on the type (labeled or unlabeled) and amount of data in the training
data set and on the type of problem to be solved.
Common types of machine learning algorithms for use with labeled data include
the following:
• Regression algorithms: Linear and logistic regression are examples of
regression algorithms used to understand relationships in data. Linear
regression is used to predict the value of a dependent variable based on the
value of an independent variable. Logistic regression can be used when the
dependent variable is binary in nature: A or B. For example, a linear
regression algorithm could be trained to predict a salesperson’s annual
sales (the dependent variable) based on its relationship to the salesperson’s
education or years of experience (the independent variables.) Another type
of regression algorithm called a support vector machine is useful when
dependent variables are more difficult to classify.
• Decision trees: Decision trees use classified data to make
recommendations based on a set of decision rules. For example, a decision
tree that recommends betting on a particular horse to win, place, or show
could use data about the horse (e.g., age, winning percentage, pedigree)
and apply rules to those factors to recommend an action or decision.
• Instance-based algorithms: A good example of an instance-based algorithm
is K-Nearest Neighbor or k-nn. It uses classification to estimate how likely a
data point is to be a member of one group or another based on its
proximity to other data points.
Algorithms for use with unlabeled data include the following:
• Clustering algorithms: Think of clusters as groups. Clustering focuses on
identifying groups of similar records and labeling the records according to
the group to which they belong. This is done without prior knowledge
about the groups and their characteristics. Types of clustering algorithms
include the K-means, TwoStep, and Kohonen clustering.
• Association algorithms: Association algorithms find patterns and
relationships in data and identify frequent ‘if-then’ relationships called
association rules. These are similar to the rules used in data mining.
• Neural networks: A neural network is an algorithm that defines a layered
network of calculations featuring an input layer, where data is ingested; at
least one hidden layer, where calculations are performed make different
conclusions about input; and an output layer. where each conclusion is
assigned a probability. A deep neural network defines a network with
multiple hidden layers, each of which successively refines the results of the
previous layer. (For more, see the “Deep learning” section below.)
Machine learning methods (also called machine learning styles) fall into three
primary categories.
As noted at the outset, machine learning is everywhere. Here are just a few
examples of machine learning you might encounter every day:
• Digital assistants: Apple Siri, Amazon Alexa, Google Assistant, and other
digital assistants are powered by natural language processing (NLP), a
machine learning application that enables computers to process text and
voice data and 'understand' human language the way people do. Natural
language processing also drives voice-driven applications like GPS and
speech recognition (speech-to-text) software.
• Recommendations: Deep learning models drive 'people also liked' and 'just
for you' recommendations offered by Amazon, Netflix, Spotify, and other
retail, entertainment, travel, job search, and news services.
• Contextual online advertising: Machine learning and deep learning models
can evaluate the content of a web page—not only the topic, but nuances
like the author's opinion or attitude—and serve up advertisements tailored
to the visitor's interests.
• Chatbots: Chatbots can use a combination of pattern recognition, natural
language processing, and deep neural networks to interpret input text and
provide suitable responses.
• Fraud detection: Machine learning regression and classification models
have replaced rules-based fraud detection systems, which have a high
number of false positives when flagging stolen credit card use and are
rarely successful at detecting criminal use of stolen or compromised
financial data.
• Cybersecurity: Machine learning can extract intelligence from incident
reports, alerts, blog posts, and more to identify potential threats, advise
security analysts, and accelerate response.
• Medical image analysis: The types and volume of digital medical imaging
data have exploded, leading to more available information for supporting
diagnoses but also more opportunity for human error in reading the data.
Convolutional neural networks (CNNs), recurrent neural networks (RNNs),
and other deep learning models have proven increasingly successful at
extracting features and information from medical images to help support
accurate diagnoses.
• Self-driving cars: Self-driving cars require a machine learning tour de
force—they must continuously identify objects in the environment around
the car, predict how they will change or move, and guide the car around
the objects as well as toward the driver's destination. Virtually every form
of machine learning and deep learning algorithm mentioned above plays
some role in enabling a self-driving automobile.
Machine learning and IBM Cloud
IBM Watson Machine Learning supports the machine learning lifecycle end to
end. It is available in a range of offerings that let you build machine learning
models wherever your data lives and deploy them anywhere in your hybrid
multicloud environment.
IBM Watson Machine Learning on IBM Cloud Pak for Data helps enterprise data
science and AI teams speed AI development and deployment anywhere, on a
cloud native data and AI platform. IBM Watson Machine Learning Cloud, a
managed service in the IBM Cloud environment, is the fastest way to move
models from experimentation on the desktop to deployment for production
workloads. For smaller teams looking to scale machine learning deployments, IBM
Watson Machine Learning Server offers simple installation on any private or
public cloud.
Links : https://www.ibm.com/cloud/learn/machine-learning
https://www.expert.ai/blog/machine-learning-
definition/#:~:text=Machine%20learning%20is%20an%20application,it%20to%2
0learn%20for%20themselves.