Assignment 2
Assignment 2
Assignment 2
1. Build a logistic regression model based on training data set to identify good customers and
bad customers. A good customer is one who has never delayed the payment, whereas a bad
customer is one who has delayed the payment even once. Use the variables “AGE”,
“NOOFDEPE”, “MTHINCTH”, “SALDATFR”, “TENORYR”, “DWNPMFR”, “PROFBUS”,
“QUALHSC”, “QUAL_PG”, “SEXCODE”, “FULLPDC”, “FRICODE” and “WASHCODE” as predictors
in your logistic model. Clearly interpret the output of the model.
2. Judge the performance of the model based on validation data set. Is the performance of the
model satisfactory? Consider at least two criteria.
3. Include the variable “Region” as an additional predictor in your logistic model. Note that you
have to create appropriate dummy variables for “Region”. Does inclusion of “Region”
improves the performance of the model?
4. Suppose Auto Finance Ltd. provides loan for a 2-year period. The management of the Auto
Finance Ltd. has estimated that the profit associated with a “True Positive” case is Rs. 6360.
Furthermore, they also estimated that the losses associated with a “False Negative” case and
a “False Positive” case are Rs. 12500 and Rs. 6360, respectively. Based on confusion matrix
obtained for the validation data set, calculate the total profit for the company.
5. Can you suggest an alternative model? Is the alternative model better that the logistic
regression model?
6. How will the fitted model be helpful in taking managerial decisions?
# Model
mod=glm(DefaulterFlag~AGE+NOOFDEPE+MTHINCTH+SALDATFR+TENORYR+DWNPMFR+PROFBUS+QUALHSC+Q
UAL_PG+SEXCODE+FULLPDC+FRICODE+WASHCODE,data=d,family=binomial,subset=train)
summary(mod)
2. You may have to define dummy variables for “Region” as follows. Note the reference region is all others. Include
the dummy variables in your model.
# Region Code
d$AP2<-ifelse(Region=="AP2", 1, 0)
d$AP2<-as.factor(AP2)
d$Chennai<-ifelse(Region=="Chennai", 1, 0)
d$Chennai<-as.factor(Chennai)
d$KA1<-ifelse(Region=="KA1", 1, 0)
d$KA1<-as.factor(KA1)
d$KE2<-ifelse(Region=="KE2", 1, 0)
d$KE2<-as.factor(KE2)
d$TN1<-ifelse(Region=="TN1", 1, 0)
d$TN1<-as.factor(TN1)