Project FRA Milestone1 JPY Nikita Chaturvedi 05.05.2022 Jupyter Notebook PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 102

06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.

2022 - Jupyter Notebook

Problem Statement
Businesses or companies can fall prey to default if they are not able to keep up their debt obligations. Defaults
will lead to a lower credit rating for the company which in turn reduces its chances of getting credit in the future
and may have to pay higher interests on existing debts as well as any new obligations. From an investor's point
of view, he would want to invest in a company if it is capable of handling its financial obligations, can grow
quickly, and is able to manage the growth scale.

A balance sheet is a financial statement of a company that provides a snapshot of what a company owns,
owes, and the amount invested by the shareholders. Thus, it is an important tool that helps evaluate the
performance of a business.

Data that is available includes information from the financial statement of the
companies for the previous year (2015). Also, information about the Networth of
the company in the following year (2016) is provided which can be used to drive
the labeled field.

In [175]:

# Importing the libraries


import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns # for making plots with seaborn
color = sns.color_palette()
import sklearn.metrics as metrics
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import statsmodels.formula.api as SM
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
from sklearn.metrics import roc_auc_score,roc_curve,classification_report,confusion_

import warnings
warnings.filterwarnings("ignore")

Data Ingestion (Read Dataset):

In [2]:

Company = pd.read_csv('FRA Milestone 1.csv')

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ipynb 1/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [3]:

Company.head(10)
Capital [Latest] [Latest] [Latest] [Latest] (

27.48 -1,007.24 5,936.03 474.3 -1,076.34 40.5 ... 0 0 0 0 0

68.08 4,458.20 7,410.18 9,070.86 -1,098.88 486.86 ... -10.3 -39.74 -57.74 -57.74 -87.18

06.86 7,714.68 6,944.54 1,281.54 4,496.25 9,097.64 ... -5,279.14 -5,516.98 -7,780.25 -7,723.67 -7,961.51

23.49 2,353.88 2,326.05 1,033.69 -2,612.42 1,034.12 ... -3.33 -7.21 -48.13 -47.7 -51.58

70.83 4,675.33 5,740.90 1,084.20 1,836.23 4,685.81 ... -295.55 -400.55 -845.88 379.79 274.79

19.39 -1,824.75 694.64 0.02 -1,843.74 0 ... 0 0 0 0 0

31.57 1,536.08 2,567.65 949.98 804.82 834.86 ... -395.87 -987.73 -396.67 -672.36 -1,264.22

45.45 979.13 2,664.04 920.67 263.95 705.76 ... -447.24 -596.97 -456.4 -461.06 -610.8

60.94 -613.79 597.82 1,700.27 -1,121.96 117.67 ... 1.9 -20.43 -3.58 -3.58 -25.91

47.85 86.35 1,220.83 1,329.82 -390.53 2,536.78 ... 19.23 18.18 9.76 9.76 8.71

In [4]:

Company.tail(10)
Capital

Power
3576 5455 Grid 43811.23 5,231.59 38,166.59 1,39,632.92 95,044.55 1,18,264.26 -10,923.29 12
Corpn

3577 566 Tata Steel 46637.38 971.41 66,663.89 1,01,142.12 28,198.44 42,583.38 -3,727.04 12

Sardar
3578 13569 47261.30 42,263.46 44,129.73 46,810.68 2,636.27 3,746.17 665.73 1
Sar.Narm.

3579 5554 Axis Bank 53164.91 474.1 44,676.51 4,61,977.78 4,02,200.22 4,497.01 0 3,58

3580 2806 Infosys 61082.00 574 48,068 48,098 0 12,869 28,721

HDFC
3581 4987 72677.77 501.3 62,009.42 5,90,576 4,96,009.19 8,463.30 0 4,44
Bank

3582 502 Vedanta 79162.19 296.5 34,057.87 71,906.06 37,643.79 29,848.44 2,503.86 11

3583 12002 IOCL 88134.31 2,427.95 67,969.97 1,40,686.75 55,245.01 1,21,643.45 6,376.84 89

3584 12001 NTPC 91293.70 8,245.46 81,657.35 1,73,099.14 85,995.34 1,28,477.59 11,449.79 42

Bharti
3585 15542 111729.10 1,998.70 78,270.80 1,04,241 21,569.70 1,00,084.90 -12,145.30 11
Airtel

Fixing Messy Column Names (containing spaces):

In [5]:

erc').str.replace('/','_by_').str.replace('&','and').str.replace('[','_').str.replace

Checking Top 10 Rows Again :

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ipynb 2/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [6]:

Company.head(10)

Out[6]:

Co_Code Co_Name Networth_Next_Year Equity_Paid_Up Networth Capital_Employed Tota

0 16974 Hind.Cables -8021.60 419.36 -7,027.48 -1,007.24 5

Tata Tele.
1 21214 -3986.19 1,954.93 -2,968.08 4,458.20 7
Mah.

ABG
2 14852 -3192.58 53.84 506.86 7,714.68 6
Shipyard

3 2439 GTL -3054.51 157.3 -623.49 2,353.88 2

Bharati
4 23505 -2967.36 50.3 -1,070.83 4,675.33 5
Defence

5 2484 Usha Ispat -2519.40 179.35 -2,519.39 -1,824.75

Hanung
6 23633 -2125.05 30.82 -1,031.57 1,536.08 2
Toys

7 3226 K S Oils -2100.56 45.92 -1,945.45 979.13 2

Quadrant
8 1541 -1695.75 61.23 -1,560.94 -613.79
Tele.

9 2334 ITI -1677.18 288 -1,947.85 86.35 1

10 rows × 67 columns

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ipynb 3/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [7]:

Company.info()

<class 'pandas.core.frame.DataFrame'>

RangeIndex: 3586 entries, 0 to 3585

Data columns (total 67 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 Co_Code 3586 non-null int64

1 Co_Name 3586 non-null object

2 Networth_Next_Year 3586 non-null float64

3 Equity_Paid_Up 3586 non-null object

4 Networth 3586 non-null object

5 Capital_Employed 3586 non-null object

6 Total_Debt 3586 non-null object

7 Gross_Block 3586 non-null object

8 Net_Working_Capital 3586 non-null object

9 Current_Assets 3586 non-null object

10 Current_Liabilities_and_Provisions 3586 non-null object

11 Total_Assets_by_Liabilities 3586 non-null object

12 Gross_Sales 3586 non-null object

13 Net_Sales 3586 non-null object

14 Other_Income 3586 non-null object

15 Value_Of_Output 3586 non-null object

16 Cost_of_Production 3586 non-null object

17 Selling_Cost 3586 non-null object

18 PBIDT 3586 non-null object

19 PBDT 3586 non-null object

20 PBIT 3586 non-null object

21 PBT 3586 non-null object

22 PAT 3586 non-null object

23 Adjusted_PAT 3586 non-null object

24 CP 3586 non-null object

25 Revenue_earnings_in_forex 3586 non-null object

26 Revenue_expenses_in_forex 3586 non-null object

27 Capital_expenses_in_forex 3586 non-null object

28 Book_Value_Unit_Curr 3586 non-null object

29 Book_Value_Adj_Unit_Curr 3582 non-null object

30 Market_Capitalisation 3586 non-null object

31 CEPS_annualised_Unit_Curr 3586 non-null object

32 Cash_Flow_From_Operating_Activities 3586 non-null object

33 Cash_Flow_From_Investing_Activities 3586 non-null object

34 Cash_Flow_From_Financing_Activities 3586 non-null object

35 ROG_Net_Worth_perc 3586 non-null object

36 ROG_Capital_Employed_perc 3586 non-null object

37 ROG_Gross_Block_perc 3586 non-null object

38 ROG_Gross_Sales_perc 3586 non-null object

39 ROG_Net_Sales_perc 3586 non-null object

40 ROG_Cost_of_Production_perc 3586 non-null object

41 ROG_Total_Assets_perc 3586 non-null object

42 ROG_PBIDT_perc 3586 non-null object

43 ROG_PBDT_perc 3586 non-null object

44 ROG_PBIT_perc 3586 non-null object

45 ROG_PBT_perc 3586 non-null object

46 ROG_PAT_perc 3586 non-null object

47 ROG_CP_perc 3586 non-null object

48 ROG_Revenue_earnings_in_forex_perc 3586 non-null object

49 ROG_Revenue_expenses_in_forex_perc 3586 non-null object

50 ROG_Market_Capitalisation_perc 3586 non-null object

51 Current_Ratio_Latest 3585 non-null object

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ipynb 4/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook
52 Fixed_Assets_Ratio_Latest 3585 non-null object

53 Inventory_Ratio_Latest 3585 non-null object

54 Debtors_Ratio_Latest 3585 non-null object

55 Total_Asset_Turnover_Ratio_Latest 3585 non-null float64

56 Interest_Cover_Ratio_Latest 3585 non-null object

57 PBIDTM_perc_Latest 3585 non-null object

58 PBITM_perc_Latest 3585 non-null object

59 PBDTM_perc_Latest 3585 non-null object

60 CPM_perc_Latest 3585 non-null object

61 APATM_perc_Latest 3585 non-null object

62 Debtors_Velocity_Days 3586 non-null object

63 Creditors_Velocity_Days 3586 non-null object

64 Inventory_Velocity_Days 3483 non-null float64

65 Value_of_Output_by_Total_Assets 3586 non-null float64

66 Value_of_Output_by_Gross_Block 3586 non-null object

dtypes: float64(4), int64(1), object(62)

memory usage: 1.8+ MB

In [8]:

Company.dtypes.value_counts()

Out[8]:

object 62

float64 4

int64 1

dtype: int64

In [9]:

Company.shape
print('The number of rows of the dataframe is',Company.shape[0],'.')
print('The number of columns of the dataframe is',Company.shape[1],'.')

The number of rows of the dataframe is 3586 .

The number of columns of the dataframe is 67 .

Dropping below listed columns as we can either use the raw values or the there percentages or
ratios.Here, we are choosing to drop these raw values and keeping the percentage values:

1. Co_Name as name of the company can be identified from Company code as well.
2. Networth as ROG-Net_Worth_perc is nothing but percentage of Value of a company as on 2015 - Current
Year.
3. Capital_Employed as ROG-Capital_Employed_perc is nothing but percentage of Total amount of capital
used for the acquisition of profits by a company.
4. Gross Block as ROG-Gross_Block_perc is percentage of Total value of all of the assets that a company
owns i.e. Gross Block.
5. Gross Sales as ROG-Gross_Sales_perc is percentage of The grand total of sale transactions within the
accounting period i.e., Gross Sales.
6. Net_Sales as ROG-Net_Sales_perc is percentage of Gross sales minus returns, allowances, and discounts
i.e. Net Sales.
7. Cost_of_Production as ROG-Cost_of_Production_perc is percentage of Costs incurred by a business from
manufacturing a product or providing a service i.e. Cost_of_Production.
8. PBIDT as ROG-PBIDT_perc is percentage of Profit Before Interest, Depreciation & Taxes i.e., PBIDT.
9. PBDT as ROG-PBDT_perc is percentage of Profit Before Depreciation and Tax i.e., PBDT.
10. PBIT as ROG-PBIT_perc is percentage of Profit before interest and taxes i.e., PBIT.
11. PBT as ROG-PBT_perc is percentage of Profit before tax i.e., PBT.
localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ipynb 5/102
06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook
p p g ,
12. PAT as ROG-PAT_perc is percentage of Profit After Tax i.e., PAT.
13. CP as ROG-CP_perc is percentage of Commercial paper, a short-term debt instrument to meet short-term
liabilities. i.e CP.
14. Revenue_earnings_in_forex as ROG-Revenue_earnings_in_forex_perc is percentage of Revenue earned in
foreign currency i.e.,Revenue_earnings_in_forex .
15. Revenue_expenses_in_forex as ROG-Revenue_expenses_in_forex_perc is percentage of Expenses due to
foreign currency transactions i.e., Revenue_expenses_in_forex.
16. Market_Capitalisation as ROG-Market_Capitalisation_perc is percentage of Product of the total number of
a company's outstanding shares and the current market price of one share i.e., Market_Capitalisation.

In [10]:

Company.drop(['Co_Name','Networth','Gross_Block','Gross_Sales','Net_Sales','Cost_of_
'PBIDT','PBDT','PBIT','PBT','PAT','CP','Revenue_earnings_in_forex',
'Revenue_expenses_in_forex','Market_Capitalisation','Capital_Employed']

In [11]:

Company.head()

Out[11]:

Co_Code Networth_Next_Year Equity_Paid_Up Total_Debt Net_Working_Capital Current_Asse

0 16974 -8021.60 419.36 5,936.03 -1,076.34 40

1 21214 -3986.19 1,954.93 7,410.18 -1,098.88 486.

2 14852 -3192.58 53.84 6,944.54 4,496.25 9,097.

3 2439 -3054.51 157.3 2,326.05 -2,612.42 1,034.

4 23505 -2967.36 50.3 5,740.90 1,836.23 4,685.

5 rows × 51 columns

Checking Shape of Data after Dropping Columns:

In [12]:

Company.shape
print('The number of rows of the dataframe after dropping certain columns is',Compan
print('The number of columns of the dataframe after dropping certain columns is',Com

The number of rows of the dataframe after dropping certain columns is


3586 .
The number of columns of the dataframe after dropping certain columns
is 51 .

Checking Duplicated Values

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ipynb 6/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [13]:

# Check for Duplicate Values

dups = Company.duplicated()

Company[dups]

Out[13]:

Co_Code Networth_Next_Year Equity_Paid_Up Total_Debt Net_Working_Capital Current_Asset

0 rows × 51 columns

Checking Missing or Null Values

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ipynb 7/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [14]:

Company.isnull().sum()

Out[14]:

Co_Code 0

Networth_Next_Year 0

Equity_Paid_Up 0

Total_Debt 0

Net_Working_Capital 0

Current_Assets 0

Current_Liabilities_and_Provisions 0

Total_Assets_by_Liabilities 0

Other_Income 0

Value_Of_Output 0

Selling_Cost 0

Adjusted_PAT 0

Capital_expenses_in_forex 0

Book_Value_Unit_Curr 0

Book_Value_Adj_Unit_Curr 4

CEPS_annualised_Unit_Curr 0

Cash_Flow_From_Operating_Activities 0

Cash_Flow_From_Investing_Activities 0

Cash_Flow_From_Financing_Activities 0

ROG_Net_Worth_perc 0

ROG_Capital_Employed_perc 0

ROG_Gross_Block_perc 0

ROG_Gross_Sales_perc 0

ROG_Net_Sales_perc 0

ROG_Cost_of_Production_perc 0

ROG_Total_Assets_perc 0

ROG_PBIDT_perc 0

ROG_PBDT_perc 0

ROG_PBIT_perc 0

ROG_PBT_perc 0

ROG_PAT_perc 0

ROG_CP_perc 0

ROG_Revenue_earnings_in_forex_perc 0

ROG_Revenue_expenses_in_forex_perc 0

ROG_Market_Capitalisation_perc 0

Current_Ratio_Latest 1

Fixed_Assets_Ratio_Latest 1

Inventory_Ratio_Latest 1

Debtors_Ratio_Latest 1

Total_Asset_Turnover_Ratio_Latest 1

Interest_Cover_Ratio_Latest 1

PBIDTM_perc_Latest 1

PBITM_perc_Latest 1

PBDTM_perc_Latest 1

CPM_perc_Latest 1

APATM_perc_Latest 1

Debtors_Velocity_Days 0

Creditors_Velocity_Days 0

Inventory_Velocity_Days 103

Value_of_Output_by_Total_Assets 0

Value_of_Output_by_Gross_Block 0

dtype: int64

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ipynb 8/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [15]:

Company.isnull().sum().sum()
print("Number of missing values in dataset is",Company.isnull().sum().sum())

Number of missing values in dataset is 118

In [16]:

Company.dtypes.value_counts()

Out[16]:

object 46

float64 4

int64 1

dtype: int64

In [17]:

Company.head()

Out[17]:

Co_Code Networth_Next_Year Equity_Paid_Up Total_Debt Net_Working_Capital Current_Asse

0 16974 -8021.60 419.36 5,936.03 -1,076.34 40

1 21214 -3986.19 1,954.93 7,410.18 -1,098.88 486.

2 14852 -3192.58 53.84 6,944.54 4,496.25 9,097.

3 2439 -3054.51 157.3 2,326.05 -2,612.42 1,034.

4 23505 -2967.36 50.3 5,740.90 1,836.23 4,685.

5 rows × 51 columns

Data Insights:

Data Consists of both categorical and numerical variables.


After dropping mentioned columns, there are total of 3586 rows and 52 columns in the dataset.Out of 52,
47 columns are of object type, 1 column is of integer type data and remaining 4 are of float type.
Data contains 118 missing or null values.
Data does not contain any duplicated values.
Column "Networth_Next_Year" can be used to drive the labeled field of the company in the following year
(2016).Hence, we will create a "default" variable that should take:

- Value of 1 when net worth next year is negative

- Value of 0 when net worth next year is positive

'Networth_Next_Year' is the target variable and all other are predector variables.
From data entries it can be observed that 47 columns are of Object Data which are Numerical in nature.
Hence, we will convert these object data types to numerical and then check descriptive statistics of data
(as all these value are of numerical data type).

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ipynb 9/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [18]:

## Recheck the unique values


for column in Company.columns:
if Company[column].dtype == 'object':
print(column.upper(),': ',Company[column].nunique())
print(Company[column].value_counts().sort_values())
print('\n')

0.06 14
0.01 14
0.05 15
0.02 17
0 48
Name: Net_Working_Capital, Length: 2699, dtype: int64

CURRENT_ASSETS : 2775

15,248.91 1

13.16 1

11.31 1

13.29 1

266.02 1

..

0.08 16

0.02 18

0.01 19

0.03 20

0 27

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 10/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [19]:

Company.columns

Out[19]:

Index(['Co_Code', 'Networth_Next_Year', 'Equity_Paid_Up', 'Total_Deb


t',

'Net_Working_Capital', 'Current_Assets',

'Current_Liabilities_and_Provisions', 'Total_Assets_by_Liabilit
ies',

'Other_Income', 'Value_Of_Output', 'Selling_Cost', 'Adjusted_PA


T',

'Capital_expenses_in_forex', 'Book_Value_Unit_Curr',

'Book_Value_Adj_Unit_Curr', 'CEPS_annualised_Unit_Curr',

'Cash_Flow_From_Operating_Activities',

'Cash_Flow_From_Investing_Activities',

'Cash_Flow_From_Financing_Activities', 'ROG_Net_Worth_perc',

'ROG_Capital_Employed_perc', 'ROG_Gross_Block_perc',

'ROG_Gross_Sales_perc', 'ROG_Net_Sales_perc',

'ROG_Cost_of_Production_perc', 'ROG_Total_Assets_perc',

'ROG_PBIDT_perc', 'ROG_PBDT_perc', 'ROG_PBIT_perc', 'ROG_PBT_pe


rc',

'ROG_PAT_perc', 'ROG_CP_perc', 'ROG_Revenue_earnings_in_forex_p


erc',

'ROG_Revenue_expenses_in_forex_perc', 'ROG_Market_Capitalisatio
n_perc',

'Current_Ratio_Latest', 'Fixed_Assets_Ratio_Latest',

'Inventory_Ratio_Latest', 'Debtors_Ratio_Latest',

'Total_Asset_Turnover_Ratio_Latest', 'Interest_Cover_Ratio_Late
st',

'PBIDTM_perc_Latest', 'PBITM_perc_Latest', 'PBDTM_perc_Latest',

'CPM_perc_Latest', 'APATM_perc_Latest', 'Debtors_Velocity_Day


s',

'Creditors_Velocity_Days', 'Inventory_Velocity_Days',

'Value_of_Output_by_Total_Assets', 'Value_of_Output_by_Gross_Bl
ock'],
dtype='object')

Running a For loop to separate Categorical and Numerical Columns:

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 11/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [20]:

cat=[]
num=[]
for i in Company.columns:
if Company[i].dtype=="object":
cat.append(i)
else:
num.append(i)
print("Categorical Columns:",cat)

print("/")
print("Numerical Columns:",num)

Categorical Columns: ['Equity_Paid_Up', 'Total_Debt', 'Net_Working_Cap


ital', 'Current_Assets', 'Current_Liabilities_and_Provisions', 'Total_
Assets_by_Liabilities', 'Other_Income', 'Value_Of_Output', 'Selling_Co
st', 'Adjusted_PAT', 'Capital_expenses_in_forex', 'Book_Value_Unit_Cur
r', 'Book_Value_Adj_Unit_Curr', 'CEPS_annualised_Unit_Curr', 'Cash_Flo
w_From_Operating_Activities', 'Cash_Flow_From_Investing_Activities',
'Cash_Flow_From_Financing_Activities', 'ROG_Net_Worth_perc', 'ROG_Capi
tal_Employed_perc', 'ROG_Gross_Block_perc', 'ROG_Gross_Sales_perc', 'R
OG_Net_Sales_perc', 'ROG_Cost_of_Production_perc', 'ROG_Total_Assets_p
erc', 'ROG_PBIDT_perc', 'ROG_PBDT_perc', 'ROG_PBIT_perc', 'ROG_PBT_per
c', 'ROG_PAT_perc', 'ROG_CP_perc', 'ROG_Revenue_earnings_in_forex_per
c', 'ROG_Revenue_expenses_in_forex_perc', 'ROG_Market_Capitalisation_p
erc', 'Current_Ratio_Latest', 'Fixed_Assets_Ratio_Latest', 'Inventory_
Ratio_Latest', 'Debtors_Ratio_Latest', 'Interest_Cover_Ratio_Latest',
'PBIDTM_perc_Latest', 'PBITM_perc_Latest', 'PBDTM_perc_Latest', 'CPM_p
erc_Latest', 'APATM_perc_Latest', 'Debtors_Velocity_Days', 'Creditors_
Velocity_Days', 'Value_of_Output_by_Gross_Block']

Numerical Columns: ['Co_Code', 'Networth_Next_Year', 'Total_Asset_Turn


over_Ratio_Latest', 'Inventory_Velocity_Days', 'Value_of_Output_by_Tot
al_Assets']

In [23]:

, 'Interest_Cover_Ratio_Latest', 'PBIDTM_perc_Latest', 'PBITM_perc_Latest', 'PBDTM_p

Converting Categorical Variables to Numerical Variables:

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 12/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [24]:

for feature in Company_X:


if Company[feature].dtype == 'object':
print('\n')
print('feature:',feature)
print(pd.Categorical(Company[feature].unique()))
print(pd.Categorical(Company[feature].unique()).codes)
Company[feature] = pd.Categorical(Company[feature]).codes

feature: Book_Value_Adj_Unit_Curr

['-167.58', '-15.18', '94.14', '-39.64', '-212.89', ..., '209.35', '24


7.39', '114.87', '69.99', '195.8']

Length: 2964

Categories (2963, object): ['-0.01', '-0.02', '-0.03', '-0.05', ...,


'99.12', '99.77', '997.59', '999.22']

[ 116 102 2931 ... 705 2597 1276]

feature: CEPS_annualised_Unit_Curr

['-22.09', '-0.02', '-148.31', '-43.08', '-159.5', ..., '104.9', '41.7


5', '39.03', '17.93', '51.79']

Length: 1900

Categories (1900, object): ['-0.01', '-0.02', '-0.03', '-0.04', ...,


'94.92', '96.53', '986.67', '995.65']

[ 257 1 188 ... 1367 907 1572]

Checking Changed Dtype Information

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 13/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [25]:

Company.info()

<class 'pandas.core.frame.DataFrame'>

RangeIndex: 3586 entries, 0 to 3585

Data columns (total 51 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 Co_Code 3586 non-null int64

1 Networth_Next_Year 3586 non-null float64

2 Equity_Paid_Up 3586 non-null int16

3 Total_Debt 3586 non-null int16

4 Net_Working_Capital 3586 non-null int16

5 Current_Assets 3586 non-null int16

6 Current_Liabilities_and_Provisions 3586 non-null int16

7 Total_Assets_by_Liabilities 3586 non-null int16

8 Other_Income 3586 non-null int16

9 Value_Of_Output 3586 non-null int16

10 Selling_Cost 3586 non-null int16

11 Adjusted_PAT 3586 non-null int16

12 Capital_expenses_in_forex 3586 non-null int16

13 Book_Value_Unit_Curr 3586 non-null int16

14 Book_Value_Adj_Unit_Curr 3586 non-null int16

15 CEPS_annualised_Unit_Curr 3586 non-null int16

16 Cash_Flow_From_Operating_Activities 3586 non-null int16

17 Cash_Flow_From_Investing_Activities 3586 non-null int16

18 Cash_Flow_From_Financing_Activities 3586 non-null int16

19 ROG_Net_Worth_perc 3586 non-null int16

20 ROG_Capital_Employed_perc 3586 non-null int16

21 ROG_Gross_Block_perc 3586 non-null int16

22 ROG_Gross_Sales_perc 3586 non-null int16

23 ROG_Net_Sales_perc 3586 non-null int16

24 ROG_Cost_of_Production_perc 3586 non-null int16

25 ROG_Total_Assets_perc 3586 non-null int16

26 ROG_PBIDT_perc 3586 non-null int16

27 ROG_PBDT_perc 3586 non-null int16

28 ROG_PBIT_perc 3586 non-null int16

29 ROG_PBT_perc 3586 non-null int16

30 ROG_PAT_perc 3586 non-null int16

31 ROG_CP_perc 3586 non-null int16

32 ROG_Revenue_earnings_in_forex_perc 3586 non-null int16

33 ROG_Revenue_expenses_in_forex_perc 3586 non-null int16

34 ROG_Market_Capitalisation_perc 3586 non-null int16

35 Current_Ratio_Latest 3586 non-null int16

36 Fixed_Assets_Ratio_Latest 3586 non-null int16

37 Inventory_Ratio_Latest 3586 non-null int16

38 Debtors_Ratio_Latest 3586 non-null int16

39 Total_Asset_Turnover_Ratio_Latest 3585 non-null float64

40 Interest_Cover_Ratio_Latest 3586 non-null int16

41 PBIDTM_perc_Latest 3586 non-null int16

42 PBITM_perc_Latest 3586 non-null int16

43 PBDTM_perc_Latest 3586 non-null int16

44 CPM_perc_Latest 3586 non-null int16

45 APATM_perc_Latest 3586 non-null int16

46 Debtors_Velocity_Days 3586 non-null int16

47 Creditors_Velocity_Days 3586 non-null int16

48 Inventory_Velocity_Days 3483 non-null float64

49 Value_of_Output_by_Total_Assets 3586 non-null float64

50 Value_of_Output_by_Gross_Block 3586 non-null int16

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 14/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook
dtypes: float64(4), int16(46), int64(1)

memory usage: 462.4 KB

In [26]:

Company.dtypes.value_counts()

Out[26]:

int16 46

float64 4

int64 1

dtype: int64

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 15/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [27]:

round(Company.describe(),2).T

Out[27]:

count mean std min 25% 50%

Co_Code 3586.0 16065.39 19776.82 4.00 3029.25 6077.50 24

Networth_Next_Year 3586.0 725.05 4769.68 -8021.60 3.98 19.02

Equity_Paid_Up 3586.0 963.22 604.30 0.00 399.25 1058.00

Total_Debt 3586.0 716.66 704.02 0.00 5.00 546.00

Net_Working_Capital 3586.0 1241.80 788.90 0.00 484.25 1205.50

Current_Assets 3586.0 1227.19 859.12 0.00 417.25 1193.00

Current_Liabilities_and_Provisions 3586.0 838.92 737.16 0.00 76.25 740.50

Total_Assets_by_Liabilities 3586.0 1543.59 918.59 0.00 747.00 1561.50 2

Other_Income 3586.0 237.34 320.10 0.00 10.00 53.00

Value_Of_Output 3586.0 1060.58 851.34 0.00 193.25 984.00

Selling_Cost 3586.0 218.16 326.97 0.00 0.00 16.00

Adjusted_PAT 3586.0 725.19 486.18 0.00 429.25 634.00

Capital_expenses_in_forex 3586.0 38.41 103.54 0.00 0.00 0.00

Book_Value_Unit_Curr 3586.0 1475.19 876.21 0.00 677.00 1441.50 2

Book_Value_Adj_Unit_Curr 3586.0 1439.54 859.66 -1.00 660.25 1397.50 2

CEPS_annualised_Unit_Curr 3586.0 766.75 526.91 0.00 464.00 582.00

Cash_Flow_From_Operating_Activities 3586.0 853.48 617.21 0.00 355.25 703.00

Cash_Flow_From_Investing_Activities 3586.0 830.13 534.97 0.00 271.25 1027.50

Cash_Flow_From_Financing_Activities 3586.0 926.98 562.65 0.00 425.25 1200.00

ROG_Net_Worth_perc 3586.0 1193.52 686.45 0.00 693.25 1083.50

ROG_Capital_Employed_perc 3586.0 1203.52 714.62 0.00 637.25 1114.50

ROG_Gross_Block_perc 3586.0 784.95 464.85 0.00 556.00 580.00

ROG_Gross_Sales_perc 3586.0 1283.22 734.54 0.00 747.25 1144.00

ROG_Net_Sales_perc 3586.0 1279.97 732.60 0.00 748.25 1138.50

ROG_Cost_of_Production_perc 3586.0 1291.87 730.64 0.00 740.25 1177.50

ROG_Total_Assets_perc 3586.0 1237.13 736.45 0.00 631.25 1154.00

ROG_PBIDT_perc 3586.0 1337.94 750.91 0.00 743.00 1245.00

ROG_PBDT_perc 3586.0 1345.10 752.49 0.00 745.25 1252.50

ROG_PBIT_perc 3586.0 1342.16 745.57 0.00 756.25 1247.00

ROG_PBT_perc 3586.0 1312.40 734.64 0.00 721.25 1209.50

ROG_PAT_perc 3586.0 1287.95 715.27 0.00 726.25 1180.00

ROG_CP_perc 3586.0 1331.98 748.07 0.00 739.25 1243.00

ROG_Revenue_earnings_in_forex_perc 3586.0 565.15 215.06 0.00 571.00 571.00

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 16/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

count mean std min 25% 50%

ROG_Revenue_expenses_in_forex_perc 3586.0 652.95 279.29 0.00 644.00 644.00

ROG_Market_Capitalisation_perc 3586.0 865.03 515.11 0.00 601.00 601.00

Current_Ratio_Latest 3586.0 249.97 249.97 -1.00 88.00 136.00

Fixed_Assets_Ratio_Latest 3586.0 328.16 352.03 -1.00 27.00 164.50

Inventory_Ratio_Latest 3586.0 514.77 504.85 -1.00 0.00 401.50

Debtors_Ratio_Latest 3586.0 574.38 491.33 -1.00 39.25 571.00

Total_Asset_Turnover_Ratio_Latest 3585.0 1.24 2.67 0.00 0.07 0.60

Interest_Cover_Ratio_Latest 3586.0 583.88 344.73 -1.00 372.00 471.00

PBIDTM_perc_Latest 3586.0 1125.01 675.97 -1.00 453.00 1059.50

PBITM_perc_Latest 3586.0 1131.02 642.01 -1.00 575.00 1078.50

PBDTM_perc_Latest 3586.0 1144.84 645.67 -1.00 619.00 1072.50

CPM_perc_Latest 3586.0 1086.45 602.02 -1.00 608.00 1016.00

APATM_perc_Latest 3586.0 1046.48 545.05 -1.00 754.00 911.50

Debtors_Velocity_Days 3586.0 249.99 194.35 0.00 60.25 255.50

Creditors_Velocity_Days 3586.0 227.90 172.04 0.00 59.00 237.00

Inventory_Velocity_Days 3483.0 79.64 137.85 -199.00 0.00 35.00

Value_of_Output_by_Total_Assets 3586.0 0.82 1.20 -0.33 0.07 0.48

Value_of_Output_by_Gross_Block 3586.0 346.93 353.00 0.00 46.00 181.50

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 17/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [28]:

continuous=Company.dtypes[(Company.dtypes=='int64')|(Company.dtypes=='float64')|(Com
data_plot=Company[continuous]

data_plot.boxplot(figsize=(20,10));
plt.xlabel("Continuous Variables")
plt.ylabel("Density")
plt.title("Figure: Boxplot of Continuous Data")

Out[28]:

Text(0.5, 1.0, 'Figure: Boxplot of Continuous Data')

Noticeably, there are outliers present in the data set.To confirm our analysis , we will further detect
outliers and decide how these outliers should be treated.

Detecting outliers using IQR method by defining a new range, that is called a decision range, and any
data point lying outside this range is considered as an outlier. The range is as given below:

IQR = Q3 − Q1

Lower Bound= Q1 - 1.5*IQR

Upper Bound=Q3 + 1.5*IQR

In [29]:

Q1 = Company.quantile(0.25)
Q3 = Company.quantile(0.75)
IQR = Q3 - Q1
UL = Q3 + 1.5*IQR
LL = Q1 - 1.5*IQR

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 18/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [30]:

((Company> UL)|(Company< LL)).sum()

Out[30]:

Co_Code 291

Networth_Next_Year 676

Equity_Paid_Up 0

Total_Debt 0

Net_Working_Capital 0

Current_Assets 0

Current_Liabilities_and_Provisions 0

Total_Assets_by_Liabilities 0

Other_Income 79

Value_Of_Output 0

Selling_Cost 168

Adjusted_PAT 0

Capital_expenses_in_forex 694

Book_Value_Unit_Curr 0

Book_Value_Adj_Unit_Curr 0

CEPS_annualised_Unit_Curr 0

Cash_Flow_From_Operating_Activities 0

Cash_Flow_From_Investing_Activities 0

Cash_Flow_From_Financing_Activities 0

ROG_Net_Worth_perc 0

ROG_Capital_Employed_perc 0

ROG_Gross_Block_perc 0

ROG_Gross_Sales_perc 0

ROG_Net_Sales_perc 0

ROG_Cost_of_Production_perc 0

ROG_Total_Assets_perc 0

ROG_PBIDT_perc 0

ROG_PBDT_perc 0

ROG_PBIT_perc 0

ROG_PBT_perc 0

ROG_PAT_perc 0

ROG_CP_perc 0

ROG_Revenue_earnings_in_forex_perc 1317

ROG_Revenue_expenses_in_forex_perc 1615

ROG_Market_Capitalisation_perc 0

Current_Ratio_Latest 160

Fixed_Assets_Ratio_Latest 0

Inventory_Ratio_Latest 0

Debtors_Ratio_Latest 0

Total_Asset_Turnover_Ratio_Latest 201

Interest_Cover_Ratio_Latest 0

PBIDTM_perc_Latest 0

PBITM_perc_Latest 0

PBDTM_perc_Latest 0

CPM_perc_Latest 0

APATM_perc_Latest 0

Debtors_Velocity_Days 0

Creditors_Velocity_Days 0

Inventory_Velocity_Days 262

Value_of_Output_by_Total_Assets 150

Value_of_Output_by_Gross_Block 0

dtype: int64

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 19/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [31]:

# Replacing outliers to NaN Values

Company[((Company> UL) | (Company< LL))]= np.nan

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 20/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [32]:

Company.isnull().sum()

Out[32]:

Co_Code 291

Networth_Next_Year 676

Equity_Paid_Up 0

Total_Debt 0

Net_Working_Capital 0

Current_Assets 0

Current_Liabilities_and_Provisions 0

Total_Assets_by_Liabilities 0

Other_Income 79

Value_Of_Output 0

Selling_Cost 168

Adjusted_PAT 0

Capital_expenses_in_forex 694

Book_Value_Unit_Curr 0

Book_Value_Adj_Unit_Curr 0

CEPS_annualised_Unit_Curr 0

Cash_Flow_From_Operating_Activities 0

Cash_Flow_From_Investing_Activities 0

Cash_Flow_From_Financing_Activities 0

ROG_Net_Worth_perc 0

ROG_Capital_Employed_perc 0

ROG_Gross_Block_perc 0

ROG_Gross_Sales_perc 0

ROG_Net_Sales_perc 0

ROG_Cost_of_Production_perc 0

ROG_Total_Assets_perc 0

ROG_PBIDT_perc 0

ROG_PBDT_perc 0

ROG_PBIT_perc 0

ROG_PBT_perc 0

ROG_PAT_perc 0

ROG_CP_perc 0

ROG_Revenue_earnings_in_forex_perc 1317

ROG_Revenue_expenses_in_forex_perc 1615

ROG_Market_Capitalisation_perc 0

Current_Ratio_Latest 160

Fixed_Assets_Ratio_Latest 0

Inventory_Ratio_Latest 0

Debtors_Ratio_Latest 0

Total_Asset_Turnover_Ratio_Latest 202

Interest_Cover_Ratio_Latest 0

PBIDTM_perc_Latest 0

PBITM_perc_Latest 0

PBDTM_perc_Latest 0

CPM_perc_Latest 0

APATM_perc_Latest 0

Debtors_Velocity_Days 0

Creditors_Velocity_Days 0

Inventory_Velocity_Days 365

Value_of_Output_by_Total_Assets 150

Value_of_Output_by_Gross_Block 0

dtype: int64

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 21/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [33]:

Company.isnull().sum().sum()
print("Number of missing values after replacing outliers with Nan values is",Company

Number of missing values after replacing outliers with Nan values is 5


717

In [34]:

Company.shape

print('The number of rows of the temporary dataframe created is',Company.shape[0],'


print('The number of columns of the temporary dataframe created is',Company.shape[1]

The number of rows of the temporary dataframe created is 3586 .

The number of columns of the temporary dataframe created is 51 .

Data has very few missing or null values and roughly 1.6% of data has outliers.

Here, we are converting outliers to missing values.Hence, total number of missing values in addition to
outliers will be 5717 (Total Number of Outliers+Total Number of Missing Values).

Note: Before converting outliers to NaN values number of missing values present in the dataset was
118.

1.2 Missing Value Treatment

Visualizing Missing Values:

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 22/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [35]:

plt.figure(figsize = (12,8))
sns.heatmap(Company.isnull(), cbar = False, cmap = 'coolwarm', yticklabels = False)
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 23/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Noticeable, presence of missing values in some variables can be observed.Blue color in the heatmap is
indicating occupied cells while red cuolor indicates missing values present in the data.Listing down few
observations:

For variable "Networth_Next_Year" some values might be completely missing.


Maximum values are missing from variable "ROG-Revenue expenses in forex (%)" followed by "Revenue
expenses in forex" ( which is expected, since ROG is the percentage represtation of of revenue values).
Also, some missing values can be observed in variables "Inventory Velocity (Days)", "Debtors
Ratio[Latest]", "ROG-Market Capitalisation (%)","Captital_expenses_in_forex","Selling_cost" and
"Other_Income".

Typically if missing data in columns is less then 30 % of our data and at row level data is atleast at 90%
complete, we do not drop the data.Here, we will first check completeness of data and then decide the
technique to be used to move forward.

In order to check the completeness of data at row level, we will look at total number of missing values in each
row.

Note: To find total number of missing values in each row , we will set axis as 1.

Since, it is a company and we want to quantify the data.Therefore, we are choosing to do a missing value
imputation instead of dropping these missing values.

We will try to target companies which completes atleast 90 % of the data in each row i.e. we will filter
out companies where there are atleast 5 or less missing values to identify the reliable data until this
point.

After filtering out these values shape of our data changes (before filtering; number of rows= 3586) to :

The number of rows of the temporary dataframe created is 3569 .

The number of columns of the temporary dataframe created is 51 .

This indicates that most of our data is still available.

Note: We have created a temporary dataframe to filter out companies with atleast 5 missing
values.

In [36]:

Company_temp = Company[Company.isnull().sum(axis = 1) <= 5]

In [37]:

Company_temp.shape

Out[37]:

(3569, 51)

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 24/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [38]:

Company.isnull().sum().sort_values(ascending = False)/Company.index.size

Out[38]:

ROG_Revenue_expenses_in_forex_perc 0.450363

ROG_Revenue_earnings_in_forex_perc 0.367262

Capital_expenses_in_forex 0.193530

Networth_Next_Year 0.188511

Inventory_Velocity_Days 0.101785

Co_Code 0.081149

Total_Asset_Turnover_Ratio_Latest 0.056330

Selling_Cost 0.046849

Current_Ratio_Latest 0.044618

Value_of_Output_by_Total_Assets 0.041829

Other_Income 0.022030

Cash_Flow_From_Financing_Activities 0.000000

Cash_Flow_From_Investing_Activities 0.000000

Cash_Flow_From_Operating_Activities 0.000000

Book_Value_Adj_Unit_Curr 0.000000

Book_Value_Unit_Curr 0.000000

ROG_Net_Worth_perc 0.000000

CEPS_annualised_Unit_Curr 0.000000

Adjusted_PAT 0.000000

ROG_Gross_Block_perc 0.000000

Value_Of_Output 0.000000

Total_Assets_by_Liabilities 0.000000

Current_Liabilities_and_Provisions 0.000000

Current_Assets 0.000000

Net_Working_Capital 0.000000

Total_Debt 0.000000

Equity_Paid_Up 0.000000

ROG_Capital_Employed_perc 0.000000

Value_of_Output_by_Gross_Block 0.000000

ROG_Gross_Sales_perc 0.000000

ROG_Net_Sales_perc 0.000000

Creditors_Velocity_Days 0.000000

Debtors_Velocity_Days 0.000000

APATM_perc_Latest 0.000000

CPM_perc_Latest 0.000000

PBDTM_perc_Latest 0.000000

PBITM_perc_Latest 0.000000

PBIDTM_perc_Latest 0.000000

Interest_Cover_Ratio_Latest 0.000000

Debtors_Ratio_Latest 0.000000

Inventory_Ratio_Latest 0.000000

Fixed_Assets_Ratio_Latest 0.000000

ROG_Market_Capitalisation_perc 0.000000

ROG_CP_perc 0.000000

ROG_PAT_perc 0.000000

ROG_PBT_perc 0.000000

ROG_PBIT_perc 0.000000

ROG_PBDT_perc 0.000000

ROG_PBIDT_perc 0.000000

ROG_Cost_of_Production_perc 0.000000

ROG_Total_Assets_perc 0.000000

dtype: float64

Dropping columns with more than 30% missing values


localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 25/102
06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook
Dropping columns with more than 30% missing values
We are sorting proportion of missing values by dividing number of missing values by number of applicable
rows.We will eliminate anything that is more then 30 %. Noticeably, "ROG-Revenue_expenses_in_forex_perc"
and "ROG-Revenue_earnings_in_forex_perc" are the only to values which are more then 30%. Therefore, we
can eliminate these values.

In [39]:

Company_sub1 = Company.drop(['ROG_Revenue_expenses_in_forex_perc','ROG_Revenue_earni
axis = 1)

In [40]:

Company_sub1.shape
print('The number of rows after dropping columns with more then 30% missing values i
print('The number of columns after dropping columns with more then 30% missing value

The number of rows after dropping columns with more then 30% missing v
alues is 3586 .

The number of columns after dropping columns with more then 30% missin
g values is 49 .

The missing values are of numeric nature.Hence, can be imputed using KNNImputer function from the
impute module of the sklearn. This imputer utilizes the k-Nearest Neighbors method to replace the
missing values in the datasets by finding the nearest neighbors with the Euclidean distance
matrix.

Another critical point here is that the KNN Imptuer is a distance-based imputation method and it requires us to
normalize our data. Otherwise, the different scales of our data will lead the KNN Imputer to generate biased
replacements for the missing values.Here, we will use Scikit-Learn’s Standard Scaler method which will scale
our variables to have values between 0 and 1.

It is recommended that data should be split to response and predictor variables

Imputation is done by predicting the missing value based on values of 10 nearest neighbors of the same
variable. Such that all the missing values are replaced based on nearest neighbors value.

Segregate the predictors and response

In [41]:

predictors = Company_sub1.drop('Networth_Next_Year', axis = 1)


response = Company_sub1['Networth_Next_Year']

Scale the predictors

In [42]:

from sklearn.preprocessing import StandardScaler


scaler = StandardScaler()
scaled_predictors = pd.DataFrame(scaler.fit_transform(predictors), columns = predict

In [43]:

Company_sub2 = pd.concat([scaled_predictors, response], axis = 1)

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 26/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Imputing the remaining missing values

In [44]:

from sklearn.impute import KNNImputer

In [45]:

imputer = KNNImputer(n_neighbors=10)

In [46]:

Company_imputed = pd.DataFrame(imputer.fit_transform(Company_sub2), columns = Compan

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 27/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [47]:

Company_imputed.isnull().sum()

Out[47]:

Co_Code 0

Equity_Paid_Up 0

Total_Debt 0

Net_Working_Capital 0

Current_Assets 0

Current_Liabilities_and_Provisions 0

Total_Assets_by_Liabilities 0

Other_Income 0

Value_Of_Output 0

Selling_Cost 0

Adjusted_PAT 0

Capital_expenses_in_forex 0

Book_Value_Unit_Curr 0

Book_Value_Adj_Unit_Curr 0

CEPS_annualised_Unit_Curr 0

Cash_Flow_From_Operating_Activities 0

Cash_Flow_From_Investing_Activities 0

Cash_Flow_From_Financing_Activities 0

ROG_Net_Worth_perc 0

ROG_Capital_Employed_perc 0

ROG_Gross_Block_perc 0

ROG_Gross_Sales_perc 0

ROG_Net_Sales_perc 0

ROG_Cost_of_Production_perc 0

ROG_Total_Assets_perc 0

ROG_PBIDT_perc 0

ROG_PBDT_perc 0

ROG_PBIT_perc 0

ROG_PBT_perc 0

ROG_PAT_perc 0

ROG_CP_perc 0

ROG_Market_Capitalisation_perc 0

Current_Ratio_Latest 0

Fixed_Assets_Ratio_Latest 0

Inventory_Ratio_Latest 0

Debtors_Ratio_Latest 0

Total_Asset_Turnover_Ratio_Latest 0

Interest_Cover_Ratio_Latest 0

PBIDTM_perc_Latest 0

PBITM_perc_Latest 0

PBDTM_perc_Latest 0

CPM_perc_Latest 0

APATM_perc_Latest 0

Debtors_Velocity_Days 0

Creditors_Velocity_Days 0

Inventory_Velocity_Days 0

Value_of_Output_by_Total_Assets 0

Value_of_Output_by_Gross_Block 0

Networth_Next_Year 0

dtype: int64

Noticeably, missing values have been treated now.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 28/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

1.3 Transform Target variable into 0 and 1

There is no target variable defined – but since the objective is to build a model for investor to decode which
company to invest in – the variable Networth_Next_Year coud be used to transform into target variable (as
mentined in rubric as well).

We will now create a default variable that should take the below mentioned values:

of 1 when net worth next year is negative & 0 when net worth next year is positive.

If the company’s Networth_Next_Year is positive – then the company would continue to return good
investment for investor and thus could be transformed as 0 (i.e., Non-Default).
If the company’s Networth_Next_Year is negative – then the company is likely to not return a good
investment to investor and transformed as 1 (i.e.Default).

Hence, creating a binary target variable using 'Networth_Next_Year'

Creating a binary target variable using 'Networth_Next_Year'

In [48]:

Company_imputed['default'] = np.where((Company_imputed['Networth_Next_Year'] > 0), 0

Checking top 10 rows

In [49]:

Company_imputed[['default','Networth_Next_Year']].head(10)

Out[49]:

default Networth_Next_Year

0 1 -6.218

1 1 -23.782

2 0 43.906

3 1 -23.723

4 1 -12.392

5 1 -13.211

6 1 -7.314

7 0 8.508

8 1 -27.635

9 0 35.004

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 29/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [50]:

Company_imputed['default'].value_counts()

Out[50]:

0 3225

1 361

Name: default, dtype: int64

Checking proportion of default

In [51]:

Company_imputed['default'].value_counts(normalize = True)

Out[51]:

0 0.899331

1 0.100669

Name: default, dtype: float64

Noticeably, approximately 10% of the companies from the dataset are likely to default and these are the
companies in which investors should probably avoid investing in.

1.4 Univariate (4 marks) & Bivariate ( 6marks) analysis with proper


interpretation. (You may choose to include only those variables
which were significant in the model building)

Univariate Analysis:

In [52]:

def univariateAnalysis_numeric(column,nbins):
print("Description of " + column)
print("-------------------------------------------------------------------------
print(Company_imputed[column].describe(),end=' ')

plt.figure()
print("Distribution of " + column)
print("-------------------------------------------------------------------------
sns.distplot(Company_imputed[column], kde=False, color='skyblue');
plt.show()

plt.figure()
print("BoxPlot of " + column)
print("-------------------------------------------------------------------------
ax = sns.boxplot(x=Company_imputed[column],color='b')
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 30/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [53]:

ted_imp_features=pd.DataFrame(Company_imputed,columns=['Net_Working_Capital','Book_Va
'ROG_Capital_Employed_perc','ROG_Total_Assets_perc','Current_Ratio_
'Fixed_Assets_Ratio_Latest','Inventory_Ratio_Latest','Debtors_Ratio
'Total_Asset_Turnover_Ratio_Latest','Interest_Cover_Ratio_Latest',
'ROG_Market_Capitalisation_perc', 'ROG_Cost_of_Production_perc'])

In [54]:

Company_num = Company_imputed_imp_features.select_dtypes(include = ['int64','int16',


Company_cat=Company_imputed_imp_features.select_dtypes(["object"])
Categorical_column_list=list(Company_cat.columns.values)
Numerical_column_list = list(Company_num.columns.values)
Numerical_length=len(Numerical_column_list)
Categorical_length=len(Categorical_column_list)
print("Length of Numerical columns is :",Numerical_length)
print("Length of Categorical columns is :",Categorical_length)

Length of Numerical columns is : 13

Length of Categorical columns is : 0

In [55]:

for x in Numerical_column_list:
univariateAnalysis_numeric(x,20)
pd.options.display.float_format = '{:.3f}'.format

Name: Book_Value_Unit_Curr, dtype: float64 Distribution of Book_Value_


Unit_Curr

----------------------------------------------------------------------
------

Insights from Univariate Analysis:


Noticeably,even though we have treated outliers but some of the variables still indicate the presence of
outliers.
50% of the times, Equity paid up i.e., amount that has been received by the company through the issue of
shares to the shareholders is in positive.
50 % of the times the company is in debt.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 31/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Majorty of the times i.e. 75% of the times company's net working capital, current assets, current
liabilities,total assets by liabilities, other income, value output, selling cost, adjusted PAT
,Book_Value_Unit_Curr,Book_Value_Adj._Unit_Curr,CEPS_annualised_Unit_Curr,
Cash_Flow_From_Operating_Activities, Cash_Flow_From_Investing_Activities etc is positive.
Company is currently not financing in longterm investments in forex currently. Probably company should
consider funding in longterm forex investmets to generate high revenues.
Since, companies are not investing in forex most of the values are 0.Therfore, boxplot is a line for variable
Capital_expense_in_forex.
For variable "Inventory_Velocity Days" there is just one whisker in boxplot is, due to the extreme skewness
of data and also there is no value smaller than the median.

In [56]:

Numerical_column_list = list(Company_num.columns.values)
Numerical_column_list

Out[56]:

['Net_Working_Capital',

'Book_Value_Unit_Curr',

'ROG_Net_Worth_perc',

'ROG_Capital_Employed_perc',

'ROG_Total_Assets_perc',

'Current_Ratio_Latest',

'Fixed_Assets_Ratio_Latest',

'Inventory_Ratio_Latest',

'Debtors_Ratio_Latest',

'Total_Asset_Turnover_Ratio_Latest',

'Interest_Cover_Ratio_Latest',

'ROG_Market_Capitalisation_perc',

'ROG_Cost_of_Production_perc']

Bivariate/Multivariate Analysis:

Countplot of Target Variable:

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 32/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [57]:

# EDA for categorical columns 'Holiday_package'.

sns.catplot('default', data=Company_imputed, kind='count',aspect=1.5, palette='mako'


plt.title("Figure: Countplot of Target Variable Default")

Out[57]:

Text(0.5, 1.0, 'Figure: Countplot of Target Variable Default')

The data has higher Non-default companies i.e., the companies whic are expected to have a postive Net
Worth next year (which is good for investors for decision making).

Some of the important parameters which are more likely to contribute to the strength of a company's balance
sheet can be evaluated by below listed parameters:
localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 33/102
06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

['Net_Working_Capital','Book_Value_Unit_Curr','ROG-Net_Worth_perc','ROG-Capital_Employed_perc','ROG-
Total_Assets_perc',
'Current_Ratio[Latest]',
'Fixed_Assets_Ratio[Latest]','Inventory_Ratio[Latest]','Debtors_Ratio[Latest]',
'Total_Asset_Turnover_Ratio[Latest]','Interest_Cover_Ratio[Latest]','ROG-Market_Capitalisation_perc', 'ROG-
Cost_of_Production_perc']

1. Net Working Capital: It measures company's liquidity and short-term financial health. A company will have
negative NWC if its ratio of current assets to liabilities is less than one.
2. Book Value (Unit Curr): High book value per share (due to profits accumulated over the years) indicates a
strong company.
3. ROG-Net Worth (%) : Companies with low capital base (that don't need additional capital for growth) will
show a higher ratio.
4. ROG-Capital Employed (%): Captures the profit generated on total capital employed (including
debt).Companies with low capital base (those that don't need additional capital for growth) will display a
higher ratio.
5. ROG-Total Assets (%): Captures the net profit generated on total assets.
6. Current Ratio[Latest]: It tells how cash rich a company is. It helps us gauge the short-term financial
strength of a company.
7. Fixed Assets Ratio[Latest]:It reveals how efficient a company is at generating sales from its existing fixed
assets.
8. Inventory Ratio[Latest] : Shows how efficiently the company manages its inventory.
9. Debtors Ratio[Latest]: A high debt to equity ratio is a warning signal, especially in situations like business
downturns.
10. Total Asset Turnover Ratio[Latest] : Shows how efficiently the company manages its total assets.
11. Interest Cover Ratio[Latest]: measures a company's ability to handle its outstanding debt.
12. ROG-Market Capitalisation (%): Company's worth as determined by the stock market.
13. ROG_Cost_of_Production_perc : Product costing is the process of tracking and studying all the various
expenses that are accrued in the production and sale of a product.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 34/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [58]:

plt.figure(figsize=(25,10))
sns.boxplot(data=Company_imputed_imp_features)
plt.xlabel("Variables")
plt.xticks(rotation=90)
plt.ylabel("Density")
plt.title('Figure:Boxplot of few important features')

Out[58]:

Text(0.5, 1.0, 'Figure:Boxplot of few important features')

Insights:

Variable 'Current_Ratio[Latest]'and 'Total_Asset_Turnover_Ratio[Latest]' still have some extreme values.


Due to the extreme skewness of data, variable Inventory ratio doesnot have a lower whisker.

Distribution Plot of Important Features:

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 35/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [59]:

# plotting multiple density plot

Company_imputed_imp_features.plot.kde(figsize = (20,10),
linewidth = 4)

Out[59]:

<AxesSubplot:ylabel='Density'>

In [60]:

# Skewness of Data

Company_imputed_imp_features.skew(axis = 0, skipna = True).sort_values(ascending=Fal

Out[60]:

Current_Ratio_Latest 1.275

Total_Asset_Turnover_Ratio_Latest 1.075

Fixed_Assets_Ratio_Latest 0.889

ROG_Market_Capitalisation_perc 0.812

Interest_Cover_Ratio_Latest 0.739

Inventory_Ratio_Latest 0.405

Debtors_Ratio_Latest 0.229

Net_Working_Capital 0.175

ROG_Cost_of_Production_perc 0.115

ROG_Capital_Employed_perc 0.097

Book_Value_Unit_Curr 0.095

ROG_Total_Assets_perc 0.074

ROG_Net_Worth_perc 0.072

dtype: float64

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 36/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Insights From Skewness and Distribution Plots of Important Features:


Since, skewness is more then 1 indicating distribution is highly skewed for variables "Current_Ratio[Latest]
" and "Total_Asset_Turnover_Ratio[Latest]".
Data is moderately skewed for variables "Fixed_Assets_Ratio[Latest]", "ROG-Market_Capitalisation_perc"
and "Interest_Cover_Ratio[Latest]".
Other variables look fairly symmetrical.

Note: The rule of thumb is:

• If the skewness is between -0.5 and 0.5, the data are fairly symmetrical.
• If the skewness is between -1 and –
0.5 or between 0.5 and 1, the data are moderately skewed.
• If the skewness is less than -1 or greater than 1,
the data are highly skewed.

In [61]:

plt.figure(figsize=(8,5))
sns.boxplot(Company_imputed["default"], Company_imputed['Current_Ratio_Latest'],data
plt.title("Figure: Plot of Default with Current_Ratio_Latest")
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 37/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [62]:

#boxplot_Total_Asset_Turnover_Ratio[Latest]

plt.figure(figsize=(8,5))
sns.boxplot(Company_imputed["default"], Company_imputed['Total_Asset_Turnover_Ratio_
plt.title('Figure: Boxplot of Default with Total_Asset_Turnover_Ratio[Latest]')
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 38/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [63]:

#boxplot_Fixed_Assets_Ratio[Latest]

plt.figure(figsize=(8,5))
sns.boxplot(Company_imputed["default"], Company_imputed['Fixed_Assets_Ratio_Latest']
plt.title('Figure: Boxplot of Default with Fixed_Assets_Ratio[Latest]')
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 39/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [64]:

#boxplot_ROG-Market_Capitalisation_perc
plt.figure(figsize=(8,5))
sns.boxplot(Company_imputed["default"], Company_imputed['ROG_Market_Capitalisation_p
plt.title('Figure: Boxplot of Default with ROG-Market_Capitalisation_perc')
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 40/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [65]:

#boxplot_Interest_Cover_Ratio[Latest]
sns.boxplot(Company_imputed["default"], Company_imputed['Interest_Cover_Ratio_Latest
plt.title('Figure: Boxplot of Default with Interest_Cover_Ratio[Latest]', fontsize=15
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 41/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [66]:

#boxplot_Inventory_Ratio[Latest]
sns.boxplot(Company_imputed["default"], Company_imputed['Inventory_Ratio_Latest'],da
plt.title('Figure: Boxplot of Default with Inventory_Ratio[Latest]', fontsize=15)
plt.show()

In [67]:

#boxplot_Debtors_Ratio[Latest]
sns.boxplot(Company_imputed["default"], Company_imputed['Debtors_Ratio_Latest'],data
plt.title('Figure: Boxplot of Default with Debtors_Ratio[Latest] ',fontsize=15)
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 42/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [68]:

#boxplot_Net_Working_Capital
sns.boxplot(Company_imputed["default"], Company_imputed['Net_Working_Capital'],data=
plt.title('Figure: Boxplot of Default with Net_Working_Capital', fontsize=15)
plt.show()

In [69]:

#boxplot_ROG-Cost_of_Production_perc
sns.boxplot(Company_imputed["default"], Company_imputed['ROG_Cost_of_Production_perc
plt.title('Figure: Boxplot of Default with ROG-Cost_of_Production_perc', fontsize=15
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 43/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [70]:

#boxplot_ROG-Capital_Employed_perc
sns.boxplot(Company_imputed["default"], Company_imputed['ROG_Capital_Employed_perc']
plt.title('Figure: Boxplot of Default with ROG-Capital_Employed_perc', fontsize=15)
plt.show()

In [71]:

#boxplot_ROG-Capital_Employed_perc
sns.boxplot(Company_imputed["default"], Company_imputed['Book_Value_Unit_Curr'],data
plt.title('Figure: Boxplot of Default with Book_Value_Unit_Curr ', fontsize=15)
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 44/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [72]:

#boxplot_ROG-ROG-Total_Assets_perc
sns.boxplot(Company_imputed["default"], Company_imputed['ROG_Total_Assets_perc'],dat
plt.title('Figure: Boxplot of Default with ROG-Total_Assets_perc ', fontsize=15)
plt.show()

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 45/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [73]:

plt.figure(figsize = (12,8))
cor_matrix = Company_imputed.drop('default', axis = 1).corr()
sns.heatmap(cor_matrix, cmap = 'plasma', vmin = -1, vmax= 1)

Out[73]:

<AxesSubplot:>

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 46/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [ ]:

1.5 Train Test Split


We will split the data into Training and Testing data set, with 70:30 proportion with the fixed random state as 1
to ensure uniformity across multiple systems.

Split the data into Train and Test dataset in a ratio of 67:33 with the fixed random_state as 42 to ensure
uniformity across multiple systems and stratify on default to make sure both train and test data have similar
proportion of defaulters and non-defaulters. This is done as the dataset is imbalanced and has more of non-
defaulters. Before we do the train-test split , we will first separate independent (X) and dependent (y) variables
(to perform Train-Test split) using train_test_split from sklearn.model_selection.

In [74]:

from sklearn.model_selection import train_test_split


from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression

Splitting the data into train and test sets

In [75]:

predictors = Company_imputed.drop('default', axis = 1)


response = Company_imputed[['default']]

In [76]:

X_train, X_test, y_train, y_test = train_test_split(predictors, response,


test_size = 0.33, random_state =

In [77]:

print('Number of rows and columns of the training set for the independent variables:
print('Number of rows and columns of the training set for the dependent variable:',y
print('Number of rows and columns of the test set for the independent variables:',X_
print('Number of rows and columns of the test set for the dependent variable:',y_tes

Number of rows and columns of the training set for the independent var
iables: (2402, 49)

Number of rows and columns of the training set for the dependent varia
ble: (2402, 1)

Number of rows and columns of the test set for the independent variabl
es: (1184, 49)

Number of rows and columns of the test set for the dependent variable:
(1184, 1)

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 47/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [78]:

X_train.head()

Out[78]:

Co_Code Equity_Paid_Up Total_Debt Net_Working_Capital Current_Assets Current_Liabili

662 -0.489 0.157 -1.015 -0.855 -1.153

1373 -0.453 -0.371 -1.015 -0.019 -0.230

3268 1.264 0.011 -0.785 -0.942 -1.237

3246 -0.850 0.410 -0.833 1.674 -1.273

1456 -0.034 -0.328 1.471 1.639 1.713

5 rows × 49 columns

In [79]:

y_train.head()

Out[79]:

default

662 0

1373 0

3268 0

3246 0

1456 0

In [80]:

X_test.head()

Out[80]:

Co_Code Equity_Paid_Up Total_Debt Net_Working_Capital Current_Assets Current_Liabili

3163 -0.659 -0.626 0.449 0.710 -1.249

3133 1.463 1.598 1.082 -0.759 -0.589

937 -0.316 0.839 -1.015 0.795 0.637

196 0.968 0.278 -0.265 -1.209 1.401

2852 -0.533 0.018 1.323 1.690 -0.821

5 rows × 49 columns

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 48/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [81]:

y_test.head()

Out[81]:

default

3163 0

3133 0

937 0

196 1

2852 0

1.6 Build Logistic Regression Model (using statsmodel library) on


most important variables on Train Dataset and choose the
optimum cutoff. Also showcase your model building
approach

Here, we will use Logistic regression Model to evaluate the relationship between one dependent binary variable
and one or more independent variables.This model will help predicts the probability of occurrence of Default
using a logit function.

Assumptions of Logistic Regression Model:

1. It assumes that there is minimal, or no multi-collinearity among the independent variables.


2. It assumes that independent variables are linearly related to log of odds.
3. It assumes a large sample for good prediction.
4. It assumes that the observations are independent of each other.
5. There are no influential values(outliers) in the continuous predictors (independent variables).
6. Logistic Regression with 2 classes that the dependent variable is binary and the ordered Logistic
Regression requires the dependent variable to be ordered.

There are two methods to solve a Logistic Regression problem:

1. Stats Model
2. Scikit Learn

Here, we will use Stats Model method by importing statsmodels.api as sm

Note: Statsmodels provides a Logit() function for performing logistic regression. The Logit() function
accepts y and X as parameters and returns the Logit object. The model is then fitted to the data.The
logit function is simply the logarithm of the odds.

logit(x) = log(x / (1 – x))

The inverse of the logit function is the sigmoid function.

The equation of the Logistic Regression by which we predict the corresponding probabilities and then go on
predict a discrete target variable is
localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 49/102
06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

1
y=
1+𝑒−𝑧
Note: z = 𝛽0 +∑𝑛𝑖=1 (𝛽𝑖 𝑋1 )
In [82]:

import statsmodels.api as sm

In [ ]:

Creating logistic regression equation & storing it in f_1

model = SM.logit(formula=’Dependent Variable ~ Σ Independent Variables(k)’ data = ‘Data Frame containing


the required values’).fit()

Splitting arrays or matrices into random train and test subsets. Model will be fitted on train set and
predictions will be made on the test set

In [83]:

#Statsmodel requires the labelled data, therefore, concatinating the y label to the

Company_train = pd.concat([X_train,y_train], axis=1)


Company_test = pd.concat([X_test,y_test], axis=1)

In [84]:

Company_train.to_csv('Company_train.csv',index=False)
Company_test.to_csv('Company_test.csv',index=False)

In [85]:

Company_train["default"].value_counts()

Out[85]:

0 2176

1 226

Name: default, dtype: int64

Checking if dataset is balanced

In [86]:

Company_train.default.sum() / len(Company_train.default)

Out[86]:

0.09408825978351373

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 50/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [87]:

Company_train.columns

Out[87]:

Index(['Co_Code', 'Equity_Paid_Up', 'Total_Debt', 'Net_Working_Capita


l',

'Current_Assets', 'Current_Liabilities_and_Provisions',

'Total_Assets_by_Liabilities', 'Other_Income', 'Value_Of_Outpu


t',

'Selling_Cost', 'Adjusted_PAT', 'Capital_expenses_in_forex',

'Book_Value_Unit_Curr', 'Book_Value_Adj_Unit_Curr',

'CEPS_annualised_Unit_Curr', 'Cash_Flow_From_Operating_Activiti
es',

'Cash_Flow_From_Investing_Activities',

'Cash_Flow_From_Financing_Activities', 'ROG_Net_Worth_perc',

'ROG_Capital_Employed_perc', 'ROG_Gross_Block_perc',

'ROG_Gross_Sales_perc', 'ROG_Net_Sales_perc',

'ROG_Cost_of_Production_perc', 'ROG_Total_Assets_perc',

'ROG_PBIDT_perc', 'ROG_PBDT_perc', 'ROG_PBIT_perc', 'ROG_PBT_pe


rc',

'ROG_PAT_perc', 'ROG_CP_perc', 'ROG_Market_Capitalisation_per


c',

'Current_Ratio_Latest', 'Fixed_Assets_Ratio_Latest',

'Inventory_Ratio_Latest', 'Debtors_Ratio_Latest',

'Total_Asset_Turnover_Ratio_Latest', 'Interest_Cover_Ratio_Late
st',

'PBIDTM_perc_Latest', 'PBITM_perc_Latest', 'PBDTM_perc_Latest',

'CPM_perc_Latest', 'APATM_perc_Latest', 'Debtors_Velocity_Day


s',

'Creditors_Velocity_Days', 'Inventory_Velocity_Days',

'Value_of_Output_by_Total_Assets', 'Value_of_Output_by_Gross_Bl
ock',

'Networth_Next_Year', 'default'],

dtype='object')

In [ ]:

Model 1
Before starting model building, lets look at the problem of multicollinearity. Multicollinearity occurs when two or
more independent variables are highly correlated with one another in a regression model.

In [88]:

## Importing VIF

from statsmodels.stats.outliers_influence import variance_inflation_factor


def calc_vif(X):
vif = pd.DataFrame()
vif["variables"] = X.columns
vif["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]

return(vif)

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 51/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [89]:

calc_vif(X_train).sort_values(by='VIF', ascending = False)

Out[89]:

variables VIF

22 ROG_Net_Sales_perc 19.846

21 ROG_Gross_Sales_perc 19.749

13 Book_Value_Adj_Unit_Curr 5.579

12 Book_Value_Unit_Curr 5.537

46 Value_of_Output_by_Total_Assets 4.805

36 Total_Asset_Turnover_Ratio_Latest 4.405

40 PBDTM_perc_Latest 4.187

26 ROG_PBDT_perc 4.079

28 ROG_PBT_perc 4.036

41 CPM_perc_Latest 3.932

29 ROG_PAT_perc 3.477

27 ROG_PBIT_perc 3.386

30 ROG_CP_perc 3.279

25 ROG_PBIDT_perc 3.278

47 Value_of_Output_by_Gross_Block 3.051

39 PBITM_perc_Latest 3.038

33 Fixed_Assets_Ratio_Latest 3.035

38 PBIDTM_perc_Latest 2.713

42 APATM_perc_Latest 2.679

10 Adjusted_PAT 2.471

14 CEPS_annualised_Unit_Curr 2.155

19 ROG_Capital_Employed_perc 1.923

18 ROG_Net_Worth_perc 1.837

37 Interest_Cover_Ratio_Latest 1.781

9 Selling_Cost 1.754

24 ROG_Total_Assets_perc 1.743

35 Debtors_Ratio_Latest 1.737

34 Inventory_Ratio_Latest 1.619

7 Other_Income 1.603

15 Cash_Flow_From_Operating_Activities 1.555

48 Networth_Next_Year 1.457

5 Current_Liabilities_and_Provisions 1.444

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 52/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

variables VIF

3 Net_Working_Capital 1.428

8 Value_Of_Output 1.388

43 Debtors_Velocity_Days 1.387

4 Current_Assets 1.377

2 Total_Debt 1.371

23 ROG_Cost_of_Production_perc 1.363

32 Current_Ratio_Latest 1.308

20 ROG_Gross_Block_perc 1.306

45 Inventory_Velocity_Days 1.304

44 Creditors_Velocity_Days 1.268

17 Cash_Flow_From_Financing_Activities 1.180

16 Cash_Flow_From_Investing_Activities 1.177

31 ROG_Market_Capitalisation_perc 1.164

0 Co_Code 1.104

6 Total_Assets_by_Liabilities 1.094

1 Equity_Paid_Up 1.060

11 Capital_expenses_in_forex nan

Here, we see that the value of VIF is high for many variables. Hence,dropping variables with VIF more than 5
(very high correlation) & build our model.

In [94]:

f_1='default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+Value_of_Output_by_Total_

In [95]:

model_1 = SM.logit(formula = f_1,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125498

Iterations 10

Checking the coefficients:

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 53/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [96]:

model_1.summary()

Out[96]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3553

Method: MLE Df Model: 32

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6157

Time: 12:31:57 Log-Likelihood: -450.04

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 4.424e-283

coef std err z P>|z| [0.025 0.975]

Intercept -5.6654 0.271 -20.926 0.000 -6.196 -5.135

Book_Value_Adj_Unit_Curr -1.2438 0.574 -2.168 0.030 -2.368 -0.120

Book_Value_Unit_Curr -1.6603 0.583 -2.848 0.004 -2.803 -0.518

Value_of_Output_by_Total_Assets 0.3727 0.162 2.301 0.021 0.055 0.690

Total_Asset_Turnover_Ratio_Latest -0.1217 0.148 -0.823 0.411 -0.412 0.168

PBDTM_perc_Latest 0.0182 0.232 0.078 0.938 -0.437 0.473

CPM_perc_Latest -0.3498 0.230 -1.523 0.128 -0.800 0.100

ROG_PBIT_perc 0.0020 0.114 0.017 0.986 -0.221 0.225

ROG_CP_perc 0.0284 0.115 0.246 0.805 -0.198 0.255

Value_of_Output_by_Gross_Block -0.4073 0.204 -1.998 0.046 -0.807 -0.008

Fixed_Assets_Ratio_Latest -0.0874 0.198 -0.442 0.658 -0.475 0.300

Adjusted_PAT -0.5003 0.154 -3.252 0.001 -0.802 -0.199

ROG_Capital_Employed_perc 0.3008 0.129 2.338 0.019 0.049 0.553

ROG_Net_Worth_perc -0.2211 0.127 -1.745 0.081 -0.469 0.027

Interest_Cover_Ratio_Latest -0.4189 0.150 -2.791 0.005 -0.713 -0.125

Selling_Cost 0.1371 0.134 1.020 0.308 -0.126 0.401

ROG_Total_Assets_perc -0.1902 0.117 -1.620 0.105 -0.420 0.040

Debtors_Ratio_Latest -0.2212 0.120 -1.837 0.066 -0.457 0.015

Inventory_Ratio_Latest -0.0744 0.119 -0.624 0.533 -0.308 0.159

Other_Income -0.1152 0.111 -1.040 0.298 -0.332 0.102

Cash_Flow_From_Operating_Activities -0.0088 0.110 -0.080 0.936 -0.224 0.207

Net_Working_Capital -0.3268 0.101 -3.227 0.001 -0.525 -0.128

Debtors_Velocity_Days 0.0331 0.103 0.321 0.748 -0.169 0.235

Total_Debt 0.6775 0.101 6.726 0.000 0.480 0.875

ROG_Cost_of_Production_perc -0.2281 0.098 -2.330 0.020 -0.420 -0.036

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 54/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Current_Ratio_Latest -0.7208 0.129 -5.587 0.000 -0.974 -0.468

ROG_Gross_Block_perc 0.0438 0.114 0.384 0.701 -0.179 0.267

Inventory_Velocity_Days -0.0121 0.102 -0.118 0.906 -0.212 0.188

Creditors_Velocity_Days 0.0952 0.095 0.999 0.318 -0.092 0.282

Cash_Flow_From_Financing_Activities -0.0284 0.093 -0.304 0.761 -0.211 0.155

Cash_Flow_From_Investing_Activities 0.1930 0.098 1.965 0.049 0.001 0.385

ROG_Market_Capitalisation_perc -0.0354 0.095 -0.374 0.708 -0.221 0.150

Equity_Paid_Up -0.1523 0.088 -1.723 0.085 -0.325 0.021

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

As most of the coefficients are having p values greater than 5%, those variables are highly correlated
and this can be ignored while taking only significant variables with p values < 0.05.

The elimination of these variables is done one by one, where the highest insignificant variable is
removed first from logistic model and then model performance tested again to see if other variables are
contributing significantly or not.

Variable "ROG_PBIT_perc" has the highest p-value (0.986) and is insignificant, therefore, we need to
eliminate it.

Model_2

In [97]:

_Gross_Block_perc+Inventory_Velocity_Days+Creditors_Velocity_Days+Cash_Flow_From_Fina

In [98]:

model_2 = SM.logit(formula = f_2,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125498

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 55/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [99]:

model_2.summary()

Out[99]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3554

Method: MLE Df Model: 31

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6157

Time: 12:38:57 Log-Likelihood: -450.04

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 6.430e-284

coef std err z P>|z| [0.025 0.975]

Intercept -5.6656 0.271 -20.931 0.000 -6.196 -5.135

Book_Value_Adj_Unit_Curr -1.2438 0.574 -2.169 0.030 -2.368 -0.120

Book_Value_Unit_Curr -1.6605 0.583 -2.849 0.004 -2.803 -0.518

Value_of_Output_by_Total_Assets 0.3730 0.161 2.317 0.021 0.057 0.689

Total_Asset_Turnover_Ratio_Latest -0.1218 0.148 -0.826 0.409 -0.411 0.167

PBDTM_perc_Latest 0.0182 0.232 0.078 0.937 -0.437 0.473

CPM_perc_Latest -0.3502 0.229 -1.530 0.126 -0.799 0.098

ROG_CP_perc 0.0297 0.089 0.332 0.740 -0.145 0.205

Value_of_Output_by_Gross_Block -0.4073 0.204 -1.997 0.046 -0.807 -0.008

Fixed_Assets_Ratio_Latest -0.0875 0.197 -0.443 0.658 -0.475 0.299

Adjusted_PAT -0.5002 0.154 -3.258 0.001 -0.801 -0.199

ROG_Capital_Employed_perc 0.3010 0.128 2.344 0.019 0.049 0.553

ROG_Net_Worth_perc -0.2210 0.127 -1.745 0.081 -0.469 0.027

Interest_Cover_Ratio_Latest -0.4187 0.150 -2.796 0.005 -0.712 -0.125

Selling_Cost 0.1372 0.134 1.021 0.307 -0.126 0.401

ROG_Total_Assets_perc -0.1902 0.117 -1.621 0.105 -0.420 0.040

Debtors_Ratio_Latest -0.2214 0.120 -1.840 0.066 -0.457 0.014

Inventory_Ratio_Latest -0.0744 0.119 -0.624 0.533 -0.308 0.159

Other_Income -0.1152 0.111 -1.040 0.298 -0.332 0.102

Cash_Flow_From_Operating_Activities -0.0089 0.110 -0.081 0.936 -0.224 0.207

Net_Working_Capital -0.3267 0.101 -3.227 0.001 -0.525 -0.128

Debtors_Velocity_Days 0.0331 0.103 0.321 0.748 -0.169 0.235

Total_Debt 0.6775 0.101 6.726 0.000 0.480 0.875

ROG_Cost_of_Production_perc -0.2281 0.098 -2.330 0.020 -0.420 -0.036

Current_Ratio_Latest -0.7206 0.129 -5.595 0.000 -0.973 -0.468

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 56/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

ROG_Gross_Block_perc 0.0437 0.114 0.384 0.701 -0.179 0.267

Inventory_Velocity_Days -0.0121 0.102 -0.118 0.906 -0.212 0.188

Creditors_Velocity_Days 0.0952 0.095 0.999 0.318 -0.092 0.282

Cash_Flow_From_Financing_Activities -0.0284 0.093 -0.304 0.761 -0.211 0.154

Cash_Flow_From_Investing_Activities 0.1930 0.098 1.965 0.049 0.001 0.386

ROG_Market_Capitalisation_perc -0.0354 0.095 -0.374 0.709 -0.221 0.150

Equity_Paid_Up -0.1523 0.088 -1.724 0.085 -0.325 0.021

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "PBDTM_perc_Latest" has the highest p-value (0.937) and is insignificant, therefore, we need to
eliminate it.

Model_3

In [100]:

e_Adj_Unit_Curr+Book_Value_Unit_Curr+Value_of_Output_by_Total_Assets+Total_Asset_Turn

In [101]:

model_3 = SM.logit(formula = f_3,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125499

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 57/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [102]:

model_3.summary()

Out[102]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3555

Method: MLE Df Model: 30

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6157

Time: 12:41:28 Log-Likelihood: -450.04

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 9.219e-285

coef std err z P>|z| [0.025 0.975]

Intercept -5.6647 0.270 -20.948 0.000 -6.195 -5.135

Book_Value_Adj_Unit_Curr -1.2426 0.574 -2.164 0.030 -2.368 -0.117

Book_Value_Unit_Curr -1.6612 0.584 -2.846 0.004 -2.805 -0.517

Value_of_Output_by_Total_Assets 0.3722 0.161 2.317 0.021 0.057 0.687

Total_Asset_Turnover_Ratio_Latest -0.1217 0.148 -0.825 0.409 -0.411 0.167

CPM_perc_Latest -0.3347 0.115 -2.908 0.004 -0.560 -0.109

ROG_CP_perc 0.0295 0.089 0.331 0.741 -0.145 0.205

Value_of_Output_by_Gross_Block -0.4062 0.203 -1.998 0.046 -0.805 -0.008

Fixed_Assets_Ratio_Latest -0.0877 0.197 -0.444 0.657 -0.475 0.299

Adjusted_PAT -0.4995 0.153 -3.259 0.001 -0.800 -0.199

ROG_Capital_Employed_perc 0.3009 0.128 2.343 0.019 0.049 0.553

ROG_Net_Worth_perc -0.2210 0.127 -1.744 0.081 -0.469 0.027

Interest_Cover_Ratio_Latest -0.4170 0.148 -2.816 0.005 -0.707 -0.127

Selling_Cost 0.1366 0.134 1.018 0.309 -0.126 0.399

ROG_Total_Assets_perc -0.1900 0.117 -1.620 0.105 -0.420 0.040

Debtors_Ratio_Latest -0.2207 0.120 -1.839 0.066 -0.456 0.015

Inventory_Ratio_Latest -0.0742 0.119 -0.622 0.534 -0.308 0.160

Other_Income -0.1153 0.111 -1.041 0.298 -0.332 0.102

Cash_Flow_From_Operating_Activities -0.0084 0.110 -0.076 0.939 -0.224 0.207

Net_Working_Capital -0.3266 0.101 -3.227 0.001 -0.525 -0.128

Debtors_Velocity_Days 0.0331 0.103 0.321 0.748 -0.169 0.235

Total_Debt 0.6771 0.101 6.729 0.000 0.480 0.874

ROG_Cost_of_Production_perc -0.2282 0.098 -2.331 0.020 -0.420 -0.036

Current_Ratio_Latest -0.7204 0.129 -5.594 0.000 -0.973 -0.468

ROG_Gross_Block_perc 0.0437 0.114 0.383 0.701 -0.180 0.267

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 58/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Inventory_Velocity_Days -0.0117 0.102 -0.115 0.909 -0.212 0.188

Creditors_Velocity_Days 0.0948 0.095 0.997 0.319 -0.092 0.281

Cash_Flow_From_Financing_Activities -0.0286 0.093 -0.306 0.759 -0.211 0.154

Cash_Flow_From_Investing_Activities 0.1929 0.098 1.964 0.050 0.000 0.385

ROG_Market_Capitalisation_perc -0.0354 0.095 -0.374 0.708 -0.221 0.150

Equity_Paid_Up -0.1520 0.088 -1.722 0.085 -0.325 0.021

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Cash_Flow_From_Operating_Activities"has the highest p-value (0.939) and is insignificant,


therefore, we need to eliminate it.

Model_4

In [103]:

f_4='default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+Value_of_Output_by_Total_

In [104]:

model_4 = SM.logit(formula = f_4,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125500

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 59/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [105]:

model_4.summary()

Out[105]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3556

Method: MLE Df Model: 29

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6157

Time: 12:45:17 Log-Likelihood: -450.04

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 1.299e-285

coef std err z P>|z| [0.025 0.975]

Intercept -5.6653 0.270 -20.954 0.000 -6.195 -5.135

Book_Value_Adj_Unit_Curr -1.2441 0.574 -2.167 0.030 -2.369 -0.119

Book_Value_Unit_Curr -1.6610 0.584 -2.845 0.004 -2.805 -0.517

Value_of_Output_by_Total_Assets 0.3722 0.161 2.317 0.020 0.057 0.687

Total_Asset_Turnover_Ratio_Latest -0.1221 0.147 -0.828 0.408 -0.411 0.167

CPM_perc_Latest -0.3351 0.115 -2.915 0.004 -0.560 -0.110

ROG_CP_perc 0.0298 0.089 0.334 0.738 -0.145 0.205

Value_of_Output_by_Gross_Block -0.4052 0.203 -1.996 0.046 -0.803 -0.007

Fixed_Assets_Ratio_Latest -0.0877 0.197 -0.444 0.657 -0.475 0.299

Adjusted_PAT -0.5011 0.152 -3.299 0.001 -0.799 -0.203

ROG_Capital_Employed_perc 0.3017 0.128 2.357 0.018 0.051 0.553

ROG_Net_Worth_perc -0.2206 0.127 -1.743 0.081 -0.469 0.027

Interest_Cover_Ratio_Latest -0.4172 0.148 -2.819 0.005 -0.707 -0.127

Selling_Cost 0.1358 0.134 1.015 0.310 -0.126 0.398

ROG_Total_Assets_perc -0.1904 0.117 -1.624 0.104 -0.420 0.039

Debtors_Ratio_Latest -0.2206 0.120 -1.838 0.066 -0.456 0.015

Inventory_Ratio_Latest -0.0746 0.119 -0.626 0.531 -0.308 0.159

Other_Income -0.1169 0.109 -1.076 0.282 -0.330 0.096

Net_Working_Capital -0.3267 0.101 -3.228 0.001 -0.525 -0.128

Debtors_Velocity_Days 0.0323 0.103 0.315 0.753 -0.169 0.233

Total_Debt 0.6765 0.100 6.743 0.000 0.480 0.873

ROG_Cost_of_Production_perc -0.2281 0.098 -2.331 0.020 -0.420 -0.036

Current_Ratio_Latest -0.7200 0.129 -5.596 0.000 -0.972 -0.468

ROG_Gross_Block_perc 0.0437 0.114 0.384 0.701 -0.179 0.267

Inventory_Velocity_Days -0.0119 0.102 -0.117 0.907 -0.212 0.188

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 60/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Creditors_Velocity_Days 0.0946 0.095 0.995 0.320 -0.092 0.281

Cash_Flow_From_Financing_Activities -0.0271 0.091 -0.297 0.767 -0.206 0.152

Cash_Flow_From_Investing_Activities 0.1937 0.098 1.985 0.047 0.002 0.385

ROG_Market_Capitalisation_perc -0.0357 0.095 -0.377 0.706 -0.221 0.150

Equity_Paid_Up -0.1520 0.088 -1.724 0.085 -0.325 0.021

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Inventory_Velocity_Days" has the highest p-value (0.907) and is insignificant, therefore, we
need to eliminate it.

Model_5

In [106]:

f_5='default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+Value_of_Output_by_Total_

In [107]:

model_5 = SM.logit(formula = f_5,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125502

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 61/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [108]:

model_5.summary()

Out[108]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3557

Method: MLE Df Model: 28

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6157

Time: 12:46:52 Log-Likelihood: -450.05

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 1.805e-286

coef std err z P>|z| [0.025 0.975]

Intercept -5.6651 0.270 -20.956 0.000 -6.195 -5.135

Book_Value_Adj_Unit_Curr -1.2453 0.574 -2.170 0.030 -2.370 -0.121

Book_Value_Unit_Curr -1.6605 0.584 -2.845 0.004 -2.804 -0.517

Value_of_Output_by_Total_Assets 0.3729 0.161 2.323 0.020 0.058 0.688

Total_Asset_Turnover_Ratio_Latest -0.1229 0.147 -0.834 0.404 -0.412 0.166

CPM_perc_Latest -0.3355 0.115 -2.920 0.004 -0.561 -0.110

ROG_CP_perc 0.0299 0.089 0.335 0.737 -0.145 0.205

Value_of_Output_by_Gross_Block -0.4048 0.203 -1.995 0.046 -0.803 -0.007

Fixed_Assets_Ratio_Latest -0.0879 0.197 -0.445 0.656 -0.475 0.299

Adjusted_PAT -0.5008 0.152 -3.297 0.001 -0.799 -0.203

ROG_Capital_Employed_perc 0.3018 0.128 2.357 0.018 0.051 0.553

ROG_Net_Worth_perc -0.2198 0.126 -1.739 0.082 -0.468 0.028

Interest_Cover_Ratio_Latest -0.4158 0.148 -2.819 0.005 -0.705 -0.127

Selling_Cost 0.1335 0.132 1.009 0.313 -0.126 0.393

ROG_Total_Assets_perc -0.1904 0.117 -1.624 0.104 -0.420 0.039

Debtors_Ratio_Latest -0.2208 0.120 -1.840 0.066 -0.456 0.014

Inventory_Ratio_Latest -0.0765 0.118 -0.648 0.517 -0.308 0.155

Other_Income -0.1173 0.109 -1.080 0.280 -0.330 0.096

Net_Working_Capital -0.3284 0.100 -3.274 0.001 -0.525 -0.132

Debtors_Velocity_Days 0.0305 0.101 0.301 0.764 -0.168 0.229

Total_Debt 0.6751 0.100 6.780 0.000 0.480 0.870

ROG_Cost_of_Production_perc -0.2268 0.097 -2.333 0.020 -0.417 -0.036

Current_Ratio_Latest -0.7196 0.129 -5.595 0.000 -0.972 -0.468

ROG_Gross_Block_perc 0.0436 0.114 0.383 0.702 -0.180 0.267

Creditors_Velocity_Days 0.0946 0.095 0.995 0.320 -0.092 0.281

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 62/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Cash_Flow_From_Financing_Activities -0.0271 0.091 -0.297 0.767 -0.206 0.152

Cash_Flow_From_Investing_Activities 0.1942 0.097 1.992 0.046 0.003 0.385

ROG_Market_Capitalisation_perc -0.0350 0.094 -0.371 0.711 -0.220 0.150

Equity_Paid_Up -0.1519 0.088 -1.722 0.085 -0.325 0.021

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Debtors_Velocity_Days" has the highest p-value (0.764) and is insignificant, therefore, we need
to eliminate it.

Model_6

In [109]:

atest+Selling_Cost+ROG_Total_Assets_perc+Debtors_Ratio_Latest+Inventory_Ratio_Latest+

In [110]:

model_6 = SM.logit(formula = f_6,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125514

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 63/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [111]:

model_6.summary()

Out[111]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3558

Method: MLE Df Model: 27

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6156

Time: 12:48:56 Log-Likelihood: -450.09

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 2.556e-287

coef std err z P>|z| [0.025 0.975]

Intercept -5.6650 0.270 -20.958 0.000 -6.195 -5.135

Book_Value_Adj_Unit_Curr -1.2442 0.573 -2.173 0.030 -2.367 -0.122

Book_Value_Unit_Curr -1.6632 0.582 -2.856 0.004 -2.805 -0.522

Value_of_Output_by_Total_Assets 0.3746 0.160 2.338 0.019 0.061 0.689

Total_Asset_Turnover_Ratio_Latest -0.1260 0.147 -0.857 0.391 -0.414 0.162

CPM_perc_Latest -0.3343 0.115 -2.912 0.004 -0.559 -0.109

ROG_CP_perc 0.0288 0.089 0.323 0.747 -0.146 0.203

Value_of_Output_by_Gross_Block -0.4037 0.202 -1.994 0.046 -0.801 -0.007

Fixed_Assets_Ratio_Latest -0.0879 0.197 -0.446 0.655 -0.474 0.298

Adjusted_PAT -0.4992 0.152 -3.289 0.001 -0.797 -0.202

ROG_Capital_Employed_perc 0.3003 0.128 2.347 0.019 0.050 0.551

ROG_Net_Worth_perc -0.2194 0.126 -1.736 0.083 -0.467 0.028

Interest_Cover_Ratio_Latest -0.4169 0.147 -2.829 0.005 -0.706 -0.128

Selling_Cost 0.1320 0.132 0.998 0.318 -0.127 0.391

ROG_Total_Assets_perc -0.1880 0.117 -1.607 0.108 -0.417 0.041

Debtors_Ratio_Latest -0.2121 0.116 -1.825 0.068 -0.440 0.016

Inventory_Ratio_Latest -0.0725 0.117 -0.618 0.537 -0.302 0.157

Other_Income -0.1157 0.108 -1.067 0.286 -0.328 0.097

Net_Working_Capital -0.3230 0.099 -3.275 0.001 -0.516 -0.130

Total_Debt 0.6751 0.100 6.781 0.000 0.480 0.870

ROG_Cost_of_Production_perc -0.2275 0.097 -2.341 0.019 -0.418 -0.037

Current_Ratio_Latest -0.7201 0.129 -5.602 0.000 -0.972 -0.468

ROG_Gross_Block_perc 0.0424 0.114 0.373 0.709 -0.180 0.265

Creditors_Velocity_Days 0.1012 0.092 1.095 0.274 -0.080 0.282

Cash_Flow_From_Financing_Activities -0.0282 0.091 -0.309 0.757 -0.207 0.150

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 64/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Cash_Flow_From_Investing_Activities 0.1931 0.097 1.983 0.047 0.002 0.384

ROG_Market_Capitalisation_perc -0.0355 0.094 -0.376 0.707 -0.221 0.150

Equity_Paid_Up -0.1527 0.088 -1.732 0.083 -0.325 0.020

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Cash_Flow_From_Financing_Activities" has the highest p-value (0.757) and is insignificant,


therefore, we need to eliminate it.

Model_7

In [112]:

of_Production_perc+Current_Ratio_Latest+ROG_Gross_Block_perc+Creditors_Velocity_Days+

In [113]:

model_7= SM.logit(formula = f_7,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125528

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 65/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [114]:

model_7.summary()

Out[114]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3559

Method: MLE Df Model: 26

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6156

Time: 12:50:33 Log-Likelihood: -450.14

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 3.561e-288

coef std err z P>|z| [0.025 0.975]

Intercept -5.6640 0.270 -20.966 0.000 -6.194 -5.135

Book_Value_Adj_Unit_Curr -1.2437 0.573 -2.171 0.030 -2.366 -0.121

Book_Value_Unit_Curr -1.6631 0.582 -2.855 0.004 -2.805 -0.521

Value_of_Output_by_Total_Assets 0.3764 0.160 2.352 0.019 0.063 0.690

Total_Asset_Turnover_Ratio_Latest -0.1269 0.147 -0.864 0.388 -0.415 0.161

CPM_perc_Latest -0.3329 0.115 -2.902 0.004 -0.558 -0.108

ROG_CP_perc 0.0301 0.089 0.338 0.735 -0.144 0.204

Value_of_Output_by_Gross_Block -0.4027 0.202 -1.993 0.046 -0.799 -0.007

Fixed_Assets_Ratio_Latest -0.0909 0.196 -0.463 0.644 -0.476 0.294

Adjusted_PAT -0.4971 0.152 -3.280 0.001 -0.794 -0.200

ROG_Capital_Employed_perc 0.2932 0.126 2.330 0.020 0.047 0.540

ROG_Net_Worth_perc -0.2171 0.126 -1.721 0.085 -0.464 0.030

Interest_Cover_Ratio_Latest -0.4172 0.147 -2.832 0.005 -0.706 -0.128

Selling_Cost 0.1297 0.132 0.982 0.326 -0.129 0.388

ROG_Total_Assets_perc -0.1903 0.117 -1.629 0.103 -0.419 0.039

Debtors_Ratio_Latest -0.2107 0.116 -1.814 0.070 -0.438 0.017

Inventory_Ratio_Latest -0.0728 0.117 -0.621 0.535 -0.303 0.157

Other_Income -0.1151 0.108 -1.061 0.289 -0.328 0.097

Net_Working_Capital -0.3239 0.099 -3.285 0.001 -0.517 -0.131

Total_Debt 0.6756 0.100 6.788 0.000 0.481 0.871

ROG_Cost_of_Production_perc -0.2260 0.097 -2.329 0.020 -0.416 -0.036

Current_Ratio_Latest -0.7187 0.128 -5.594 0.000 -0.970 -0.467

ROG_Gross_Block_perc 0.0410 0.114 0.360 0.719 -0.182 0.264

Creditors_Velocity_Days 0.1032 0.092 1.119 0.263 -0.078 0.284

Cash_Flow_From_Investing_Activities 0.1917 0.097 1.971 0.049 0.001 0.382

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 66/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

ROG_Market_Capitalisation_perc -0.0344 0.094 -0.365 0.715 -0.219 0.150

Equity_Paid_Up -0.1531 0.088 -1.738 0.082 -0.326 0.020

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "ROG_CP_perc" has the highest p-value (0.735) and is insignificant, therefore, we need to
eliminate it.

Model_8

In [115]:

s+Total_Asset_Turnover_Ratio_Latest+CPM_perc_Latest+Value_of_Output_by_Gross_Block+ F

In [116]:

model_8= SM.logit(formula = f_8,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125544

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 67/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [117]:

model_8.summary()

Out[117]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3560

Method: MLE Df Model: 25

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6155

Time: 12:52:13 Log-Likelihood: -450.20

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 4.908e-289

coef std err z P>|z| [0.025 0.975]

Intercept -5.6667 0.270 -20.975 0.000 -6.196 -5.137

Book_Value_Adj_Unit_Curr -1.2437 0.572 -2.174 0.030 -2.365 -0.123

Book_Value_Unit_Curr -1.6651 0.582 -2.862 0.004 -2.806 -0.525

Value_of_Output_by_Total_Assets 0.3767 0.160 2.350 0.019 0.063 0.691

Total_Asset_Turnover_Ratio_Latest -0.1266 0.147 -0.861 0.389 -0.415 0.162

CPM_perc_Latest -0.3306 0.114 -2.889 0.004 -0.555 -0.106

Value_of_Output_by_Gross_Block -0.4034 0.202 -1.995 0.046 -0.800 -0.007

Fixed_Assets_Ratio_Latest -0.0897 0.196 -0.457 0.648 -0.475 0.295

Adjusted_PAT -0.4958 0.151 -3.274 0.001 -0.793 -0.199

ROG_Capital_Employed_perc 0.2937 0.126 2.333 0.020 0.047 0.540

ROG_Net_Worth_perc -0.2132 0.126 -1.698 0.089 -0.459 0.033

Interest_Cover_Ratio_Latest -0.4153 0.147 -2.821 0.005 -0.704 -0.127

Selling_Cost 0.1272 0.132 0.965 0.335 -0.131 0.386

ROG_Total_Assets_perc -0.1894 0.117 -1.623 0.105 -0.418 0.039

Debtors_Ratio_Latest -0.2126 0.116 -1.831 0.067 -0.440 0.015

Inventory_Ratio_Latest -0.0735 0.117 -0.627 0.530 -0.303 0.156

Other_Income -0.1148 0.108 -1.059 0.289 -0.327 0.098

Net_Working_Capital -0.3227 0.098 -3.277 0.001 -0.516 -0.130

Total_Debt 0.6774 0.099 6.816 0.000 0.483 0.872

ROG_Cost_of_Production_perc -0.2256 0.097 -2.325 0.020 -0.416 -0.035

Current_Ratio_Latest -0.7197 0.128 -5.601 0.000 -0.971 -0.468

ROG_Gross_Block_perc 0.0408 0.114 0.359 0.720 -0.182 0.264

Creditors_Velocity_Days 0.1025 0.092 1.111 0.266 -0.078 0.283

Cash_Flow_From_Investing_Activities 0.1935 0.097 1.991 0.046 0.003 0.384

ROG_Market_Capitalisation_perc -0.0338 0.094 -0.359 0.720 -0.219 0.151

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 68/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Equity_Paid_Up -0.1542 0.088 -1.751 0.080 -0.327 0.018

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "ROG_Gross_Block_perc" has the highest p-value (0.720) and is insignificant, therefore, we
need to eliminate it.

Model_9

In [118]:

l+Total_Debt+ROG_Cost_of_Production_perc+Current_Ratio_Latest+Creditors_Velocity_Days

In [119]:

model_9= SM.logit(formula = f_9,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125562

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 69/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [120]:

model_9.summary()

Out[120]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3561

Method: MLE Df Model: 24

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6155

Time: 12:53:34 Log-Likelihood: -450.26

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 6.673e-290

coef std err z P>|z| [0.025 0.975]

Intercept -5.6660 0.270 -20.975 0.000 -6.195 -5.137

Book_Value_Adj_Unit_Curr -1.2395 0.573 -2.162 0.031 -2.363 -0.116

Book_Value_Unit_Curr -1.6645 0.583 -2.854 0.004 -2.808 -0.522

Value_of_Output_by_Total_Assets 0.3767 0.160 2.349 0.019 0.062 0.691

Total_Asset_Turnover_Ratio_Latest -0.1251 0.147 -0.851 0.395 -0.413 0.163

CPM_perc_Latest -0.3274 0.114 -2.867 0.004 -0.551 -0.104

Value_of_Output_by_Gross_Block -0.3987 0.202 -1.979 0.048 -0.794 -0.004

Fixed_Assets_Ratio_Latest -0.0889 0.196 -0.453 0.650 -0.473 0.296

Adjusted_PAT -0.4961 0.152 -3.274 0.001 -0.793 -0.199

ROG_Capital_Employed_perc 0.2970 0.125 2.367 0.018 0.051 0.543

ROG_Net_Worth_perc -0.2144 0.126 -1.707 0.088 -0.460 0.032

Interest_Cover_Ratio_Latest -0.4150 0.147 -2.820 0.005 -0.703 -0.127

Selling_Cost 0.1310 0.131 0.997 0.319 -0.127 0.389

ROG_Total_Assets_perc -0.1859 0.116 -1.600 0.110 -0.414 0.042

Debtors_Ratio_Latest -0.2127 0.116 -1.831 0.067 -0.440 0.015

Inventory_Ratio_Latest -0.0731 0.117 -0.624 0.533 -0.303 0.157

Other_Income -0.1142 0.108 -1.054 0.292 -0.327 0.098

Net_Working_Capital -0.3225 0.099 -3.274 0.001 -0.516 -0.129

Total_Debt 0.6770 0.099 6.815 0.000 0.482 0.872

ROG_Cost_of_Production_perc -0.2226 0.097 -2.303 0.021 -0.412 -0.033

Current_Ratio_Latest -0.7212 0.128 -5.615 0.000 -0.973 -0.469

Creditors_Velocity_Days 0.1030 0.092 1.118 0.264 -0.078 0.284

Cash_Flow_From_Investing_Activities 0.1864 0.095 1.961 0.050 6.43e-05 0.373

ROG_Market_Capitalisation_perc -0.0365 0.094 -0.389 0.697 -0.221 0.148

Equity_Paid_Up -0.1551 0.088 -1.762 0.078 -0.328 0.017

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 70/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "ROG_Market_Capitalisation_perc" has the highest p-value (0.697) and is insignificant,


therefore, we need to eliminate it.

Model_10

In [121]:

ncome+ Net_Working_Capital+Total_Debt+ROG_Cost_of_Production_perc+Current_Ratio_Lates

In [122]:

model_10= SM.logit(formula = f_10,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125583

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 71/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [123]:

model_10.summary()

Out[123]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3562

Method: MLE Df Model: 23

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6154

Time: 12:54:57 Log-Likelihood: -450.34

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 8.978e-291

coef std err z P>|z| [0.025 0.975]

Intercept -5.6620 0.270 -20.985 0.000 -6.191 -5.133

Book_Value_Adj_Unit_Curr -1.2322 0.573 -2.150 0.032 -2.356 -0.109

Book_Value_Unit_Curr -1.6718 0.583 -2.866 0.004 -2.815 -0.529

Value_of_Output_by_Total_Assets 0.3735 0.160 2.333 0.020 0.060 0.687

Total_Asset_Turnover_Ratio_Latest -0.1246 0.147 -0.847 0.397 -0.413 0.164

CPM_perc_Latest -0.3280 0.114 -2.870 0.004 -0.552 -0.104

Value_of_Output_by_Gross_Block -0.3990 0.202 -1.979 0.048 -0.794 -0.004

Fixed_Assets_Ratio_Latest -0.0874 0.196 -0.445 0.656 -0.472 0.297

Adjusted_PAT -0.4969 0.151 -3.284 0.001 -0.793 -0.200

ROG_Capital_Employed_perc 0.2978 0.125 2.374 0.018 0.052 0.544

ROG_Net_Worth_perc -0.2137 0.125 -1.703 0.088 -0.460 0.032

Interest_Cover_Ratio_Latest -0.4146 0.147 -2.817 0.005 -0.703 -0.126

Selling_Cost 0.1248 0.131 0.956 0.339 -0.131 0.381

ROG_Total_Assets_perc -0.1872 0.116 -1.612 0.107 -0.415 0.040

Debtors_Ratio_Latest -0.2135 0.116 -1.840 0.066 -0.441 0.014

Inventory_Ratio_Latest -0.0739 0.117 -0.630 0.529 -0.304 0.156

Other_Income -0.1151 0.108 -1.062 0.288 -0.327 0.097

Net_Working_Capital -0.3188 0.098 -3.252 0.001 -0.511 -0.127

Total_Debt 0.6741 0.099 6.808 0.000 0.480 0.868

ROG_Cost_of_Production_perc -0.2210 0.097 -2.288 0.022 -0.410 -0.032

Current_Ratio_Latest -0.7247 0.128 -5.652 0.000 -0.976 -0.473

Creditors_Velocity_Days 0.1018 0.092 1.105 0.269 -0.079 0.282

Cash_Flow_From_Investing_Activities 0.1878 0.095 1.976 0.048 0.002 0.374

Equity_Paid_Up -0.1569 0.088 -1.785 0.074 -0.329 0.015

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 72/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Fixed_Assets_Ratio_Latest" has the highest p-value (0.656) and is insignificant, therefore, we
need to eliminate it.

Model_11

In [124]:

nover_Ratio_Latest+CPM_perc_Latest+Value_of_Output_by_Gross_Block+ Adjusted_PAT+ROG_C

In [125]:

model_11= SM.logit(formula = f_11,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125611

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 73/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [126]:

model_11.summary()

Out[126]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3563

Method: MLE Df Model: 22

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6153

Time: 12:57:12 Log-Likelihood: -450.44

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 1.209e-291

coef std err z P>|z| [0.025 0.975]

Intercept -5.6618 0.270 -20.970 0.000 -6.191 -5.133

Book_Value_Adj_Unit_Curr -1.2280 0.575 -2.137 0.033 -2.354 -0.102

Book_Value_Unit_Curr -1.6779 0.585 -2.870 0.004 -2.824 -0.532

Value_of_Output_by_Total_Assets 0.3714 0.160 2.324 0.020 0.058 0.685

Total_Asset_Turnover_Ratio_Latest -0.1302 0.147 -0.888 0.374 -0.417 0.157

CPM_perc_Latest -0.3254 0.114 -2.852 0.004 -0.549 -0.102

Value_of_Output_by_Gross_Block -0.4674 0.132 -3.532 0.000 -0.727 -0.208

Adjusted_PAT -0.4960 0.151 -3.276 0.001 -0.793 -0.199

ROG_Capital_Employed_perc 0.2958 0.125 2.362 0.018 0.050 0.541

ROG_Net_Worth_perc -0.2120 0.125 -1.691 0.091 -0.458 0.034

Interest_Cover_Ratio_Latest -0.4202 0.147 -2.866 0.004 -0.708 -0.133

Selling_Cost 0.1241 0.131 0.950 0.342 -0.132 0.380

ROG_Total_Assets_perc -0.1864 0.116 -1.607 0.108 -0.414 0.041

Debtors_Ratio_Latest -0.2167 0.116 -1.874 0.061 -0.443 0.010

Inventory_Ratio_Latest -0.0739 0.117 -0.631 0.528 -0.303 0.156

Other_Income -0.1147 0.108 -1.058 0.290 -0.327 0.098

Net_Working_Capital -0.3192 0.098 -3.258 0.001 -0.511 -0.127

Total_Debt 0.6755 0.099 6.831 0.000 0.482 0.869

ROG_Cost_of_Production_perc -0.2204 0.097 -2.283 0.022 -0.410 -0.031

Current_Ratio_Latest -0.7266 0.128 -5.669 0.000 -0.978 -0.475

Creditors_Velocity_Days 0.0997 0.092 1.083 0.279 -0.081 0.280

Cash_Flow_From_Investing_Activities 0.1880 0.095 1.978 0.048 0.002 0.374

Equity_Paid_Up -0.1567 0.088 -1.783 0.075 -0.329 0.016

Possibly complete quasi-separation: A fraction 0.18 of observations can be

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 74/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Inventory_Ratio_Latest" has the highest p-value (0.528) and is insignificant, therefore, we need
to eliminate it.

Model_12

In [127]:

Interest_Cover_Ratio_Latest+Selling_Cost+ROG_Total_Assets_perc+Debtors_Ratio_Latest+O

In [128]:

model_12= SM.logit(formula = f_12,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125666

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 75/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [129]:

model_12.summary()

Out[129]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3564

Method: MLE Df Model: 21

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6152

Time: 12:58:51 Log-Likelihood: -450.64

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 1.755e-292

coef std err z P>|z| [0.025 0.975]

Intercept -5.6604 0.270 -20.975 0.000 -6.189 -5.131

Book_Value_Adj_Unit_Curr -1.2303 0.574 -2.143 0.032 -2.356 -0.105

Book_Value_Unit_Curr -1.6766 0.585 -2.868 0.004 -2.822 -0.531

Value_of_Output_by_Total_Assets 0.3541 0.157 2.250 0.024 0.046 0.662

Total_Asset_Turnover_Ratio_Latest -0.1431 0.145 -0.985 0.325 -0.428 0.142

CPM_perc_Latest -0.3288 0.114 -2.881 0.004 -0.553 -0.105

Value_of_Output_by_Gross_Block -0.4640 0.132 -3.504 0.000 -0.724 -0.204

Adjusted_PAT -0.4992 0.151 -3.299 0.001 -0.796 -0.203

ROG_Capital_Employed_perc 0.2950 0.125 2.359 0.018 0.050 0.540

ROG_Net_Worth_perc -0.2091 0.125 -1.671 0.095 -0.454 0.036

Interest_Cover_Ratio_Latest -0.4185 0.146 -2.859 0.004 -0.705 -0.132

Selling_Cost 0.1183 0.131 0.905 0.365 -0.138 0.374

ROG_Total_Assets_perc -0.1825 0.116 -1.577 0.115 -0.409 0.044

Debtors_Ratio_Latest -0.2333 0.113 -2.069 0.039 -0.454 -0.012

Other_Income -0.1198 0.108 -1.108 0.268 -0.332 0.092

Net_Working_Capital -0.3211 0.098 -3.281 0.001 -0.513 -0.129

Total_Debt 0.6698 0.098 6.808 0.000 0.477 0.863

ROG_Cost_of_Production_perc -0.2213 0.097 -2.291 0.022 -0.411 -0.032

Current_Ratio_Latest -0.7224 0.128 -5.644 0.000 -0.973 -0.472

Creditors_Velocity_Days 0.0973 0.092 1.059 0.290 -0.083 0.277

Cash_Flow_From_Investing_Activities 0.1927 0.095 2.033 0.042 0.007 0.378

Equity_Paid_Up -0.1582 0.088 -1.802 0.072 -0.330 0.014

Possibly complete quasi-separation: A fraction 0.18 of observations can be

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 76/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Selling_Cost" has the highest p-value (0.365) and is insignificant, therefore, we need to
eliminate it.

Model_13

In [130]:

f_13='default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+Value_of_Output_by_Total

In [131]:

model_13= SM.logit(formula = f_13,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125780

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 77/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [132]:

model_13.summary()

Out[132]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3565

Method: MLE Df Model: 20

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6148

Time: 13:00:29 Log-Likelihood: -451.05

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 3.049e-293

coef std err z P>|z| [0.025 0.975]

Intercept -5.6558 0.270 -20.942 0.000 -6.185 -5.126

Book_Value_Adj_Unit_Curr -1.2200 0.571 -2.137 0.033 -2.339 -0.101

Book_Value_Unit_Curr -1.6763 0.581 -2.884 0.004 -2.816 -0.537

Value_of_Output_by_Total_Assets 0.3653 0.157 2.331 0.020 0.058 0.672

Total_Asset_Turnover_Ratio_Latest -0.1379 0.145 -0.953 0.340 -0.421 0.146

CPM_perc_Latest -0.3304 0.114 -2.894 0.004 -0.554 -0.107

Value_of_Output_by_Gross_Block -0.4651 0.132 -3.521 0.000 -0.724 -0.206

Adjusted_PAT -0.4753 0.149 -3.186 0.001 -0.768 -0.183

ROG_Capital_Employed_perc 0.2915 0.125 2.331 0.020 0.046 0.537

ROG_Net_Worth_perc -0.2247 0.124 -1.812 0.070 -0.468 0.018

Interest_Cover_Ratio_Latest -0.4158 0.146 -2.846 0.004 -0.702 -0.129

ROG_Total_Assets_perc -0.1807 0.116 -1.560 0.119 -0.408 0.046

Debtors_Ratio_Latest -0.2186 0.111 -1.962 0.050 -0.437 -0.000

Other_Income -0.0877 0.102 -0.857 0.391 -0.288 0.113

Net_Working_Capital -0.3152 0.097 -3.235 0.001 -0.506 -0.124

Total_Debt 0.6717 0.098 6.832 0.000 0.479 0.864

ROG_Cost_of_Production_perc -0.2223 0.097 -2.299 0.021 -0.412 -0.033

Current_Ratio_Latest -0.7291 0.128 -5.703 0.000 -0.980 -0.479

Creditors_Velocity_Days 0.1027 0.092 1.120 0.263 -0.077 0.282

Cash_Flow_From_Investing_Activities 0.1928 0.095 2.039 0.041 0.008 0.378

Equity_Paid_Up -0.1569 0.088 -1.790 0.073 -0.329 0.015

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 78/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Variable "Other_Income" has the highest p-value (0.391) and is insignificant, therefore, we need to
eliminate it.

Model_15

In [133]:

s+Total_Asset_Turnover_Ratio_Latest+CPM_perc_Latest+Value_of_Output_by_Gross_Block+ A

In [134]:

model_15= SM.logit(formula = f_15,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125883

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 79/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [135]:

model_15.summary()

Out[135]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3566

Method: MLE Df Model: 19

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6145

Time: 13:03:51 Log-Likelihood: -451.42

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 4.976e-294

coef std err z P>|z| [0.025 0.975]

Intercept -5.6461 0.269 -20.990 0.000 -6.173 -5.119

Book_Value_Adj_Unit_Curr -1.2055 0.572 -2.108 0.035 -2.326 -0.085

Book_Value_Unit_Curr -1.6875 0.583 -2.895 0.004 -2.830 -0.545

Value_of_Output_by_Total_Assets 0.3565 0.157 2.275 0.023 0.049 0.664

Total_Asset_Turnover_Ratio_Latest -0.1269 0.144 -0.880 0.379 -0.410 0.156

CPM_perc_Latest -0.3289 0.114 -2.887 0.004 -0.552 -0.106

Value_of_Output_by_Gross_Block -0.4692 0.132 -3.549 0.000 -0.728 -0.210

Adjusted_PAT -0.4977 0.147 -3.388 0.001 -0.786 -0.210

ROG_Capital_Employed_perc 0.3005 0.125 2.413 0.016 0.056 0.545

ROG_Net_Worth_perc -0.2228 0.124 -1.791 0.073 -0.467 0.021

Interest_Cover_Ratio_Latest -0.4170 0.146 -2.859 0.004 -0.703 -0.131

ROG_Total_Assets_perc -0.1802 0.116 -1.555 0.120 -0.407 0.047

Debtors_Ratio_Latest -0.2241 0.111 -2.016 0.044 -0.442 -0.006

Net_Working_Capital -0.3208 0.097 -3.304 0.001 -0.511 -0.131

Total_Debt 0.6558 0.096 6.804 0.000 0.467 0.845

ROG_Cost_of_Production_perc -0.2176 0.097 -2.255 0.024 -0.407 -0.028

Current_Ratio_Latest -0.7153 0.127 -5.651 0.000 -0.963 -0.467

Creditors_Velocity_Days 0.0928 0.091 1.019 0.308 -0.086 0.271

Cash_Flow_From_Investing_Activities 0.1851 0.094 1.965 0.049 0.001 0.370

Equity_Paid_Up -0.1541 0.088 -1.759 0.079 -0.326 0.018

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 80/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Variable "Total_Asset_Turnover_Ratio_Latest" has the highest p-value (0.379) and is insignificant,


therefore, we need to eliminate it.

Model_16

In [136]:

'default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+Value_of_Output_by_Total_Asse

In [137]:

model_16= SM.logit(formula = f_16,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.125992

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 81/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [138]:

model_16.summary()

Out[138]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3567

Method: MLE Df Model: 18

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6142

Time: 13:05:36 Log-Likelihood: -451.81

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 8.090e-295

coef std err z P>|z| [0.025 0.975]

Intercept -5.6564 0.269 -21.017 0.000 -6.184 -5.129

Book_Value_Adj_Unit_Curr -1.1990 0.573 -2.094 0.036 -2.321 -0.077

Book_Value_Unit_Curr -1.6921 0.584 -2.899 0.004 -2.836 -0.548

Value_of_Output_by_Total_Assets 0.2756 0.127 2.178 0.029 0.028 0.524

CPM_perc_Latest -0.3329 0.114 -2.927 0.003 -0.556 -0.110

Value_of_Output_by_Gross_Block -0.4767 0.132 -3.610 0.000 -0.736 -0.218

Adjusted_PAT -0.5025 0.147 -3.420 0.001 -0.791 -0.215

ROG_Capital_Employed_perc 0.3068 0.124 2.468 0.014 0.063 0.550

ROG_Net_Worth_perc -0.2252 0.124 -1.810 0.070 -0.469 0.019

Interest_Cover_Ratio_Latest -0.4303 0.145 -2.961 0.003 -0.715 -0.145

ROG_Total_Assets_perc -0.1817 0.116 -1.570 0.117 -0.409 0.045

Debtors_Ratio_Latest -0.2327 0.111 -2.102 0.036 -0.450 -0.016

Net_Working_Capital -0.3301 0.096 -3.421 0.001 -0.519 -0.141

Total_Debt 0.6586 0.096 6.843 0.000 0.470 0.847

ROG_Cost_of_Production_perc -0.2163 0.096 -2.244 0.025 -0.405 -0.027

Current_Ratio_Latest -0.7130 0.127 -5.629 0.000 -0.961 -0.465

Creditors_Velocity_Days 0.0826 0.090 0.915 0.360 -0.094 0.259

Cash_Flow_From_Investing_Activities 0.1832 0.094 1.951 0.051 -0.001 0.367

Equity_Paid_Up -0.1526 0.087 -1.745 0.081 -0.324 0.019

Possibly complete quasi-separation: A fraction 0.17 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Creditors Velocity Days" has the highest p-value (0.360) and is insignificant, therefore, we
localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 82/102
06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook
_ y_ y g p ( ) g , ,
need to eliminate it.

Model_17

In [139]:

f_17='default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+Value_of_Output_by_Total

In [140]:

model_17= SM.logit(formula = f_17,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.126109

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 83/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [141]:

model_17.summary()

Out[141]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3568

Method: MLE Df Model: 17

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6138

Time: 13:07:28 Log-Likelihood: -452.23

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 1.311e-295

coef std err z P>|z| [0.025 0.975]

Intercept -5.6456 0.268 -21.070 0.000 -6.171 -5.120

Book_Value_Adj_Unit_Curr -1.2016 0.569 -2.112 0.035 -2.317 -0.087

Book_Value_Unit_Curr -1.6802 0.579 -2.900 0.004 -2.815 -0.545

Value_of_Output_by_Total_Assets 0.2833 0.126 2.250 0.024 0.037 0.530

CPM_perc_Latest -0.3309 0.114 -2.915 0.004 -0.553 -0.108

Value_of_Output_by_Gross_Block -0.4730 0.132 -3.594 0.000 -0.731 -0.215

Adjusted_PAT -0.4986 0.147 -3.389 0.001 -0.787 -0.210

ROG_Capital_Employed_perc 0.3032 0.124 2.438 0.015 0.059 0.547

ROG_Net_Worth_perc -0.2260 0.125 -1.813 0.070 -0.470 0.018

Interest_Cover_Ratio_Latest -0.4379 0.145 -3.015 0.003 -0.723 -0.153

ROG_Total_Assets_perc -0.1815 0.116 -1.567 0.117 -0.409 0.046

Debtors_Ratio_Latest -0.2196 0.110 -2.003 0.045 -0.434 -0.005

Net_Working_Capital -0.3232 0.096 -3.364 0.001 -0.512 -0.135

Total_Debt 0.6715 0.095 7.050 0.000 0.485 0.858

ROG_Cost_of_Production_perc -0.2156 0.096 -2.237 0.025 -0.405 -0.027

Current_Ratio_Latest -0.7169 0.126 -5.672 0.000 -0.965 -0.469

Cash_Flow_From_Investing_Activities 0.1752 0.093 1.874 0.061 -0.008 0.358

Equity_Paid_Up -0.1542 0.088 -1.762 0.078 -0.326 0.017

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Equity_Paid_Up" has the highest p-value (0.078) and is insignificant, therefore, we need to
eliminate it.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 84/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Model_18

In [142]:

f_18='default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+Value_of_Output_by_Total

In [143]:

model_18= SM.logit(formula = f_18,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.126544

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 85/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [144]:

model_18.summary()

Out[144]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3569

Method: MLE Df Model: 16

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6125

Time: 13:09:27 Log-Likelihood: -453.79

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 6.370e-296

coef std err z P>|z| [0.025 0.975]

Intercept -5.6166 0.266 -21.127 0.000 -6.138 -5.096

Book_Value_Adj_Unit_Curr -1.2287 0.587 -2.094 0.036 -2.378 -0.079

Book_Value_Unit_Curr -1.6251 0.595 -2.733 0.006 -2.791 -0.460

Value_of_Output_by_Total_Assets 0.2828 0.126 2.248 0.025 0.036 0.529

CPM_perc_Latest -0.3348 0.113 -2.958 0.003 -0.557 -0.113

Value_of_Output_by_Gross_Block -0.4677 0.131 -3.565 0.000 -0.725 -0.211

Adjusted_PAT -0.4995 0.147 -3.391 0.001 -0.788 -0.211

ROG_Capital_Employed_perc 0.2927 0.124 2.370 0.018 0.051 0.535

ROG_Net_Worth_perc -0.2120 0.124 -1.703 0.089 -0.456 0.032

Interest_Cover_Ratio_Latest -0.4334 0.145 -2.985 0.003 -0.718 -0.149

ROG_Total_Assets_perc -0.1752 0.115 -1.522 0.128 -0.401 0.050

Debtors_Ratio_Latest -0.2186 0.110 -1.995 0.046 -0.433 -0.004

Net_Working_Capital -0.3258 0.096 -3.394 0.001 -0.514 -0.138

Total_Debt 0.6591 0.095 6.973 0.000 0.474 0.844

ROG_Cost_of_Production_perc -0.2152 0.096 -2.235 0.025 -0.404 -0.027

Current_Ratio_Latest -0.7113 0.126 -5.650 0.000 -0.958 -0.465

Cash_Flow_From_Investing_Activities 0.1765 0.093 1.893 0.058 -0.006 0.359

Possibly complete quasi-separation: A fraction 0.17 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "ROG_Net_Worth_perc" has the highest p-value (0.089) and is insignificant, therefore, we need
to eliminate it.

Model 19
localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 86/102
Model_19
06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [145]:

t_by_Total_Assets+CPM_perc_Latest+Value_of_Output_by_Gross_Block+ Adjusted_PAT+ROG_C

In [146]:

model_19= SM.logit(formula = f_19,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.126952

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 87/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [147]:

model_19.summary()

Out[147]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3570

Method: MLE Df Model: 15

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6112

Time: 13:11:13 Log-Likelihood: -455.25

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 2.732e-296

coef std err z P>|z| [0.025 0.975]

Intercept -5.6353 0.267 -21.079 0.000 -6.159 -5.111

Book_Value_Adj_Unit_Curr -1.1900 0.576 -2.067 0.039 -2.319 -0.061

Book_Value_Unit_Curr -1.6867 0.586 -2.877 0.004 -2.836 -0.538

Value_of_Output_by_Total_Assets 0.2795 0.125 2.233 0.026 0.034 0.525

CPM_perc_Latest -0.3425 0.113 -3.041 0.002 -0.563 -0.122

Value_of_Output_by_Gross_Block -0.4759 0.131 -3.639 0.000 -0.732 -0.220

Adjusted_PAT -0.5869 0.139 -4.225 0.000 -0.859 -0.315

ROG_Capital_Employed_perc 0.2332 0.118 1.979 0.048 0.002 0.464

Interest_Cover_Ratio_Latest -0.4570 0.144 -3.166 0.002 -0.740 -0.174

ROG_Total_Assets_perc -0.1859 0.115 -1.623 0.104 -0.410 0.039

Debtors_Ratio_Latest -0.2163 0.109 -1.982 0.048 -0.430 -0.002

Net_Working_Capital -0.3136 0.096 -3.282 0.001 -0.501 -0.126

Total_Debt 0.6640 0.094 7.052 0.000 0.479 0.849

ROG_Cost_of_Production_perc -0.2263 0.096 -2.359 0.018 -0.414 -0.038

Current_Ratio_Latest -0.7206 0.126 -5.723 0.000 -0.967 -0.474

Cash_Flow_From_Investing_Activities 0.1809 0.093 1.943 0.052 -0.002 0.363

Possibly complete quasi-separation: A fraction 0.17 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Cash_Flow_From_Investing_Activities" has the highest p-value (0.052) and is insignificant,


therefore, we need to eliminate it.

Model_21
localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 88/102
06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [148]:

ver_Ratio_Latest+ROG_Total_Assets_perc+Debtors_Ratio_Latest+Net_Working_Capital+Tota

In [149]:

model_21= SM.logit(formula = f_21,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.127482

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 89/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [150]:

model_21.summary()

Out[150]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3571

Method: MLE Df Model: 14

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6096

Time: 13:15:17 Log-Likelihood: -457.15

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 1.748e-296

coef std err z P>|z| [0.025 0.975]

Intercept -5.6487 0.268 -21.056 0.000 -6.175 -5.123

Book_Value_Adj_Unit_Curr -1.1779 0.574 -2.053 0.040 -2.302 -0.054

Book_Value_Unit_Curr -1.7273 0.585 -2.952 0.003 -2.874 -0.580

Value_of_Output_by_Total_Assets 0.2483 0.124 1.996 0.046 0.004 0.492

CPM_perc_Latest -0.3525 0.112 -3.150 0.002 -0.572 -0.133

Value_of_Output_by_Gross_Block -0.4640 0.130 -3.573 0.000 -0.719 -0.210

Adjusted_PAT -0.5701 0.138 -4.127 0.000 -0.841 -0.299

ROG_Capital_Employed_perc 0.2259 0.117 1.933 0.053 -0.003 0.455

Interest_Cover_Ratio_Latest -0.4618 0.144 -3.208 0.001 -0.744 -0.180

ROG_Total_Assets_perc -0.2086 0.113 -1.843 0.065 -0.430 0.013

Debtors_Ratio_Latest -0.2378 0.109 -2.187 0.029 -0.451 -0.025

Net_Working_Capital -0.3170 0.096 -3.315 0.001 -0.504 -0.130

Total_Debt 0.6544 0.094 6.978 0.000 0.471 0.838

ROG_Cost_of_Production_perc -0.2235 0.096 -2.340 0.019 -0.411 -0.036

Current_Ratio_Latest -0.7155 0.126 -5.690 0.000 -0.962 -0.469

Possibly complete quasi-separation: A fraction 0.17 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "ROG_Total_Assets_perc" has the highest p-value (0.065) and is insignificant, therefore, we
need to eliminate it.

Model_22

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 90/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [151]:

f_22='default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+Value_of_Output_by_Total

In [152]:

model_22= SM.logit(formula = f_22,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.127957

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 91/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [153]:

model_22.summary()

Out[153]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3572

Method: MLE Df Model: 13

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6082

Time: 13:17:48 Log-Likelihood: -458.85

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 8.838e-297

coef std err z P>|z| [0.025 0.975]

Intercept -5.6362 0.267 -21.106 0.000 -6.160 -5.113

Book_Value_Adj_Unit_Curr -1.2116 0.579 -2.091 0.037 -2.347 -0.076

Book_Value_Unit_Curr -1.7024 0.590 -2.887 0.004 -2.858 -0.547

Value_of_Output_by_Total_Assets 0.2341 0.124 1.883 0.060 -0.010 0.478

CPM_perc_Latest -0.3627 0.111 -3.259 0.001 -0.581 -0.145

Value_of_Output_by_Gross_Block -0.4598 0.130 -3.539 0.000 -0.714 -0.205

Adjusted_PAT -0.5876 0.137 -4.276 0.000 -0.857 -0.318

ROG_Capital_Employed_perc 0.1159 0.100 1.159 0.246 -0.080 0.312

Interest_Cover_Ratio_Latest -0.4555 0.144 -3.173 0.002 -0.737 -0.174

Debtors_Ratio_Latest -0.2257 0.108 -2.091 0.037 -0.437 -0.014

Net_Working_Capital -0.3143 0.095 -3.292 0.001 -0.501 -0.127

Total_Debt 0.6533 0.094 6.977 0.000 0.470 0.837

ROG_Cost_of_Production_perc -0.2366 0.095 -2.482 0.013 -0.423 -0.050

Current_Ratio_Latest -0.7078 0.125 -5.649 0.000 -0.953 -0.462

Possibly complete quasi-separation: A fraction 0.18 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "ROG_Capital_Employed_perc" has the highest p-value (0.246) and is insignificant, therefore,
we need to eliminate it.

Model_23

In [154]:

f_23='default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+Value_of_Output_by_Total

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 92/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [155]:

model_23= SM.logit(formula = f_23,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.128145

Iterations 10

In [156]:

model_23.summary()

Out[156]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3573

Method: MLE Df Model: 12

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6076

Time: 13:19:12 Log-Likelihood: -459.53

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 1.551e-297

coef std err z P>|z| [0.025 0.975]

Intercept -5.6167 0.265 -21.160 0.000 -6.137 -5.096

Book_Value_Adj_Unit_Curr -1.2181 0.585 -2.083 0.037 -2.364 -0.072

Book_Value_Unit_Curr -1.6831 0.594 -2.834 0.005 -2.847 -0.519

Value_of_Output_by_Total_Assets 0.2354 0.124 1.896 0.058 -0.008 0.479

CPM_perc_Latest -0.3613 0.111 -3.244 0.001 -0.580 -0.143

Value_of_Output_by_Gross_Block -0.4514 0.130 -3.472 0.001 -0.706 -0.197

Adjusted_PAT -0.5518 0.133 -4.136 0.000 -0.813 -0.290

Interest_Cover_Ratio_Latest -0.4438 0.143 -3.106 0.002 -0.724 -0.164

Debtors_Ratio_Latest -0.2239 0.108 -2.074 0.038 -0.436 -0.012

Net_Working_Capital -0.3143 0.095 -3.296 0.001 -0.501 -0.127

Total_Debt 0.6546 0.094 6.991 0.000 0.471 0.838

ROG_Cost_of_Production_perc -0.2204 0.094 -2.338 0.019 -0.405 -0.036

Current_Ratio_Latest -0.6974 0.124 -5.607 0.000 -0.941 -0.454

Possibly complete quasi-separation: A fraction 0.17 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Variable "Value_of_Output_by_Total_Assets" has the highest p-value (0.058) and is insignificant,


therefore, we need to eliminate it.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 93/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Model_24

In [157]:

f_24='default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+CPM_perc_Latest+Value_of

In [158]:

model_24= SM.logit(formula = f_24,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.128643

Iterations 10

In [159]:

model_24.summary()

Out[159]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3574

Method: MLE Df Model: 11

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6061

Time: 13:22:52 Log-Likelihood: -461.31

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 7.852e-298

coef std err z P>|z| [0.025 0.975]

Intercept -5.5890 0.264 -21.132 0.000 -6.107 -5.071

Book_Value_Adj_Unit_Curr -1.2287 0.588 -2.090 0.037 -2.381 -0.076

Book_Value_Unit_Curr -1.6853 0.597 -2.822 0.005 -2.856 -0.515

CPM_perc_Latest -0.3612 0.111 -3.256 0.001 -0.579 -0.144

Value_of_Output_by_Gross_Block -0.3606 0.117 -3.071 0.002 -0.591 -0.130

Adjusted_PAT -0.5471 0.133 -4.108 0.000 -0.808 -0.286

Interest_Cover_Ratio_Latest -0.3882 0.139 -2.799 0.005 -0.660 -0.116

Debtors_Ratio_Latest -0.1332 0.096 -1.388 0.165 -0.321 0.055

Net_Working_Capital -0.3034 0.095 -3.199 0.001 -0.489 -0.117

Total_Debt 0.6619 0.093 7.092 0.000 0.479 0.845

ROG_Cost_of_Production_perc -0.2104 0.094 -2.238 0.025 -0.395 -0.026

Current_Ratio_Latest -0.7062 0.123 -5.719 0.000 -0.948 -0.464

Possibly complete quasi-separation: A fraction 0.17 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 94/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Variable "Debtors_Ratio_Latest" has the highest p-value (0.165) and is insignificant, therefore, we need
to eliminate it.

Model_25

In [160]:

f_25='default~Book_Value_Adj_Unit_Curr+Book_Value_Unit_Curr+CPM_perc_Latest+Value_of

In [161]:

model_25= SM.logit(formula = f_25,data=Company_imputed).fit()

Optimization terminated successfully.

Current function value: 0.128912

Iterations 10

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 95/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [162]:

model_25.summary()

Out[162]:

Logit Regression Results

Dep. Variable: default No. Observations: 3586

Model: Logit Df Residuals: 3575

Method: MLE Df Model: 10

Date: Sun, 06 Feb 2022 Pseudo R-squ.: 0.6052

Time: 13:25:33 Log-Likelihood: -462.28

converged: True LL-Null: -1171.0

Covariance Type: nonrobust LLR p-value: 1.680e-298

coef std err z P>|z| [0.025 0.975]

Intercept -5.5826 0.264 -21.167 0.000 -6.099 -5.066

Book_Value_Adj_Unit_Curr -1.2280 0.596 -2.059 0.040 -2.397 -0.059

Book_Value_Unit_Curr -1.6870 0.605 -2.791 0.005 -2.872 -0.502

CPM_perc_Latest -0.3632 0.111 -3.283 0.001 -0.580 -0.146

Value_of_Output_by_Gross_Block -0.3771 0.118 -3.206 0.001 -0.608 -0.147

Adjusted_PAT -0.5628 0.133 -4.238 0.000 -0.823 -0.303

Interest_Cover_Ratio_Latest -0.4170 0.137 -3.037 0.002 -0.686 -0.148

Net_Working_Capital -0.3206 0.094 -3.407 0.001 -0.505 -0.136

Total_Debt 0.6412 0.092 6.982 0.000 0.461 0.821

ROG_Cost_of_Production_perc -0.2192 0.094 -2.338 0.019 -0.403 -0.035

Current_Ratio_Latest -0.6852 0.122 -5.604 0.000 -0.925 -0.446

Possibly complete quasi-separation: A fraction 0.17 of observations can be

perfectly predicted. This might indicate that there is complete

quasi-separation. In this case some parameters will not be identified.

Now all the variables are significant, therefore, we don't need to eliminate any variable.Therefore, after
many such iterations below variables were removed :

ROG_PBIT_perc, PBDTM_perc_Latest, Cash_Flow_From_Operating_Activities, Inventory_Velocity_Days,


Debtors_Velocity_Days, Cash_Flow_From_Financing_Activities, ROG_CP_perc, ROG_Gross_Block_perc,
ROG_Market_Capitalisation_perc, Fixed_Assets_Ratio_Latest, Inventory_Ratio_Latest, Selling_Cost,
Other_Income, Total_Asset_Turnover_Ratio_Latest, Creditors_Velocity_Days, Equity_Paid_Up,
ROG_Net_Worth_perc, Cash_Flow_From_Investing_Activities, ROG_Total_Assets_perc,
ROG_Capital_Employed_perc, Value_of_Output_by_Total_Assets, Debtors_Ratio_Latest

Variables used for Statistical Modelling are :

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 96/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

Book_Value_Adj_Unit_Curr, Book_Value_Unit_Curr, CPM_perc_Latest, Value_of_Output_by_Gross_Block,


Adjusted_PAT, Interest_Cover_Ratio_Latest, Net_Working_Capital, Total_Debt, ROG_Cost_of_Production_perc
and Current_Ratio_Latest.

1.7 Validate the Model on Test Dataset and state the performance
matrices. Also state interpretation from the model
Now we will look at the predicted probability values.

Prediction on the Data Model:

In [172]:

y_prob_pred_train = model_25.predict(Company_train)

pd.DataFrame(y_prob_pred_train).head()

Out[172]:

662 0.000

1373 0.001

3268 0.003

3246 0.002

1456 0.003

In [173]:

y_prob_pred_test = model_25.predict(Company_test)

pd.DataFrame(y_prob_pred_test).head()
...

Let us now see the predicted classes on Train Data.

In [174]:

y_class_pred=[]
for i in range(0,len(y_prob_pred_train)):
if np.array(y_prob_pred_train)[i]>0.5:
a=1
else:
a=0
y_class_pred.append(a)

Model Evaluation on the Training Data


Let us now check the confusion matrix and the classification report followed by the AUC and the AUC-ROC
curve.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 97/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [178]:

sns.heatmap((metrics.confusion_matrix(Company_train['default'],y_class_pred)),annot=
,cmap='Blues');
plt.xlabel('Predicted Label');
plt.ylabel('Actual Label',rotation=90);
plt.title('Figure: Confusion Matrix of Train Data');

In [179]:

print(metrics.classification_report(Company_train['default'],y_class_pred,digits=3))

precision recall f1-score support

0 0.970 0.980 0.975 2176

1 0.785 0.712 0.747 226

accuracy 0.955 2402

macro avg 0.878 0.846 0.861 2402

weighted avg 0.953 0.955 0.954 2402

Overall 95% of correct predictions to total predictions were made by the model

92% of those defaulted were correctly identified as defaulters by the model

Now, let us see the predicted probability values on test dataset

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 98/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [180]:

y_prob_pred_test = model_25.predict(Company_test)

pd.DataFrame(y_prob_pred_test).head()

Out[180]:

3163 0.001

3133 0.000

937 0.159

196 0.764

2852 0.000

Let us now see the predicted classes on Test Data.

In [181]:

y_class_pred=[]
for i in range(0,len(y_prob_pred_test)):
if np.array(y_prob_pred_test)[i]>0.5:
a=1
else:
a=0
y_class_pred.append(a)

Model Evaluation on the Test Data


Let us now check the confusion matrix and the classification report followed by the AUC and the AUC-ROC
curve.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.ip… 99/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [182]:

sns.heatmap((metrics.confusion_matrix(Company_test['default'],y_class_pred)),annot=T
,cmap='Blues');
plt.xlabel('Predicted Label');
plt.ylabel('Actual Label',rotation=90);
plt.title('Figure: Confusion Matrix of Test Data');

In [183]:

print(metrics.classification_report(Company_test['default'],y_class_pred,digits=3))

precision recall f1-score support

0 0.974 0.974 0.974 1049

1 0.800 0.800 0.800 135

accuracy 0.954 1184

macro avg 0.887 0.887 0.887 1184

weighted avg 0.954 0.954 0.954 1184

Overall 97% of correct predictions to total predictions were made by the model

91% of those defaulted were correctly identified as defaulters by the model

Some interpretation of the model:

1) Of many variables – significantly only 6 variables contribute to the company being predicted as default or not
from logistic regression point of view.

2) The model is likely to predict the 86% companies that could default correctly.

3) Which means only in 14% cases – it could happen that a company that is predicted as defaulter may not be
a defaulter but form an investor point of view – it is ok to no invest money on company that could likely not
default.

4) The precision is a bit less in this model – however still 68% times, the model will predict the defaulter
company correctly.

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.i… 100/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.i… 101/102


06/02/2022, 17:52 Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022 - Jupyter Notebook

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

In [ ]:

localhost:8888/notebooks/Downloads/Financial Risk Analytics (FRA)/Project FRA Milestone 1/Project_FRA_Milestone1_Nikita Chaturvedi_05.05.2022.i… 102/102

You might also like