Investigation of Household Debt through Multilevel Multivariate Analysis : Case of a Developing Country

This study focuses on investigating the relationships between different socioeconomic and demographic characteristics and households’ debt decision and demand. We used six survey rounds of data from Pakistan household integrated expenditure survey (HIES) 2001 to 2014. HIES is a nationally representative data collected by Pakistan Bureau of Statistics. Multilevel models were used to investigate the relationship in which the data on households was nested in primary sampling units (PSUs) and PSUs were nested in provinces. The decision of taking household debt varies 22% at PSU level and 18% at provincial level due to unobserved variables. We found that households having higher financial assets, higher income and larger household sizes tend to have a higher percentage of debt. The amount of debt also increases with education and age. In the case of demand for debt, the variation is 12% at the provincial level. Literature studying household debt decision in Pakistan often ignore the geographical differences (region/province specific studies). Considering socioeconomic characteristics habituating the usage of credit is of countless importance in guiding policy design and interventions that aim to improve financial inclusion.


INTRODUCTION
Debt can be a blessing or a curse depending on situations and contexts.Debt usually has negative intrinsic meaning and commonly associated with stress, depression and decrease in wellbeing (O'Neill, Prawitz, Sorhaindo, Kim, & Garman, 2006).Bertola, Disney, and Grant (2006, p. 1) however proposes that debt can be desirable.In fact, the opportunity to take debt provides the ability to enhance economic welfare.Debt can yield positive outcomes if it is handled carefully and is not linked too low of consumption in the future.The lack of capacity to borrow can reduce the welfare of the society (Tsai, Dwyer, & Tsay, 2016).
In this changing economic world, rising household debt is becoming one of the major problems to address but in developing world, there are some countries where efforts to increase access and use of broad range of financial services, including household debt, are being made.Efforts to enhance accessibility to credit especially among poor households is seen as a measure to improve their standard of living.The average behaviour towards household debt in Pakistan is that they are phobic of debt and remain outside the debt market.The percentage of Pakistani people who are using services offered by formal financial institutions are only 14% whereas 50.5% people have access to finance, including access to a semiformal sector which includes shopkeepers and money lenders (Mundial, 2008).According to a report published by the World Bank, 50% of the population in Pakistan do not use any formal or informal financial service and 19% population have voluntarily left the market (Nenova, Niang, & Ahmad, 2009).One of the cited reasons for the lack of borrowing is narrow access to financial services (Ahmed, 2016).In efforts to increase access to household debt, hurdles can be socioeconomic and demographic which condition the use of credit (Chen & Jin, 2016).This paper focuses on socioeconomic and demographic characteristics that can affect the decision to borrow as well as the amount of household debt.
Studies on low household debt in Pakistan is still lacking because the issue of financial inclusion was not in focus until May 2015 when the government realised the seriousness of the issue and launched financial inclusion strategy program.Debt provision was one of the sources of increasing financial inclusion.Even then the issue of debt has been explored (Ahmed, 2016;Adnan, 2005;Gul & Abbas, 2007), but the studies rely on small scale primary data.This study will investigate the issue by using recent nationally representative data of household integrated expenditure survey (HIES) from 2001-2014 by giving insightful information about socioeconomic and demographic characteristics which affect the household debt decision and their amount of debt.
HIES adopt two-stage survey design in conducting their large-scale survey to reduce the data collection costs.The enumeration blocks were identified from amp containing cities and villages.At first stage, primary sampling units (PSU) were selected based containing enumeration blocks.At second stage, households were selected from PSU using probability proportion to size (PPS) whereas household size was used as measure of size (MOS).Such multistage design produces data in such a way that people living in the same PSU can be expected to be more alike than people from different PSUs.Such design effect may create incorrect estimates of standard errors and may lead to Type 1 error (find a significant relationship where none exists).Multilevel models correct for such design effect (Arpino & Aassve, 2014;Grilli & Rampichini, 2015;O'Loughlin, 2004).Multilevel models can be used in two ways.One is to just account for the effect of the random variations to get efficient estimates.Secondly, we can just to focus on random variations and see how they change with the addition of the variables.For a multistage design of the data, the use of multilevel model is preferred (Stoker & Bowers, 2002).
One concern over the use of multilevel models is the need to have some minimum number of groups or size of observations per group.It is important to note that when random variations are zero, multilevel models transform into classical models.Thus, even if the number of groups is small and eventually leads to small inter-group variations, multilevel models do as good as classical models.When number of groups is small to estimate random variations, multilevel models usually add little information about variation but they give at least as much information as the classical regression.Therefore, even if we have only two groups such as male and female, multilevel model still possesses some advantages in making predictions about the groups.According to (Gelman & Hill, 2007, p. 275), two observations per group is adequate to fit the multilevel model.

Determinants of Household Debt
There are different factors that affect the amount of debt or the debt behaviour of households.The accumulation of debt over a lifetime is affected by age based on explanation by the life cycle hypothesis and the permanent income hypothesis.Modigliani (1986) proposed life cycle income hypothesis (LCIH), saying that a person seeks to smooth his consumption over his lifetime, for which he obtains the support of savings and debt.As far as age is concerned, debt is required in the early career life when a person has less income, and expenses are greater.This increase in debt with income may occur until mid-career life and middle age when a person begins having a higher income.Age was found to be a significant factor explaining the variation in amount of debt in many empirical studies (Del-Río & Young, 2005;Fabbri & Padula, 2004;Magri, 2002).However, there are also empirical evidences reported a contrasting finding.Yilmazer and DeVaney (2005) for instance, observed that old US residents do not reduce savings but decrease their spending associated with consumption.
In addition to age, household size has also been identified as an important determinant of the amount household debt.Household Consumption and Financial Survey of USA reported that larger household size will lead to higher household expenditure, and eventually increase the debt.Similar findings were also recorded by other authors (Livingstone & Lunt, 1992;Togba, 2012).Other demographic factors considered by past researchers are gender (Crook, 2006) and marital status (Del-Río & Young, 2005).Married couples are more prone to debts.A study conducted in Italy for a period of 1989-1998 revealed that marital status is a significant positive predictor of household debt (Magri, 2002).
An important association was found between household debt and education (Godwin, 1998;Kim & DeVaney, 2001).Usually, better-educated individuals will have better job prospects which lead to higher expected income and hence employed persons with higher expected income would have a better access to loans (Crook, 2006;Del-Río & Young, 2005).Employment status is also found associated with household debts (Tudela & Young, 2005).When a person lost his/her job, he/she then would be forced into obtaining debts.Employed persons are more inclined towards debt as they can easily apply for loans in the formal sector (Crook, 2006).Income also plays an important role in affecting the demand for debt (Crook, 2006;Del-Río & Young, 2005;Petrides & Karagrigoriou, 2008).Individuals with higher income are more eligible to obtain higher amount of debt (Petrides & Karagrigoriou, 2008).Another important factor which explains the ability of individual to obtain debt is the amount of financial assets.There is a direct relationship between debt and financial assets (Yilmazer & DeVaney, 2005).At the same time, it is also found that individuals who do not possess financial assets used to have more unsecured debt.However there is unclear direction of relationship between financial asset and debt where (Banks, Smith, & Wakefield, 2002) found a negative relationship while Brown and Taylor (2008) a positive relationship between the two variables.
In addition, each geographical area has its own dominant factors and unique characteristics affecting the debt behaviour (Stone & Maury, 2006).There are several variables that have been identified to affect the household debt but there are limitations in the generalizability of the results due to characteristics varying from an area to another.Lack of comprehensive database has generally limited research on household debt behaviour (Ladas, Garibaldi, Scarpel, & Aickelin, 2014).There remain national and cultural differences between households of different countries about how they manage their finances (Nicolini, Cude, & Chatterjee, 2013).This geographical differences call for the multilevel modelling approach to take into account variation across geographical areas.The particular literature of Pakistan lacked multilevel analysis (region specific studies) which is recently becoming popular for community/ population analysis.The use of surveys with stratified sampling design also necessitates the use of multilevel analysis in order to have efficient estimates.
Multilevel model is used to determine the effect within the groups and between the groups.Household debt has been investigated through multilevel analysis but there is a gap of studies in the context of Pakistan.Temporal effects in debt behaviour were investigated through multilevel analysis by Allison, Davis, Short, and Webb (2015).Multilevel level mixed effect binary logistic regression has been used to investigate the decisions of households to take debt based on their different characteristics.Multilevel mixed effect binary logistic regression has been used for binary outcomes.Vella and Verbeek (1998) used binary logistic regression in order to model the union decisions to raise the wage of union members.
As mentioned earlier, provincial variations are present in the country in respect of economic opportunities, outreach, income distribution and culture, it will be important to investigate debt variations across region and provinces through multilevel analysis.This research paper will be helpful in providing information about certain groups of people who obtain a certain range of debt in the context of Pakistan.This paper will discuss how the demand for debt and decision to take debt can be affected by socioeconomic and demographic characteristics by taking account the structural differences present across areas and provinces.The research can identify the need for different strategies to disburse household debt in different provinces if structural differences exist between them.
Increasing debt opportunities and its acceptance in a developing country like Pakistan is because of its prevailing poverty rate.One of the Sustainable Development Goals is to eradicate poverty and microfinance, credit or debt is considered to be an important tool to help in achieving this goal (Littlefield, Morduch, & Hashemi, 2003) as evidenced in Bangladesh.Usually, lack of demand is created when the debt is not targeted to the right market.This lack of demand leads to shut down of the formal credit market and eventually increases poverty.There is a need to provide access to debt to economically and socioeconomically disadvantaged.This will also help to increase financial inclusion.The specific research questions are: 1.
Are there any structural features of the province in a country such as the difference in household debt decision and demand for debt between different areas of which we take account before looking at the effects of demographic and socioeconomic characteristics?

2.
How is the decision to take debt and demand for debt affected by socioeconomic and demographic characteristics?

3.
How much decision and demand for debt vary across areas (PSUs and provinces)?

DATA
To The detail of the data can be seen on http://www.pbs.gov.pk/content/household-integrated-economic-survey-hies-2013-14.The original data was purchased from Pakistan Bureau of Statistics on request for research purpose.data disaggregated at the provincial level.Pakistan Bureau of Statistics (PBS) developed its own sampling frame where they divided cities and town into enumeration blocks identifying them through the map.A two-stage stratified sample design was used for this survey.At stage 1, primary sampling units (PSUs) were selected.PSUs consists of enumeration blocks in urban areas and villages in rural areas.In urban areas, PSUs were selected from each stratum based on probability proportion to size (PPS) technique of sampling using households as a measure of size (MOS).The same method of PPS is used taking the population of villages as MOS in the rural domain.
At stage 2, systematic sampling scheme is used to select between 12 and 16 households from each enumeration block whereas sampling is started randomly.Based on the above-mentioned sampling frame, designed by PBS, data for HIES was collected.An integrated questionnaire was used by PBS to collect data at both household and individual level which is nationally representative covering the broad topics of demographics, health, education, savings, liabilities and consumption.

Data Screening
We pooled the data for six survey rounds and sorted the desired questions representing the required variables.The conceptual model contained both demographic and socioeconomic variables.The information of household debt, income and financial assets was given at household level whereas the information of age, education, gender and marital status was provided of each household member.It was logical to take personal characteristics such as age and education of household head rather than of any household member who is not even legally eligible to apply for a loan from formal institutions and match them to information of household debt, income and assets.So the data for only household heads was sorted.Household head is the one who is identified as head by the household members.There is no clear definition of head of household but is operationally defined as the one who is a main financial contributor, decision maker or the eldest one.This definition has been adopted from Nenova (2009).For the collection of data of HIES, PBS defines head as, "If a person lives alone, that person is considered as the head of the household.If a group of persons live and eat together as defined above, the head of the household is that person who is considered as the head by the household members.In practice, when the husband, wife, married and unmarried children from a single household, the husband is generally reported as the "head".When parents, brothers and sisters comprise a household, either a parent or the eldest brother or sister is generally reported as the head of the household.When a household consists of several unrelated persons either the respondent or the eldest household member is selected as the "head".In special dwelling units, the resident person-in-charge (e.g.manager) may be reported as the "head"" (GoP, 2013, p.8).
We used household head characteristics as determinants of household debt behaviour.The total data merged was 42,147 whereas the missing values were 0.01%.The software automatically drops the household head for which any information is missing.The valid cases used for the analysis of household debt decision were 41,592, obtained after pooling six survey rounds from period 2001 to 2014.To investigate factors explaining the variation in amount of debt, 11,159 household heads who have taken debt were used.The number of PSUs were 3515 and households belonged to 4 provinces.The prior inspection of the data also did not show any unusual values for observations.However at a later stage after fitting the final model, the residual plot will be scrutinized for confirmation.
Figure 1 shows different levels of the data which are provincial, PSU and household.These levels mandated the use of multilevel models.Finding characteristics variations at each level may give exciting results and important policy insights.

METHODS
The analysis is divided into two parts, first investigation of household debt decision and the second part is analysis of household amount of debt.The structural features have been investigated at the respective stages.In order to investigate household debt decision, the outcome of household debt has been coded as 1 and 0, where 1 indicates the presence of household debt and 0 indicates the absence.The focus of the study is to investigate the factors affecting the household decision to take debt accounting for structural differences.
Employment has been recoded as 1 indicates paid employment and 0 indicates other employment categories which include employers, self-employed and people employed in the agricultural sector.Around 60% of the employees have paid employment.Education and age are both measured in years.However in the analysis, age was divided by 10 and education by 5 in order to see 10 and 5 years change respectively.Usually, in 10 years, a person may become legally eligible to take debt or reach mid-career or retire and it can sufficiently and significantly affect the debt decisions of households.The five years change in education means that a household head may complete a degree or may pass one of the stage of education namely primary, secondary, higher secondary, degree or postgraduate and move to next.The purpose was also to make the estimated coefficient interpretable.
Household size has been recoded into three categories of small, medium and large.Small household size includes 1-5 household members, medium one includes 6-8 and large household size includes household members greater than 8.The average household size of Pakistan was 6.41 in 2013/14 whereas the average household size increase to 8 among the poorest.The average household size in Punjab is 6.16 (GoP, 2013).We took the average household size of 6-8 as a reference for the medium category whereas smaller household size was categorised as small.The household size larger than the average was categorised as large.The categories were carefully made exhaustible and mutually exclusive.
Income was categorized into three categories according to tertiles of the data.We have pooled the data for six round conducted over 13 years.The income categories were PKR 1-65,000 (category 1), 65,001-145,000 (category 2) and greater than PKR 145,000.For reference to currency, we use 1 USD equals to PKR 115 exchange rate.
Financial assets were measured by the market value of financial assets reported by households in the data which then we recoded into three categories according to tertiles of the data and separated the observations into three categories namely assets of worth PKR 1-195,000 (category 1), PKR 195,001-600,000 (category 2) and greater than PKR 600,000 (category 3).Meanwhile, gender includes the category of male, coded as 1 and female as zero.Marital status is a binary variable where 1 represents the married and 0 otherwise.The region represents urban (1) and rural area (0) whereas province includes Punjab (1), Sindh (2), Balochistan (3) and Khyber Pakhtunkhwa (KPK) (4).While investigating demand for debt, household debt is measured by the net amount of debt owed by households.In this analysis, we excluded the households which have not taken the debt.The number of observation was 11,592 households after eliminating for those without any debt.Similar independent variables which have been used to investigate the decision of debt are considered in the analysis on amount of debt.The categories used for the variables are also the same except for income where in this analysis the middle income category is merged the lower income categories due to the change in distribution of income after elimination of households mentioned above.The tertiles of financial assets and income, in the data of people who have taken data, are also around those values identified previously so the range of categories has not been changed.Also by keeping the categories same, we can easily compare the effect of a range of income on household debt decision and demand for debt.
In order to examine whether geographical variation is important in explaining household debt behaviour, we test the null hypothesis that there is no higher level variation and compare three-level model to a single level model.The household, primary sampling unit (PSU) and province variance is denoted by ! e 2 , ! u 2 and !v 2 respectively.At this stage, we do not add any predictor variable and see random variations of the dependent variable in an empty model.The following hypothesis is tested: There is no provincial and PSU level variance) If !v 2 = 0 then there would be no variation between provinces and if ! u 2 = 0 then there would be no variation between PSU, the multilevel model will turn into classical model In order to test the hypothesis, the likelihood ratio test is conducted where the likelihood ratio (LR) is calculated as The value of LR statistic is then compared to chisquared distribution according to degrees of freedom.If the LR statistics is greater than chi-squared distribution then the null hypothesis is rejected and multilevel model is considered as best fit to the data as compared to single-level model.
After fitting the multilevel model, it will be important to interpret the random variation.The total random variation for a household is !
In a logit model, the level 1 residuals of the latent variable y, ! e 2 are assumed to follow sta andard distribution which has variance of ! 2 3 " 3.29 (Fielding,   2004).
The province-level variance is calculated as: The PSU level variance is calculated as: The intraclass correlation (ICC) between different households within the same province but different PSU is calculated as: The intraclass correlation (ICC) between different households within the same province and same PSU is calculated as: After confirming the fit of the higher level model, predictor variables are added to the model.In the fixed part, the significance of the variables is checked and interpreted in a classical way whereas as the random part is interpreted using VPCs and ICCs.While testing the structural differences in household demand, we use the same procedure described above but here while calculating VPC, we will not replace ! e 2 by 3.29.After confirming the multilevel fit, we will use the multilevel analysis for finding the effects of socioeconomic and demographic characteristics on household debt.
Multilevel binary logistic regression has been used for clustered nature of data and binary outcome.Multilevel binary logistic regression does not need to follow the linearity assumption or homogenous residuals but it requires independence of observations.Heteroscedasticity may be present in the data to allow for the presence of random effects however the residual errors at a higher level should be homoscedastic.The econometric model used is as follows: n ijk is the number of observations of households of Pakistan (i) from PSU j and province k.X ijk is set of predictors for household debt amount.v k is the effect of provinthe ce.u jk is the effect of PSUs within a provine.!ijk is error term of the model.! is overall random intercept at national levethe l. y ijk may take value of 1 when the household has taken debt and 0 when they have not taken debt.
where n jk is the total number of households in j PSU and k represents the province.J k is the number of PSUs in the province k.K is the total number of provinces.!ijk is the difference between household i's observed debt decision and PSU j's mean.The province level variance !v 2 measures difference in household debt decision between provinces. Te PSU level variance !u 2 measures difference is household debt decision between PSUs whereas ! e 2 measures difference in household debt decision of households.
Multilevel mixed effect linear regression allows for the presence of heteroscedasticity in the data and helps in modelling the random variance (Bullen, Jones, & Duncan, 1997) but higher level residuals should be homoscedastic.This has been checked later after fitting the multilevel model and getting higher level residuals.The econometric model used is as follows: where, y ijk is the amount of debt.STATA was used for the analysis.

ESTIMATION RESULTS
In the sample of 41,592, only 27% of the household heads in the dataset have taken debt.The mean age of people is around 44 years, having a secondary level of education on average.The mean household size is 7 members.Their average annual income is around PKR 230,000.Most of the people living in rural areas have taken debt.The household size has been calculated by the number of family members permanently living in one house.Majority of the household heads are male and most of them are married.When the sample is reduced to only those who have taken debt for the analysis of demand for debt, the mean age, household size, education and the distribution of region, employment, gender and marital status remained almost same.The descriptive statistics are detailed in Table 1.
Household debt amount is transformed into log form so that the large standard deviations can be reduced.The amount of household debt in log form ranges between 6 and 16 whereas the original variable without log form has a large range of variation.Mean household debt in the sample is around PKR 0.1 million.In 2006, State Bank of Pakistan relaxed the strict regulations for the eligibility criteria of applying for loan so that it will attract more borrowers.Minimum wage to apply for PKR 100,000 to PKR 150,000 (USD 966-USD 1442 @104PKR/USD) loan was set at PKR 12,500 per month.Since respondents are largely male and married, gender and marital status failed to explain the variation in debt decision and thus are excluded from the analysis.
Figure 2 shows variation of households which has taken debt by province.The largest percentage is from Punjab followed by Khyber Pakhtunkhwa.Punjab is the richest of all provinces.They would have taken debt for education, investment and to supplement their income.Meanwhile, the situation of Khyber Pakhtunkhwa has been worsened due to terrorism attacks and is poorest of all provinces.People might have taken debt out of their needs.The same pattern was seen for demand for debt as shown in Figure 3.The percentage of the amount of debt is higher in Punjab.Debt remained low for Balochistan.However varying percentages have been seen for Khyber Pakhtunkhwa.

The Decision of Taking Household Debt
Multilevel mixed effect binary logistic regression is used to see the effect of characteristics on the decision to take debt.In order to see the structural differences in household debt decision, we see the random variations by fitting multilevel model.We first try to fit three-level model to the data without adding explanatory variable where the first level is household level, the second level is PSU level and the third level is provincial level.We found that the likelihood ratio test suggests that three-level model is preferred to single level model.
There is 16% random variation present in household debt decision at province level.There is around 37% variation present in household debt decision at PSU level shown in Table 2.However, variation of 10% is usually considered considerable in the textbooks of multilevel modelling but there is no specific benchmark as it varies according to the study and fields.
Researchers use random variation in the previous studies in their specific topic as a benchmark.We failed to find any reference study for determining random provincial/PSU level variation in household debt.The likelihood ratio test is considered as important to choose between single level and higher level models which suggest the use of the higher level model.The intraclass coefficient shows that the correlation between two household heads in the same province but different PSU is 16.3% whereas in the same province and same PSU is 53.3%.We have also confirmed that three-level model is preferred to twolevel model through likelihood ratio test.
We have added level 1 explanatory variables, interactions and level two variables in the model.We have also added the time dummies as we have pooled the six survey in order to see any change in the intercept over the years which can be seen in Table 3.
In Model 3, we examined the role of interaction between education and income as the literature says that better education may be related to better income (Gregorio & Lee, 2002;Thurow, 1972).We found the interaction term to be insignificant.Each PSU is also located in either urban or rural region.We also add level 2 variables of the region and found the PSU level variance decreased by around 2% as region explains it.We also have added random slopes but the variance of the random slope is very small not even around 1%.The likelihood ratio test suggests that model 3 is the preferred model and the best model.

Fixed Effects
For final model 3, odds ratio has also been shown.Fixed effects model showed that with every 10 years increase in age, the odds of taking household debt decreases.There is a culture of depending on children at later ages of their life.In a patriarchal society like Pakistan, some people bring up their sons considering them as assets who will pay their efforts back sooner or later.The older people can easily depend on their children rather than depending on debts to fulfil their expenditures.This might be the reason that data reveals that with an increase in every 10 years of age, odds of having debt decreases.Similar results were found in another study of age cohorts (Li & Goodman, 2015).The result is supported by the literature.Duca and Rosenthal (1993) looked at the relationship between debts and age which turned out to be negative.These results are not according to the mainstream findings in the literature (Fabbri & Padula, 2004;Yilmazer & DeVaney, 2005) where debt increases with age.
A 5 years increase in education decreases the odds of taking household debt.Usually, better-educated individuals will have better jobs which lead to higher income and hence employed persons with high income can apply for debts easily (Crook, 2006).The results show that the effect of education on the decision of taking household debt is not moderated by income.The interaction between education and income did not turn out to be statistically significant which shows that education or income does not necessarily affect their independent relationships with household debt.A person having higher education may decide to take debt as compared to those having less education but his decision will not be affected by his income and vice versa.
Household size is also significant in determining the debt entry.Household size of 1-5 is taken as the reference category.Odds of having debt for small household size relative to medium household size is greater.Larger household size (greater than 8 persons) has higher odds of having debt as compared to smaller households (between 1 and 5 persons).In general when household size increases, household expenditure increases and eventually the odds of taking debt also increase.A literature review is also in line with such rationale.Findings of household size are also according to literature where Magri (2002) also found the same results.
An important association was found between financial assets and debt.In the case of financial assets, larger financial assets decrease odds of having debt.A negative relationship was also found between the amount of debt and financial assets in the literature (Banks et al., 2002).People having an annual income greater than PKR 145,000 have more odds of having debt.People having income greater than PKR 145,000 can easily apply for formal loans and of large denominations as they require PKR12,500 per month salary to apply for formal loans.Thus odds of their taking debt also increase.The odds of having debt increases with increase in income which was also found in earlier investigations (Crook, 2006;Petrides & Karagrigoriou, 2008).
We found that being a paid employee decreases the odds of having debt as compared to other.It means that persons which are in agricultural employment or self-employed or are employers will have more odds of taking debt.
Odds of having debt decreases in urban area.Living in urban areas may ensure better opportunities and prospects of earnings so a person may not need to take debt.Persons from rural areas, having less education, having less age, with large household size, small financial assets and large income have high odds of having debt.We consider Model 2 as the predictive model for the household debt decisions.

Random Effects
Model 3 suggests that the decision of taking household debt varies 22% at PSU level and 18% at the provincial level.The random variation is due to unobserved characteristics.PSU represents enumeration blocks consisting of cities and villages.Our concern is provincial variation because, for each province, different strategies/policies of debt can be made rather than for each enumeration block.The provincial variation in household debt decisions is due to unobserved variables.It may be due to the unique profile of each province.Khyber Pakhtunkhwa is an economically depressed and backward province of Pakistan.This province is more affected by terrorism as compared to other provinces.Due to terrorism, its economic condition has become worse and consequently demand for debt varies (Nouman & Khan, 2010).
Ownership of land is important in agrarian countries like Pakistan because lands are their primary productive assets and sources of income.The average land holding was found to be highest in Balochistan followed by Punjab and Sindh.The unequal land distribution in Pakistan resulted in different needs/demand for credit (Anwar, Qureshi, Ali, & Ahmad, 2004).In last few years, there was an increase in the incidence of natural calamities in Sindh which lead to different credit needs compared to other provinces.The difference in cultural profile, income resource distribution, social class inequality, agricultural contribution, natural hazards and security concerns may lead to random variations in entry into the debt market and debt demand.
We also plotted the random residuals and found that there are variations in random residuals at PSU level.The caterpillar plot in Appendix Figure 1 shows the rank effect of PSU with vertical range showing the 95% confidence interval.Almost all of the residuals range touches the red line which shows that the random residuals are similar and not heteroscedastic.The homoscedasticity of the random residuals at a higher level shows that all the residuals at all levels will be homoscedastic.Although there is variation present between individual PSUs which has been estimated through the model.

Demand for Debt
To investigate the demand for debt, we used multilevel mixed effect linear regression.However, only households with debt are considered in the analysis.
The amount of debt was used as representative of the demand for debt where those who did not respond to the amount of the debt were not included.In order to be sure that sample used at this stage does not suffer from selection bias, we conduct Heckman selection model (two-step) as a test for sample selection bias which is also a method for correcting the bias.At first stage, the probability of taking debt using the determinants have been calculated whereas in the next stage (when people having debt have been selected non-randomly), the transformation of individual probabilities is used as a determinant.Rho is the estimated coefficient of the correlation between the error terms of the two equations.Lambda is the estimated coefficient of the Inverse Mill's ratio which is the product of standard deviation of the error term in the debt amount equation and rho.The insignificance of lambda shows that there is no selectivity bias in the sample which we are using at this stage.The result is shown in Appendix, Table 2.No evidence of multicollinearity was found between independent variables and shown in Appendix Table 1.
In order to see the structural differences in household demand for debt, we try to fit multilevel model in the data.Without using any explanatory variable, we fit three level model where using province as highest level and.The second lower level is PSU and level 1 is household level.The likelihood ratio test used shows that three-level model is preferred to a single level.
Table 4 shows that the variation in household debt at the provincial level is 12.5% and at PSU level is 21.2%.As we could not find previous studies in the context of Pakistan mentioning random variation in household debt so we could not comment on the magnitude of the variation.However, variation of 10% is usually considered considerable but there is no specific benchmark as it varies according to the study and fields.The correlation between two household heads in the same province but different PSU is 12.5% whereas in the same province and same PSU is 33.8%.
We prefer fitting three level model than two level.Table 5 shows the results of multilevel mixed effect linear regression for fitting a three-level model.
Model 1 includes the explanatory variables and the interaction terms are included in model 2 whereas model 3 includes level 2 variable.On the basis of likelihood ratio test, we select Model 3 as our final model.

Fixed Effect
With every 10 years increase in age, the percentage amount of debt increases as with age, skills, experience and income improve which make them eligible to take more debt.The findings are according to literature.LCIH also supports the increase of with age (Modigliani, 1986).Other researchers also found that debt increases with age (Fabbri & Padula, 2004;Magri, 2002;Yilmazer & DeVaney, 2005).
Household size is also an important factor to affect the amount of debt and percentage of debt was found to be higher for those having larger household sizes.Household size also affects the debt, as with their increase, in view of constant income, it becomes difficult to manage expenditures and thus debt may be required (Livingstone & Lunt, 1992;Magri, 2002;Togba, 2012).Results show that households with larger household sizes used to have more expenditures with a limited amount of income, due to which, they may need debts.
Our results also show that households with higher income have a higher percentage of debt.Earlier researchers showed that income plays an important role in affecting debt.High income may give eligibility to people for applying larger amounts of debts from formal institutions (Crook, 2006;Del-Río & Young, 2005;Petrides & Karagrigoriou, 2008).
We also found that paid employees used to have less amount of debt as compared to others which may be due to their continuous stream of inflowing income.Those who are in the agricultural sector or are employers tend to have more debts.The employment also plays an important role in affecting the amount of debt.Employed persons may have more debts (Crook, 2006;Tudela & Young, 2005).
Results show a positive association between education and debt amount.Education affects household debt positively as better education gives prospects of better earnings and a good understanding of financial options.Individuals use loans to fulfil their desires in anticipation of future income which will be used to repay debt (Kim & DeVaney, 2001).The interaction between education and income is insignificant which means that the relationship of education with household debt is not moderated by income.
Data shows that household debt is positively affected by household resources such as income and assets as these resources give better eligibility and access to the credit.Education also significantly and positively affects the amount of household debt.People having higher financial assets and higher income have a higher percentage of debt.People who have higher financial assets have lower odds of taking debt as compared to those who have lower financial assets whereas if they take debt, their percentage of the amount of debt is higher than those who have lower financial assets.People having more assets consider them as a backup for any emergency need and consume them in case of need instead of taking loan whereas people having higher income are often have better eligibility to take a larger amount of debt which they can easily repay so they take more debt as compared to those having lower income.These findings are also conform to the literature (Leonard & Di, 2014).
People in urban areas have more percentage of debt than people in rural areas.It is important to mention here that there are more odds that people in rural areas decide to take debt but the percentage of debt is higher in urban areas by approximately 6%.This may be because of higher access to debt in urban areas.Year dummies show that there was a significant increase in the amount of debt over the years except for 2005/06 but still the average amount of debt over the years remained PKR 92,955.This shows that debt amount has increased over the years but the increase the amount of debt was not very large.
We have added an interaction term between education and income but they did not turn out be significant.The PSU is located in either urban area or rural area so we also include a region to the model but it did not decrease the PSU level variance.

Random Effect
There are 12% variations in random intercepts of household debt at the provincial level.They have been graphed below.We can see in Figure 5 that random intercept of household debt of Punjab is positive while rest are negative.There are some factors that are different in each province which affects the household debt differently.Further investigation by other researchers in this area is required in order to investigate further factors which province specific and are affecting the household debt.Different policies for each province may be more effective than any general policy for the whole country for financial inclusion or debt provision.
The model was found to be robust as the residuals were normal and homoscedastic.We also plotted the random residuals and found that there are variations in random residuals at PSU level.In Appendix Figure 2, shows that almost all of the residuals range touch the red line which shows that the random residuals are not heteroscedastic and the estimates will be efficient.The homoscedasticity of the random residuals shows that all the residuals will be homoscedastic.Normal residuals are shown in Appendix, Figure 3.

CONCLUSION
As the focus of research on household debt has shifted from solely economic factors to different characteristics affecting debt behaviour, including age, gender and education etc., there is a need to conduct this study in each country to identify the debt behaviour for policy making, as the results of a single country cannot be generalised to others.There are some unique and situational factors present in every area that determine the debt behaviour of households.Unfortunately, not every country has a comprehensive survey database for such study.Therefore, there is much room for research in this area.In Pakistan, borrowing is not very common.There are some unobserved structural features of the provinces which lead to 18% variation in household debt decision and 12% variation in household demand for debt.Evidenced from the HIES, the observed factors show that only people with certain characteristics enter into the debt market.The mean amount of debt is not very high, which may signal that they may not borrow for investment purposes.Earlier literature in the context of Pakistan usually revolved around access, outreach and financial stability of both the demand and supply side.The authors argue that debate is far beyond access and outreach; rather, it is very basic.
Odds of having debt decrease with being in the urban area and having high education.Both aspects ensure better opportunities and prospects of earnings so a person may not need to take debt.Findings of household size are also according to literature where Magri (2002) also found the same results.The odds of having debt increase with increase in income and financial assets which were also found in earlier investigations (Crook, 2006;Petrides & Karagrigoriou, 2008).When the actual income is high, people may decide to have more debt because of having eligibility to the large nomination of debts which can be used for investment purposes.We found that people having high financial assets, higher income, larger household sizes and being other than paid employee tend to have a higher percentage of debt.However, the amount of debt also increases with education and age.. Demand for debt and household debt decisions randomly vary at provincial level which needs a much deeper exploration of factors affecting them in each province.This also signals that a country policy about household debt may give different results in each province.
There is a need to understand our market base on the demand side.Only then institutions with the sole purpose of giving financing to the poor at affordable low rates can work efficiently.The authors can now answer our questions raised in the introduction.People who want to supplement their income enter into the debt market on average.They are rather doing convenient borrowing.People taking on debt have an average household size of 7. Larger household sizes mean higher expenditures, so people might take on debt to finance consumption whenever required.According to the literature, the authors relate higher income to higher education but the interaction term was found to be insignificant.Currently, financial institutions are not readily offering products for such demand base.Some do not have the eligibility to apply for a loan from formal institutions and some do not want to apply for a loan from a formal financial institution (voluntary financial exclusion).There is a need to develop sustainable loans for identified demand base.The identified demand base is usually ignored by financial institutions, considering them marginally poor and giving them credit to be risky.
The study has generated some important implications which can be of interest to policy makers.The study showed that there is a low prevalence of credit among households.Household debt is positively related to paid employment, income and financial assets.Relaxing terms of debt for agriculturally employed, self-employed, having low income and financial assets can increase the number of borrowers and help to eradicate poverty.Provision of formal credit at easy terms is very important to increase the financial inclusion.In order to improve the credit inclusivity necessary to eradicate poverty, a broad range of credit products should be offered by financial institutions and governments.Credit products should focus to facilitate people employed in the agricultural sector, living in rural areas, young, less educated, and having less financial assets and large household sizes because they want to enter in the debt market.Hence, success is in providing sustainable financial services that can help in raising the standard living of poor.People take loans from the informal sector which lacks the capacity to sustainably grow.If the capacity of banks is built to provide easy and need-based loans whose operational functionality is close to informal sector then the debt market base can be extended and people will be more receptive towards them.However, it is important to mention here that each province requires specific policy tailored according to them in order to have effective results for financial inclusion.Households are sceptical towards conventional banks then inclusion can be improved by emphasizing on Islamic finance.
In this research, we have found provincial random variation in demand for debt and household debt decision which opens the door for further investigations by other researchers.If some province specific factors other than those identified in this research are explored then specific provincial policies may help to achieve the agenda of debt provision to increase financial inclusion effectively.This research also signals that any universal policy for the whole country may affect household debt differently in each province.The limitation of the data used for the study was that it does not cover psychological aspects of debt behaviour which needs to be explored.The psychological factors may contribute towards the explanation of random variations at the provincial level.

Figure 1 :
Figure 1: Different levels of the data.
, education, gender and employment status are the characteristics of the household head.

Figure 2 :
Figure 2: Percentage of households, deciding to take debt, by province over the years.

Figure 3 :
Figure 3: Percentage of the amount of household debt taken by household, by provinces over the years.

Figure 5 :
Figure 5: Random intercept of household debt at a provincial level.

Figure 2 :
Figure 2: Caterpillar plot of residuals at PSU level.

Figure 3 :
Figure 3: Normal quantile plot of residuals at PSU level.

Table 3 : Multilevel Mixed Effect Binary Logistic Regression Showing the Effect of Interactions and Level Two Variable on the Decision of Debt
Dependent variable= Household debt (1&0).*** p<0.01, ** p<0.05, * p<0.1.