Issues in Statistical Modelling of Human Capital and Economic Growth Nexus

The human capital and growth relationship has been subject to considerable debate in economic literature. The empirical growth models are beset with problems ranging from theoretical frameworks and statistical modelling to estimation procedures. Due to non-availability of precise human capital variable, theoretical knowledge fails when pitched against empirical data. This paper endeavors to answer four main questions that have been figured out prominently in this debate: Is there a direct interplay between human capital and growth? Are parametric techniques incapable of capturing non-linear aspects of human capital-growth relationship as compared to semi-parametric techniques? Are estimates of human capital sensitive to proxy of human capital variables? Are estimates of human capital sensitive to estimation techniques? Our findings reveal that human capital has a well-established role in accelerating growth through both its ‘level effects’ and ‘rate effects’. The results are not sensitive to definition of education variable but are rather technique dependent. The semi-parametric model provides sufficient evidence for non-linearity in human capital-growth relationship contrary to parametric models.


INTRODUCTION
The impact of human capital on economic growth has been a moot point in economic literature. Theoretical literature and empirical evidence have divergent opinions on impact of human capital on economic growth. Theoretically, the impact of human capital on economic growth has been explained in two distinctive ways. The first is the 'level effect', that reveals a direct relationship between economic growth and human capital [Mankiw, et al. (1992); Islam (1995); Barro (2001); Bassanini and Scarpetta (2001); Freire-Seren (2001) ;Agiomirgianakis, et al. (2002)]. The second approach, 'rate effect', maintains that human capital influences the technological progress thereby indirectly facilitating economic growth through adoption and generation of new technologies [Lucas (1988); Romer (1990); Black and Lynch (1996); Edwards (1997);Maudas, et al. (1999); Loof and Anderson (2008)].
The theoretical growth models, when incorporated in empirical studies, are surrounded with problems ranging from statistical modelling to estimation procedures. In statistical growth models a linear human capital-economic growth relationship is assumed, that is increasingly questioned by the upcoming growth literature. Due to nonlinear aspect of human capital, the linear growth models are incapable of capturing the human capital-growth association [Liu and Stengos (1999); Krueger and Lindahl (2001); Kalaitzidakis, et al. (2001); Kourtellos (2002);Mamuneas, et al. (2006)].
The estimation of these statistical growth models gives rise to another set of complications. In most of research literature, education has been considered as the sole determinant of human capital variable. The importance of education notwithstanding, a healthier workforce is better able to learn, invent, and implement new technologies. Consequently, the health status that stands out as a crucial component of human capital, has usually been overlooked in the growth studies. Moreover, the proxy employed for the human capital further exacerbates the conundrum. The commonly used proxies such as literacy rate, enrollment ratios, and the educational attainment have their inherent limitations, leading to imprecise estimates. The choice of the suitable estimation technique is also critical as the relative strengths and weaknesses of the estimation technique employed also affects the resultant human capital coefficients. Resultantly, empirical growth models frequently present inaccurate and inappropriate estimates that are often inconsistent with the prior expectations from theoretical frameworks.
In panel data models, the choice of estimation technique is crucial for the validity of estimates [Wane (2004)] and the results of growth models are often sensitive to the used estimation technique [Sturm and de Haan (2005); Butkiewicz and Yanikkaya (2006); Biesebroeck (2008); Erdal and Yenipazarli (2013)].
Fixed Effect (FE) model has an attenuated bias in the estimated coefficients, generating considerably smaller and often insignificant estimates [McKinnish (2000)]. In contrast, FE generates efficient and consistent estimates if all regressors are exogenous; the Generalized Method of Moments (GMM) estimates become consistent but inefficient [Gyimah-Brempong, et al. (2012)]. Estimated labour coefficient is significantly lower for parametric models than OLS estimates when simultaneity issues and productivity are accounted for. Capital coefficients, on the other hand, are not much affected but can be over-/under-estimated relative to OLS; GMM capital coefficient estimates being higher [Biesebroeck (2008)].
National accounts and other data from developing countries are affected with measurement errors, raising concerns on growth model estimates [Barro (2000)]. Thus, finding the most suitable technique with efficient and robust estimators becomes more important for at least two reasons. First, a regressor may misestimate the variable of interest even if it is not measured with error. Secondly, the estimation and comparison of different estimation techniques is advantageous in identifying some degree of measurement error in independent variable [McKinnish (2000)].
In the light of the above discussion, following questions stand out in the context of human capital-growth relationship: a. Is there a direct interplay between human capital and growth? b. Are parametric techniques incapable of capturing nonlinear aspects of human capital-growth relationship as compared to semi-parametric techniques? c. Are estimates of human capital sensitive to proxy of human capital variables? d. Are estimates of human capital sensitive to estimation techniques?
The objective of this paper is to resolve the dilemma of incongruity between theoretical and empirical evidence on human capital-growth relationship. The paper specifically aims at dealing with statistical modelling of growth frameworks. In order to provide a comprehensive analysis, human capital has been decomposed into education and health variables.
The present study has a significant contribution in the enrichment of human capital and economic growth debate. Human capital is a fundamental element in the economic growth models. Empirical issues in the human capital not only yield imprecise human capital estimates but also ascertain reliability of other estimates included in the growth model. Our work is distinguished from the earlier work on the topic in three important ways. First, the researchers brought the empirical evidence using most commonly employed estimation techniques on the topic, all at one platform using a uniform data set and then observing how patterns change with the change in techniques. Second, in theoretical modeling, the researchers not only focused on incorporating education as a proxy of human capital but also a measure of health, namely life expectancy, to account the multidimensional aspects of human capital -a much under-reported angle in human-capital and growth nexus both theoretically and empirically. Finally, we brought forth empirical evidence both at level and rate effect by modelling effect of human capital proxies on both growth and total factor productivity separately, keeping in view theoretical work that deals with both these lines in growth literature.
The paper is organized as follows: section 2 presents the empirical methodology used in the paper. Section 3 discusses data and variable construction. Results are reported and analyzed in section 4. Finally, section 5 concludes the paper.

MODEL SPECIFICATION
Theoretical growth literature describes the impact of human capital on economic growth in two different ways: the 'level effect', which poses a direct relationship between economic growth and human capital [Mankiw, et al. (1992); Islam (1995); Barro (2001); Agiomirgianakis, et al. (2002)] and the 'rate effect', that suggests human capital acting as the engine of technological progress via its positive spillovers and vital contribution in research and developments; thereby indirectly facilitating economic growth [Lucas (1988); Romer (1990); Edwards (1997);Maudas, et al. (1999), Loof and Anderson (2008)]. Keeping the two perspectives in mind, the researchers have tried to question of effect of human capital proxies related to education and health both via direct effect on growth and indirect effect on growth through total factor productivity. Model specification adopted for these two lines of research are as below.

Modelling impact of human capital on output growth
The empirical model employed in the study to investigate the impact of human capital on growth is a modification of the parsimonious model presented by Mamuneas, et al. (2006), whereby our starting point is a general production function describing the state of technology of a country i at time t : Further assuming that human capital is a function of education and health status, i.e.,

 
can thus be written as:

Modeling Impact of Human Capital on Total Factor Productivity (TFP) Growth
For modeling indirect effect via Total Factor Productivity (TFP) on growth, the baseline methodological model employed for capturing TFP equation is of parsimonious growth accounting framework that only contains the traditional inputs with some extensions. For example, given the problem of parameter homogeneity normally exists in the models like Equation 16 as the estimated parameters signify the mean contribution of factor inputs, whereas the contribution of inputs is assumed to be the same across time and countries. To deal with this issue of parameter homogeneity, an index of Total Factor Productivity (TFP) growth is constructed for the panel in which the parameters of factor inputs are allowed to vary not only across countries but also across time. Hence, an index of TFP growth for country i in year t is represented by the following equation: Using the condition described in Equation B-3, Equation B-2 can be written as: Two control variables, Trade openness and Democracy, are also introduced in the empirical model considering their relative importance in the economic uplift of a country.

Control Variables
In our current research we have kept the model parsimonious focusing only on fundamental growth variables such as growth in physical capital, effective labour, and human capital proxies in form of education and health indicators with only two control variables namely trade openness and democracy. These control variables have been added to account for impact of integration via trade and degree of political freedom. However, there are other important macroeconomic variables that could be possibly seen as further control variables such as debt, a proxy for the financial institution, variables of government expenditures, measures of economic institutions, and demographic variables, to name a few. But given human capital accumulation can very much respond to all these variables, we chose to work with a simplest baseline model with minimal control 1 . Hence, in this baseline work, the analysis is restricted to these two basic controls.
Openness: Trade policies that make a country more open towards international trade along with stimulating human capital accumulation foster greater economic growth. As pointed out by Miller and Upadhyay (2002), greater outward orientation enhances efficiency in the use of resources and following the principles of comparative advantage, promotes production specialization in certain industries. The increase in exports relaxes the foreign exchange constraint and a large inflow of important inputs in the production is facilitated through imports. The countries with increased trade openness as a result experience faster economic growth.
Democracy: The role of democratic institutions in a country's economic growth is considerably emphasized in the recent growth literature. Democracy facilitates better building of economic, social, and legal institutions which have a vital role in a country's progress. Besides the direct role of democracy in the growth process, studies like Baun and Lake (2003) have also explained the indirect impact of democracy on economic growth via secondary education and life expectancy.
Adding these controls to Equations A-16 and B-4, we get the final equation for estimation where the fundamental growth variables are taken in form of growth and these controls are added as level variables since growth in these variables over time will have marginal variation. Keeping the insights from literature review in perspective, these equations have been estimated through a variety of estimation techniques related to issue of heterogeneity in sample, endogeneity in key variables and issue of non-linearity in relation of human capital proxies and growth such as Common Effects (CE), Fixed Effects (FE), Random Effects (RE), Two Stage Least Squares (2SLS) and Generalized Methods of Moments (GMM), Partially Linear Semi-parametric estimation technique.

DATA AND VARIABLE CONSTRUCTION
The present study is based on a balanced panel data set over the time period 1970-2000 2 for 32 developing countries, i.e., thirty observations for each country in the sample adding up to a total of 960 observations. The selection of countries is on the basis of data availability. Among the selected 32 countries, four belong to South Asia, four to Middle East and North Africa, nine to Sub Saharan Africa, eleven to Latin America, and four to East Asia and Pacific Region. List of countries included in the sample along with the summary statistics of key variables are given in the Appendix 1. Variable construction and data sources for each variable are reported in Table1.
Going to different data sets was essential, given one data source could not provide complete information for all variables and for all time periods.
However, to avoid inconsistencies, the key variables such as capital, labour, education, and health proxy have been modeled as growth rather than at level so that the inconsistencies that may arrive from varying unit of analysis across various data sets can be controlled.  Table 2 shows an enhancing role of education in output growth as education leads to skill development and increased productivity of the labour force, contributing affirmatively to the growth process.

EMPIRICAL RESULTS
Literacy rates have been opted as a proxy for education in Table 2, as the smoothest data for all countries in our sample was available for this indicator among all other possible educational indicators. Literacy rate, however, is considered as a poor proxy for education since it incorporates only the very first part of investment in education and neglects the larger part of it that lies above the attainment of basic literacy. Hence, the analysis is repeated using the most commonly used proxy for education, i.e., the mean years of schooling and results are reported in Table 3. * Education and health indicators were chosen in accordance with the availability of smoothest data for our panel of countries and time period. Other options for health indicator (hospital beds per 1000 people, Health expenditure as % of GDP, infant mortality rate) were not employed because of missing observations. Though remaining mostly insignificant, the magnitude of the educational growth parameter reduces considerably with average years of schooling indicator. It may be for the reason that this measure is beset with a substantial noise arising from various inconsistencies of primary data used in its construction and is likely to bring in a downward bias in the estimated coefficient. It can be shown that similar to the educational component of human capital, the coefficient of the health status corresponds to the respective output elasticity. Just as the educational component of human capital, the health component also has a substantial positive contribution in growth although lesser in magnitude as compared to education. The health effect is significantly important in case of developing countries where a notably large number of workforce is employed in the manual labour and thus better health status ensures less absenteeism from work. Better health in terms of higher life expectancy tends to encourage the growth process by providing incentives for investing in other forms of human capital. The household savings are likely to increase in view of greater life expectancy, which also supplements the domestic and foreign investment; thereby accelerating the growth process.
Results from Tables 2 and 3 reveal that though remaining positive, the significance of the human capital coefficients is sensitive to the use of estimation technique. In panel data models, both education and health coefficients are insignificant primarily because of the existence of endogeneity.
The use of 2SLS, which deals with the endogeneity issue, turns one of the human capital coefficients significant. The more technically advanced technique GMM turns both of the human capital coefficients significant.
Tables 4 and 5 depict an affirmative relationship between education and TFP growth. It is because education develops skills among the labour force; thereby increasing their productivity. Also, an educated labour force is better trained to innovate, use, and adapt new technologies. The estimated parameter of the mean years of schooling education variable is again lower in magnitude as compared to the literacy rate educational indicator and is insignificant in all panel data regressions. Health also has a positive association with the TFP growth but under the panel models this effect is insignificant. In 2SLS technique, as shown in Table 4 with the literacy rate educational variable, when we account for the endogeneity of health variable, the coefficient of education turns out to be positive and significant. On the other hand, when mean years of schooling are used as the educational indicator in Table 5, dealing with health endogeneity turns the health variable significant as well. These results indicate that while accounting for the endogeneity in human capital components, either of these human capital components is likely to have a significant positive impact on growth and the result is sensitive to the definition of the educational measure used in the regression. Also as compared to the 2SLS technique that accounts for only the significantly positive impact of education, the GMM approach states that both components of human capital exert a significant impact on growth. With both proxies of education variable, the regression estimates give evidence of the strong determinate impact of education and health status on the TFP growth.
The impact of human capital on economic growth can also be nonlinear in style. To account for the possible non-linearities in the human capitaloutput growth and human capital-TFP growth relationships, we employed the square and cubic terms of the human capital components; education and health in the various specifications of the equations A-16 and B-4. The results are reported in Appendices 2 and 3. The interaction terms relating human capital components to democracy and trade openness are also incorporated in the alternative specifications to explore the indirect impact of human capital components on the growth of output as well as TFP via democratic institutions and outward orientation in the economy. The interaction terms of human capital components and the power variables do not reflect any non-linear trend in human capital-output growth and human capital-TFP growth relationships.

Semi-parametric Approach
The parametric model may lead to specification bias in case the functional form of the variables is not correctly modeled. An appropriate approach, in this case, would be to use a semi-parametric formulation in which the human capital components are estimated in a data driven non-parametric way. A partially linear semi-parametric model is used for estimation where capital growth, labour growth, democracy, and openness variables constitute the linear parametric part, while human capital is essentially employed as the non-parametric element in the model.
We first discuss the cases when human capital variable comprises of only one component. Appendices 4 and 5, specifications 3 and 11 report the results of the parametric components when literacy rate and health status are used as an indicator of human capital, respectively. The second order Gaussian kernel is used as the multivariate kernel estimator and cross validation method is used for the band width selection. When literacy rate is employed as the only indicator of human capital, the corresponding graph depicts a slightly downward sloping curve indicating the studies that only consider the educational component of human capital are likely to attain the negative coefficients of human capital variable. On the other hand, the health component when employed as the single indicator of human capital reveals a non-linear relationship with growth.
The researchers then modeled both the education and health variables together as the non-parametric components and the results are reported in Appendices 4 and 5, specifications 4-6. The corresponding graphs indicate that when both education and health components of human capital are included in the estimation model, the education component which initially suggested a negative and somewhat linear relationship with growth, exhibit non-linear trends along with the health component. The coefficients of the parametric variables do not change their signs or significance in this case.
A model specification test has been conducted as proposed by Li and Wang (1998). The general linear model is tested against the semi-parametric formulation. The value of this test is insignificant in all cases implying that the null hypothesis of linear parametric model cannot be rejected.
The interaction terms of the human capital components with democracy and openness are used in the semi parametric formulations which allow for the impact of human capital components on growth to be non-linear. The corresponding graphs clearly depict that the human capital components affect output growth and TFP growth in a non-linear manner. However, the exact nature of the non-linear relationship is not evident.

CONCLUSION
For the last few decades, researchers have contributed tremendously to the field of economic growth. There has been a continuous inflow of research papers right from the theoretical models of growth to their implications for the real world data. Among the different variables that gained the researchers' attention towards their impact on growth, human capital is undeniably the most significant one.
The empirical growth literature has yet to decide on the human capitalgrowth linkages in accordance with the theoretical growth models that assign a fundamental role to the human capital investments. This study has been conducted to resolve the dilemma of incongruity between theoretical and empirical evidence on human capital-growth relationship. Firstly, it is focused on whether human capital, in terms of education and health indicators, has a direct impact on output growth or indirect effect through total factor productivity growth. The study attempts to explore the sensitivity of the human capital estimates to different econometric estimation techniques and proxy for education. It is also investigated whether human capital-growth linkages are linear or non-linear in nature using a sample of selected developing countries. Keeping in view the importance of health status in the context of developing countries, the human capital is decomposed into education and health measures.
The findings reveal that the human capital has a well-established role in accelerating growth through both its 'level effects' and 'rate effects'. All the estimation techniques used in the analysis confirm the positive impact of human capital components on growth. However, the significance of the coefficients of human capital components varies with the estimation technique. In the panel data models, the coefficients of both components of human capital are insignificant. The 2SLS technique was applied in response to the endogeneity detected in the health variable, which turned the coefficient of one of the human capital components significant, depending upon the indicator of the educational variable. The application of the GMM technique turned the coefficients of both human capital components significant. Thus, the estimates of human capital components become accurate and sound with the incorporation of theoretically sound estimation techniques that could deal with the more complex empirical growth issues.
Regarding the linear and non-linear connections of human capital with growth; the findings reveal that the parametric models perform well in case of linear relationship between human capital and growth. Parametric models considered in the present study are unable to reveal the non-linearity in the human capital-growth associations. The semi-parametric model, on the other hand, does indicate the existence of non-linear linkages between human capital and growth, though it does not depict the true non-linear functional form. In the nutshell, we can claim that human capital coefficients in the growth models are technique dependent. Furthermore, the choice of human capital proxies creates minor differences in the estimation results. However, our findings do not undermine the need for improving human capital proxies on education † and health ‡ indictors; rather the analysis depicts that besides improving data quality of such proxies as per their relevance to growth process itself, importance of research on statistical issues in line of issue of endogeneity and functional form misspecification for these proxies and heterogeneity of countries taken up for assessment should not be overlooked. † A few examples could be to extend the notion of education beyond formal to nonformal avenues of skill formation like technical training or on the job training beside enriching data on educational expenditures or composition of human capital formed from formal education as followed in Cooray (2009) and Hawkes and Ugur (2012). ‡ A few possible extensions of health proxies could include acquiring data on hospital beds per 1000 people, health expenditure as % of GDP, infant mortality rate or average visits to hospital per person, etc. Notes: 1) The results are robust to White Heteroscedasticity.
2) Values in parenthesis are SE.

Issues in Statistical
2) Values in parenthesis are SE.