Relationship between Pretreatment Serum Albumin Levels with the Risk of Malignant Pleural Mesothelioma

: Background : Malignant Pleural Mesothelioma (MPM) is a very rare and aggressive form of cancer. Recently it was found that pretreatment Serum Albumin (SA), the main circulating protein in blood is a significant prognostic factor for MPM patients. The objective of this present article is to show the relationship between pretreatment Serum albumin (SA) levels with the risk of MPM. Methods : Generalized additive model (GAM), an advanced regression analysis method has been introduced here to find this mathematical relationship between the response variable (SA) and the cofactors. Results : The main determinates of SA are identified - asbestos exposure, hemoglobin, disease diagnosis status (patients having MPM) are the factors having significant association with SA, whereas duration of asbestos exposure, duration of disease symptoms, total protein (TP), Pleural lactic dehydrogenise (PLD), pleural protein (PP), pleural glucose (PG) and C-reactive protein (CRP) are the significant continuous variables for SA. The non-parametric estimation part of this model shows Lactate dehydrogenase (LDH) and Glucose level are the significant smoothing terms. Additionally it is also found that, second and third order interactions between cofactors are highly significant for SA. Conclusions : The findings of this present work can conclude that - serum albumin may play the role of a very significant prognostic factor for MPM disease and it has been established here through mathematical modeling. Few of the findings are already been exist in MPM research literature whereas some of the findings are completely new in the literature.


INTRODUCTION
Malignant pleural mesotheliomas (MPM) are very aggressive tumors, which is a disease originating from pleura, pericardium, peritoneum or tunica vaginalis and it is since the early 1960s recognized to be strongly related to asbestos exposure [1], however it may also be related to previous simian virus 40 (SV40) infection, radiation and quite possible for genetic predisposition [2,3]. The incidence of malignant pleural mesothelioma (MPM) is extremely high in some Turkish villages where there is a low-level environmental exposure to erionite, is a naturally occurring fibrous mineral that belongs to a group of minerals called zeolites. Environmental asbestos exposure and MPM are one of the major public health problems of Turkey. Molecular mechanisms can also be implicated in the development of mesothelioma [4]. Rural living is associated with the development of mesothelioma [5][6][7]. Soil mixtures containing asbestos, known as 'white-soil' or 'corak' can be found in Anatolia, Turkey and 'Luto' in Greece [7,[8][9][10][11]. MPM is a fatal cancer of increasing incidence associated with asbestos exposure [12]. MPM is a malignancy that is resistant to the common tumor directed therapies, but again individual patients might *Address correspondence to this author at the Department of Mathematics, NSHM Knowledge Campus, Durgapur, West-Bengal, PIN-713212, India; Mob: +919775745309/+918918060382; E-mail-sabyasachi99@gmail.com respond to chemotherapy, radiotherapy or immunotherapy, and selected patients might benefit from radical surgery and multimodality treatment [13]. MM is a rare disease with an incidence rate of 1-2 per million/year [14] in the general population. In industrialized countries, the rate ranges from 1 to 5 per million/year for women and 10-30 per million/year for men [15][16][17]. The higher incidence rates in industrialized countries may be due to asbestos exposure [11]. Recently it is observed that, MPM are responsible for approximately 15,000-20,000 deaths annually worldwide [4]. Estimated 1000 patients have MPM in Turkey per year. The annual incidence of pleural mesothelioma was 22.4/1,000,000 in Anatolia [18].
The most of the work using this MPM dataset were diagnostic works which are based on various classifiers [23][24]55]. Object was to classify or diagnosis the disease with minimum misclassification rate. Diagnosis usually appears when a patient visits the doctor to have symptoms checked out. Patients may be met with shortness of breath, pain in the chest or back, painful, persistent coughing or any number of other symptoms, none of which immediately alert the doctor to a diagnosis of mesothelioma [19]. Several studies were carried out about MPM epidemiology, clinics in south east of Turkey [20][21][22]. There are many studies on MPM disease diagnosis using artificial intelligence techniques also like, probability neural networks (PNNs), learning vector quantization (LVQ) [23], artificial immune system (AIS) and multi-layer neural network (MLNN) [24] with prognostic data. MPM is a very rare type of malignant and fatal disease with a poor prognosis.
Serum albumin (SA), the main circulating protein in blood is a prognostic factor for MPM patients. This finding is recently established by a team of Chinese researchers, the report shows that the abundant protein may offer one of the simplest ways to predict mesothelioma prognosis [25]. Human serum albumin or simply serum albumin constitutes about half of serum protein. It is produced in the liver. It is soluble and monomeric. Albumin transports hormones, fatty acids, and other compounds, buffers pH, and maintains oncotic pressure, among other functions. Albumin is synthesized in the liver as preproalbumin, which has an N-terminal peptide that is removed before the nascent protein is released from the rough endoplasmic reticulum. The product proalbumin, is in turn cleaved in the Golgi vesicles to produce the secreted albumin. The reference range for albumin concentrations in serum is approximately 35 -50 g/L (3.5 -5.0 g/dL), a lower-than-normal level of blood albumin may be a sign of many diseases such as liver, kidney diseases and now it is also a prognostic factor for MPM disease. It has a serum half-life of approximately 20 days. It has a molecular mass of 66.5 kDa [26].
This present article aims to explore a relationship between SA and the biochemical, demographic parameters from the dataset of MPM patients. Serum albumin (SA) is playing the role of response variable (other factors and variables are the possible cofactors) which is positive, heterogeneous and non-normally distributed continuous random variable and generally modeled through either gamma or log normal distribution. It has been also observed that few biochemical parameters are non-linearly associated with SA. So, it could be better to practice generalized additive model (GAM) in place of any other ordinary regression like multiple regression or generalized linear model (GLM) [27]. Joint GLM can also be handled this type of positive, non-normal, heterogeneous data, but still this article preferred to show the GAM application here because of the method's flexibility and efficiency in the fields of complex data analysis [28][29][30].
In the statistical analysis of clinical trials and observational studies, the identification and adjustment of prognostic factors are an important activity in order to get a valid outcome. The failure to consider important prognostic variables, particularly in observational studies, can lead to errors in estimating treatment differences. In addition, incorrect modeling of prognostic factors can result in the failure to identify nonlinear trends or threshold effects on survival. This article describes flexible statistical methods that may be used to identify and characterize the effect of potential prognostic factors on disease endpoints. These methods are called 'Generalized Additive Models' (GAM) [31][32][33].
The major objective of this study is to explore a relationship between SA and the other bio medical parameters of MPM patients. Many authors had used various classification techniques on this dataset for MPM disease diagnosis [23,24], but probably, advance regression or probabilistic modeling techniques are not been used under proper modeling scheme.

Material
In order to perform the research reported, the patient's hospital reports from Dicle University, Faculty of Medicine's were used in this work. One of the special characteristics of this diagnosis study is to use the real dataset taking from patient reports from this hospital [24]. Three hundred and twenty-four (324) MM patient data were diagnosed and treated. These data were investigated retrospectively and analyzed files. In the dataset, all samples have 35 features because it is more effective than other factors subsets by doctor's guidance. These features are age, gender, city, asbestos exposure, type of MM, duration of asbestos exposure, diagnosis method, keep side, cytology, duration of symptoms, dyspnoea, ache on chest, weakness, habit of cigarette, performance status, White Blood cell count (WBC), hemoglobin (HGB), platelet count (PLT), sedimentation, blood lactic dehydrogenises (LDH), Alkaline phosphatise (ALP), total protein, albumin, glucose, pleural lactic dehydrogenises, pleural protein, pleural albumin, pleural glucose, dead or not, pleural effusion, pleural thickness on tomography, pleural level of acidity (pH), C-reactive protein (CRP), class of diagnosis. Diagnostic tests of each patient were recorded. Table 1 shows the detail descriptions of variables and their descriptive statistics. This present study based on the dataset collected from UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/ Mesothelioma).  ------ Normal pleural proteins count is less than 1-2 g/dL.

Methods
In this present article, an advanced regression technique namely Generalized additive model (GAM) [31,32,34] has been performed for finding the association between serum albumin and other parameters (biochemical, demographic and others) for MPM disease dataset. Recently, it has been established that SA is an important prognostic factor for MPM disease [25]. The factors which influenced SA the most both negatively or positively, article tries to explicit this.
Best GAM model can be selected through some model checking criteria namely R-square value, Akaike Information Criterion (AIC), Bayesian information criterion (BIC) or Generalized Cross Validation (GCV) value and regression diagnostic plots like normal probability plot, Residuals against fitted value plot etc. [31,32,34]. Cofactors are significant or not judged through p-value. Approximate significance of smooth terms is also judged through p-value. For this MPM data set serum albumin (SA) is taken as response variable (Y), and age, gender, city, asbestos exposure, type of MM, duration of asbestos exposure, diagnosis method, keep side, cytology, duration of symptoms, dyspnoea, ache on chest, weakness, habit of cigarette, performance status, White Blood cell count (WBC), hemoglobin (HGB), platelet count (PLT), sedimentation, blood lactic dehydrogenises (LDH), Alkaline phosphatise (ALP), total protein, glucose, pleural lactic dehydrogenises, pleural protein, pleural albumin, pleural glucose, dead or not, pleural effusion, pleural thickness on tomography, pleural level of acidity (pH), C-reactive protein (CRP), class of diagnosis are the cofactors (X i ' s). Including class of diagnosis there are total thirty five (35) parameters, out of which eighteen (18) are categorical and seventeen (17)

Generalized Additive Model (GAM)
GAM [31,32,34] is an extension of the Generalized Linear Model (GLM) [27] where the modeling of the mean functions relaxes the assumption of linearity, albeit additively of the mean function pertaining to the covariates is assumed. Whilst the mean functions of some covariates may be assumed to be linear, the non-linear mean functions are modeled using smoothing methods, such as kernel smoothers, lowess, smoothing splines or regression splines. In general, the model has the following structure where, µ = E(Y ) for Y , a response variable with some exponential family distribution, g is the link function and f j are some smooth functions of the covariates X j for each j = 1, 2,…, p .
GAMs provide more flexibility than do GLMs, as they relax the hypothesis of linear dependence between the covariates and the expected value of the response variable. The main drawback of GAMs lies in the estimation of the smooth functions f j , and there are different ways to address this. One of the most common alternatives is based on splines, which allow the GAM estimation to be reduced to the GLM context [27]. Smoothing splines [37,38], use as many knots as unique values of the covariate X j and control the model's smoothness by adding a penalty to the least squares fitting objective [35][36][37][38].
Generalized additive models can be used in virtually any setting where linear models are used. For a single , the linear component of the model with an additive [30]. In other words, the purpose of generalized additive models is to maximize the quality of prediction of the dependent variable Y from various distributions, by estimating unspecific (non-parametric) functions of the covariates X j which are "connected" to the dependent variable via the link function g .
A unique aspect of generalized additive models is the non-parametric functions f j of the covariates X j .
Specifically, instead of some kind of simple or complex parametric functions, Hastie and Tibshirani (1990) discuss various general scatterplot smoothers that can be applied to the X variable values, with the target criterion to maximize the quality of prediction of the (transformed) Y variable values. One such scatterplot smoother is the cubic smoothing splines smoother, which generally produces a smooth generalization of the relationship between the two variables in the scatterplot. Computational details regarding this smoother can be found in Hastie and Tibshirani (1990; see also Schimek, 2000).
The GAM regression techniques are used for this MPM disease dataset. All statistical and data analytic works, mainly GAM regression are performed in R statistical software [34].

RESULTS
This present section considered serum albumin (SA) as a dependent or response variable and remaining others as independent variable or cofactors. The SA is positive valued, non-normally distributed, heterogeneous (non-constant variance) continuous variable. This response variable has been modeled through gamma distributed log linked generalized additive models. The relationship between SA and the others cofactors is very complicated. The best GAM model is identified through the GCV value ( Table 2) along with the model checking criteria (Figure 1 and 2). Adjusted R-square value and the percentage of the deviance explained by the model are also very important to choose the best model. But good R-square value may not be adequate for determining the best model [39]. GAM has two parts of estimation methods; one is parametric estimation for those cofactors which entered in model parametrically and non-parametric estimation used for smoothing cofactors. Through this non-parametric smoothing estimation part GAM tries to control the heterogeneity and the non-linearity (complexity) of the relationship between response variable and the cofactors [30]. Table 2 shows the result of the estimations of the model. For finding the true relationship between SA and the other cofactors, article has to considered second and third order interaction effects in the present model. Interaction effects is very much popular in regression and design of experiment, it means cofactors have a joint effect on response variable. In medical science data analysis also it is very much relevant, because two or three bio medical parameters may have joint influence on the corresponding response variable [28,29]. Some insignificant effects are retained in the model in order to respect the marginality rule, namely that when an interaction term is significant, all related lower-order interactions and main effects should be included in the model [40,41]. This article considered the P-values up to approximately 10% level as significant, and more than 10% to approximately 20% as partially significant. [28-30, 40, 41].
In order to examine the proper fitting of the GAM fitted model ( Table 2), one model checking criteria with four different plots are shown in Figure 1. First plot of Figure 1 shows theoretical quantiles are plotted against the deviance residuals, second plot shows linear predictor plotted against residuals, in third plot histogram of the residuals are plotted and in forth plot fitted values plotted against response values. All these four plots suggested that the fitted model is adequate for this data analysis, especially the histogram of residuals is almost normally distributed which has an indicator of good fit. Figure 2 & 3 shown two diagnostic plots, namely, the absolute residuals plot and the normal probability plot. In Figure 2, displays the normal probability plot of the GAM fitted model ( Table 2), which does not show lack of fit for outliers or variables as there is not much more gap in the figure, only except in the lower and upper part of figure, which shows little deviations due to complexity of the relationship. Figure 3 the absolute residual values are plotted with respect to fitted values. It is almost a flat diagram with the running means, indicating that the variance is constant for the fitted model. GAM has a non-parametric smoothing terms estimation part for betterment of the model fitting. It also has a graphical part in which variable values are plotted against its smoothness along with the estimated degrees of freedom. Figure 4 shows the smoothness of variable LDH with 95% confidence interval, which indicates that  after crossing a certain value of LDH the smooth curve declined. Figure 5, shows the smoothness of variable Glucose with 95% confidence interval. This smooth curve shows non-linearity with respect to the increment of glucose value.

Interpretations of Serum albumin data analysis
The results and interpretation of the parametric estimation of cofactors from Table 2  ii.
In this GAM fitted model, the factor Hemoglobin has a positive significant association with SA with p-value 0.006 which indicates that patients with higher hemoglobin than normal range having more SA value than the rest.
iii. Serum Albumin (SA) is partial positively significantly associated with the factor diagnosis class of MPM with p-value 0.07. The patient who encounters the MPM disease has a lower value of SA than others. This finding suggests that, the patients having lower level iv. SA value getting higher chance of MPM disease. More clearly it can be concluded that non MPM patients having higher SA than MPM patients.
v. Duration of asbestos exposure has a negative significant association with SA with p-value 0.038. This indicates that if duration of asbestos exposure is increased then the SA value is decreased.
vi. SA is negatively significantly associated with the duration of symptoms with p-value 0.05, which signify that if a patient is going through a long time of MPM disease symptoms then the SA value is reduced. vii. Total protein (TP) has a high positive significant association with SA with the p-value <0.001. It indicates that if the value of TP is increased in blood then the SA value is also increased.
viii. In this GAM fitted model, Pleural glucose (PG) has a high positive significant association with SA. With p-value <0.001, which indicates that, if PG value is increased then SA is also increased.
ix. C-reactive protein (CRP) has high positive significant association with SA with p-value <0.001. If the value of CRP is increased than SA value is also increased.
x. The joint interaction effects (TP * CRP) of total protein (TP) with C-reactive protein (CRP) is high negatively significantly associated with the SA having p-value <0.001. Though TP and CRP both are positively associated with SA, but the joint effects of these two cofactors are found to be negative. The results and interpretation of the non-parametric estimation of smoothing terms from Table 2 are described as follows, xviii. The lower part of Table 2 shows non-parametric estimation of smoothing terms namely Blood lactic dehydrogenise (LDH) and Glucose. Both of these two cofactors enter in the gamma distributed GAM model as smoothing factors. It is observed that F-test statistics has been used for testing this non-parametric smoothness of these cofactors. The smoothness of the factor LDH is significant with p-value 0.02 and Glucose is highly significant with p-value <0.001.
xix. It also noticed from Table 2 that, the GAM fitted model has an Adjusted R-square value approximately 0.60 with 65% of its deviance explained. The GCV (Generalized cross validation) score is 0.0230 which is also very low compare to other models.
From Table 2, the final selected GAM fitted gamma distributed model of the Serum Albumin (y) is shown below # denotes the value of an estimate whose first three decimal places are zeros and 'f' denotes the smoothing function.
Where, Z = ln(y); ('ln' means Logarithm with base 'e' of y and y is the response variable serum albumin).

DISCUSSION
In Yao et al. (2014) [25] the authors tried to establish that Serum Albumin (SA) is significant prognostic factor for MPM disease patients. MPM is a highly aggressive malignant with a very short span of median survival with approximately 9-12 months [2]. Still no such universally accepted standard therapies have been developed. Conventional medical and surgical therapies are also not completely developed with efficiency. Therefore it is very important clinical and medical science research problem to identify the risk or prognostic factors for MPM disease. These are the motivation of the present article, in which we try to find a relationship between SA and the other cofactors described in the MPM dataset. A probabilistic modeling approach has been considered here using generalized additive model commonly known as GAM, with gamma distribution and 'log' link assumptions [32][33][34]. For performing regression analysis a response or dependent variable is required, but in this present dataset of MPM disease has no such continuous response variable has been given. Here SA serves the purpose. Yao et al. (2014) showed that the pretreatment serum albumin level is an independent prognostic indicator of overall survival (OS) for MPM patients [25]. They also reported that patients with hypoalbuminaemia (albumin level ≤ 35 g/l) had been associated with significantly worse survival than those with a normal albumin level. Not only for MPM, in the field of malignant disease, SA has been shown as an independent prognostic factor in several cancers [42][43][44][45].
The report also said the prognostic role of SA in MPM is emphasized because it is a simple, inexpensive and commonly performed laboratory test. This SA is measured as a part of liver function tests, which are routinely performed for patients.
These are the reasons why we took the chance to find the determinate for SA, those who are responsible for decreasing and as well as increasing the SA. Because if a factor gives a negative impact to SA, that means decrement in SA value where as in case of positive impact of a factor means the increment in SA value. Once these risk factors are identified then it could be easy for the medical and clinical researchers to develop the standard therapies of treatments for MPM disease. This is the major objective of the present article to develop a probabilistic model using the bio medical and demographic parameters. This present work reported very important finding in terms of cofactor which gave main and interaction effects on SA.
The present result showed that the patients who have been exposed to the asbestos (name of the factor is "asbestos exposure") during their life give a significant effect to the SA. It has been well known that asbestos exposure is one of the major reasons for MPM disease. Model reported this very efficiently. Hemoglobin range which is higher than normal is also determined as a significant factor for SA model fitting. ( 2 ) The patients having the hemoglobin more than the normal range have the higher value of SA than others. It is also reported that the patients do not having MPM disease have a higher SA value than MPM disease patients. These findings supported many medical researches and the clinical views [25,[46][47][48]. Age is not a significant factor in SA modeling and also it is not significant in MPM disease. Duration of asbestos exposure is a continuous variable in this present study measured in years is the most important hallmark and the leading cause for MPM disease. [46,47]. Here duration of asbestos exposure is negatively significantly associated with SA, which indicates the increment of duration of asbestos exposure reduced the SA value, which indirectly infers the occurrence chance of MPM. This finding is very much important because it statistically (or mathematically) proves that, duration of asbestos exposure is one of the major causes for MPM disease. Similarly duration of symptoms of disease is negatively significantly associated with SA value, which means if duration of symptoms of the disease is increased then the SA value will be decreased and Yao et al. (2014) shows that lower level value of SA is an important prognostic factor for MPM.
So, our present model supports the earlier finding regarding MPM disease very prominently and strongly using this GAM regression technique [30,[32][33]. Total protein, also known as serum total protein, is a biochemical test for measuring the total amount of protein in serum. The reference range for total protein is typically 6.0-8.0g/dl. Concentrations below the reference range usually reflect low albumin concentration and may refers to liver disorder and kidney disorder. Elevated total protein may indicate: inflammation or infections, such as viral hepatitis B or C, or HIV and bone marrow disorders. There is so such evidence of relationship between MPM and total protein, but the present model shows a high positive association between SA and total protein. It means if the total protein is increased in serum it will be help to increase the SA value.
In this present work PLD or pleural lactic dehydrogenase, pleural glucose and pleural protein are found to be highly positively associated with SA value, which indicates that if these pleural fluids testing measures (PLD, PG & PP) are increased then SA value should be increased.
Medical research said that a low level of pleural glucose can be link to infection or malignancy [49,50].
That means normal level or little higher than normal level PG patient has smaller chance to get infection or malignant, here our study shows that increment in PG ensures the increment in SA. Patients or persons with standard or normal SA value have smaller chance to get MPM.
The upper limit of the normal PLD or pleural lactic dehydrogenase is 200 IU/L. A high LD indicates that pleural fluid is an exudate, while a low level indicates it is transudate. Normal PP or pleural proteins count is less than 1-2 g/dL. Pleural effusions are classified as transudates or exudates on the basis of the fluid protein level, classically, a pleural fluid protein level >30g/l is an exudate and <30g/l is a transudate, in the context of a normal serum protein level [51]. So, clinically it is established that these pleural fluid measures are very sensitive in their own level, a little deviation from their normal ranges cause various diseases including malignant. Present study shows a mathematical relationship between these pleural fluid measures with the SA, it can help to maintain the normal level of each of these biomedical parameters.
C-reactive protein (CRP) has a positive significant association with SA, which showed in this present article. It indicates that if the CRP level is increased then SA value level is also increased. Few earlier researchers found that, CRP is an acute phase reactant which has been noted to be significantly elevated in patients with metastatic disease across a variety of solid organ and hematological malignancies, including malignant pleural Mesothelioma (MPM) [52].
In a retrospective study of 115 patients with a pathologically confirmed diagnosis of MPM, elevated CRP (≥1 mg/dL) was shown to be an independent indicator of poor prognosis (HR=2.07; 95% CI: 1.23-3.46; P=0.001) [53]. As per our knowledge the mathematical relationship founds from this present article with CRP and SA is new in literature. But very interesting result founds here that CRP value along with the total protein and pleural glucose has high negative significance association with SA. Which means the joint effect of CRP and TP is negatively significantly associated with SA. That is if both of these two increase at their level jointly then it will diminished the SA value. Same result can be shown for CRP and PG case also. The conclusion is very important that individually CRP gives a positive effect on SA, but in joint interaction it gives the negative effect to SA.
Similar things happened also in case of total protein, it has a positive significant association with SA as a main effect, but in case of joint interaction with PG, PP and PLD, together they have a negative significant association with SA, that means if they increased their level jointly then the SA value should be decreased. Decrement in SA value form it normal range may play a very important role for MPM patients [25]. The joint interaction effect of pleural glucose (PG) with pleural protein (PP) and with PLD both are high negatively significantly associated with SA, but the joint of effect of PG and pleural albumin (PA) is positively significant, which means if both of these two factor are increased at their level then SA value is also increased. In main effect the factor PG is positively significant but PA is not significant. Therefore, these joint interaction effects are very important factors in MPM disease treatment or prognosis which is new in literature of medical research from mathematical modeling perspective.
Another important finding of this work is the third order interaction effects of the factors, which is very difficult to interpret literally. These four third order interaction effects -i) TP, PG and CRP ii) TP, PG and PP iii) PP, PG and PA iv) TP, PLD and PG have been occurred in this GAM model, which obviously predicts some important relationship between SA and them, but this is too complex to interpret. The third order interaction effects of (i), (ii) and (iv) are positively significantly associated with SA whereas (iii) is negatively significantly associated with SA.
Beside these another major part is incorporation of smoothing factors in this model which help to fit the model well enough. It also gives the stable estimate of the parameters (standard error of estimates in Table 2) and eliminates the heteroscedasticity (non-constant variance response). From this part it could be found that lactate dehydrogenase (LDH) and glucose are the significant smoothing factors which have a nonlinear relationship with SA (from Table 2 and Figure 3a and b). Lactate dehydrogenase (LDH) is a protein that helps to produce energy in the body. An LDH test measures the amount of LDH in the blood and the normal value range is 105 to 333 IU/L. LDH is found in many body tissues such as the heart, liver, kidney, skeletal muscle, brain, blood cells, and lungs. High LDH were found to be prognostic indicators in mesothelioma. [54].
In our work it shows that (from Figure 4) a high amount of LDH (more than 600 IU/L) causes the decrement in SA.
So far in our knowledge these are the most fundamental findings of this present work which has not been done before by any researcher. Now it can be verified by the medical researchers and the practitioner in clinic.

CONCLUSION
This current article is tried to find a relationship between serum albumin (SA) and the others cofactors based on a well-known mesothelioma pleural malignant (MPM) disease dataset (see material part). Serum albumin is treated here as a response variable with gamma distribution as an assumption. The reason behind taking SA as a response variable is that, the pretreatment serum albumin level is an independent prognostic indicator of overall survival (OS) for MPM patients [25]. We tried to model this SA variable which is a continuous random variable with non-constant variance and non-normal distribution pattern. To model this we introduced generalized additive model popularly known as GAM with a Gamma distributional assumption and logarithm as a link function. The variable descriptions and the fitted results are presented in Table 1 and 2 respectively. The model checking plots and the other relevant plots such as normal probability plot, absolute residual plot, smoothing term plots are presented in Figure 1, 2, and 3 respectively.
The current reported results ( Table 2), though not completely conclusive, are revealing but the determinants of SA are derived satisfying the following regression analysis criteria. First, the determinants are selected based on GAM fitted model analyses. Second, the final model is selected based on GCV value. Third, final model is justified based on GAM diagnostic plots [32][33][34]. Fourth, the standard error of the estimates is very small, indicating that the estimates are stable [39,41]. Fifth, the final model of the SA is selected based on locating the appropriate statistical distribution. The SA distribution is identified herein as the gamma distribution. For more extension regarding this please follow the references [28][29][30].
To the best of our knowledge, the present models (Results & Discussion section) can be considered as one of the best probabilistic model under regression framework. The current models may provide a better assistance for researchers and the medical practitioner for developing standard treatment therapies and to make decision using the individual MPM patient's risk factors. The current results have focused many interesting conclusions. These findings may help the medical practitioners for better medical treatment. Asbestos exposure, hemoglobin, disease diagnosis status are the significant categorical variables for serum albumin, whereas duration of asbestos exposure, duration of symptoms, total protein, PLD, PP, PG and CRP are the significant continuous variables for SA. The non-parametric estimation part of this model shows LDH and Glucose level are the significant smoothing terms. Additionally it is also found in parametric estimation part that, second and third order interactions of biochemical parameters are highly significant for this SA. Most of these present findings are partially as well as completely new in MPM research literature.
Finally, taking into consideration of all relevant results found from this work-it can be conclude that, serum albumin may play a very significant prognostic factor role for MPM disease and it is not only clinical perspective but also from mathematical ground. We can predict the SA value using the fitted model presented here (equation (2)) and this probabilistic model takes MPM disease research to a strong platform.