ISSN (on-line): 1806-3756 | ISSN (printed): 1806-3713

Publication continuous and bimonthly

ORIGINAL ARTICLE

Analysis and validation of probabilistic models for predicting malignancy in solitary pulmonary nodules in a population in Brazil

Cromwell Barbosa de Carvalho Melo; João Aléssio Juliano Perfeito; Danilo Félix Daud; Altair da Silva Costa Júnior; Ilka Lopes Santoro; Luiz Eduardo Villaça Leão

**Abstract**

**Resumo**

**Introduction**

Pulmonary nodules have always represented a major diagnostic challenge, which is cause for justified concern given the incidence of malignant (metastatic or primary) lung tumors. In recent decades, there has been an increase in the incidence of, and consequently, in the mortality from, primary lung cancer, concomitantly with advances in imaging techniques, which have resulted in increased detection of pulmonary nodules. In this context, the finding of a solitary pulmonary nodule (SPN) has become crucial for the early detection of primary lung cancer, which, according to data from the Brazilian National Ministry of Health Mortality Database, is the leading cause of cancer death, surpassing the number of deaths from prostate and breast cancer when gender is not taken into account.

An SPN is defined as a more or less spherical lung opacity that is less than 3 cm in diameter. It usually has well-defined margins, is completely surrounded by lung parenchyma, and is without other radiological abnormalities, such as atelectasis and mediastinal lymph node enlargement.(1,2)

Several ways to estimate the malignant potential of SPNs have been devised. Among the most widespread are two mathematical models based on multivariate analysis of the clinical characteristics of patients with SPNs and the radiological characteristics of SPNs, one of which was published by Swensen et al.(3) in 1997 and one of which was published by Gould et al.(4) in 2007. In those two studies, the authors developed mathematical formulas to calculate the probability of SPN malignancy with the purpose of providing guidance for attending physicians, the probabilistic models having been extensively tested and approved, especially in populations in the USA and Europe.(3-5) In a study conducted in the Philippines, the high prevalence of tuberculosis made it impossible to repeat that finding, demonstrating the ineffectiveness of the models for that population.(6)

Since, to date, there have been no studies in Brazil aimed at evaluating these models in a population in the country, the objective of the present study was to analyze clinical and radiological variables that influence the pathological diagnosis of SPN and to compare and validate the two aforementioned mathematical models(3,4) for calculating the probability of SPN malignancy in patients with SPN in Brazil.

**Methods**

This was a retrospective study involving all of the patients submitted to resection of SPN at the Hospital São Paulo, located in the city of São Paulo, Brazil, between 2000 and 2009. The study was based on data from medical charts. We studied the following variables: gender; age; presence of systemic comorbidities; history of malignancy prior to the diagnosis of SPN; histopathological diagnosis of SPN (malignant disease vs. benign disease); smoking status (current smokers and former smokers); smoking history (in pack-years); and number of years since smoking cessation. In addition, we studied CT features of SPNs, including presence of spiculated margins, maximum transverse diameter (in mm), and anatomical location, as described in CT reports.

After data collection, we used the following inclusion criteria: having a confirmed diagnosis of SPN; having undergone surgical resection of SPN; having a medical chart containing the data needed for the analysis; and having a pathological diagnosis of SPN.

Of the 127 patients who were initially screened for inclusion in the study, 110 met the aforementioned criteria. The main reason for exclusion was having an incomplete medical chart, followed by having been diagnosed with multiple pulmonary nodules.

We determined the probability of SPN malignancy by using the mathematical models developed in the aforementioned studies and applying the equations defined by the authors.

Equations developed by Swensen et al.(3):

Probability of malignancy = ex/(1+ ex) (1)

x = −6.8272 + (0.0391 × age) +

+ (0.7917 × smoke) +

+ (1.3388 × cancer) +

+ (0.1274 × diameter) +

+ (1.0407 × spiculation) +

+(0.7838 × location) (2)

where age is the age of the patient in years; smoke = 1 if the patient is a current or former smoker (otherwise, smoke = 0); cancer = 1 if the patient has a history of an extrathoracic cancer that was diagnosed more than five years ago (otherwise, cancer = 0); diameter is the diameter of the nodule in mm; spiculation = 1 if the edge of the nodule has spicules (otherwise, spiculation = 0); and location = 1 if the nodule is located in an upper lobe (otherwise, location = 0).

Equations developed by Gould et al.(4):

Probability of malignant SPN = ex/(1+ ex) (1)

x = −8.404 + (0.779 × age) +

+ (2.061 × smoke) + (0.112 × diameter) −

− (0.567 × Y) (2)

where age is age in years; smoke is 1 if a current or former smoker (otherwise 0); diameter is the largest diameter of the nodule in mm; and Y is the number of years since quitting smoking divided by 10.

For the statistical analysis, we used the Statistical Package for the Social Sciences, version 13.0 for Windows (SPSS Inc., Chicago, IL, USA), and the Statistical Package for the Social Sciences, version 20.0 for Mac. We also used BioEstat, version 5.0 for Windows, for complementary analyses and for constructing the ROC curves.

In order to determine possible differences among the groups studied, we used the Student's t-test for parametric variables, Pearson's chi-square test for nonparametric variables, and Fisher's exact test for dichotomous variables.(7) The level of significance was set at 5% for all statistical tests.

Diagnostic performance and the best cut-off point for both mathematical models were determined by analysis of the ROC and two-graph ROC curves.(8) For the cut-off points, we calculated the sensitivity, specificity, accuracy, negative predictive value, and positive predictive value of the models. We also calculated the area under the curve and compared the models.

**Results**

We evaluated 110 patients. Of those, 59 were male and 51 were female. We found no significant association between gender and the diagnosis of SPN malignancy. The same was true for presence of comorbidities, history of malignancy prior to the diagnosis of SPN, and smoking status. Neither the number of years since smoking cessation nor smoking history in pack-years had any influence on the pathological diagnosis of SPN. The only clinical characteristic that was significantly associated with SPN malignancy was age (p = 0.006), when it was stratified into groups, with increasing ORs, culminating in an OR of 5.70 for the > 70 year age group (Table 1).

Among the radiological characteristics, presence of spiculated margins (p = 0.001) and lesion diameter (p = 0.001) were significantly associated with SPN malignancy. Stratification of this analysis by lesion diameter revealed increasing ORs, with SPNs of 20.1-30 mm in diameter reaching an OR of 2.64 (Table 1).

After calculating the probability of SPN malignancy with the mathematical model of Swensen et al.,(3) we constructed a ROC curve, the area under the ROC curve (AUC) being 0.79 ± 0.44 (95% CI: 0.70-0.88; Figure 1). The construction of a two-graph ROC curve allowed us to determine an optimal cut-off point in relation to the various cut-off points along the ROC curve (Figure 2), with higher sensitivity and specificity being obtained below 15% or above 66.5% (a yield higher than 95%; Table 2).

For the model of Gould et al.,(4) we obtained an AUC of 0.69 ± 0.50 (95% CI: 0.59-0.79; Figure 3). By analyzing the two-graph ROC curve, we observed the behavior of the various cut-off points in relation to sensitivity and specificity; for a maximum yield (greater than 95%), the calculated cut-off points were below 8.5% and above 82.3% (Table 2).

The diagnosis of SPN remains a major challenge in medical practice. In the present study, we evaluated a sample of patients who had undergone surgical resection of SPN in order to establish a definitive pathological diagnosis and identified three independent risk factors for SPN malignancy: advanced age; presence of spiculated margins; and SPN diameter. The other clinical and radiological characteristics of the patients with SPN showed no significant associations with SPN malignancy in our sample.

In several recent studies, age has been reported to be one of the major risk factors for SPN malignancy.(3,4,9,10) Stratification by age revealed a statistically significant association between age and malignancy, as well as increasing ORs. This finding corroborates current findings demonstrating that older individuals, especially those over 50 years of age, are at a higher risk for malignant SPN.

Spiculated (corona radiata) margins are predictive of SPN malignancy, the positive predictive value being as high as 94%, whereas lobulated margins have a positive predictive value for malignancy of up to 80%.(11-13) This was also true in the present study, in which we found that two thirds of the malignant lesions had irregular, spiculated edges or irregular, lobulated edges, a finding that was statistically significant (p = 0.001).

The mean lesion diameter is also an important risk factor for malignancy, especially when it increases and approaches 30 mm. Numerous studies have confirmed this finding, always associating lesion growth with its malignant potential. Nodules of more than 20 mm in diameter have a greater than 50% chance of being diagnosed as malignant.(14,15) This is consistent with the findings of the present study, in which we found a significant association between lesion diameter and malignancy when we compared the mean lesion diameters among the stratified groups, stratification having revealed increasing ORs.

We also evaluated two mathematical models for predicting the likelihood of SPN malignancy. Although both models are widely disseminated, we found no studies investigating either model in a population in Brazil. One group of authors recently tested the model of Swensen et al.(3) in a population in the Philippines and found that the model was not valid as a predictor of malignancy, a finding that was associated with the high rate of tuberculosis in the study population.(6) In our study population, both models proved effective in predicting the malignant potential of SPNs, the model of Swensen et al.(3) being more accurate than that of Gould et al.(4)

The results obtained with the use of the equations developed by Swensen et al.(3) showed that the AUC found in the present study (0.79 ± 0.44; 95% CI: 0.70-0.88) was nearly identical to the values reported by other groups of researchers, who validated both probabilistic models in similar studies.(16,17) This AUC allows us to state that the aforementioned mathematical model showed good accuracy (AUC > 0.70), which supports the use of that model as a diagnostic test, as has been proposed.(18) By analyzing the ROC and two-graph ROC curves, we observed the behavior of the various cut-off points: the values at the ends of the curves, i.e., the cut-off points below 15.0% and above 66.5%, are the values with the highest yield, a sensitivity of 95.9% and a specificity of 35.3% having been found for the cut-off points below 15.0% and a sensitivity of 36.7% and a specificity of 94.1% having been found for the cut-off points above 66.5%. In brief, in patients for whom the probability of malignancy was 15.0%, the rates of true positives were so high (showed such a high sensitivity) that, in theory, they would have allowed us to withhold treatment in our sample, whereas, in patients for whom the probability of malignancy was 11.0%, sensitivity was 100%, this being therefore the lowest possible rate of false negatives. At the other end of the curve, we found patients for whom the probability of malignancy was 66.5%; at this cut-off point, sensitivity was 36.7% and specificity was 94.1%, i.e., they reached values that allow referral for surgical resection of SPN because of a high diagnostic rate, which increases after that percentile, reaching a specificity of 100% above the cut-off point of 80.5% (i.e., minimizing the occurrence of false negatives). For patients with intermediate probability of malignancy (i.e., those for whom the probability was between 15.0% and 66.5%), the model was found to be ineffective in predicting the probability of SPN malignancy, being therefore an unreliable diagnostic test. For such patients, further tests, including positron emission tomography and biopsy (transbronchial or transthoracic biopsy), are necessary.

For the model of Gould et al.,(4) our analysis of the ROC curve revealed an AUC of 0.69 ± 0.50 (95% CI: 0.59-0.79) and an accuracy of 66.40%. Reliable cut-off points were obtained only with values 8.36% (sensitivity of 94.4% and specificity of 21.4%) and values 82.3% (sensitivity of 13.0% and specificity of 94.6%). Therefore, for cases within the range between these cut-off points, it is impossible to draw reliable conclusions based on the model, and further investigation being therefore necessary.

For our sample, we found that the mathematical model of Swensen et al.(3) was superior to that of Gould et al;(4) through analysis of superimposed ROC curves, we observed a greater AUC for the former, as well as a narrower range between the reliable cut-off points. This behavior demonstrated that, in our sample, the mathematical model proposed by Swensen et al.(3) had a higher diagnostic accuracy. Recently, in a study conducted in the USA, the two models were compared and were found to have very similar behaviors,(16) a finding that is in disagreement with ours.

In conclusion, of the clinical and radiological characteristics related to SPNs, three showed a statistically significant association with SPN malignancy: advanced age; presence of spiculated margins on chest CT; and greater maximum SPN diameter.

By comparing the model of Swensen et al.(3) and that of Gould et al.,(4) we found that the former had a higher yield, with higher sensitivity, specificity, and accuracy.

1. Varoli F, Vergani C, Caminiti R, Francese M, Gerosa C, Bongini M, et al. Management of solitary pulmonary nodule. Eur J Cardiothorac Surg. 2008;33(3):461-5. PMid:18203611. http://dx.doi.org/10.1016/j.ejcts.2007.12.004

2. Ost D, Fein AM, Feinsilver SH. Clinical practice. The solitary pulmonary nodule. N Engl J Med. 2003;348(25):2535-42. PMid:12815140. http://dx.doi.org/10.1056/NEJMcp012290

3. Swensen SJ, Silverstein MD, Ilstrup DM, Schleck CD, Edell ES. The probability of malignancy in solitary pulmonary nodules. Application to small radiologically indeterminate nodules. Arch Intern Med. 1997;157(8):849 55. PMid:9129544. http://dx.doi.org/10.1001/archinte.1997.00440290031002

4. Gould MK, Ananth L, Barnett PG; Veterans Affairs SNAP Cooperative Study Group. A clinical model to estimate the pretest probability of lung cancer in patients with solitary pulmonary nodules. Chest. 2007;131(2):383-8. PMid:17296637 PMCid:3008547. http://dx.doi.org/10.1378/chest.06-1261

5. Schultz EM, Sanders GD, Trotter PR, Patz EF Jr, Silvestri GA, Owens DK, et al. Validation of two models to estimate the probability of malignancy in patients with solitary pulmonary nodules. Thorax. 2008;63(4):335 41. PMid:17965070 PMCid:2882437. http://dx.doi.org/10.1136/thx.2007.084731

6. Rafanan AL, Ceniza SV, Canete MT. Two commonly used prediction models (Mayo and VA) to estimate the probability of malignancy in patients with solitary pulmonary nodules are not applicable in a country with a high prevalence of tuberculosis. Chest. 2010;138(4_MeetingAbstracts):250A-250A. doi:10.1378/chest.10657

7. Ebraim GJ, Sullivan KR. Mother and Child Health Research Methods. London: Book-Aid; 1995.

8. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29-36. PMid:7063747.

9. Gould MK, Fletcher J, Iannettoni MD, Lynch WR, Midthun DE, Naidich DP, et al. Evaluation of patients with pulmonary nodules: when is it lung cancer?: ACCP evidence-based clinical practice guidelines (2nd edition). Chest. 2007;132(3 Suppl):108S-130S.

10. Clements WM, DeRosimo JF, Reed CE. Solitary pulmonary nodule. In: Shields TW, LoCicero J, Reed CE, Feins RH, editors. General Thoracic Surgery. Philadelphia: Lippincott Williams & Wilkins; 2009. p. 1205-11.

11. Soubani AO. The evaluation and management of the solitary pulmonary nodule. Postgrad Med J. 2008;84(995):459 66. PMid:18940947. http://dx.doi.org/10.1136/pgmj.2007.063545

12. Stark P. Computed tomographic and positron emission tomographic scanning of pulmonary nodules. In: UpToDate, Basow DS, editor, UpToDate: Waltham, MA; 2012.

13. Seemann MD, Seemann O, Luboldt W, Bonél H, Sittek H, Dienemann H, et al. Differentiation of malignant from benign solitary pulmonary lesions using chest radiography, spiral CT and HRCT. Lung Cancer. 2000;29(2):105-24. http://dx.doi.org/10.1016/S0169-5002(00)00104-5

14. Henschke CI, Yankelevitz DF, Naidich DP, McCauley DI, McGuinness G, Libby DM, et al. CT screening for lung cancer: suspiciousness of nodules according to size on baseline scans. Radiology. 2004;231(1):164-8. PMid:14990809. http://dx.doi.org/10.1148/radiol.2311030634

15. MacMahon H, Austin JH, Gamsu G, Herold CJ, Jett JR, Naidich DP, et al. Guidelines for management of small pulmonary nodules detected on CT scans: a statement from the Fleischner Society. Radiology. 2005;237(2):395 400. PMid:16244247. http://dx.doi.org/10.1148/radiol.2372041887

16. Schultz EM, Sanders GD, Trotter PR, Patz EF Jr, Silvestri GA, Owens DK, et al. Validation of two models to estimate the probability of malignancy in patients with solitary pulmonary nodules. Thorax. 2008;63(4):335 41. PMid:17965070 PMCid:2882437. http://dx.doi.org/10.1136/thx.2007.084731

17. Herder GJ, van Tinteren H, Golding RP, Kostense PJ, Comans EF, Smit EF, et al. Clinical prediction model to characterize pulmonary nodules: validation and added value of 18F-fluorodeoxyglucose positron emission tomography. Chest. 2005;128(4):2490-6. PMid:16236914. http://dx.doi.org/10.1378/chest.128.4.2490

18. Martinez EZ, Louzada-Neto F, Pereira BB. A curva ROC para testes diagnósticos. Cad Saude Coletiva (Rio J.). 2003;11(1):7-31.

Study carried out at the Universidade Federal de São Paulo/Escola Paulista de Medicina - UNIFESP/EPM, Federal University of São Paulo/Paulista School of Medicine - São Paulo, Brazil.

Correspondence to: Cromwell Barbosa de Carvalho Melo. Rua Napoleão de Barros, 715, 4º andar, Disciplina de Cirurgia Torácica. Vila Clementino, CEP 04023-002, São Paulo, SP, Brasil.

Tel. 55 11 5576-4295. E-mail: cromwellmelo@hotmail.com

Financial support: None.

Submitted: 15 May 2012. Accepted, after review: 14 August 2012.

SCS Quadra 01, Bloco K, Salas 203/204 Ed. Denasa. CEP: 70.398-900 - Brasília - DF

Fone/fax: 0800 61 6218/ (55) (61) 3245 1030/ (55) (61) 3245 6218

E-mails: jbp@jbp.org.br

jpneumo@jornaldepneumologia.com.br

Copyright 2019 - Brazilian Thoracic Association