Generation of an algorithm based on minimal gene sets to clinically subtype triple negative breast cancer patients

BMC Cancer. 2016 Feb 23:16:143. doi: 10.1186/s12885-016-2198-0.

Abstract

Background: Recently, a gene expression algorithm, TNBCtype, was developed that can divide triple-negative breast cancer (TNBC) into molecularly-defined subtypes. The algorithm has potential to provide predictive value for TNBC subtype-specific response to various treatments. TNBCtype used in a retrospective analysis of neoadjuvant clinical trial data of TNBC patients demonstrated that TNBC subtype and pathological complete response to neoadjuvant chemotherapy were significantly associated. Herein we describe an expression algorithm reduced to 101 genes with the power to subtype TNBC tumors similar to the original 2188-gene expression algorithm and predict patient outcomes.

Methods: The new classification model was built using the same expression data sets used for the original TNBCtype algorithm. Gene set enrichment followed by shrunken centroid analysis were used for feature reduction, then elastic-net regularized linear modeling was used to identify genes for a centroid model classifying all subtypes, comprised of 101 genes. The predictive capability of both this new "lean" algorithm and the original 2188-gene model were applied to an independent clinical trial cohort of 139 TNBC patients treated initially with neoadjuvant doxorubicin/cyclophosphamide and then randomized to receive either paclitaxel or ixabepilone to determine association of pathologic complete response within the subtypes.

Results: The new 101-gene expression model reproduced the classification provided by the 2188-gene algorithm and was highly concordant in the same set of seven TNBC cohorts used to generate the TNBCtype algorithm (87%), as well as in the independent clinical trial cohort (88%), when cases with significant correlations to multiple subtypes were excluded. Clinical responses to both neoadjuvant treatment arms, found BL2 to be significantly associated with poor response (Odds Ratio (OR) =0.12, p=0.03 for the 2188-gene model; OR = 0.23, p < 0.03 for the 101-gene model). Additionally, while the BL1 subtype trended towards significance in the 2188-gene model (OR = 1.91, p = 0.14), the 101-gene model demonstrated significant association with improved response in patients with the BL1 subtype (OR = 3.59, p = 0.02).

Conclusions: These results demonstrate that a model using small gene sets can recapitulate the TNBC subtypes identified by the original 2188-gene model and in the case of standard chemotherapy, the ability to predict therapeutic response.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Algorithms
  • Female
  • Gene Expression*
  • Humans
  • Models, Genetic
  • Neoadjuvant Therapy
  • Predictive Value of Tests
  • Prognosis
  • Retrospective Studies
  • Treatment Outcome
  • Triple Negative Breast Neoplasms / drug therapy
  • Triple Negative Breast Neoplasms / genetics*
  • Triple Negative Breast Neoplasms / pathology*