Skip to main content

Table 4 Balanced error rate for classification using random forest model and stratified by follow-up time from sample collection to colorectal cancer diagnosis of cases

From: Untargeted plasma metabolomics and risk of colorectal cancer—an analysis nested within a large-scale prospective cohort

 

 < 5 years

5–9 years

10–15 years

 > 15 years

All samples

Three-level outcomes

 Location (proximal, distal, rectal)a

0.72

0.63

0.66

0.59

0.62

KRAS/BRAF (KRAS, BRAF, both wt)a

0.68

0.67

0.70

0.64

0.67

Two-level outcomes

 Stage (stages I–II and stages III–IV)b

0.50

0.41

0.44

0.46

0.49

KRAS (mutation, wild type)b

0.50

0.47

0.42

0.44

0.50

BRAF (mutation, wild type)b

0.50

0.50

0.52

0.51

0.50

 MSI (MSI, MSS)b

0.50

0.51

0.50

0.50

0.50

  1. wt wild type. MSI microsatellite instability. MSS microsatellite stable. None of the potential confounders (body mass index, smoking status, education level, diabetes, alcohol intake, and recreational physical activity) was selected in the variable selection step of the random forest models, with the exception of body mass index in the tumor location analysis restricted to samples taken < 5 years prior to diagnosis. aBalanced error rate for a three-class problem with expected BER by chance of 0.67. bBalanced error rate for a two-class problem with expected BER by chance of 0.50