Fig. 3From: Measurement and application of patient similarity in personalized predictive modeling based on electronic medical recordsPredictive performance of random forest (RF), logistic regression (LR), and k-nearest neighbor (kNN) models. For simplicity, only performances of the models built on 2% (200 samples) to 30% (3000 samples) of the 10,000 training sample candidates are displayed in the figure. Blue, cyan, and dark red lines represent RF, kNN, and LR models, respectively. Lines with dot, triangle, and cross markers represent models built on the randomly selected samples and the most similar samples based on patient similarity when the similarity of disease diagnoses feature was calculated using ICD-10 and CCS codesBack to article page