Skip to main content

Table 1 The basic characteristics of samples in the test set and training set

From: Measurement and application of patient similarity in personalized predictive modeling based on electronic medical records

Characteristic

Test set (n = 6490)

Training set (n = 10,000)

P value#

Male gender, n (%)

4387 (67.6%)

6838 (68.4%)

0.282

Age (years), mean ± SD

60.1 ± 14.7

60.1 ± 15.0

0.967

Myocardial infarction, n (%)

443 (6.8%)

656 (6.6%)

0.615

Congestive heart failure, n (%)

507 (7.8%)

795 (8.0%)

0.642

Chronic obstructive pulmonary disease, n (%)

288 (4.4%)

467 (4.7%)

0.368

Mild liver disease, n (%)

799 (12.3%)

1301 (13.0%)

0.188

Hypertension, n (%)

3501 (53.9%)

5389 (53.9%)

0.950

Coronary heart disease, n (%)

2206 (34.0%)

3331 (33.3%)

0.366

Serum glucose (mmol/L), mean ± SD

6.6 ± 2.9

6.7 ± 2.9

0.793

Abnormal urine glucose, n (%)

1222 (18.8%)

1884 (18.8%)

0.987

  1. #Pearson’s χ2 test for nominal variables and T-test for scale variables
  2. SD standard deviation