Table 6.

Comparison (Paired t-test) of Note Types for Perceived Workload (NASA-TLX) and Usability (SUS)

NoteNotePaired t-Test Results
ABCDA Versus BA Versus CA Versus D
MeanSDMeanSDMeanSDMeanSDP-ValueP-ValueP-Value
TLX—Mental3.25(1.69)3.00(1.32)2.50(1.83)2.56(1.36).52.21.21
TLX—Physical1.75(1.13)1.56(0.81)1.44(0.73)1.38(0.72).38.33.23
TLX—Timing4.25(1.57)4.31(1.49)3.25(1.95)3.50(1.71).85.12.18
TLX—Performance3.81(1.47)3.63(1.45)2.69(1.70)3.31(1.54).66.048.43
TLX—Effort3.94(1.77)3.50(1.32)2.88(2.00)2.69(1.74).30.13.043
TLX—Frustration3.31(1.78)2.69(1.54)2.75(1.98)2.63(1.75).01.41.28
TLX—Overall3.39(1.21)3.11(0.86)2.58(1.51)2.68(1.14).24.11.10
SUS*58.50(22.22)74.83(15.10)81.83(21.9)77.50(26.86).007.005.009
  • * There was one subject who had missing data for note A. This subject was excluded from analysis specific to SUS. Bold reflects significant mean differences where P < .05.

  • Note models: A (traditional SOAP), B (two-column APSO), C (collapsible APSO), D (two-column collapsible APSO).

  • NASA-TLX contains 6 subscale items and an overall mean subscale score, each on a 7-point Likert scale in which lower TLX scores indicate less workload.

  • System Usability Scale (SUS) is reported as a raw score (scale of 0 to 100) in which larger values are considered better usability.

  • TLX, Task Load Index; APSO, assessment, plan, subjective, objective; SOAP, subjective, objective, assessment, plan.