Unpaired t-test

Alpha

0.05

Hypothesized Mean Difference

0

Baseline Models

General-purpose agent approach

Mean

57.58

68.39

Variance

73.72605042

29.6

Observations

36

36

Observed Mean Difference

−10.81

standard error of difference

1.677

df

70

t Stat

−6.507

P (T ≤ t) one-tail

0.0000000334

t Critical one-tail

1.67202883

P (T ≤ t) two-tail

0.0001

t Critical two-tail

2.002465403