Accuracy: The overall accuracy is the percentage of correctly classified outcomes.
Attributable Risk: The difference in the rate of a condition between the exposed and unexposed populations.
This difference is attributed to the exposure. (Table 1).
Table 1

Disease
(cases) 
No Disease
(control) 
Row Totals 
Exposure (Treatment) 
a 
b 
a + b 
No Exposure (No Treatment) 
c 
d 
c + d 
Column Totals 
a + c 
b + d 
a + b + c + d 
In epidemiology, attributable risk is the difference in the rate of a disease/outcome for exposed and unexposed populations. This calculation helps illustrate whether an exposure is related to the particular disease/outcome.
Ex. Cohort Study of Smoking and Coronary Heart Disease (CHD) among Medicaid Recipients

Developed CHD 
Do not develop CHD 
Total 
Incidence per 1,000 per Year 
Smoked Cigarettes 
115 
3,000 
3,115 
36.9 
Do not smoke Cigarettes 
132 
5,200 
5,332 
24.8 
Incidence among Exposed (Smokers) =
Incidence among Unexposed (NonSmokers) =
The incidence in the exposed group, which is attributable to the exposure, is calculated as follows:
The proportion of the total incidence in the exposed group, which is attributable to the exposure, is calculated by:
Among Medicaid recipients, 32.8% of the morbidity from CHD among smokers may be attributable to smoking.
CaseControl: In an observational casecontrol study, the researcher looks through extant data and randomly selects cases and controls. Because the researcher specifically looks for and includes cases and controls, the proportion of cases in the sample is predetermined. Thus, one cannot estimate a risk ratio. After selection based only on casecontrol status, each person's exposure is determined.
Crosssectional: In a crosssectional study, data are collected at multiple time periods (usually at regular intervals). However, the data not collected from the same sampling units.
Cumulative incidence: The number of new cases in a specific time interval divided by the number of persons at risk.
Ex. In 2010, the population of women ages 3549 who were breast cancer free was 135,000 and 1,000 of those women develop breast cancer over 1 year of observation, the cumulative incidence rate of breast cancer is 7.41 breast cancer cases per 1,000 Medicaid recipients (0.741%).
(return to top)
Ecological Study: A study for which the unit of analysis is the population rather than the individual. This type of study might compare outcomes, for example, in different countries.
Effectiveness: RCTs are sometimes designed to investigate whether there is evidence in favor/against a drug/device/intervention when recruiting relatively arbitrary participants in flexible conditions. These trials focus on general practice.
Efficacy: RCTs are sometimes designed to investigate whether there is evidence in favor or against a drug, device, or intervention when recruiting highly selected participants in highly controlled conditions.
Efficiency: The efficiency of a test is the percentage of the times that the test gives the correct answer compared to the total number of tests (Table 2).
Table 2

Test Result (T) 
True Status (D) 
Positive (+) 
Negative () 
Disease (+) 
a
(True Positive) 
b
(False Negative) 
No Disease () 
c
(False Positive) 
d
(True Negative) 
(return to top)
Hazard ratio (HR): The ratio of two hazard rates corresponding to two conditions (e.g., male versus female, or exposed versus unexposed). The hazard rate is the rate of events at time t conditioned on not having the event before time t.
Incidence rate: The measure of the risk of occurrence of a specific outcome in a specific time interval.
Ex. In 2010, the average Medicaid population was 972,000 and there were 3,500 deaths over 1 year of observation, the incidence rate of breast cancer is 3.60 deaths per 1,000 personyears.
Incidence rate ratio (IRR): The ratio of two incidence rates which is used for comparison in regression models of count (incidence) outcomes.
Longitudinal: In a longitudinal study, data for each sampling unit (typically a person in health studies) is collected repeatedly over time. The measures may or may not be at regular time intervals.
(return to top)
Negative Predictive Value (NPV): The negative predictive value is the probability that noncases really are noncases (Table 2).
Table 2

Test Result (T) 
True Status (D) 
Positive (+) 
Negative () 
Disease (+) 
a
(True Positive) 
b
(False Negative) 
No Disease () 
c
(False Positive) 
d
(True Negative) 
Nested casecontrol: A casecontrol study taken from within a (larger) panel study.
(return to top)
Odds: The ratio of the probability of success to the probability of failure.
Example: The ratio of the probability of success to the probability of failure (Table 1).
Table 1

Disease
(cases) 
No Disease
(control) 
Row Totals 
Exposure (Treatment) 
a 
b 
a + b 
No Exposure (No Treatment) 
c 
d 
c + d 
Column Totals 
a + c 
b + d 
a + b + c + d 
Let p = probability of an event
1p = probability of that event not occurring
The odds of disease among those who have been exposed:
The odds of disease among those who were not exposed is:
Odds Ratio (OR): The ratio of two odds. The odds of cancer for males versus the odds of cancer for females is the odds ratio of cancer for males versus females. This measure relates the relative ratio of success to failure for one condition versus another. Generally, people are better able to think in terms of relative risk than they are in terms of relative odds (Table 1).
Table 1

Disease
(cases) 
No Disease
(control) 
Row Totals 
Exposure (Treatment) 
a 
b 
a + b 
No Exposure (No Treatment) 
c 
d 
c + d 
Column Totals 
a + c 
b + d 
a + b + c + d 
The odds ratio compares the risk of disease in exposed versus nonexposed persons:
Interpretation of OR:
OR < 1 
lower risk (“exposure is protective or negatively associated with disease”) of disease for exposed individuals 
OR = 1 
no difference in risk of disease 
OR > 1 
increased risk (“exposure is positively associated with disease“) of disease for exposed individuals 
(return to top)
Panel study: A form of a longitudinal study (sometimes called a cohort study) in which groups are followed over time. The groups are formed to differ only on certain key variables.
Positive Predictive Value (PPV): The positive predictive value (also known as the precision) is the probability that predicted cases really are cases (Table 2).
Table 2

Test Result (T) 
True Status (D) 
Positive (+) 
Negative () 
Disease (+) 
a
(True Positive) 
b
(False Negative) 
No Disease () 
c
(False Positive) 
d
(True Negative) 
Power: The power of a test (1β) is the probability it will reject a hypothesis when that hypothesis is not true (Table 3).
Table 3

Reality 
Decision 
H0 is true 
H0 is false 
Reject H0 
Type I (α) 
Correct decision 
Fail to Reject H0 
Correct decision 
Type II (β) 
Prevalence rate: The total number of cases at a specific time. This measures how common a disease/outcome is at points in time.
Ex. In 2010, there were 7,500 children (ages birth18 years) who had paid claims associated with a primary diagnosis of obesity in the Medicaid population in SC. The total population of children ages birth18 years was 125,000, therefore the prevalence of obesity in children in the Medicaid population in SC is 6%.
Prospective Cohort: In a prospective cohort study, the researcher identifies a cohort of persons based on whether they were exposed (and none of them are already cases). Then, the entire cohort is followed over time where some proportion of the population will become cases.
(return to top)
Randomized controlled trial: The preferred design for a clinical trial used to examine the efficacy of a drug/intervention/device. In RCTs, subjects are first accepted into the study and then assigned to one of the treatment arms. RCTs are typically broken into several levels of investigation (especially when the focus is on a drug).
Randomized clinical trials: A randomized clinical trial (RCT) is one in which after subjects satisfy eligibility criteria are then randomly assigned to one of the treatment groups under study. This randomization helps balance known and unknown prognostic factors.
Relative risk (RR): This is the risk of an event (e.g., a health outcome) relative to whether there was exposure. For example, one might be interested in the risk of cancer relative to smoking; the RR is the proportion of exposed sample which develops into a case over the proportion of the unexposed sample which develops into a case. (Table 1)
Table1

Disease
(cases) 
No Disease
(control) 
Row Totals 
Exposure (Treatment) 
a 
b 
a + b 
No Exposure (No Treatment) 
c 
d 
c + d 
Column Totals 
a + c 
b + d 
a + b + c + d 
Interpretation of Relative Risk:
RR < 1 > 
Positive association between exposure and disease
Exposed group has higher incidence than nonexposed group 
RR = 1 > 
No association between exposure and disease
Incidence rates are identical between groups 
RR > 1 > 
Negative association between exposure and disease
Non exposed group has higher incidence 
Retrospective Cohort: In a retrospective study, the researcher identifies a cohort of persons based on whether they were exposed. The researcher looks to see whether those persons subsequently became cases.
Receiver Operator Characteristic (ROC) Curve: The ROC curve is a graph of the true positive rate (sensitivity) versus the false positive rate (one minus specificity) at various threshold settings. A poor predictive model will have an area under the ROC curve of one half and a perfect predictive model will have an area under the curve of one.
(return to top)
Sensitivity: The number of true positives divided by the sum of the number of true positives and the number of false negatives. This is the proportion of items classified as positive which really are positive (Table 2).
Table 2

Test Result (T) 
True Status (D) 
Positive (+) 
Negative () 
Disease (+) 
a
(True Positive) 
b
(False Negative) 
No Disease () 
c
(False Positive) 
d
(True Negative) 
Specificity: The number of true negatives divided by the sum of the number of true negatives and the number of false positives. This is the proportion of items classified as negative which really are negative (Table 2).
Table 2

Test Result (T) 
True Status (D) 
Positive (+) 
Negative () 
Disease (+) 
a
(True Positive) 
b
(False Negative) 
No Disease () 
c
(False Positive) 
d
(True Negative) 
(return to top)
Threshold setting: The threshold setting is that value which classifies continuous measure into 2 categories (noncase and case).
Type I Error: In a statistical hypothesis test, there is a null hypothesis (H0) assumed to be true. Evidence is assessed to determine whether the null hypothesis should be rejected in favor of the alternative hypothesis. Type I error (α) occurs when we reject the null hypothesis when the null hypothesis is actually true. This is also referred to as the significance level (Table 3).
Table 3

Reality 
Decision 
H0 is true 
H0 is false 
Reject H0 
Type I (α) 
Correct decision 
Fail to Reject H0 
Correct decision 
Type II (β) 
Type II Error: In a statistical hypothesis test, there is a null hypothesis (H0) assumed to be true. Evidence is assessed to determine whether the null hypothesis should be rejected in favor of the alternative hypothesis. Type II error (β) occurs when we fail to reject the null hypothesis when the null hypothesis is actually false (Table 3).
Table 3

Reality 
Decision 
H0 is true 
H0 is false 
Reject H0 
Type I (α) 
Correct decision 
Fail to Reject H0 
Correct decision 
Type II (β) 
Youden's J: A summary of 2x2 tables, this statistic is the sensitivity plus the specificity minus 1. This statistic can be used to choose the optimal threshold setting for classifying predicted values into noncases and cases.
(return to top) 