# 2 Bi-variate Statistics: Basics

# Terminology: Explanatory/Response or Independent/Dependent

All of the discussion so far has been for studies which have a single variable. We may collect the values of this variable for a large population, or at least the largest sample we can afford to examine, and we may display the resulting data in a variety of graphical ways, and summarize it in a variety of numerical ways. But in the end all this work can only show a single characteristic of the individuals. If, instead, we want to study a *relationship*, we need to collect two (at least) variables and develop methods of descriptive statistics which show the relationships between the values of these variables.

Relationships in data require at least two variables. While more complex relationships can involve more, in this chapter we will start the project of understanding *bivariate data*, data where we make two observations for each individual, where we have exactly two variables.

If there is a relationship between the two variables we are studying, the most that we could hope for would be that that relationship is due to the fact that one of the variables *causes* the other. In this situation, we have special names for these variables

# Scatterplots

When we have bivariate data, the first thing we should always do is draw a graph of this data, to get some feeling about what the data is showing us and what statistical methods it makes sense to try to use. The way to do this is

as follows

# Correlation

As before…

# Exercises

Here we go…