|
|
I WILL HAVE OFFICE HOURS MONDAY AND WEDNESDAY AFTERNOON. MARIA WILL HAVE OFFICE HOURS TUESDAY AFTERNOON. OR YOU MAY EMAIL US. HOWEVER, PLEASE
DO NOT E-MAIL AFTER 8 PM TUESDAY NIGHT.
IF YOU E-MAIL ME WEDNESDAY MORNING I
WILL NOT HAVE ANY TIME TO RESPOND TO YOU.
|
This assignment is worth 5 PERCENT toward
your final grade.
Remember! I use plus and minus grading
on all assignments and for the final grade.
We are at least as interested in how you arrive at your answer as to what your answer is.
This assignment is a good example. Let's suppose you chose the correlation coefficient Tau-beta as the most appropriate in the cross-tabulation table to examine the association between year and the number of adults in the household.
You lost one point, because although Tau-b is a VALID measure in this case, it is not the MOST APPROPRIATE measure when you have an interval independent variable and a ratio dependent variable. Eta is a more powerful measure in this case than the Taus. Tau-b is used when at least one variable is ordinal, so it CAN be used, it's just that Eta is a better choice if the second variable is numeric.
ETA is ALSO a correlation coefficient.
You could also use r (Pearson's Product)
because year is interval-level (see below). The only problem is that with
only two values (NOT "variables") of the independent variable
"year" we don't know if the relationship resembles a straight line
or not.
is probably the BEST
choice (but you did receive full credit this time for r).
But suppose you did choose Tau-beta. We next looked to see if you found its correct value of -0.11 in the output and whether you compared Tau-beta (and not some other correlation coefficient) with Eta in the difference of means test. We expected you to say the value of the Tau was "weak" and not "very weak" (as was the case for eta or r).
We were looking
for consistency between your choice of the best correlation coefficient
and your subsequent choices.
|
|
What about year? Year is interval because there is no fixed zero (the Hebrew calendar and the Gregorian calendar have different starting points than the Western European calendar) and there is an equal interval: one year.
Decide on the level of data based on the properties of the category system, not how many categories are in the sampled variable. "Year" is interval because it has an equal numeric unit, one year. It did not become suddenly ordinal because we only used two survey years for this exercise.
In either the case of "year" or "adults"
we judge the level of data measurement FROM PROPERTIES OF THE CATEGORY
SYSTEM, not from some particular empirical sample distribution.
You realized that you probably had a REAL association (in the population).
You
based this decision on the probability or p-level value or the statistical
significance of the Chi-square (or later, the F).
You DO NOT use the value of the Chi-square itself or the value of the correlation coefficient. Both are SAMPLE VALUES. No matter what they may appear to be, sample values may not be different from a population null hypothesis of 0 or no association.
The level of statistical significance was less than .01 or p < .01.
Because you had two zeroes for the probability
level, the statement p = .01 is false. The probability is less than one
chance in 100, but we do not know how much less from the program output.
|
The computer output for the table itself stopped at two decimal places, therefore p < .0001 or p < .001 was inappropriate for the tabular portion unless you (1) looked up the value of the chi-square in a probability table and explicitly referenced this (a few people did) OR (2) explicitly referenced the difference of means test and its corresponding eta (a few people did).
You received full credit for this error on the exercise because this was your first encounter. You will NOT receive full credit for an incorrect probability level on Exam 2. One chance in 100 is not the same as one chance in a 1000 or one chance in 10,000.
You correctly calculated or identified the value of Eta which was 0.10
You correctly identified the strength of Eta as very weak (or
-beta
as -0.11 as weak).
ABOUT CORRELATION COEFFICIENT DIRECTION:
Correlation
coefficients that are the square root of some entity, such as phi or eta
are positive by default and definition. They cannot be negative. Two variables
must be at least ordinal and the correlation coefficient must be at least
ordinal for a correlation to have a positive or negative direction.
|
Question 7 is about the question of statistical significance. At this point, you looked at the value of the t (square root of the F-ratio) to see whether the T (F) is significantly different from zero. In this example, t = 6.48 (approximately). Its level of statistical significance was p < .0001. The program output gave you four decimal places for the probability level, so use four (not 2 or 3). You want to consider the probability level for this question.
You
identified both the t-value and its level of statistical significance correctly.
If the t-value (F-ratio) had been close to zero, you would have concluded that the 1980 and 2002 means on the number of adults in the household were the same (this is equivalent to saying that there is NO DIFFERENCE or a ZERO DIFFERENCE between the 1980 and 2002 population means).You cannot always use a cutoff t value of |1.96|. This will not hold in smaller samples (e.g., under 120). In smaller samples, the t distribution looks "flatter" than the normal distribution and you will need larger t values to reach your specified alpha level.If the t-value (in absolute terms) is considerably larger than zero and falls in the "tail" of the t distribution, you reject the null hypothesis of "no difference" or "zero difference" between the 1980 and 2002 means and accept the alternative hypothesis that there is "some difference" (unknown, just not zero) across time in the number of household adults.
Do NOT answer the question about strength of the difference here, The t or F value is a poor indicator of strength anyway, partly because it is influenced by sample size as much as the group differences; partly because it can theoretically go to infinity; but largely because it only tests whether the difference between the male and female means is zero or not.
|
You
identified the level of significance as p < .0001 (there was the "row
of four zeroes" here).
You
identified the eta as about 0.10 and you correctly identified the strength
as VERY weak.
If you chose the difference of means test for its economy and ease of interpretation
for your reader, we expected you to choose eta in question 2 (not tau or
phi which are for lower levels of data).
Those who liked the difference of means noted that:
|
|
Some of your output was missing.
You misidentified the value of Eta or Tau.
You confused the value of the correlation coefficient with the probability
level.
You misidentified the measurement level of each variable in the assignment
at some point.
You
misidentified the "number of adults" as the independent variable. There
is no way the number of adults in the household will influence what year
it is! NOTE ON COEFFICIENTS: You may use a symmetric correlation coefficient
with an asymmetric relationship (with two numeric variables and lots of
values, r will be your best choice, for example). However, it is generally
inappropriate to use an asymmetric coefficient with a symmetric relationship.
You used a sample value, such as a t, Chi-Square or a correlation coefficient
to decide if you had statistically significant results. The FIRST thing
you must do is decide whether the sample results reflect a population value
of zero, or whether you reject the null hypothesis and decide that the
population parameter is not zero (no matter how small).
You
decided the sample results were not statistically significant because the
value of the correlation coefficient was weak or very weak. Statistical
significance in the bivariate case means the association is not zero, even
if it is quite weak. Substantive or practical significance is assessed
with the strength of the correlation coefficient or the effect size of
the difference across means.
|
You misapplied levels of statistical significance
to the strength of a correlation.
Eta was BOTH simultaneously very weak AND it was highly statistically significant in these analyses. This is mixing up statistical significance with practical (substantive) significance. The test for statistical significance asked "is Eta reliably different from zero"? The answer was a resounding YES! If Eta were really zero in the population, you would only get a Chi-Square or F-ratio as extreme or large as our results by accident in less than 1 in 100 or 1 in 10,000 samples, depending on the table or the difference of means test. That's REALLY rare! So you reject the null hypothesis that Eta = 0 and say that your results are "statistically significant." (And you do have a less than 1/10,000 chance in the case of the t-test that you are wrong and Eta really is zero.) But you didn't test what Eta WAS. You only rejected what it probably WAS NOT (zero). When you next looked at the actual value of Eta, it was just 0.10 and this is very weak. We could reliably identify Eta (or tau) as statistically different from zero because the sample is so large and the results don't vary much. Our sample estimate is probably a good estimate of the population Eta. But reliably different from zero doesn't make for strong. And, in this case, Eta is reliably very weak. Be prepared for this combination of results (weak strength but statistically significant) in large samples. In small samples, the opposite occurs. You get results that look "as if they were moderate to strong". In fact, if these results are not statistically significant, it doesn't matter what they appear to be, because no statistical significance means they really do not differ from zero in the population. It's easy to mix things up because correlations can vary from zero to one and thus resemble a numerical fraction just like probability levels do. KNOW WHERE TO LOOK. KEEP THE QUESTION ORDER IN SEQUENCE. Answer question 1 (do I even have any relationship?) FIRST! If you do have a relationship, your results are statistically significant, no matter how tiny they may be. NORMALLY, WE WANT TO DISCUSS RESULTS THAT ARE BOTH STATISTICALLY AND SUBSTANTIVELY SIGNIFICANT. TIP: Check the studies that you read. If they never discuss the STRENGTH of results, you should be at least a little suspicious. Either the analyst didn't know to check the strength of the results (shame!) or the results were weak and they didn't want to tell you so.
|
You
said that substantively weak results or effects were not statistically
significant.
You picked Phi or tau because it was bigger in size than Eta. Do
not pick a correlation coefficient just because it is "the biggest one."
If this were the case, we would simply pick gamma
for everything because it is nearly always the largest coefficient, although
gamma is usually a poor choice because of the way it is calculated. Further,
it means you will bounce around from analysis to analysis using inconsistent
coefficients because they are the largest in size in each case. And, of
course, those variations in size could (once again) simply be sampling
error or variability.
DO REVIEW THESE POINTS (AS WELL AS WHY
ETA WAS THE BEST CHOICE IN THE TABULAR DISPLAY TOO) BECAUSE THEY WILL APPEAR
ON EXAM 2.
|
|
|
|
|
READINGS AND ASSIGNMENTS |
OVERVIEW |
|
Susan Carol Losh October
24 2004
This page was built with
Netscape Composer
and is best viewed with
Netscape Navigator
600 X 800 display resolution.