NEW DUE DATE ASSIGNMENT 5. NOVEMBER 29 (MONDAY) BY CLASS.
OVERVIEW
GUIDE 1: INTRODUCTION
GUIDE 2: CONSTRUCTING A TABLE
GUIDE 3: UNIVARIATE STATISTICS AND DISPLAYS
GUIDE 4: BIVARIATE BASICS
GUIDE 5: BIVARIATE CORRELATIONS
GUIDE 6: MULTIVARIATE CROSSTABULATIONS
GUIDE 7: BASIC REGRESSION
GUIDE 8: REGRESSION SPECIFICS
GUIDE 9: SAMPLING
TO EDF 5400 READINGS AND ASSIGNMENTS
 

ASSIGNMENT 5 SPECIFICATIONS
COMING SOON


 
 

EDF 5400 INTRODUCTORY STATISTICS
FALL 2004

ASSIGNMENT 4: CROSSTABULATIONS INCLUDING A CONTROL VARIABLE 
ISSUES OF CAUSALITY IN NON-EXPERIMENTAL DATA

GENERAL FEEDBACK ASSIGNMENT 4
REVIEW ASSIGNMENT 4 SPECIFICATIONS HERE

This assignment is worth 5 PERCENT toward your final grade.
Remember! I use plus and minus grading on assignments and for the final grade.


This Feedback page is generic. If you feel it does not address the score on your paper, please make an appointment and we will go over your paper.

We do not go over PERSONAL papers during class or break. However, I will discuss them after class, and we can discuss them during office hours or through an appointment.

Maria and I are at least as interested in how you arrive at your answer as what your answer is.

This assignment is a good example. Let's suppose you chose the correlation coefficient phi (Cramer's V) to examine the association between degree and study year.

You received partial credit if you did. You COULD use V because you can use this correlation coefficient with any kind of data. But tau-beta was the BETTER choice in this case because (a) year is interval and degree is ordinal and (b) with only two categories in the independent variable (year), you can't tell if the relationship is nonlinear, so you may as well go with a higher level correlation coefficient.

However, if you DID choose Cramer's V, we wanted to make sure you were consistent, and that you subsequently chose the correct numeric values and strengths that belonged with V and not some other correlation coefficient. This became important because the correlation coefficient between year and degree was a weak strength for men if you used tau-beta but a moderate strength if you used V (both were very close in magnitude and the tau was just a tad below moderate level).
 

 
Phi and Cramer's V are identical if either the row or column variable has only two values. In that case, the formula collapses down to the formula for Phi. If the table is 3 X 3 or larger, Cramer's V is often slightly different from Phi.

IMPORTANT: A variable has different categories or VALUES. The terms categories and values are interchangeable. That's week one material.

If you consistently said how many variables a variable had, you lost credit, because you were warned on both Assignment 3 and Exam 2 that this is the time to stop. Statements such as the following are false and meaningless:

FALSE: "You can't tell the form of the relationship because the independent variable had only two variables."

FALSE: "You can't tell the form of the relationship because there were only two variables." (Yes, you can.)

Despite the "AQ" (Anxiety Quotient) on this exercise, most people did quite well. The median score was 19/20, the mean was 18.2 points (s = 2.5), and the IQR was from 18-20. Really can't do better than that! Especially since all these measures should be very familiar by now.

I know it sounds trite to say live with some anxiety; however, most of us get nervous when we learn new material. There is that ghastly feeling of not quite having one's feet on the floor. But, as you know by this time, such a feeling dissipates with practice. This will happen with basic regression, too.

   If you scored below 16 on this assignment, you are IN TROUBLE and need some extra help. Maria has been absolutely terrific working one-on-one with students (thank you, Maria!) and has office hours in the LRC Tuesday and Thursday 3:30-5:15.
 
 

The 18-20 point paper


You correctly identified sex as nominal and degree as ordinal. A few people thought degree was numeric. Not so. The units are irregular, for example, there were 11 years in the first category but only 1 in the second.
You recognized that the values of Phi and Cramer's V in this case were identical because the table was 2 X 2.


You realized that you probably had a REAL association (in the population).
The level of statistical significance for the Chi-Square was less than .01 or p < .01.
 
 
Almost everyone this time around was comfortable with the probability levels, and this was true for ALL FOUR of the tables that you considered. Congratulations!

You used Chi-square to determine the significance level, not the actual value of Phi (or Tau-beta). Chi-square is an inference measure. In sample data, it is easily possible to have a Phi or Tau that "looks real" or nonzero but which is a sampling accident instead because the Chi-square did not reach the .05 level of statistical significance (usually but not always this occurs with a small sample). 

You do NOT directly interpolate from a sample correlation coefficient to the population correlation without ascertaining first, through Chi-Square (or an F or t-test) that the correlation is really different from zero in the population.

If you said the association was real because PHI or TAU-BETA looked different from zero, there will be many dissenting letters in the Tallahassee Democrat.

You understood that you could not determine the form of the correlation between sex and degree because sex is (1) a nominal variable and (2) only has two categories or values (NOT "VARIABLES") and form cannot be determined under these circumstances.

You correctly reported the numeric value of the  correlation ( = .065 or .07) and identified its value as VERY WEAK.



Because year is interval and degree level is ordinal, and, further, since year had only 2 values (so you could not ascertain any nonlinear trends) and the relationship is asymmetric (no way can your degree level affect what year it is) you realized that you could use tau-beta in this case. Tau-beta is the strongest and most appropriate level of correlation coefficient to use under these circumstances.

You could use Phi  (you lost 1 point if you did) because you can use Phi with any level of data, or with nonlinear relationships. It's just not the BEST correlation under these circumstances.

No one identified Gamma as the best coefficient. Thank you!

You used the CHI-SQUARE value and its accompanying probability level (p < .01) to ascertain that the relationship was real. You did not use the sample value of Tau-b (or Phi) to ascertain if the sample relationship was real or a sampling accident.

You recognized that you could not identify the form of the relationship because you only had two categories on the independent variable, year. You need at least three categories (or VALUES) to identify form.

When you examined the correlations (Phi or Tau-B) separately for women and men, you noted that:

What are your causal conclusions about the type of relationship considering all three variables together?
n the box below is a message repeated from Assignment 4 that was placed right below question 15:
 
 
Remember, you can make ONLY ONE CHOICE from among the outcomes immediately above. 

All three correlations between degree and year were easily within 0.10 of one another, for everyone, for women and for men.

PLUS sex very weakly (but definitely) predicted degree.



Interpretation? JUDGEMENT CALL!! This is NOT an interaction effect. The correlations between year and degree are within |.10| for women and men.

It is not appropriate to say that one subtable correlation is bigger (e.g., for women) than the other because you do NOT have an interaction effect. Basically the correlation between year and degree is about the same for both sexes.



 
YOU LOST CREDIT IF

You kept confusing variables (an entity that varies) with values (the scores or categories that a variable takes on).

You said you could determine the form of the relationship (monotonic, linear or nonlinear) with a nominal variable or a variable that had only two categories or values.

You thought Chi-Square was a correlation coefficient. Or, you thought a correlation coefficient was an inference measure.

Level of data is the first thing we examine when choosing a statistic (for example, you couldn't use Pearson's r here because degree was ordinal. You couldn't use eta--a few people tried--because the dependent variable degreet was ordinal and eta requires a NUMERIC dependent variable.)

You did not recognize the joint effects in question 15. Remember, in an extraneous relationship, the control variable is not related (or is related but the effect is so trivial that it is VERY weak--were it weak instead, this would definitely have been a JOINT relationship).

You never mentioned the relationshp between the control variable gender and degree in deciding between a joint and extraneous relationship. This correlation is what makes the difference.

You said the 0.25 tau-beta between year and degree for men was moderate. It is not, it is weak. Please review THE CHART.
 
POSITIVE POINTS

Almost everyone has become a real "pro" at disentangling statistical significance from the strength of a correlation.

Everyone could locate the appropriate output for the total sample and for each gender subgroup.

Almost everyone could correctly identify the strength of their chosen correlation coefficients.

Most students recognized that there was a joint (or extraneous) effect among gender, degree level, and year and could tell us why.
 
 
PLEASE STUDY  YOUR ASSIGNMENT. COMMENTS ARE ON THEM AS APPROPRIATE. 
There will be a comparable problem on Exam Three


  READINGS AND ASSIGNMENTS
OVERVIEW

Susan Carol Losh November 15 2004
This page was built with Netscape Composer
and is best viewed with Netscape Navigator
600 X 800 display resolution.