Gradstats Lab Week 8

class: center, middle, inverse, title-slide

# Gradstats Lab Week 8
## Analysis Examples
### Chris Mellinger
### UC Boulder
### 2016/12/12 (updated: 2020-03-05)

---

# Study Design

.pull-left[
Example credit: Richard Border

Workplace harassment study.

Four men and four women are shown four different videos each and then rate the severity of harassment occurring in each.

Each video either had a male or female target (person being harassed), and each depicted either sexual harassment or cruel denigration.
]
.pull-right[

```r
kable(head(d), format='html') # remember you shouldn't add this argument usually
```

<table>
 <thead>
  <tr>
   <th style="text-align:right;"> subject </th>
   <th style="text-align:left;"> gender </th>
   <th style="text-align:right;"> ms </th>
   <th style="text-align:right;"> fs </th>
   <th style="text-align:right;"> mc </th>
   <th style="text-align:right;"> fc </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> M </td>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:right;"> 7 </td>
   <td style="text-align:right;"> 6 </td>
   <td style="text-align:right;"> 9 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:left;"> M </td>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:right;"> 8 </td>
   <td style="text-align:right;"> 7 </td>
   <td style="text-align:right;"> 8 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:left;"> M </td>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:right;"> 6 </td>
   <td style="text-align:right;"> 7 </td>
   <td style="text-align:right;"> 8 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:left;"> M </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 9 </td>
   <td style="text-align:right;"> 6 </td>
   <td style="text-align:right;"> 9 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 5 </td>
   <td style="text-align:left;"> F </td>
   <td style="text-align:right;"> 5 </td>
   <td style="text-align:right;"> 7 </td>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:right;"> 7 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 6 </td>
   <td style="text-align:left;"> F </td>
   <td style="text-align:right;"> 7 </td>
   <td style="text-align:right;"> 7 </td>
   <td style="text-align:right;"> 8 </td>
   <td style="text-align:right;"> 8 </td>
  </tr>
</tbody>
</table>
]

---

# Study Design: Exercise

.pull-left[
Take 60 seconds alone and:

- Identify sources of dependence
- Identify each independent variable (IV)
- Identify IVs that units are crossed with and which that depedent units are nested under
- Alt: Identify which IVs vary between units and which vary within units

Then we'll take 2 minutes to check your answers with a partner.
]
.pull-right[

```r
kable(head(d), format='html')
```

---

# Exercise Answers

Units of depedence are participants

Three IVs:
- **Gender of participant.** Varies BETWEEN participants. Participants are NESTED UNDER levels of gender (they each only experience being one gender).
- **Gender of target.** Varies WITHIN participants. Participants are CROSSED WITH levels of target gender (they experience both levels of target gender).
- **Harassment type.** Varies WITHIN participants. Participants are CROSSED WITH levels of harassment type (they see both cruel and sexual harassment)

---

# Analysis Strategy

We have IVs that vary both within and between participants.

For IVs varying within:
- Compute averages and differences, depending on our research questions
- Interpret intercepts!!

For IVs varying between (business as normal):
- Construct codes to ask our RQs.
- Interpret slopes accordingly.

---

# Analysis Prep Exercise

Take 2 minutes on your own and write down the difference scores and contrast codes we want.

Then check your answers with a friend.

---

# Difference Scores

For a full analysis, we need

1. Main effect of target gender
2. Main effect of harassment type
3. Interaction between the two

One heuristic is that our difference scores can follow the sign of the contrast codes we would write.

.pull-left[

Contrast codes:

<table>
 <thead>
<tr>
<th style="border-bottom:hidden" colspan="1"></th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Male</div></th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Female</div></th>
</tr>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> Sex. </th>
   <th style="text-align:right;"> Cru. </th>
   <th style="text-align:right;"> Sex. </th>
   <th style="text-align:right;"> Cru. </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> targetGender </td>
   <td style="text-align:right;"> -1 </td>
   <td style="text-align:right;"> -1 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 1 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> type </td>
   <td style="text-align:right;"> -1 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> -1 </td>
   <td style="text-align:right;"> 1 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> tgxt </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> -1 </td>
   <td style="text-align:right;"> -1 </td>
   <td style="text-align:right;"> 1 </td>
  </tr>
</tbody>
</table>

]
.pull-right[

Difference scores:

```r
d <- within(d, {
  tgender <- -ms - mc + fs + fc
  type <-    -ms + mc - fs + fc
  gxt <-      ms - mc - fs + fc
  
  # We also want an average 
  avg <- (ms + mc + fs + fc)/4
})
```
]

---

# Contrast Codes

Gender is easy to code.

```r
d <- within(d, {
  genderC <- -.5 * (gender=="M") + .5 * (gender=="F")
})
```

---

# Build Models

For each within-participants effect, we use its corresponding difference score as an outcome.

For each between-participants effect, we use its contrast code set as predictors.

For the next four models, we will interpret the intercept and slope (we will interpret even without significance as an exericise).

---

# Main Effect of Participant Gender

```r
mcSummary(lm(avg ~ genderC, data = d))
```

```
## $call
## lm(formula = avg ~ genderC, data = d)
## 
## $anova
##               SS df    MS EtaSq     F     p
## Model      0.031  1 0.031 0.016 0.095 0.768
## Error      1.969  6 0.328    NA    NA    NA
## Corr Total 2.000  7 0.286    NA    NA    NA
## 
## $extras
##        RMSE AdjEtaSq
## Model 0.573   -0.148
## 
## $coefficients
##               Est StErr      t  SSR(3) EtaSq tol CI_2.5 CI_97.5     p
## (Intercept) 6.375 0.203 31.478 325.125 0.994  NA  5.879   6.871 0.000
## genderC     0.125 0.405  0.309   0.031 0.016  NA -0.866   1.116 0.768
```

???

Intercept: average harassment rating across participants and vignettes.
Slope: Main effect of participant gender. No evidence participant gender mattered, but directionally women rated videos as having more harassment than men.

---

# Target Gender Model

```r
mcSummary(lm(tgender ~ genderC, data = d))
```

```
## $call
## lm(formula = tgender ~ genderC, data = d)
## 
## $anova
##              SS df     MS EtaSq     F     p
## Model      12.5  1 12.500 0.184 1.351 0.289
## Error      55.5  6  9.250    NA    NA    NA
## Corr Total 68.0  7  9.714    NA    NA    NA
## 
## $extras
##        RMSE AdjEtaSq
## Model 3.041    0.048
## 
## $coefficients
##              Est StErr      t SSR(3) EtaSq tol CI_2.5 CI_97.5     p
## (Intercept)  5.5 1.075  5.115  242.0 0.813  NA  2.869   8.131 0.002
## genderC     -2.5 2.151 -1.162   12.5 0.184  NA -7.762   2.762 0.289
```

???

Intercept: the men-minus-women difference is positive; thus, participants rated harassment as worse when women were the targets than when men were the targets. Main effect of target gender.

Slope: Not significant. But directionally, this shows that the target-gender difference was lower for women than for men. That is, men showed a greater tendency to rate harassment as worse when women were targets than when men were targets.

---

# Harassment Type Model

```r
mcSummary(lm(type ~ genderC, data = d))
```

```
## $call
## lm(formula = type ~ genderC, data = d)
## 
## $anova
##              SS df     MS EtaSq     F     p
## Model      40.5  1 40.500 0.779 21.13 0.004
## Error      11.5  6  1.917    NA    NA    NA
## Corr Total 52.0  7  7.429    NA    NA    NA
## 
## $extras
##        RMSE AdjEtaSq
## Model 1.384    0.742
## 
## $coefficients
##              Est StErr      t SSR(3) EtaSq tol CI_2.5 CI_97.5     p
## (Intercept)  2.5 0.489  5.108   50.0 0.813  NA  1.302   3.698 0.002
## genderC     -4.5 0.979 -4.597   40.5 0.779  NA -6.895  -2.105 0.004
```

???

Intercept: Main effect of harassment type. Cruel harassment was rated as worse than sexual harassment.

Slope: Interaction between harassment type and participant gender. Men showed a greater tendency than women to rate cruel harassment type as worse.

---

# Type by Target-Gender Model

```r
mcSummary(intMod <- lm(gxt ~ genderC, data = d))
```

```
## $call
## lm(formula = gxt ~ genderC, data = d)
## 
## $anova
##              SS df    MS EtaSq     F     p
## Model       4.5  1 4.500 0.132 0.915 0.376
## Error      29.5  6 4.917    NA    NA    NA
## Corr Total 34.0  7 4.857    NA    NA    NA
## 
## $extras
##        RMSE AdjEtaSq
## Model 2.217   -0.012
## 
## $coefficients
##              Est StErr      t SSR(3) EtaSq tol CI_2.5 CI_97.5     p
## (Intercept) -2.0 0.784 -2.551   32.0 0.520  NA -3.918  -0.082 0.043
## genderC      1.5 1.568  0.957    4.5 0.132  NA -2.337   5.337 0.376
```

---

# Type by Target-Gender Interpretation

This one's hard, let's do it more thoroughly.

```r
mcSummary(intMod)$coefficients
```

```
##              Est StErr      t SSR(3) EtaSq tol CI_2.5 CI_97.5     p
## (Intercept) -2.0 0.784 -2.551   32.0 0.520  NA -3.918  -0.082 0.043
## genderC      1.5 1.568  0.957    4.5 0.132  NA -2.337   5.337 0.376
```

So it is **negative** in sign and significant! Let's decompose.

---

# Type by Target-Gender Interpretation

Here was our difference score:

```r
d <- within(d, {
  tgender <- -ms - mc + fs + fc
  type <-   -ms + mc - fs + fc
* gxt <-     ms - mc - fs + fc
```

Another way to write this:

`$$(men - women)_{sexual} - (men - women)_{cruel}$$`

So, our negative coefficient says that the target-gender difference is *more negative* when sexual harassment is viewed than when cruel harassment is viewed

This is hard to interpret, so let's compute simple effects and graph our results.

---

### Simple Effects

We want to understand the type by target-gender interaction.

```r
d <- within(d, {
  # harassment effect for male targets
  type_m <- mc - ms
  type_f <- fc - fs
})
```

---

### Simple Effects

```r
mcSummary(lm(type_m ~ genderC, data = d))$coefficients
```

```
##               Est StErr      t SSR(3) EtaSq tol CI_2.5 CI_97.5     p
## (Intercept)  2.25 0.489  4.597   40.5 0.779  NA  1.052   3.448 0.004
## genderC     -3.00 0.979 -3.065   18.0 0.610  NA -5.395  -0.605 0.022
```

```r
mcSummary(lm(type_f ~ genderC, data = d))$coefficients
```

```
##               Est StErr      t SSR(3) EtaSq tol CI_2.5 CI_97.5     p
## (Intercept)  0.25 0.433  0.577    0.5 0.053  NA -0.810   1.310 0.585
## genderC     -1.50 0.866 -1.732    4.5 0.333  NA -3.619   0.619 0.134
```

Which parameter are we interpretting?

Intercepts! Larger (and significant) for male targets, smaller (and non-significant) for female targets.

Sexual harassment is seen as worse for women than men. This is less true of cruel harassment; it's seen as bad for both.

Equivalently: harassment is seen as worse when it is cruel than when it is sexual, but this is more true for targets who are male than female.

---

Plot setup code.

```r
yMaleM <- c(
  mean(d$ms[d$gender=="M"]),
  mean(d$mc[d$gender=="M"])
)
yFemaleM <- c(
  mean(d$fs[d$gender=="M"]),
  mean(d$fc[d$gender=="M"])
)

yMaleF <- c(
  mean(d$ms[d$gender=="F"]),
  mean(d$mc[d$gender=="F"])
)
yFemaleF <- c(
  mean(d$fs[d$gender=="F"]),
  mean(d$fc[d$gender=="F"])
)

xs <- c(
  -1, +1
)

mCol <- "#1f78b4"
fCol <- "#b2df8a"
```

---

.pull-left[

```r
#Plot for harrassment ratings by male participants
#Creating empty plot
plot(xs, yMaleM, type = 'n', 
     xlim = c(-1.5, 1.5), 
     ylim = c(1.5, 10),
     ylab = "Harassment Rating", 
     xlab = "Target Gender",
     main = "Harassment Ratings 
        by Male Participants",
     axes = F)
```

Notice the `type='n'` argument. This makes R set up a plot without drawing any of the data yet. We will add it later.

Also notice the `axes = F` argument. This turns off the default axes (which would be numbers) so that we can construct our own in a specific way later.
]
.pull-right[
![](AnalysisExamples_files/figure-html/unnamed-chunk-17-1.png)
]

---
.pull-left[

```r
#Making axis labels
axis(side = 1, at = c(-1, 1), 
     labels = c("Sexual", "Cruel"))
axis(side = 2)
```
]

.pull-right[
![](AnalysisExamples_files/figure-html/unnamed-chunk-19-1.png)
]

---

.pull-left[

```r
#Making axis labels
axis(side = 1, at = c(-1, 1), 
     labels = c("Sexual", "Cruel"))
axis(side = 2)
```

```r
#Plot lines
lines(xs, yMaleM, col = mCol, lwd = 3)
lines(xs, yFemaleM, col = fCol, lwd = 3)
```
]
.pull-right[
![](AnalysisExamples_files/figure-html/unnamed-chunk-22-1.png)
]

---

.pull-left[

```r
#Making axis labels
axis(side = 1, at = c(-1, 1), 
     labels = c("Sexual", "Cruel"))
axis(side = 2)
```

```r
#Plot lines
lines(xs, yMaleM, col = mCol, lwd = 3)
lines(xs, yFemaleM, col = fCol, lwd = 3)
```

```r
#Legend
legend(x = 'bottomright', 
       legend = c("Male Target", 
                  "Female Target"), 
       lty=1,
       col=c(mCol, fCol))
```
]
.pull-right[
![](AnalysisExamples_files/figure-html/unnamed-chunk-26-1.png)
]

---

.pull-left[

```r
#Plot for ratings by female participants
#Creating empty plot
plot(xs, yFemaleF, type = 'n', 
     xlim = c(-1.5, 1.5), 
     ylim = c(1.5, 10),
     ylab = "Harassment Rating", 
     xlab = "Target Gender",
     main = "Harassment Ratings 
        by Female Participants",
     axes = F)
#Axis labels
axis(side = 1, at = c(-1, 1), 
     labels = c("Sexual", "Cruel"))
axis(side = 2)

#Plot lines
lines(xs, yMaleF, col = mCol, 
      lwd = 3)
lines(xs, yFemaleF, col = fCol, 
      lwd = 3)

#legend
legend(x = 'bottomright', 
       legend = c("Male Target", 
                  "Female Target"), 
       lty=1,
       col=c(mCol, fCol))
```
]
.pull-right[
![](AnalysisExamples_files/figure-html/unnamed-chunk-28-1.png)

]

---

```r
op <- par(mfrow=c(2,1))
ylims <- c(1.5, 10)

plot(xs, yMaleM, type = 'n', xlim = c(-1.5, 1.5), ylim = ylims,
     ylab = "Harassment Rating", xlab = "Target Gender",
     main = "Harassment Ratings by Male Participants",
     axes = F)
axis(side = 1, at = c(-1, 1), labels = c("Sexual", "Cruel"))
axis(side = 2)
lines(xs, yMaleM, col = mCol, lwd = 3)
lines(xs, yFemaleM, col = fCol, lwd = 3)
legend(x = 'bottomright', legend = c("Male Target", "Female Target"), lty=1, 
       col=c(mCol, fCol))

plot(xs, yFemaleF, type = 'n', xlim = c(-1.5, 1.5), ylim = ylims,
     ylab = "Harassment Rating", xlab = "Target Gender",
     main = "Harassment Ratings by Female Participants",
     axes = F)
axis(side = 1, at = c(-1, 1), labels = c("Sexual", "Cruel"))
axis(side = 2)
lines(xs, yMaleF, col = mCol, lwd = 3)
lines(xs, yFemaleF, col = fCol, lwd = 3)
legend(x = 'bottomright', legend = c("Male Target", "Female Target"), lty=1, 
       col=c(mCol, fCol))

par(op)
```

---

![](AnalysisExamples_files/figure-html/unnamed-chunk-30-1.png)

---

### Type by T-Gender by P-Gender Interpretation

```r
mcSummary(intMod)$coefficients
```

Not significant, but we want to understand it.

First, NAME the 2-way target-gender by type effect: **the harassment dichomoty**

Then, let's understand how it varies by participant gender.

Positive coefficient: the harassment dichomoty is *more positive* for women than for men.

Since our baseline effect is negative in sign, the gender dichomoty effect is less true for women than for men.

So: men see sexual harassment directed at women as especially severe, but cruel harassment is bad for both. Women don't see so much difference; harassment is rated worse when directed at women than at men, but it does not matter so much what kind of harassment is occurring.

---

# Write Up

Severity ratings were submitted to a 2 (target gender: male vs female) x 2 (harassment type: sexual vs. cruel) x 2 (participant gender: male vs female) ANOVA with the first two factors varying within-participants and the last varying between-participants. There was no evidence that ratings varied as a function of participant gender, `$t(6) = 0.31$`, `$PRE = .016$`, `$p=.77$`. A main effect of target-gender showed that harassment was rated as more severe when women were targets than when men were targets `$t(6)=5.12$`, `$PRE=.813$`, `$p=.002$`. Participants rated cruel harassment as significantly more severe than sexual harassment on average, `$t(6)=5.11$`, `$PRE=.813$`, `$p=.002$`. These main effects were qualified by a significant harassment-type by target-gender interaction, `$t(6) = -2.55$`, `$PRE=.520$`, `$p=.043$`. For male targets, cruel harassment was rated as significantly more severe than sexual harassment, `$t(6)=4.60$`, `$PRE=.779$`, `$p=.004$`. In contrast, no harassment effect emerged for female targets, `$t(6)=0.58$`, `$PRE=.053$`, `$p=.585$`.

These results suggest that harassment type matters, but only for male victims. Men who are sexually harassed are perceived as less victimized when subjected to cruel treatment, but both are bad for women.

Remember that all these data are fake, so don't read into this.

---

Also some code for a ggplot version. We need the `reshape2` library.

```r
library(reshape2)
```

```
## 
## Attaching package: 'reshape2'
```

```
## The following object is masked from 'package:tidyr':
## 
##     smiths
```

---

```r
#transforming the data into longform
names(d)
dlong <- melt(d,
              id.vars = c("subject", "gender"), #sample unit, between factor
              measure.vars = c("ms", "fs", "mc", "fc"),
              variable.name = "condition",
              value.name = "rating")
dlong$target_gender <- substr(dlong$condition, 1, 1) #making target gendervariable
dlong$target_condition <- substr(dlong$condition, 2, 2) #making target gendervariable
dlong$subject_gender <- dlong$gender

#ggplot line graph
ggplot(dlong, aes(target_gender, rating, group=target_condition, color=target_condition)) + stat_summary(geom='line') +  stat_summary(geom='linerange') +facet_grid(subject_gender~.,labeller = label_both) +theme(legend.position = 'bottom')

#ggplot bar plot

# x: between factor (condition)
# fill: within factor (gender)
# y: DV (rating)
p <- ggplot(dlong) +
  geom_bar(aes(x = condition, y = rating, fill = target_gender),
           position = "dodge", stat = "summary", fun.y = "mean")
p + theme_classic()
```

---

```
##  [1] "subject" "gender"  "ms"      "fs"      "mc"      "fc"      "avg"    
##  [8] "gxt"     "type"    "tgender" "genderC" "type_f"  "type_m"
```

```
## No summary function supplied, defaulting to `mean_se()
## No summary function supplied, defaulting to `mean_se()
## No summary function supplied, defaulting to `mean_se()
## No summary function supplied, defaulting to `mean_se()
```

![](AnalysisExamples_files/figure-html/unnamed-chunk-34-1.png)![](AnalysisExamples_files/figure-html/unnamed-chunk-34-2.png)