# RStats

Statistics in RStudio

<ðŸ”— As always links first

Some commonly used statistics calculations for use in RStudio.

**Load data**

Weâ€™ll load some data on Tasmanian devils and presence of DFTD (cancer) and *Trypanosoma* presence. The data is coded so that 0 is negative and 1 is positive.

```
library(readr)
# Load data
tasdevil <- read_csv("data/tasdevil-parasite.csv")
```

## Basics

Sructure of data set

`str(tasdevil)`

How many had DFTD

`sum(tasdevil$DFTDStatus)`

## Proportion and CIs

**Load libraries**

```
# Install first if you need
#
# install.packages("DescTools")
# install.packages("PropCIs")
# install.packages("binom")
# install.package("rcompanion")
# install.package("tidyverse")
library(DescTools)
library(PropCIs)
library(binom)
library(rcompanion)
library(tidyverse)
```

Calculate different in *Trypanosoma* prevalence between males and females with 95% CIs

```
groupwiseMean(TrypStatus ~ Sex,
data = tasdevil,
conf = 0.95,
digits = 3)
```

We could plot the values above using thisâ€¦

```
#save values to a data.frame
CI <- groupwiseMean(TrypStatus ~ Sex,
data = tasdevil,
conf = 0.95,
digits = 3)
#plot
qplot(x= Sex,
y = Mean,
data = CI,
shape= Sex) +
geom_point(size=2.5) +
geom_errorbar(aes(
ymin = Trad.lower,
ymax = Trad.upper,
width = 0.15)) + theme_bw() + ylim(0,1)
```

Calculate different in *Trypanosoma* prevalence between males and females and 4 different sites with 95% CIs

```
groupwiseMean(TrypStatus ~ Sex + Site_code,
data = tasdevil,
conf = 0.95,
digits = 3)
```

#### Simple stuff

If you need here are some simple bits of code where you have basic numbers such asâ€¦7 positive out of sample size of 21.

```
binom.test(7, 21,
0.5,
alternative="two.sided",
conf.level=0.95)
```

Now weâ€™ll calculate the 95% CIs using the Jeffreys method.

```
BinomCI(7, 21,
conf.level=0.95,
method="jeffreys")
```

### Odds ratio & Relative risk

Using epitools - manual here

*Reminder:* If you need more information on the tests use the help command in the console (e.g. `?riskratio`

, `?oddsratio`

).

**Library**

```
library(epitools)
# if you don't have this package, first install using `install.packages("epitools")`
```

Create a simple dataframe. In this case weâ€™ll test effect of gender on parasite presence with a simple positive/negative summary. Of course if you have a your raw data in a spreadsheet you could make your own by summarising the releavnt information into a dataframe. (Need help tidying and summarising your dataâ€¦check out this tutorial to check you hooked on the `dplyr`

and `tidyr`

packages

```
factor1 <- c("Female", "Male")
factor2 <- c("Positive", "Negative")
dat <- matrix(c(16, 30, 15, 34), nrow = 2, ncol = 2, byrow = TRUE)
dimnames(dat) <- list("Sex" = factor1, "Parasite present" = factor2)
```

Your dataframe should look like this

`dat`

Now lets calculate our **odds ratio**

`oddsratio(dat)`

and **relative risk**

`riskratio(dat)`

## Modelling

Manual here and webpage