As always it’s only fair links go first
Some commonly used statistics calculations for use in RStudio.
Load data
We’ll load some data on Tasmanian devils and presence of DFTD (cancer) and Trypanosoma presence. The data is coded so that 0 is negative and 1 is positive.
library(readr)
# Load data
tasdevil <- read_csv("data/tasdevil-parasite.csv")
Sructure of data set
str(tasdevil)
## tibble [95 × 17] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ Microchip : num [1:95] 9.82e+14 9.82e+14 9.82e+14 9.82e+14 9.82e+14 ...
## $ Site_name : chr [1:95] "Takone" "Takone" "Takone" "Takone" ...
## $ Site_code : chr [1:95] "TKN" "TKN" "TKN" "TKN" ...
## $ Sex : chr [1:95] "Male" "Female" "Male" "Male" ...
## $ YOB : num [1:95] 2017 2016 2017 2016 2013 ...
## $ BloodNumber : chr [1:95] "TKN-215" "TKN-178" "TKN-213" "TKN-192" ...
## $ TrappingDate: chr [1:95] "May-18" "May-18" "May-18" "May-18" ...
## $ DFTDScore : num [1:95] 1 5 1 5 1 1 1 5 1 1 ...
## $ DFTDStatus : num [1:95] 0 1 0 1 0 0 0 1 0 0 ...
## $ Blood_lid_id: chr [1:95] "A1" "A10" "A2" "A3" ...
## $ TrypStatus : num [1:95] 0 0 1 0 0 1 0 0 0 0 ...
## $ TrypCope : num [1:95] 0 0 1 0 0 1 0 0 0 0 ...
## $ TrypNov : num [1:95] 0 0 0 0 0 0 0 0 0 0 ...
## $ TrypSpp : chr [1:95] "Neg" "Neg" "T.cope" "Neg" ...
## $ TickStatus : num [1:95] 1 1 1 1 1 1 1 0 1 1 ...
## $ TickNos : num [1:95] 1 1 1 1 1 1 1 0 1 1 ...
## $ TickSpp. : chr [1:95] "ITaF" "ITaF" "ITaF" "ITaF" ...
## - attr(*, "spec")=
## .. cols(
## .. Microchip = col_double(),
## .. Site_name = col_character(),
## .. Site_code = col_character(),
## .. Sex = col_character(),
## .. YOB = col_double(),
## .. BloodNumber = col_character(),
## .. TrappingDate = col_character(),
## .. DFTDScore = col_double(),
## .. DFTDStatus = col_double(),
## .. Blood_lid_id = col_character(),
## .. TrypStatus = col_double(),
## .. TrypCope = col_double(),
## .. TrypNov = col_double(),
## .. TrypSpp = col_character(),
## .. TickStatus = col_double(),
## .. TickNos = col_double(),
## .. TickSpp. = col_character()
## .. )
How many had DFTD
sum(tasdevil$DFTDStatus)
## [1] 21
Load libraries
# Install first if you need
#
# install.packages("DescTools")
# install.packages("PropCIs")
# install.packages("binom")
# install.package("rcompanion")
# install.package("tidyverse")
library(DescTools)
library(PropCIs)
library(binom)
library(rcompanion)
library(tidyverse)
Calculate different in Trypanosoma prevalence between males and females with 95% CIs
groupwiseMean(TrypStatus ~ Sex,
data = tasdevil,
conf = 0.95,
digits = 3)
## Sex n Mean Conf.level Trad.lower Trad.upper
## 1 Female 46 0.348 0.95 0.205 0.491
## 2 Male 49 0.306 0.95 0.172 0.440
We could plot the values above using this…
#save values to a data.frame
CI <- groupwiseMean(TrypStatus ~ Sex,
data = tasdevil,
conf = 0.95,
digits = 3)
#plot
qplot(x= Sex,
y = Mean,
data = CI,
shape= Sex) +
geom_point(size=2.5) +
geom_errorbar(aes(
ymin = Trad.lower,
ymax = Trad.upper,
width = 0.15)) + theme_bw() + ylim(0,1)
Calculate different in Trypanosoma prevalence between males and females and 4 different sites with 95% CIs
groupwiseMean(TrypStatus ~ Sex + Site_code,
data = tasdevil,
conf = 0.95,
digits = 3)
## Sex Site_code n Mean Conf.level Trad.lower Trad.upper
## 1 Female BRI 16 0.438 0.95 0.1640 0.711
## 2 Female FNP 12 0.000 0.95 0.0000 0.000
## 3 Female TKN 10 0.200 0.95 -0.1020 0.502
## 4 Female WPP 8 0.875 0.95 0.5790 1.170
## 5 Male BRI 15 0.267 0.95 0.0132 0.520
## 6 Male FNP 15 0.200 0.95 -0.0293 0.429
## 7 Male TKN 13 0.308 0.95 0.0174 0.598
## 8 Male WPP 6 0.667 0.95 0.1250 1.210
If you need here are some simple bits of code where you have basic numbers such as…7 positive out of sample size of 21.
binom.test(7, 21,
0.5,
alternative="two.sided",
conf.level=0.95)
##
## Exact binomial test
##
## data: 7 and 21
## number of successes = 7, number of trials = 21, p-value = 0.1892
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
## 0.1458769 0.5696755
## sample estimates:
## probability of success
## 0.3333333
Now we’ll calculate the 95% CIs using the Jeffreys method.
BinomCI(7, 21,
conf.level=0.95,
method="jeffreys")
## est lwr.ci upr.ci
## [1,] 0.3333333 0.1632227 0.5459686
Using epitools - manual here
Reminder: If you need more information on the tests use the help command in the console (e.g. ?riskratio
, ?oddsratio
).
Library
library(epitools)
# if you don't have this package, first install using `install.packages("epitools")`
Create a simple dataframe. In this case we’ll test effect of gender on parasite presence with a simple positive/negative summary. Of course if you have a your raw data in a spreadsheet you could make your own by summarising the releavnt information into a dataframe. (Need help tidying and summarising your data…check out this tutorial to check you hooked on the dplyr
and tidyr
packages
factor1 <- c("Female", "Male")
factor2 <- c("Positive", "Negative")
dat <- matrix(c(16, 30, 15, 34), nrow = 2, ncol = 2, byrow = TRUE)
dimnames(dat) <- list("Sex" = factor1, "Parasite present" = factor2)
Your dataframe should look like this
dat
## Parasite present
## Sex Positive Negative
## Female 16 30
## Male 15 34
Now lets calculate our odds ratio
oddsratio(dat)
## $data
## Parasite present
## Sex Positive Negative Total
## Female 16 30 46
## Male 15 34 49
## Total 31 64 95
##
## $measure
## odds ratio with 95% C.I.
## Sex estimate lower upper
## Female 1.000000 NA NA
## Male 1.205696 0.506045 2.888357
##
## $p.value
## two-sided
## Sex midp.exact fisher.exact chi.square
## Female NA NA NA
## Male 0.6721372 0.8269438 0.6648308
##
## $correction
## [1] FALSE
##
## attr(,"method")
## [1] "median-unbiased estimate & mid-p exact CI"
and relative risk
riskratio(dat)
## $data
## Parasite present
## Sex Positive Negative Total
## Female 16 30 46
## Male 15 34 49
## Total 31 64 95
##
## $measure
## risk ratio with 95% C.I.
## Sex estimate lower upper
## Female 1.000000 NA NA
## Male 1.063946 0.8030739 1.409559
##
## $p.value
## two-sided
## Sex midp.exact fisher.exact chi.square
## Female NA NA NA
## Male 0.6721372 0.8269438 0.6648308
##
## $correction
## [1] FALSE
##
## attr(,"method")
## [1] "Unconditional MLE & normal approximation (Wald) CI"
Manual here and webpage
Work by Siobhon Egan
siobhon.egan@murdoch.edu.au