Durga
is an R package that aims to simplify the
estimation and visualisation of one type of effect size: differences in
group means. The package consists of two main functions and several
auxiliary functions. The DurgaDiff
function estimates raw
or standardised mean differences, while DurgaPlot
will
create a plot for visualising both the original data and the effect
size, and provides an extensive range of display options. The function
DurgaBrackets
can add confidence brackets to an
existing Durga plot. Since Durga
is implemented in base R,
Durga plots can be positioned using the standard R plotting layout
tools: par(mfrow = c(...))
, layout()
and
split.screen()
.
This vignette shows some example Durga plots. Each plot is preceded by the code used to generate it, which can be used to learn how to create the plot.
Before plotting your data, you must estimate the differences in group
means by calling DurgaDiff
. Your raw data should be in a
data frame (or similar) in either long or wide format.
A long format data frame contains one observation per row, a column for
the measurement or value being analysed, and another column for the
group or treatment. In addition, paired or repeated measures data
requires an ID column to match measurements for the same
specimen/individual. For example, within the insulin
data
set, the value being measured - blood sugar level - is recorded in the
column sugar
(specified in the DurgaDiff
call
by passing the argument data.col = "sugar"
); treatments are
"before"
and "after"
in the
treatment
column (group.col = "treatment"
);
and the identity of the individual (rabbit) being tested is given in the
id
column (id.col = "id"
). The blood sugar
level of each rabbit was tested before and after treatment with insulin,
so for each rabbit, there is a row with the before
sugar
level, and another with the after
sugar level. Both long
and wide format can be used to represent paired and unpaired data,
however long format is most appropriate for unpaired data.
## sugar treatment id experimenter_time time
## 1 0.111 before 1 ECN FEB 16 NA
## 2 0.139 before 2 ECN APR 22 NA
## 3 0.129 before 3 ECN APR 22 NA
## 53 0.045 after 1 ECN FEB 16 35
## 54 0.075 after 2 ECN APR 22 45
## 55 0.056 after 3 ECN APR 22 60
# Calculate differences in group means
d <- DurgaDiff(insulin, data.col = "sugar", group.col = "treatment", id.col = "id")
d
## Bootstrapped effect size
## sugar ~ treatment
## Groups:
## mean median sd se CI.lower CI.upper n
## after 0.07038462 0.071 0.01785250 0.002475697 0.06545157 0.07519795 52
## before 0.12726923 0.127 0.01522397 0.002111185 0.12314735 0.13144397 52
## Paired Mean of differences (R = 1000, bootstrap CI method = bca):
## before - after: 0.0568846, 95% CI [0.0511501, 0.0629732]
When your data set is in long format, you may alternatively identify
your data with a formula. The formula interface allows a little more
flexibility when specifying your model, but the end results are
equivalent. The formula must be specified as
<data variable> ~ <group variable>
. The data
table must be in long format. For paired data, specify the
id
column using the id.col
argument.
## Bootstrapped effect size
## sugar ~ treatment
## Groups:
## mean median sd se CI.lower CI.upper n
## after 0.07038462 0.071 0.01785250 0.002475697 0.06549376 0.07500021 52
## before 0.12726923 0.127 0.01522397 0.002111185 0.12321154 0.13190010 52
## Paired Mean of differences (R = 1000, bootstrap CI method = bca):
## before - after: 0.0568846, 95% CI [0.0510858, 0.062979]
DurgaDiff also accepts data in wide format. A wide format
data frame contains a measurement column for each group. For example, an
experiment with a control and a treatment might comprise a column named
control
and another named treatment
. To pass
wide format data to DurgaDiff
, do not specify the
data.col
or group.col
arguments. Instead, you
must identify the groups in your data with the groups
argument. Each group must be the name of a column in the data frame.
Wide format may be a natural way to represent paired data, as each row
represents a specimen. Wide format is not as appropriate for unpaired
data, as a single row will contain measurements from two (or more)
unrelated specimens, although Durga will handle it. To use wide format
with unpaired data, specify id.col = NULL
. If your groups
have different sample sizes, you will also have to specify
na.rm = TRUE
.
## sugar.before sugar.after time experimenter date
## 1 0.111 0.045 35 ECN FEB 16
## 2 0.139 0.075 45 ECN APR 22
## 3 0.129 0.056 60 ECN APR 22
## 4 0.132 0.067 35 ECN APR 25
## 5 0.145 0.060 90 FGB APR 6
## 6 0.151 0.071 120 FGB APR 10
# Calculate differences in group means
(d <- DurgaDiff(insulin.wide, groups = c("sugar.before", "sugar.after")))
## Bootstrapped effect size
## value ~ group
## Groups:
## mean median sd se CI.lower CI.upper n
## sugar.before 0.12726923 0.127 0.01522397 0.002111185 0.12278846 0.13105769 52
## sugar.after 0.07038462 0.071 0.01785250 0.002475697 0.06526918 0.07448683 52
## Paired Mean of differences (R = 1000, bootstrap CI method = bca):
## sugar.after - sugar.before: -0.0568846, 95% CI [-0.0628107, -0.0511401]
Durga can estimate unstandardised or various standardised group
differences, which are specified using the DurgaDiff parameter
effect.type
. Chapter 11, Cohen’s d in Cumming (2012) is a
good reference for understanding standardised effect sizes and the
importance of choosing the appropriate standardiser. The online help for
DurgaDiff summarises the effect types that are built-in to Durga.
The simplest plot to create simply uses all default values. This plot shows raw data points displayed as individual data points, the distribution of data points within each group shown as half violins, and effect size to the right (for a single group comparison) or beneath the group data (for multiple comparisons).
The effect size is an estimate of the difference in means between two groups (or a similar statistic). It is shown with a symbol (a triangle by default) at the estimated value, and a thick line showing the confidence interval of the estimate. A filled half-violin shows the distribution of bootstrapped effect sizes, truncated at the top and bottom so it does not extend past the confidence interval.
If the data are paired (aka repeated measures; id.col
was specified in the DurgaDiff
call), then pair lines are
drawn to connect measurements for the same individual/specimen.
A DurgaPlot
data visualisation is a combination of one
or more visual components: data points
; violin
plots; bar
charts; box
plots; repeated measure
paired
lines; a mean or median indicator
(central.tendency
); and error bars
(error.bars
). Each component is controlled by a set of
parameters to the DurgaPlot
function. The names of the
parameters indicate the component they apply to, so to adjust the violin
plot component, read the documentation about all parameters with prefix
violin
. Effect size display is controlled by parameters
with prefix ef.size
. Parameters have sensible default
values, so for example, if you display bar charts
(bar = TRUE
), violin plots and the central tendency
indicator will be turned off. Obviously, you are able to override the
defaults if you wish. Refer to help for the DurgaPlot
function for full details on all arguments.
# Show two plots in the same figure, and increase margins so they don't look crowded
par(mfrow = c(1, 2), mar = c(5, 5, 4, 5) + 0.1)
DurgaPlot(d,
# Display box plots
box = TRUE,
# Don't display data points
points = FALSE,
# Don't display paired-value lines
paired = FALSE,
main = "Box plot")
par(mar = c(5, 5, 4, 1) + 0.1)
DurgaPlot(d,
# Display bar charts
bar = TRUE,
# Position effect size below
ef.size.position = "below",
# No paired-value lines
paired = FALSE,
main = "Bar chart")
When plotted, the default display order for groups is their natural
order, i.e. alphabetical order for character data, and numeric order for
factors. You can, however, specify the display order with the
groups
parameter to either DurgaDiff
or
DurgaPlot
. Not all groups in the data need be included in
the analysis or plot; missing groups will be excluded from the analysis
and/or plot. To assign display labels to the groups, use a named vector
to list the groups. The names will be used to label the groups on the
plot.
In the damselfly data set, the groups are juvenile
and
adult
. The example below demonstrates how to change the
displayed group names as well as the display order of the groups.
# No title on this plot, so reduce the top margin size
par(mar = c(5, 4, 1, 1) + 0.1)
# Immature males are yellow, mature are red. Display juveniles first, and label
# with colour rather than developmental stage
d <- DurgaDiff(mass ~ maturity, damselfly, na.rm = TRUE,
groups = c("Yellow" = "juvenile", "Red" = "adult"))
DurgaPlot(d,
# Custom y-axis label
left.ylab = "Body mass (mg)",
# Specify the colours to (approximately) match the data
group.colour = c("orange", "red"),
box = TRUE, box.fill = FALSE)
If you need greater control over the display of the group labels, for
example to display in italics or bold, set the group.names
field in the DurgaDiff object. It must be a vector of characters or
expressions, one for each group.
# No title on this plot, so reduce the top margin size
par(mar = c(5, 4, 1, 1) + 0.1)
# Immature males are yellow, mature are red. Display juveniles first, and label
# with colour rather than developmental stage
d <- DurgaDiff(mass ~ maturity, damselfly, na.rm = TRUE,
groups = c("Juveniles" = "juvenile", "Adults" = "adult"))
d$group.names <- c(expression(italic("Juveniles")), expression(bold("Adults")))
DurgaPlot(d,
# Custom y-axis label
left.ylab = "Body mass (mg)",
# Specify the colours to (approximately) match the data
group.colour = c("orange", "red"),
box = TRUE, box.fill = FALSE)
Colours can be specified for each group (group.colour
;
demonstrated above), or separately for individual components. The
master argument for each component type controls whether the
component is displayed and optionally its colour. So, to display blue
violins, specify violin = "blue"
. To use the default
colour, specify violin = TRUE
. Colours can be specified as
the name of an RColorBrewer
palette (e.g. group.colour = "Dark2"
), or as one or more R
colours.
Colours for data points receive special treatment; points can be
coloured individually or by group. If there are as many colours as data
points, the points are coloured individually. Otherwise, colours are
applied to groups (excess colours are ignored). The default point symbol
(pch
) is 19.
The convenience function DurgaTransparent
adds
transparency to a colour, so DurgaTransparent("blue", 0.8)
is a nearly-transparent blue colour.
par(mfrow = c(1, 2), mar = c(11, 4, 3, 6) + 0.1)
# Only show 2 groups even though there are 3 in the data
d <- DurgaDiff(iris, data.col = "Sepal.Length", group.col = "Species",
groups = c("versicolor", "setosa"))
DurgaPlot(d,
# Custom colours for points - the points are partially transparent
points = c(DurgaTransparent("coral", .6), DurgaTransparent("darkturquoise", .8)),
# Different colurs for violin (border) and violin fill
violin = c("red", "blue"), violin.fill = c("#f8a8c840", "#a8c8f840"),
main = "Data subset; Two groups")
par(mar = c(5, 4, 3, 1) + 0.1)
d <- DurgaDiff(iris, data.col = "Sepal.Length", group.col = "Species")
# Define colours for the 3 species
bgCols <- c(DurgaTransparent("red", 0.5),
DurgaTransparent("blue", 0.5),
DurgaTransparent("green", 0.5))
DurgaPlot(d,
# Make all boxes the same colour
box = "grey50", box.fill = "grey90",
# Black for border/outline colour of points
points = "black",
# Additional data point parameters: small squares filled with colour for species
points.params = list(pch = 22, cex = 0.8, bg = bgCols[iris$Species]),
main = "Coloured point fill")
The central tendency indicator displays either the mean or median of
each group (controlled by the central.tendency.type
argument), and is shown by default when neither boxes nor bars are
shown.
Error bars on each group may show: 95% confidence intervals of the
mean; standard deviation; or standard error (specified by setting the
error.bars.type
parameter), and may be plotted regardless
of what other components are shown.
par(mfrow = c(1, 2))
d <- DurgaDiff(iris, data.col = "Sepal.Length", group.col = "Species")
# Only show error bars (95% CI) and central tendency
DurgaPlot(d, error.bars.type = "CI",
# Custom colour for central tendency
central.tendency = rainbow(3),
points = FALSE, violin = FALSE, main = "Mean and 95% CI")
par(mar = c(5, 2.4, 4, 2) + 0.1)
DurgaPlot(d, bar = TRUE, error.bars.type = "SD", points = FALSE,
left.ylab = "", ef.size.label = "",
main = "Bar chart with std. deviation")
The default behaviour of DurgaDiff
is to calculate
differences between all pairs of groups; each difference is a
contrast and has an estimated effect size with confidence
interval. When plotted, each of multiple effect sizes is positioned
below the first group in the contrast. This can cause layout problems
when one group participates in multiple contrasts. For example, the iris
dataset (available within standard R) contains three species:
setosa
, versicolor
and virginica
,
hence three possible contrasts: virginica - versicolor
,
virginica - setosa
and versicolor - setosa
. If
all three effect sizes are plotted, two of them
(virginica - versicolor
and
virginica - setosa
) will be positioned at the same
horizontal location, resulting in superimposed text and an illegible
graph (panel a)
below)!
There are several ways around this problem:
ef.size.dx
; exactly how to do this is left as an
exercise for the reader);There is no single best solution to this problem. The appropriate method depends on your data and the information that you wish to communicate.
To avoid this problem in default plots, the default behaviour of
DurgaPlot
is to assume that the first group is a control,
and then show the differences between all subsequent groups and the
control (method 1.). This behaviour prevents effect
sizes from overwriting each other, but may exclude pertinent data from
the plot (panel b)
below).
To change the default contrast selection, specify the
contrasts
argument to either DurgaDiff
or
DurgaPlot
. Contrasts are specified as
"<group> - <group>, <group> - <group>"
,
so to apply method 2. to the iris
dataset,
you could specify
contrasts = c("virginica - setosa", "versicolor - setosa", "setosa - versicolor")
.
This differs from the default behaviour by also plotting the contrast
"setosa - versicolor"
. Note that if the contrast
"versicolor - setosa"
were specified rather than
"setosa - versicolor"
, the effect size would be shown at
the same location as the "versicolor - setosa"
effect size,
hence failing to solve the problem.
par(mfrow = c(1, 2))
di <- DurgaDiff(iris, "Sepal.Length", "Species")
DurgaPlot(di, contrasts = "*", main = "Problematic effect sizes")
mtext("a)", cex = 1.5, col = "brown", font = 2, adj = -0.3, line = 1.4)
DurgaPlot(di, main = "Default effect sizes (subset)")
mtext("b)", cex = 1.5, col = "brown", font = 2, adj = -0.3, line = 1.4)
When there are too many group comparisons for the effect sizes to be
clearly displayed, you can choose to display confidence brackets above
the plot by calling DurgaBrackets
after calling
DurgaPlot
. Confidence brackets communicate less information
than the standard effect size visualisation, but may be a reasonable
compromise for many group comparisons.
You must manually ensure there is sufficient space for the brackets
in the plot (see DurgaBrackets
for details). In the example
below, to create space above the plot, we increase the top margin size
and prevent the frame from being drawn around the plot
(frame.plot = FALSE
).
The default symbology for the brackets are light grey if the
confidence interval includes 0. If the confidence interval does not
include zero, text is black by default, and the default bracket line
colour is the colour of the group with the higher central tendency.
Arguments to DurgaBrackets
can be used to override the
defaults.
d <- DurgaDiff(petunia, 1, 2)
# Use the top margin for brackets
op <- par(mar = c(2, 4, 4, 1) + 0.1)
# Custom group colours
cols = rainbow(3)
# Save return value from DurgaPlot
p <- DurgaPlot(d,
# Don't draw effect size because we will draw brackets
ef.size = FALSE,
# Don't draw frame because brackets will appear in the upper margin
frame.plot = FALSE,
# Custom group colours
group.colour = cols)
# Customise colours by drawing brackets with the colour of the taller group.
# Get the taller group from each difference
tallerIdx <- sapply(d$group.differences, function(diff) {
# If the group difference > 0, the first group is taller
ifelse(diff$t0 > 0, diff$groupIndices[1], diff$groupIndices[2])
})
# Draw the brackets
DurgaBrackets(p, labels = "level CI", br.col = cols[tallerIdx])
Each graphical group component (points, violin plot, box plot, bar
chart, central tendency indicator, error bar and axis label) is
positioned or centred horizontally on x-axis locations from 1 to
n, where n is the number of groups. Group horizontal
locations can be shifted with the group.dx
parameter. For
example, group.dx = 0.1
would shift all groups to the right
by 1/10th of the distance between groups. Values are repeated as
necessary, so group.dx = c(0.1, -0.1)
will move each pair
of groups closer together.
group.dx
shifts all components equally, however, to
provide fine-grained control, each component can be shifted
independently. For example, to shift all violins left and all
points right, specify
violin.dx = -0.12, points.dx = 0.1
.
par(mfrow = c(1, 2), mar = c(3, 4, 3, 1) + 0.1)
# Construct some data with multiple groups
set.seed(1) # Ensure we always get the exact same data set
multiGroupData <- data.frame(Measurement = unlist(lapply(c(10, 12, 8, 8.5, 9, 9.1),
function(m) rnorm(40, m))),
Group = rep(paste0("g", 1:6), each = 40),
Id = rep(1:6, 40))
d <- DurgaDiff(multiGroupData, 1, 2)
DurgaPlot(d,
# Visually group each pair
group.dx = c(0.18, -0.18),
# Show box plots
box = TRUE,
# Make the boxes narrow
box.params = list(boxwex = 0.4),
# Don't display effect size or data points
ef.size = FALSE, points = FALSE,
xlim = c(0.8, 6.2), # Reduce white space on plot left and right
main = "Grouped into pairs")
DurgaPlot(d,
# Shift violins left
violin.dx = -0.12,
# Shift points right
points.dx = 0.1,
# Make points small and jittered
points.params = list(cex = 0.5), points.method = "jitter",
# No central tendency indicator
central.tendency = FALSE,
ef.size = FALSE, main = "Component shifts")
group.dx
is implemented very simply as a default value
for various other *.dx
arguments. That means that values
such as violin.dx
are not added to group.dx
;
rather they replace it. Under some circumstances, it can be more
convenient if *.dx
values are added to
group.dx
. You can achieve this in your own code by storing
group.dx
in a variable, then adding relative offsets for
other arguments.
# Example snippet, make *.dx arguments relative to group.dx
group.dx <- c(0.75, 0) # Shift group 1 right, leave group 2 unchanged
DurgaPlot(..., group.dx = group.dx,
central.tendency.dx = group.dx + 0.1, # Shift mean & error bars right
violin.dx = group.dx + c(-0.4, -0.5), # Shift violins left
ef.size.dx = -0.5)
By default, effect sizes are “mean” (aka unstandardised), which means
they are in the same units as the data. However, DurgaDiff
can estimate standardised effect sizes - either Cohen’s d or Hedges’ g -
which allow comparisons of effect sizes across data in different
units.
You can override the y-axis labels to aid interpretation of effect
sizes. DurgaPlot
does not provide text labels for
standardised effect sixes by default because effect size interpretation
depends on context.
par(mar = c(5, 9, 3, 1) + 0.1)
# Define effect size labels and their positions on the y-axis
effectSizes <- c("No effect" = 0, "Large positive effect" = 0.8, "Huge positive effect" = 2)
# Give the groups friendly names
groups <- c("Self" = "self_fertilised", "Westerham" = "westerham_cross", "Inter" = "inter_cross")
# Calculate bias-corrected Cohen's d, named Hedges' g, rather than unstandardised difference in means
d <- DurgaDiff(petunia, 1, 2, effect.type = "hedges g", groups = groups)
DurgaPlot(d, violin = FALSE, main = "Hedges' g, labelled effect size",
points.method = "jitter",
# Use our ticks and labels instead of default
ef.size.ticks = effectSizes,
# Horizontal tick labels
ef.size.params = list(las = 1),
# Don't label the effect size y-axis because it won't fit with the long tick labels
ef.size.label = "")
Durga provides a fixed set of effect type statistics that can be used, however sometimes a different statistic is required. For example, we may wish to compare measurements of some organ or structure to know whether they have increased or decreased in size, and by how much, in which case we may decide to use log2 ratio as our statistic. log2 ratio has a straightforward interpretation: 0 means no difference, 1 means group 1 is double the size of group 2 (\(2^1\)), 2 means group 1 is four-times the size (\(2^2\)), -1 means group 2 is double the size of group 1 (\(-2^1\)), and so on.
# Define a function that calculates the statistic for two vectors
log2Ratio <- function(x1, x2) log2(mean(x2) / mean(x1))
# Generate some random data
set.seed(1)
data <- data.frame(species = rep(c("Sp1", "Sp2"), each = 10),
volume = c(rnorm(10, mean = 5, sd = 2), rnorm(10, mean = 10, sd = 2)))
# Apply log2Ratio as the effect type
d <- DurgaDiff(data, "volume", "species", effect.type = log2Ratio)
# Apply custom labels to describe the effect type, since it is not Sp2 - Sp1
d$group.differences[[1]]$label.plot <- expression("log"[2] ~ italic("Sp 2") * ":" * italic("Sp 1"))
# Plot, and set the y-axis label for effect size
DurgaPlot(d, ef.size.label = expression("log"[2]~"ratio"))
# It is also possible to set the label to be used for printing to the console
d$group.differences[[1]]$label.print <- "log2 Sp 2:Sp 1"
d
## Bootstrapped effect size
## volume ~ species
## Groups:
## mean median sd se CI.lower CI.upper n
## Sp1 5.264406 5.513151 1.561172 0.4936859 4.433125 6.231468 10
## Sp2 10.497690 10.983745 2.139030 0.6764206 8.796348 11.489956 10
## Unpaired Custom effect type (R = 1000, bootstrap CI method = bca):
## log2 Sp 2:Sp 1: 0.995729, 95% CI [0.69323, 1.27465]
Durga implements a variety of standardisers, but what if you wish to
calculate an effect size that is not implemented by Durga? The following
example shows one way to use the cohens_d
function from the
effectsize
R package with Durga.
# Create a new function that calls effectsize::cohens_d and returns just the calculated statistic
myCohensAv <- function(x1, x2) {
# Note that we swap the arguments because Durga expects the calculation to be x2 - x1
cd <- effectsize::cohens_d(x2, x1, pooled_sd = FALSE)
# Extract and return just the Cohen's d value
cd$Cohens_d
}
# Pass it in to DurgaDiff as the effect.type
d <- DurgaDiff(petunia, 1, 2, effect.type = myCohensAv)
# Specify the effect type label
DurgaPlot(d, ef.size.label = expression("Cohen's d"))
This example shows the use of custom group order and display names, contrasts between relevant pairs of groups rather than between all combinations of groups, and various custom plotting options.
par(mar = c(5, 4, 3, 1) + 0.1)
# Define display order and display pretty group names
groups = c("Control 1" = "g1", "Treatment 1" = "g2",
"Control 2" = "g3", "Treatment 2" = "g4",
"Control 3" = "g5", "Treatment 3" = "g6")
d <- DurgaDiff(multiGroupData, 1, 2, 3, groups = groups,
contrasts = "g2 - g1, g4 - g3, g6 - g5")
# Reduce text size
par(cex = 0.7)
DurgaPlot(d, box = TRUE,
# Group into pairs
group.dx = c(0.15, -0.15),
# Don't display points
points = FALSE,
# Adjust plot width to reduce left/right wasted space
xlim = c(1, 6),
# Don't display pair lines
paired = FALSE,
# Hide the boxplot median line (medcol = NA) and make the boxes narrow (boxwex = 0.4)
box.params = list(medcol = NA, boxwex = 0.4),
# Display mean and error bar
central.tendency = "#8060b0a0", # Display and colour central tendency
central.tendency.symbol = "segment",
central.tendency.width = 0.07,
error.bars = "#8060b0a0", # Same colour for error bars
# Don't display the effect size violin
ef.size.violin = FALSE,
main = "Group offsets"
)
Any unknown parameters to DurgaPlot
are passed on to the
plot function, which allows some control over additional aspects of the
plot. The labels underneath effect sizes can be controlled by passing a
named vector for the DurgaPlot
contrasts
parameter.
# Add in some fake sex data to the insulin data set
data <- cbind(insulin, Sex = sample(c("Male", "Female"), nrow(insulin), replace = TRUE))
# Thin the data so that individual symbols are visible.
# Obviously these data manipulations are for plot demonstration purposes only
data <- data[data$id %in% 1:15, ]
d <- DurgaDiff(data, "sugar", "treatment", "id",
groups = c("Before insulin" = "before", "After insulin" = "after"), na.rm = TRUE)
par(mar = c(2, 4, 4, 1))
# Use different colours to show sex
cols <- RColorBrewer::brewer.pal(3, "Set1")
DurgaPlot(d,
# Don't draw containing box
frame.plot = FALSE,
# Relabel the contrasts
contrasts = c("Blood sugar change" = "after - before"),
left.ylab = "Blood sugar level",
# Customise the violins
violin.shape = c("left", "right"), violin.dx = c(-0.055, 0.055), violin.width = 0.3,
# Customise the points to display sex
points = "black",
points.params = list(bg = ifelse(data$Sex == "Female", cols[1], cols[2]),
pch = ifelse(data$Sex == "Female", 21, 24)),
# Customise the effect size
ef.size.pch = 19,
ef.size.violin = "blue",
ef.size.violin.shape = "full",
# Make mean lines match group colour
ef.size.line.col = RColorBrewer::brewer.pal(3, "Set2")[2:1],
ef.size.line.lty = 3,
central.tendency = FALSE,
main = "Effects of insulin on blood sugar")
# Add a legend
legend("topright", c("Female", "Male"), pch = c(21, 24), pt.bg = cols)
Below is an alternative representation for the insulin dataset,
showing overlayed rain cloud plots. Positioning is achieved by manually
adjusting the relative positions of the various components, using the
appropriate *.dx
parameters.
d <- DurgaDiff(insulin, "sugar", "treatment", "id",
groups = c("Before insulin" = "before", "After insulin" = "after"), na.rm = TRUE)
# Change effect size tick label
d$group.differences[[1]]$label.plot <- "Change in sugar"
# Horizontal axis tick labels, move y-axis labels outwards to make space for the
# horizontal ticks, and ensure the margins are large enough to show them
par(las = 1, mgp = c(4, 1, 0), mar = c(3, 5.5, 4, 5.5) + 0.1)
cols <- RColorBrewer::brewer.pal(3, "Set2")
colsTr <- DurgaTransparent(cols, 0.7)
DurgaPlot(d, paired = FALSE,
# Move 2nd group to be at the same position as 1st
group.dx = c(0.2, -0.8),
# X-axis ticks are meaningless since both groups are at the same location
x.axis = FALSE,
# Adjust plot width and spacing
xlim = c(0.7, 2.1),
violin = TRUE, violin.shape = "right", violin.fill = colsTr,
# No need to specify violin position because violin.dx defaults to group.dx
# Bumpy violins
violin.adj = 1,
# Box plots are black outlines
box = "black", box.dx = c(0.06, -0.895), box.fill = colsTr,
# Don't show outliers
box.outline = FALSE,
# Thin boxes, solid thick lines
box.params = list(boxwex = 0.06, lty = 1, lwd = 2,
# normal line weight for median line
medlwd = 2,
# no end of whisker line
staplecol = NA),
# Points as solid colours, slightly smaller than default
points = cols, points.params = list(pch = 16, cex = 0.8),
points.method = "jitter", points.dx = c(-0.14, -1.14), points.spread = 0.09,
# Move effect size left to near where 2nd group would have been
ef.size.dx = -1.2,
left.ylab = "Blood sugar",
main = "Rain cloud with effect size"
)
legend("topright", c("Before insulin", "After insulin"), col = cols, fill = colsTr, bty = "n")
The source code for Durga is available at github.com/KhanKawsar/EstimationPlot. We welcome suggestions for improvements or reports of problems. Let us know by logging an issue on GitHub.
The Durga GitHub page has additional information such as how to install the latest development version of Durga.
Cumming, G. (2012). Understanding the new statistics : effect sizes, confidence intervals, and meta-analysis (1st ed.). New York: Routledge.
Khan, M. K., & McLean, D. J. (2023). Durga: An R package for effect size estimation and visualisation. bioRxiv, 2023.2002.2006.526960. doi:10.1101/2023.02.06.526960