搜索
您的当前位置:首页正文

ggpubr包系列学习教程(一)

来源:哗拓教育

ggpubr: 'ggplot2' Based Publication Ready Plots

一款基于ggplot2的可视化包ggpubr,能够一行命令绘制出符合出版物要求的图形。

1. R包的安装及加载

ggpubr包可以从CRAN或GitHub中进行下载安装

install.packages("ggpubr")

library(ggpubr)

if(!require(devtools)) install.packages("devtools")

devtools::install_github("kassambara/ggpubr")

library(ggpubr)

2. 常用基本图形的绘制

分布图绘制(Distribution)


01. 带有均值线和地毯线的密度图

#构建数据集

set.seed(1234)

df <- data.frame( sex=factor(rep(c("f", "M"), each=200)),

                 weight=c(rnorm(200, 55), rnorm(200, 58)))

# 预览数据格式

head(df)

# 绘制密度图

ggdensity(df, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex",

         palette = c("#00AFBB", "#E7B800")) #rug参数添加地毯线,add参数可以添加均值mean和中位数median

图一

图1. 密度图展示不同性别分组下体重的分布,X轴为体重,Y轴为自动累计的密度,X轴上添加地毯线进一步呈现样本的分布;按性别分别组标记轮廓线颜色,再按性别填充色展示各组的分布,使用palette自定义颜色,是不是很舒服。

02. 带有均值线和边际地毯线的直方图

gghistogram(df, x="weight", add = "mean", rug = TRUE, color = "sex", fill = "sex",

           palette = c("#00AFBB", "#E7B800"))

图二

图2. 带有均值线和边际地毯线的直方图,只是把密度比例还原为了原始数据counts值

箱线/小提琴图绘制(barplot/violinplot)


01. 箱线图+分组形状+统计

#加载数据集ToothGrowth

data("ToothGrowth")

df1 <- ToothGrowth

head(df1)

p <- ggboxplot(df1, x="dose", y="len", color = "dose",

              palette = c("#00AFBB", "#E7B800", "#FC4E07"),

              add = "jitter", shape="dose") #增加了jitter点,点shape由dose映射

p

图三

图3. 箱线图按组着色,同时样本点标记不同形状可以一步区分组或批次

02. 箱线图+分组形状+统计

# 增加不同组间的p-value值,可以自定义需要标注的组间比较

my_comparisons <- list(c("0.5", "1"), c("1", "2"), c("0.5", "2"))

p+stat_compare_means(comparisons = my_comparisons)+    #不同组间的比较

     stat_compare_means(label.y = 50)

图四

图4. stat_compare_means添加组间比较连线和统计P值

03. 内有箱线图的小提琴图+星标记

ggviolin(df1, x="dose", y="len", fill = "dose",

        palette = c("#00AFBB", "#E7B800", "#FC4E07"),

        add = "boxplot", add.params = list(fill="white"))+

 stat_compare_means(comparisons = my_comparisons, label = "p.signif")+   #label这里表示选择显著性标记(星号)

 stat_compare_means(label.y = 50)

图五

图5. ggviolin绘制小提琴图, add = “boxplot”中间再添加箱线图,stat_compare_means中,设置lable=”p.signif”,即可添加星添加组间比较连线和统计P值按星分类。

条形/柱状图绘制(barplot)


data("mtcars")

df2 <- mtcars

df2$cyl <- factor(df2$cyl)

df2$name <- rownames(df2) #添加一行name

head(df2[, c("name", "wt", "mpg", "cyl")])

ggbarplot(df2, x="name", y="mpg", fill = "cyl", color = "white",

         palette = "npg", #杂志nature的配色

         sort.val = "desc", #下降排序

         sort.by.groups=FALSE, #不按组排序

         x.text.angle=60)

图六

图6. 柱状图展示不同车的速度,按cyl为分组信息进行填充颜色,颜色按nature配色方法(支持 ggsci包中的本色方案,如: “npg”, “aaas”, “lancet”, “jco”, “ucscgb”, “uchicago”, “simpsons” and “rickandmorty”),按数值降序排列。

# 按组进行排序

ggbarplot(df2, x="name", y="mpg", fill = "cyl", color = "white",

         palette = "aaas", #杂志Science的配色

         sort.val = "asc", #上升排序,区别于desc,具体看图演示

         sort.by.groups=TRUE,x.text.angle=60) #按组排序 x.text.angle=90

图七

图7. 由上图中颜色改为Sciences配色方案,按组升序排布,且调整x轴标签60度角防止重叠。

偏差图绘制(Deviation graphs)


偏差图展示了与参考值之间的偏差

df2$mpg_z <- (df2$mpg-mean(df2$mpg))/sd(df2$mpg)    # 相当于Zscore标准化,减均值,除标准差

df2$mpg_grp <- factor(ifelse(df2$mpg_z<0, "low", "high"), levels = c("low", "high"))

head(df2[, c("name", "wt", "mpg", "mpg_grp", "cyl")])

ggbarplot(df2, x="name", y="mpg_z", fill = "mpg_grp", color = "white",

         palette = "jco", sort.val = "asc", sort.by.groups = FALSE,

         x.text.angle=60, ylab = "MPG z-score", xlab = FALSE, legend.title="MPG Group")

图八

图8. 基于Zscore的柱状图,就是原始值减均值,再除标准差。按jco杂志配色方案,升序排列,不按组排列。

# 坐标轴变换

ggbarplot(df2, x="name", y="mpg_z", fill = "mpg_grp", color = "white",

         palette = "jco", sort.val = "desc", sort.by.groups = FALSE,

         x.text.angle=90, ylab = "MPG z-score", xlab = FALSE,

         legend.title="MPG Group", rotate=TRUE, ggtheme = theme_minimal())     # rotate设置x/y轴对换

图九

图9. rotate=TRUE翻转坐标轴,柱状图秒变条形图

棒棒糖图绘制(Lollipop chart)


棒棒图可以代替条形图展示数据

ggdotchart(df2, x="name", y="mpg", color = "cyl",

          palette = c("#00AFBB", "#E7B800", "#FC4E07"),

          sorting = "ascending",

          add = "segments", ggtheme = theme_pubr())

图十

图10. 柱状图太多了单调,改用棒棒糖图添加多样性

设置其他参数

ggdotchart(df2, x="name", y="mpg", color = "cyl",

          palette = c("#00AFBB", "#E7B800", "#FC4E07"),

          sorting = "descending", add = "segments", rotate = TRUE,

          group = "cyl", dot.size = 6,

          label = round(df2$mpg), font.label = list(color="white",

          size=9, vjust=0.5), ggtheme = theme_pubr())

图十一

图11. 棒棒糖图简单调整,rotate = TRUE转换坐标轴, dot.size = 6调整糖的大小,label = round()添加糖心中的数值,font.label进一步设置字体样式

棒棒糖偏差图


ggdotchart(dfm, x = "name", y = "mpg_z",

                    color = "cyl", # Color by groups

                    palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette

                    sorting = "descending", # Sort value in descending order

                    add = "segments", # Add segments from y = 0 to dots

                    add.params = list(color = "lightgray", size = 2), # Change segment color and size

                    group = "cyl", # Order by groups

                    dot.size = 6, # Large dot size

                    label = round(dfm$mpg_z,1), # Add mpg values as dot labels,设置一位小数

                    font.label = list(color = "white", size = 9, vjust = 0.5), # Adjust label parameters

                    ggtheme = theme_pubr() # ggplot2 theme

                    )+

                    geom_hline(yintercept = 0, linetype = 2, color = "lightgray")

图十二

图12. 同柱状图类似,用Z-score的值代替原始值绘图。

Cleveland点图绘制


ggdotchart(dfm, x = "name", y = "mpg",

                    color = "cyl", # Color by groups

                    palette = c("#00AFBB", "#E7B800", "#FC4E07"), # Custom color palette

                    sorting = "descending", # Sort value in descending order

                    rotate = TRUE, # Rotate vertically

                    dot.size = 2, # Large dot size

                    y.text.col = TRUE, # Color y text by groups

                    ggtheme = theme_pubr() # ggplot2 theme

                    )+

                    theme_cleveland() # Add dashed grids

图十三

图13. theme_cleveland()主题可设置为Cleveland点图样式

3. 常用基本绘图函数及参数

基本绘图函数

基本参数

Top