Funnel Plots for Proportion Data

Overview

The funnelR package provides a flexible framework for creating funnel plots for proportion data. A funnel plot is a powerful visualization in the analysis of unit level performance relative to some criterion. It readily allows identification of units that are In Control or Extreme according to a benchmark at specified level of confidence (e.g.95%).

Framed this way, a funnel plot can be applied to any number of fields of study to monitor and identify units that deviate from what is considered typical. For example, it could be used to differentiate schools that are high, average or low performing on a standardized test according to a National or State benchmark. From a quality improvement point of view, they might help identify which hospitals have extreme mortality or surgical complication rates relative to a benchmark prescribed by a government body.

The funnelR package provides many options to specify elements of a funnel plot including user defined: control limits, benchmarks, and estimation methods. It also has the capability to write scored results (i.e. a variable that records whether a unit is In Control or Extreme according to the specifications of the funnel plot) to your sample data set. This variable might then be included in further analysis such as cross-tabulations (e.g. stratification) or regression modeling (e.g. covariate).

While many flavors of funnel plots exist (rates, ratios, etc.), the current package considers funnel plots assuming proportion data that is binomially distributed. The interested reader is referred to Spiegelhalter (2005) for further details.

Data for Examples

To use the funnelR package, your sample data must follow some basic conventions:

One observation per row.
The numerator variable must be named n.
The denominator variable must be named d.

The following sample data set will be used for illustrating the features of the package.

id: Physician ID.
sex: Physician Sex.
n: Number of patients who rated their recent care as satisfactory.
d: Total number of patients under the care of the physician.

my_data  <- data.frame(id=c(1,2,3,4,5,6,7,8,9,10),
                       sex=c('M','F','M','F','F','M','F','M','F','M'), 
                       n=c(130,65,155,125,19,185,82,77,50,80), 
                       d=c(150,200,300,250,50,220,100,90,400,425)
                       )
knitr::kable(my_data)

id	sex	n	d
1	M	130	150
2	F	65	200
3	M	155	300
4	F	125	250
5	F	19	50
6	M	185	220
7	F	82	100
8	M	77	90
9	F	50	400
10	M	80	425

Example 1

Let’s model the sample data using a funnel plot. This analysis might help shed some insight on which physicians are receiving satisfactory ratings.

Consider a factitious benchmark of 50% being considered the norm. We can draw a funnel plot with 80% and 95% confidence limits and see who falls where. For this example we will use the exact method. Note the step must be an integer for the exact method.

library(funnelR)

my_limits   <- fundata(input=my_data, 
                      benchmark=0.50, 
                      alpha=0.80, 
                      alpha2=0.95, 
                      method='exact', 
                      step=1)

my_plot     <- funplot(input=my_data, 
                       fundata=my_limits)

my_plot
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$up` is discouraged.
#> ℹ Use `up` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$lo` is discouraged.
#> ℹ Use `lo` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$up2` is discouraged.
#> ℹ Use `up2` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$lo2` is discouraged.
#> ℹ Use `lo2` instead.
#> Warning: Use of `fundata$benchmark` is discouraged.
#> ℹ Use `benchmark` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `input$n` is discouraged.
#> ℹ Use `n` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.

Example 2

Let’s repeat Example 1, but set the method to approximate. We will need to set the step parameter to something reasonably small to produce the two sets of smooth confidence limits.

my_limits2   <- fundata(input=my_data, 
                        benchmark=0.50, 
                        alpha=0.80, 
                        alpha2=0.95, 
                        method='approximate', 
                        step=0.5)

my_plot2     <- funplot(input=my_data, 
                        fundata=my_limits2)

my_plot2
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$up` is discouraged.
#> ℹ Use `up` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$lo` is discouraged.
#> ℹ Use `lo` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$up2` is discouraged.
#> ℹ Use `up2` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$lo2` is discouraged.
#> ℹ Use `lo2` instead.
#> Warning: Use of `fundata$benchmark` is discouraged.
#> ℹ Use `benchmark` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `input$n` is discouraged.
#> ℹ Use `n` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.

Example 3

As previously mentioned, the funnelR package is capable of scoring your sample data. Scoring here refers to returning a variable in your sample data which records whether each observation is In Control or Extreme according to the specifications of the funnel plot. This can be useful in further analyses of your data (e.g. a stratification variable).

We’ll score the sample data according to the specifications in Example 2.

my_score <- funscore(input=my_data, 
                     benchmark=0.50, 
                     alpha=0.80, 
                     alpha2=0.95, 
                     method='approximate')

knitr::kable(my_score)

id	sex	n	d	r	z	score	score2
1	M	130	150	0.8666667	8.9814624	Extreme	Extreme
2	F	65	200	0.3250000	4.9497475	Extreme	Extreme
3	M	155	300	0.5166667	0.5773503	In Control	In Control
4	F	125	250	0.5000000	0.0000000	In Control	In Control
5	F	19	50	0.3800000	1.6970563	Extreme	In Control
6	M	185	220	0.8409091	10.1129979	Extreme	Extreme
7	F	82	100	0.8200000	6.4000000	Extreme	Extreme
8	M	77	90	0.8555556	6.7461923	Extreme	Extreme
9	F	50	400	0.1250000	15.0000000	Extreme	Extreme
10	M	80	425	0.1882353	12.8543881	Extreme	Extreme

The variable score and score2 correspond to the parameters alpha and alpha2, respectively.

We can take the analysis one step further and produce a funnel plot, which is colored by score2 pretty painlessly!

my_plot3 <- funplot(input=my_score, 
                    fundata=my_limits2, 
                    byvar="score2")

my_plot3
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$up` is discouraged.
#> ℹ Use `up` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$lo` is discouraged.
#> ℹ Use `lo` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$up2` is discouraged.
#> ℹ Use `up2` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$lo2` is discouraged.
#> ℹ Use `lo2` instead.
#> Warning: Use of `fundata$benchmark` is discouraged.
#> ℹ Use `benchmark` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `input$n` is discouraged.
#> ℹ Use `n` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.

Finally, since sex is also present on the sample data set, we can also color the funnel plot by this too!

my_plot4 <- funplot(input=my_score, 
                    fundata=my_limits2, 
                    byvar="sex")

my_plot4
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$up` is discouraged.
#> ℹ Use `up` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$lo` is discouraged.
#> ℹ Use `lo` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$up2` is discouraged.
#> ℹ Use `up2` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$lo2` is discouraged.
#> ℹ Use `lo2` instead.
#> Warning: Use of `fundata$benchmark` is discouraged.
#> ℹ Use `benchmark` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `input$n` is discouraged.
#> ℹ Use `n` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.

Example 4

The funplot function is essentially a wrapper for ggplot2, which will return a base funnel plot as a ggplot object. You can leverage your existing ggplot2 knowledge to customize the funnel plot.

We will use produce a customized funnel plot using the specifications from Example 2. This time, we will add the following features:

Custom axes text.
Add a secondary benchmark as a reference.
Change the plot theme.
Change the colors of the points.
Label each point by the id variable.

library(ggplot2)

my_plot4_mod <- my_plot4 +
                labs(x="Physician practice size", y="Proportion (%) of satisfied patients") +
                geom_hline(yintercept=0.40, colour="darkred", linetype=6, size=1) +               
                theme_minimal() +
                scale_colour_manual(values=c("green","darkgreen")) + 
                geom_text(aes(label=id), colour="black", size=4, nudge_x=10) 
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
                

my_plot4_mod
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$up` is discouraged.
#> ℹ Use `up` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$lo` is discouraged.
#> ℹ Use `lo` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$up2` is discouraged.
#> ℹ Use `up2` instead.
#> Warning: Use of `fundata$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `fundata$lo2` is discouraged.
#> ℹ Use `lo2` instead.
#> Warning: Use of `fundata$benchmark` is discouraged.
#> ℹ Use `benchmark` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `input$n` is discouraged.
#> ℹ Use `n` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.
#> Use of `input$d` is discouraged.
#> ℹ Use `d` instead.
#> Warning: Use of `input$n` is discouraged.
#> ℹ Use `n` instead.
#> Warning: Use of `input$d` is discouraged.
#> ℹ Use `d` instead.

id	sex	n	d
1	M	130	150
2	F	65	200
3	M	155	300
4	F	125	250
5	F	19	50
6	M	185	220
7	F	82	100
8	M	77	90
9	F	50	400
10	M	80	425

id	sex	n	d
1	M	130	150
2	F	65	200
3	M	155	300
4	F	125	250
5	F	19	50
6	M	185	220
7	F	82	100
8	M	77	90
9	F	50	400
10	M	80	425

id	sex	n	d
1	M	130	150
2	F	65	200
3	M	155	300
4	F	125	250
5	F	19	50
6	M	185	220
7	F	82	100
8	M	77	90
9	F	50	400
10	M	80	425