python matplotlib statistics seaborn visualization

Customizing p-value thresholds for "star" text format in statannotations

The statannotations package provides visualization annotation on the level of statistical significance for pairs of data in plots (in seaborn boxplot or strip plot, for example). These annotation can be in "star" text format, where one or more stars appears on top of the bar between pairs of data: .

Is there any way to customize the thresholds for stars? I want 0.0001 to be the threshold for the first significance threshold instead of 0.05, and 0.00001 for two stars **, and 0.000001 for three stars ***.

The example figure was generated from example codes from statsannotations' github page:

from statannotations.Annotator import Annotator
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset("tips")
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)
annot.configure(test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
annot.apply_test()
ax, test_results = annot.annotate()
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')

With verbose set to 2, this would also tell us the thresholds used for determining how many stars appear above the bars:

p-value annotation legend:
      ns: p <= 1.00e+00
       *: 1.00e-02 < p <= 5.00e-02
      **: 1.00e-03 < p <= 1.00e-02
     ***: 1.00e-04 < p <= 1.00e-03
    ****: p <= 1.00e-04

I want to feed something like a dictionary of p-value threshold: number of stars to Annotator, but I don't know to what parameter should I feed to.

Solution

In their repository, specifically inside file [Annotator.py][1]:,we have self._pvalue_format = PValueFormat(). That implies we can change the same. The PValueFormat() class, which can be found here, has the following configurable parameters:

CONFIGURABLE_PARAMETERS = [
    'correction_format',
    'fontsize',
    'pvalue_format_string',
    'simple_format_string',
    'text_format',
    'pvalue_thresholds',
    'show_test_name'
]

For completeness, here is the modified version of your code and the new result with two lines showing the before and after values for the pvalues. Also, the image changes accordingly.

# ! pip install statannotations
from smartprint import smartprint as sprint
from statannotations.Annotator import Annotator
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = sns.load_dataset("tips")
x = "day"
y = "total_bill"
order = ['Sun', 'Thur', 'Fri', 'Sat']
ax = sns.boxplot(data=df, x=x, y=y, order=order)
annot = Annotator(ax, [("Thur", "Fri"), ("Thur", "Sat"), ("Fri", "Sun")], data=df, x=x, y=y, order=order)

print ("Before hardcoding pvalue thresholds ")
sprint (annot.get_configuration()["pvalue_format"])


annot.configure(test='Mann-Whitney', text_format='star', loc='outside', verbose=2)
annot._pvalue_format.pvalue_thresholds =  [[0.01, '****'], [0.03, '***'], [0.2, '**'], [0.6, '*'], [1, 'ns']]
annot.apply_test()
ax, test_results = annot.annotate()
plt.savefig('example_non-hue_outside.png', dpi=300, bbox_inches='tight')

print ("After hardcoding pvalue thresholds ")
sprint (annot.get_configuration()["pvalue_format"])

Output:

Before hardcoding pvalue thresholds 
Dict: annot.get_configuration()["pvalue_format"]
Key: Value

{'correction_format': '{star} ({suffix})',
 'fontsize': 'medium',
 'pvalue_format_string': '{:.3e}',
 'pvalue_thresholds': [[0.0001, '****'],
                       [0.001, '***'],
                       [0.01, '**'],
                       [0.05, '*'],
                       [1, 'ns']],
 'show_test_name': True,
 'simple_format_string': '{:.2f}',
 'text_format': 'star'}

p-value annotation legend:
      ns: p <= 1.00e+00
       *: 2.00e-01 < p <= 6.00e-01
      **: 3.00e-02 < p <= 2.00e-01
     ***: 1.00e-02 < p <= 3.00e-02
    ****: p <= 1.00e-02

Thur vs. Fri: Mann-Whitney-Wilcoxon test two-sided, P_val:6.477e-01 U_stat=6.305e+02
Thur vs. Sat: Mann-Whitney-Wilcoxon test two-sided, P_val:4.690e-02 U_stat=2.180e+03
Sun vs. Fri: Mann-Whitney-Wilcoxon test two-sided, P_val:2.680e-02 U_stat=9.605e+02
After hardcoding pvalue thresholds 
Dict: annot.get_configuration()["pvalue_format"]
Key: Value

{'correction_format': '{star} ({suffix})',
 'fontsize': 'medium',
 'pvalue_format_string': '{:.3e}',
 'pvalue_thresholds': [[0.01, '****'],
                       [0.03, '***'],
                       [0.2, '**'],
                       [0.6, '*'],
                       [1, 'ns']],
 'show_test_name': True,
 'simple_format_string': '{:.2f}',
 'text_format': 'star'}

Image:

Edit: Based on user: Bonlenfum's comment, changing the thresholds can also be achieved by simply appending the key-value when calling .configure, as shown below:

annot.configure(test='Mann-Whitney', text_format='star', loc='outside',\
verbose=2, pvalue_thresholds=[[0.01, '****'], \
[0.03, '***'], [0.2, '**'], [0.6, '*'], [1, 'ns']])