API Reference
Calculator
- class py50.Calculator(data: DataFrame, name_col: str = None, concentration_col: str = None, response_col: str | list = None)[source]
- __init__(data: DataFrame, name_col: str = None, concentration_col: str = None, response_col: str | list = None)[source]
- calculate_absolute_ic50(name_col: str = None, concentration_col: str = None, response_col: str | list = None, input_units: str = None, verbose: bool = None)[source]
Calculations previously performed in absolute_calculation(). The dictionary results are converted into a pandas DataFrame
- Parameters:
name_col – str Name column from DataFrame
concentration_col – str Concentration column from DataFrame
response_col – Union[str, list] Response column from DataFrame. Can be a single column (i.e. already a calculated average) or a list of columns to be averaged. The columns will be averaged internally within the function.
input_units – str Units of input dataset. Default is nM.
verbose – bool Output drug concentration units.
- Returns:
DataFrame generated from the list from the absolute_calculation method
- calculate_ic50(name_col: str = None, concentration_col: str = None, response_col: str | list = None, input_units: str = None, verbose: bool = None)[source]
Calculations previously performed in relative_calculation(). The dictionary results are converted into into a pandas DataFrame
- Parameters:
name_col – str Name column from DataFrame
concentration_col – str Concentration column from DataFrame
response_col – Union[str, list] Response column from DataFrame. Can be a single column (i.e. already a calculated average) or a list of columns to be averaged. The columns will be averaged internally within the function.
input_units – str Units of input dataset. Default is nM.
verbose – bool Output drug concentration units.
- Returns:
DataFrame generated from the list from the relative_calculation method
- calculate_pic50(name_col: str = None, concentration_col: str = None, response_col: str | list = None, input_units: str = None, verbose: bool = None)[source]
Convert IC50 into pIC50 values. Calculation is performed using the absolute_calculation. As such, two columns will be appended - relative pIC50 and absolute pIC50. Conversion is performed by convert the IC50 values from nM to M levels and then taking the negative log value of said number.
- Parameters:
name_col – str Name column from DataFrame
concentration_col – str Concentration column from DataFrame
response_col – Union[str, list] Response column from DataFrame. Can be a single column (i.e. already a calculated average) or a list of columns to be averaged. The columns will be averaged internally within the function.
input_units – str Units of input dataset. Default is nM.
verbose – bool Output drug concentration units.
- Returns:
DataFrame from calculate_absolute_ic50 along with the pIC50 values
- show(rows: int = None)[source]
show DataFrame
- Parameters:
rows – int Indicate the number of rows to display. If none, automatically show 5.
- Returns:
DataFrame
PlotCurve
- class py50.PlotCurve(data: DataFrame, name_col: str | None = None, concentration_col: str | None = None, response_col: str | list | None = None)[source]
Bases:
object- __init__(data: DataFrame, name_col: str | None = None, concentration_col: str | None = None, response_col: str | list | None = None)[source]
- curve_plot(concentration_col: str = None, response_col: str | list = None, name_col: str = None, query: str = None, title: str = None, titlesize: int = 16, xlabel: str = None, ylabel: str = None, axis_fontsize: int = 14, conc_unit: str = 'nM', xscale: str = 'log', xscale_ticks: tuple = None, ymax: int = None, ymin: int = None, line_color: str = 'black', line_width: int = 1.5, errorbar: str = 'sd', marker: bool = None, markersize: int = 8, legend: bool = False, legend_loc: str = 'best', box: bool = False, box_color: str = 'gray', box_intercept: int = 50, conc_target: int = None, hline: int = None, hline_color: str = 'gray', vline: int = None, vline_color: str = 'gray', figsize: tuple = (6.4, 4.8), savepath: str = None, verbose: bool = None, **kwargs)[source]
Generate a dose-response curve for a single drug target. Because a data table can contain multiple drugs, user must specify specific target.
- Parameters:
concentration_col – str Concentration column from DataFrame.
response_col – Union[str, list] Response column from DataFrame.
name_col – str Column containing drug name for plotting.
query – str Draw a curve for a specific query in the dataset. Only needed when response_col is a list.
title – str Title of the figure.
titlesize – tuple Modify plot title font size.
xlabel – str Title of the X-axis.
ylabel – str Title of the Y-axis.
axis_fontsize – int Modify axis label font size.
conc_unit – str Input unit of concentration. Can accept nanomolar (nM) and micromolar (uM or µM). If the units are different, for example in the DataFrame units are in nM, but the units for the graph are µM, the units from the DataFrame will be converted to match the conc_unit input. The final plot will scale based on the conc_unit input. By default, it will assume input concentration will be in nM.
xscale – int Set the scale of the X-axis as logarithmic or linear. It is logarithmic by default.
xscale_ticks – tuple Set the scale of the X-axis
ymax – int Give a set maximum limit for the Y-Axis
ymin – int Give a set minimum limit for the Y-Axis
line_color – str. Takes a list of colors. By default, it uses the CBPALETTE. List can contain name of colors or colors in hex code.
line_width – int Set width of lines in plot.
errorbar – str Set the type of seaborn errorbar to use. Defaults to ‘sd’.
marker – Optional, list Takes a list of for point markers.
markersize – int Set the marker size.
legend – Optional, bool Denotes a figure legend.
legend_loc – str Determine legend location. Default is best. Matplotlib options can be found here https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.legend.html
box – Optional bool. Draw a box to highlight a specific location. If box = True, then the box_color, box_intercept, and x_concentration MUST ALSO BE GIVEN.
box_color – str Set color of box. Default is gray.
box_intercept – int Set horizontal location of box. By default, it is set at 50% of the Y-axis.
conc_target – int Set vertical location of the box. By default, this is set to None. For example, if the box_intercept is set to 50%, then the x_concentration must be the Absolute IC50 value. If there is an input to x_concentration, it will override the box_intercept and the response data will move accordingly. Finally, the number must be in the same unit as the X-axis. i.e., if the axis is in µM, then the number for the x_concentration should be in µM and vice versa.
hline – Int or float Draw a horizontal line across the graph. This line will stretch across the length of the plot. This is optional and set to 0 by default.
hline_color – str Set color of horizontal line. Default color is gray.
vline – int or float This line will stretch across the height of the plot. This is optional and set to 0 by default.
vline_color – str Set color of vertical line. Default color is gray.
figsize – tuple Set figure size.
savepath – str File path for save location.
verbose – bool Output information about the plot.
- Returns:
Figure
- grid_curve_plot(concentration_col: str = None, response_col: str = None, name_col: str = None, column_num: int = 2, title: str = None, titlesize: int = 20, xlabel: str = None, ylabel: str = None, conc_unit: str = 'nM', xscale: str = 'log', xscale_ticks: tuple = None, ymax: int = None, ymin: int = None, line_color: list = ('#000000', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7'), line_width: int = 1.5, box: bool = False, box_color: str = 'gray', box_intercept: int = 50, hline: int = None, hline_color: str = 'gray', vline: int = None, vline_color: str = 'gray', figsize: tuple = (8.4, 4.8), savepath: str = None, verbose: bool = None, **kwargs)[source]
Generate a dose-response curve for mutliple drugs. Each curve will be placed in its own plot which is then placed in a grid.
- Parameters:
concentration_col – str Concentration column from DataFrame
response_col – str Response column from DataFrame
name_col – str Name column from DataFrame
column_num – int Set number of column grid
title – str Title of the figure
titlesize – int Modify plot title font size
- :param xlabel:str
Title of the X-axis
- Parameters:
ylabel – str Title of the Y-axis
ymax – int Give a set maximum limit for the Y-Axis
ymin – int Give a set minimum limit for the Y-Axis
conc_unit – str Input will assume that the concentration will be in nM. Thus, it will be automatically converted into µM. If xscale_unit is given as nM, no conversion will be performed.
xscale – str Set the scale of the X-axis as logarithmic or linear. It is logarithmic by default.
xscale_ticks – tuple Set the scale of the X-axis
line_color – list Takes a list of colors. By default, it uses the CBPALETTE. List can contain name of colors or colors in hex code.
line_width – int Set width of lines in plot.
errorbar – Set the type of seaborn errorbar to use. Defaults to ‘sd’.
box – bool Draw a box to highlight a specific location. If box = True, then the box_color, and box_intercept MUST ALSO BE GIVEN.
box_color – str Set color of box. Default color is gray.
box_intercept – int Set horizontal location of box. By default, it is set at Absolute IC50.
hline – int or float Draw horizontal line that will stretch across the length of the plot. This is optional and set to 0 by default.
hline_color – str Set color of horizontal line. Default color is gray.
vline – int or float Draw a line that will stretch across the height of the plot. This is optional and set to 0 by default.
vline_color – str Set color of vertical line. Default color is gray.
figsize – tuple Set figure size for subplot.
savepath – str File path for save location.
verbose – bool Output information about the plot.
- Returns:
Figure
- multi_curve_plot(concentration_col: str = None, response_col: str = None, name_col: str = None, title: str = None, titlesize: int = 12, xlabel: str = None, ylabel: str = None, conc_unit: str = 'nM', xscale: str = 'log', xscale_ticks: tuple = None, ymax: int = None, ymin: int = None, axis_fontsize: int = 10, line_color: list = ('#000000', '#E69F00', '#56B4E9', '#009E73', '#F0E442', '#0072B2', '#D55E00', '#CC79A7'), marker: list = ('o', '^', 's', 'D', 'v', '<', '>', 'p'), markersize: int = 8, line_width: int = 1.5, errorbar: str = 'sd', legend: bool = False, legend_loc: str = 'best', box_target: str = None, box_color: str = 'gray', box_intercept: int = 50, hline: int = None, hline_color: str = 'gray', vline: int = None, vline_color: str = 'gray', figsize: tuple = (6.4, 4.8), savepath: str = None, verbose: bool = None, **kwargs)[source]
Generate a dose-response plot for multiple drug targets. Curves will be placed into a single plot.
- Parameters:
concentration_col – str Concentration column from DataFrame
response_col – str Response column from DataFrame
name_col – Column containing name of drug from DataFrame
title – str Title of the figure
titlesize – int Modify plot title font size
xlabel – str Title of the X-axis
ylabel – str Title of the Y-axis
conc_unit – str Input will assume that the concentration will be in nM. Thus, it will be automatically converted into µM. If xscale_unit is given as nM, no conversion will be performed.
xscale – str Set the scale of the X-axis as logarithmic or linear. It is logarithmic by default.
xscale_ticks – tuple Set the scale of the X-axis
ymax – int Give a set maximum limit for the Y-Axis
ymin – int Give a set minimum limit for the Y-Axis
- :param axis_fontsize:int
Modify axis label font size
- Parameters:
line_color – str Takes a list of colors. By default, it uses the CBPALETTE. List can contain name of colors or colors in hex code.
markersize – Set the marker size.
line_width – int Set width of lines in plot.
errorbar – Set the type of seaborn errorbar to use. Defaults to ‘sd’.
marker – list Takes a list for point markers. Marker options can be found here: https://matplotlib.org/stable/api/markers_api.html
legend – bool Denotes a figure legend.
legend_loc – str Determine legend location. Matplotlib options can be found here https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.legend.html
box_target – str Draw a box to highlight a specific drug curve. Must use specific drug name.
box_color – str Set color of box. Default color is gray.
box_intercept – int Set horizontal location of box. By default, it is set at Absolute IC50.
hline – int or float Draw a horizontal line that will stretch across the length of the plot. This is optional and set to 0 by default.
hline_color – str Set color of horizontal line. Default color is gray.
vline – int or float Draw a line that will stretch across the height of the plot. This is optional and set to 0 by default.
vline_color – str Set color of vertical line. Default color is gray.
figsize – tuple Set figure size.
savepath – str File path for save location.
verbose – bool Output information about the plot.
- Returns:
Figure
Stats
- class py50.Stats(data)[source]
Class contains wrappers for pingouin module. The functions output data as a Pandas DataFrame. This is in a format needed for plotting with functions in class Plots(), however they can also be used individually to output single DataFrame for output as a csv or xlsx file using pandas.
- static explain_significance()[source]
Print out DataFrame containing explanations for star values. This is used for reference. See [GraphPad](https://www.graphpad.com/support/faq/what-is-the-meaning-of–or–or–in-reports-of-statistical-significance-from-prism-or-instat/)
- Returns:
pandas.DataFrame
- get_anova(value_col=None, group_col=None, **kwargs)[source]
One-way and N-way ANOVA.
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String or list of strings Name of columnName of column containing the grouping variable.
kwargs – optional Other options available with [pingouin.anova()](https://pingouin-stats.org/build/html/generated/pingouin.anova.html)
- Returns:
Pandas.DataFrame
- get_cochran(value_col=None, group_col=None, subgroup_col=None)[source]
Calculate Cochran Q Test. This is used when the dependent variable, or value_col, is binary. For details between groups, posthoc test will be needed.
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String Name of column containing the within factor.
subgroup_col – String Name of column containing the subject identifier.
- Returns:
Pandas.DataFrame
- get_friedman(group_col=None, value_col=None, subgroup_col=None, method='chisq')[source]
Calculate Friedman Test. Determines if distributions of two or more paired samples are equal. For details between groups, posthoc test (get_pairwise_tests(parametric=False)) will be needed.
- Parameters:
value_col – String Name of column containing the dependent variable
group_col – String Name of column containing the between-subject factor.
subgroup_col – String Name of column containing the subject/rater identifier
method – String Statistical test to perform. Must be ‘chisq’ (chi-square test) or ‘f’ (F test). See Pingouin documentation for further details
- Returns:
Pandas.DataFrame
- get_gameshowell(value_col=None, group_col=None, effsize='hedges')[source]
Pairwise Games-Howell post-hoc test
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String Name of columnName of column containing the between factor.
effsize – String or None Effect size. Additional methods can be found with [pingouin.pairwise_gameshowell()](https://pingouin-stats.org/build/html/generated/pingouin.pairwise_gameshowell.html)
- Returns:
Pandas.DataFrame
- get_homoscedasticity(value_col=None, group_col=None, method='levene', **kwargs)[source]
Test for data variance.
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String Name of columnName of column containing the grouping variable.
method – String Statistical test. ‘levene’ (default). Additional tests can be found with [pingouin.homoscedasticity()](https://pingouin-stats.org/build/html/generated/pingouin.homoscedasticity.html#pingouin.homoscedasticity)
kwargs – optional Other options available with pingouin.homoscedasticity()
- Returns:
Pandas.DataFrame
- get_kruskal(value_col=None, group_col=None, detailed=False)[source]
Calculate Kruskal-Wallis H-test for independent samples.
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String Name of column containing the between factor.
detailed – Boolean Output additional details from Kruskal-Wallis H-test.
- Returns:
Pandas.DataFrame
- get_mannu(value_col=None, group_col=None, subgroup_col=None, alternative='two-sided', **kwargs)[source]
Calculate Mann-Whitney U Test. This is a non-parametric version of the independent T-test.
- Parameters:
self – pandas.DataFrame Input DataFrame.
value_col – String Columns containing values for testing.
group_col – String Column containing group name.
subgroup_col – String Column containing subgroup name.
alternative – String Defines the alternative hypothesis, or tail of the test. Must be one of “two-sided”. Must be one of “two-sided” (default), “greater” or “less”.
kwargs – Optional Other options available with [pingouin.mwu()](https://pingouin-stats.org/build/html/generated/pingouin.mwu.html)
- Returns:
Pandas.DataFrame
- get_mixed_anova(value_col=None, group_col=None, within_subject_col=None, subject_col=None, **kwargs)[source]
Mixed-design ANOVA.
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String Name of column containing the between factor.
within_subject_col – String Name of column containing the within-subject factor (repeated measurements).
subject_col – Name of column containing the between-subject identifier.
kwargs – optional Other options available with [pingouin.mixed_anova()](https://pingouin-stats.org/build/html/generated/pingouin.mixed_anova.html)
- Returns:
Pandas.DataFrame
- get_normality(value_col=None, group_col=None, method='shapiro', **kwargs)[source]
Test data normality of dataset.
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String Name of columnName of column containing the grouping variable.
method – String Normality test. ‘shapiro’ (default). Additional tests can be found with [pingouin.normality()](https://pingouin-stats.org/build/html/generated/pingouin.normality.html)
kwargs – optional Other options available with pingouin.normality()
- Returns:
Pandas.DataFrame
- static get_p_matrix(data, test=None, group_col1=None, group_col2=None, order=None)[source]
Convert dataframe of statistic results into a matrix. Group columns must be indicated. Group 2 is optional and depends on test used (i.e. pairwise vs Mann-Whitney U). Final DataFrame output can be used with the Plots.p_matrix() function to generate a heatmap of p-values.
- Parameters:
data – pandas.DataFrame Input DataFrame. Must be of already computed test results.
group_col1 – String Name of column containing the group
group_col2 – String Name of column containing the second group. This variable is optional.
test – String Name of the test used to calculate statistics.
order – List or String == “alpha” Reorder the groups for the final table. If input is string “alpha”, the order of the groups will be alphabetized.
- Returns:
- get_pairwise_mixed(value_col=None, group_col=None, within_subject_col=None, subject_col=None, parametric=True, **kwargs)[source]
Posthoc test for mixed ANOVA.
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String or list with 2 elements Name of column containing the between-subject factors.
within_subject_col – String or list with 2 elements Name of column containing the within-subject identifier.
subject_col – String Name of column containing the subject identifier. This is mandatory if subgroup_col is used.
parametric – Boolean If True (default), use the parametric ttest() function. If False, use [pingouin.wilcoxon()](https://pingouin-stats.org/build/html/generated/pingouin.wilcoxon.html#pingouin.wilcoxon) or [pingouin.mwu()](https://pingouin-stats.org/build/html/generated/pingouin.mwu.html#pingouin.mwu) for paired or unpaired samples, respectively.
kwargs – dict Additional keywords arguments that are passed to [pingouin.pairwise_tests()](https://pingouin-stats.org/build/html/generated/pingouin.pairwise_tests.html#pingouin.pairwise_tests).
- Returns:
pandas.DataFrame
- get_pairwise_rm(value_col=None, group_col=None, within_subject_col=None, subject_col=None, parametric=True, **kwargs)[source]
Posthoc test for repeated measures.
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String or list with 2 elements Name of column containing the between-subject factors.
within_subject_col – String or list with 2 elements Name of column containing the within-subject identifier.
subject_col – String Name of column containing the subject identifier. This is mandatory if subgroup_col is used.
parametric – Boolean If True (default), use the parametric ttest() function. If False, use [pingouin.wilcoxon()](https://pingouin-stats.org/build/html/generated/pingouin.wilcoxon.html#pingouin.wilcoxon) or [pingouin.mwu()](https://pingouin-stats.org/build/html/generated/pingouin.mwu.html#pingouin.mwu) for paired or unpaired samples, respectively.
kwargs – dict Additional keywords arguments that are passed to [pingouin.pairwise_tests()](https://pingouin-stats.org/build/html/generated/pingouin.pairwise_tests.html#pingouin.pairwise_tests).
- Returns:
pandas.DataFrame
- get_pairwise_tests(value_col=None, group_col=None, within_subject_col=None, subject_col=None, parametric=True, **kwargs)[source]
Posthoc test for parametric or nonparametric statistics. By default, the parametric parameter is set as True.
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String or list with 2 elements Name of column containing the between-subject factors.
within_subject_col – String or list with 2 elements Name of column containing the within-subject identifier.
subject_col – String Name of column containing the subject identifier. This is mandatory if subgroup_col is used.
parametric – Boolean If True (default), use the parametric ttest() function. If False, use [pingouin.wilcoxon()](https://pingouin-stats.org/build/html/generated/pingouin.wilcoxon.html#pingouin.wilcoxon) or [pingouin.mwu()](https://pingouin-stats.org/build/html/generated/pingouin.mwu.html#pingouin.mwu) for paired or unpaired samples, respectively.
kwargs – dict Additional keywords arguments that are passed to [pingouin.pairwise_tests()](https://pingouin-stats.org/build/html/generated/pingouin.pairwise_tests.html#pingouin.pairwise_tests).
- Returns:
pandas.DataFrame
- get_rm_anova(value_col=None, within_subject_col=None, subject_col=None, correction='auto', detailed=False, effsize='ng2')[source]
One-way and two-way repeated measures ANOVA.
- Parameters:
value_col – String Name of column containing the dependent variable.
within_subject_col – String Name of column containing the within factor.
subject_col – String Name of column containing the subject identifier.
correction – String or Boolean If True, also return the Greenhouse-Geisser corrected p-value.
detailed – Boolean If True, return full ANOVA table.
effsize – String Effect size.
- Returns:
Pandas.DataFrame
- get_tukey(value_col=None, group_col=None, effsize='hedges')[source]
Pairwise Tukey post-hoc test.
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String Name of columnName of column containing the between factor.
effsize – String or None Effect size. Additional methods can be found with [pingouin.pairwise_tukey()](https://pingouin-stats.org/build/html/generated/pingouin.pairwise_tukey.html)
- Returns:
Pandas.DataFrame
- get_welch_anova(value_col=None, group_col=None)[source]
One-way Welch ANOVA
- Parameters:
value_col – String Name of column containing the dependent variable.
group_col – String Name of column containing the grouping variable.
- Returns:
Pandas.DataFrame
- get_wilcoxon(value_col=None, group_col=None, subgroup_col=None, alternative='two-sided', **kwargs)[source]
Calculate wilcoxon tests. This is non-parametric version of paired T-test. Data number must be uniform to work.
- Parameters:
value_col – String Columns containing values for testing.
group_col – String Column containing group name.
subgroup_col – String Column containing subgroup name.
alternative – String Defines the alternative hypothesis, or tail of the test. Must be one of “two-sided”. Must be one of “two-sided” (default), “greater” or “less”.
kwargs – Optional Other options available with [pingouin.wilcoxon()](https://pingouin-stats.org/build/html/generated/pingouin.wilcoxon.html)
- Returns:
Pandas.DataFrame
Plots
- class py50.Plots(data)[source]
-
- barplot(test=None, group_col=None, value_col=None, group_order=None, subgroup_col=None, subject_col=None, within_subject_col=None, pairs=None, pvalue_label=None, hide_ns=False, palette=None, orient='v', loc='inside', errorbar='sd', capsize=0.1, return_df=None, **kwargs)[source]
Draw a barplot from the input DataFrame.
- Parameters:
test – String Name of test for calculations. Names must match the test names from the py50.Stats()
group_col – String Name of column containing groups. This should be the between depending on the selected test.
value_col – String Name of the column containing the values. This is the dependent variable.
group_order – List. Place the groups in a specific order on the plot.
subgroup_col – String Name of the column containing the subgroup for the group column. This is associated with the hue parameters in Seaborn.
subject_col – String Name of the column containing the subject column.
within_subject_col – String Name of the column containing the within subject column.
pairs – List A list containing specific pairings for annotation on the plot.
pvalue_label – List. A list containing specific pvalue labels. This order must match the length of pairs list.
hide_ns – bool Automatically hide groups with no significance from plot.
palette – String or List. Color palette used for the plot. Can be given as common color name or in hex code.
orient – String Orientation of the plot. Only “v” and “h” are for vertical and horizontal, respectively, is supported
loc – String Set location of annotations. Only “inside” or “outside” are supported.
errorbar – String Set confidence interval on plot.
capsize – Int Set cap size on plot.
return_df – Boolean Returns a DataFrame of calculated results. If pairs used, only return rows with annotated pairs.
- Returns:
- boxenplot(test=None, group_col=None, value_col=None, group_order=None, subgroup_col=None, subject_col=None, within_subject_col=None, pairs=None, pvalue_label=None, hide_ns=False, palette=None, orient='v', loc='inside', return_df=None, **kwargs)[source]
Draw a boxenplot from the input DataFrame.
- Parameters:
test – String Name of test for calculations. Names must match the test names from the py50.Stats()
group_col – String Name of column containing groups. This should be the between depending on the selected test.
value_col – String Name of the column containing the values. This is the dependent variable.
group_order – List. Place the groups in a specific order on the plot.
subgroup_col – String Name of the column containing the subgroup for the group column. This is associated with the hue parameters in Seaborn.
subject_col – String Name of the column containing the subject column.
within_subject_col – String Name of the column containing the within subject column.
pairs – List A list containing specific pairings for annotation on the plot.
pvalue_label – List. A list containing specific pvalue labels. This order must match the length of pairs list.
hide_ns – bool Automatically hide groups with no significance from plot.
palette – String or List. Color palette used for the plot. Can be given as common color name or in hex code.
orient – String Orientation of the plot. Only “v” and “h” are for vertical and horizontal, respectively, is supported
loc – String Set location of annotations. Only “inside” or “outside” are supported.
return_df – Boolean Returns a DataFrame of calculated results. If pairs used, only return rows with annotated pairs.
- Returns:
- boxplot(test=None, group_col=None, value_col=None, group_order=None, subgroup_col=None, subject_col=None, within_subject_col=None, pairs=None, pvalue_label=None, hide_ns=False, palette=None, orient='v', loc='inside', whis=1.5, return_df=None, **kwargs)[source]
Draw a boxplot from the input DataFrame.
- Parameters:
test – String Name of test for calculations. Names must match the test names from the py50.Stats()
group_col – String Name of column containing groups. This should be the between depending on the selected test.
value_col – String Name of the column containing the values. This is the dependent variable.
group_order – List. Place the groups in a specific order on the plot.
subgroup_col – String Name of the column containing the subgroup for the group column. This is associated with the hue parameters in Seaborn.
subject_col – String Name of the column containing the subject column.
within_subject_col – String Name of the column containing the within subject column.
pairs – List A list containing specific pairings for annotation on the plot.
pvalue_label – List. A list containing specific pvalue labels. This order must match the length of pairs list.
hide_ns – bool Automatically hide groups with no significance from plot.
palette – String or List. Color palette used for the plot. Can be given as common color name or in hex code.
orient – String Orientation of the plot. Only “v” and “h” are for vertical and horizontal, respectively, is supported
loc – String Set location of annotations. Only “inside” or “outside” are supported.
whis – Int Set length of whiskers on plot.
return_df – Boolean Returns a DataFrame of calculated results. If pairs used, only return rows with annotated pairs.
- Returns:
Fig
- ci_plot(data: Optional = None, value_col: str = None, group_col: str = None, alpha: float = 0.05, title: str = 'Tukey HSD Confidence Intervals', xlabel: str = None, ylabel: str = None, linewidth: float = 1.5, figsize: tuple = (8, 6), return_stats: bool = False)[source]
Generate a confidence interval plot. The plot utilizes the Tukey Honest Significant Difference (HSD) test and is a wrapper for statsmodels (https://www.statsmodels.org/dev/index.html). ANOVA will also be calculated and its p-value will be plotted alongside the title. :param data: Optional
Input dataset.
- Parameters:
value_col – str Name of the column containing the dependent variable.
group_col – str Name of the column containing the groups.
alpha – float The significance level for the test.
title – str Set the title for the figure. Defaults to “Tukey HSD Confidence Intervals”.
xlabel – str Set the label for the x-axis. If None is given, defaults to the value_col input.
ylabel – str Set the label for the y-axis. If None is given, defaults to the group_col input.
linewidth – float Set the width of the lines.
figsize – tuple Set the figure size. Defaults to (8,6).
return_stats – bool Whether to return the Tukey HSD test.
- Returns:
- distribution(val_col=None, type='histplot', **kwargs)[source]
- Parameters:
self – Pandas.Dataframe Input data.
val_col – String The name of the column containing the dependent variable.
type – String The type of figure drawn. For distribution, only “histplot” or “qqplot” supported
kwargs – Optional keyword arguments for seaborn or pg.qqplot.
- Returns:
figure
- p_matrix(data=None, cmap=None, title=None, titlesize=14, linewidths=0.01, linecolor='gray', **kwargs)[source]
Wrapper function for scikit_posthoc heatmap.
- Parameters:
data – Pandas.Dataframe Input table must be a matrix calculated using the stats.get_p_matrix(). Optional.
cmap – List A list of colors. Can be color names or hex codes.
title – String Input title for figure.
title_titlesize – Int Set size of figure legend.
linewidths – Int Set line width of figure.
linecolor – String Set line color. Can be color name or hex code.
kwargs – Optional Keyword arguemnts associated with [scikit-posthocs](https://scikit-posthocs.readthedocs.io/en/latest/)
- Returns:
Pyplot figure
- stripplot(test=None, group_col=None, value_col=None, group_order=None, subgroup_col=None, subject_col=None, within_subject_col=None, pairs=None, pvalue_label=None, hide_ns=False, palette=None, orient='v', loc='inside', return_df=None, **kwargs)[source]
Draw a stripplot from the input DataFrame.
- Parameters:
test – String Name of test for calculations. Names must match the test names from the py50.Stats()
group_col – String Name of column containing groups. This should be the between depending on the selected test.
value_col – String Name of the column containing the values. This is the dependent variable.
group_order – List. Place the groups in a specific order on the plot.
subgroup_col – String Name of the column containing the subgroup for the group column. This is associated with the hue parameters in Seaborn.
subject_col – String Name of the column containing the subject column.
within_subject_col – String Name of the column containing the within subject column.
pairs – List A list containing specific pairings for annotation on the plot.
pvalue_label – List. A list containing specific pvalue labels. This order must match the length of pairs list.
hide_ns – bool Automatically hide groups with no significance from plot.
palette – String or List. Color palette used for the plot. Can be given as common color name or in hex code.
orient – String Orientation of the plot. Only “v” and “h” are for vertical and horizontal, respectively, is supported
loc – String Set location of annotations. Only “inside” or “outside” are supported.
return_df – Boolean Returns a DataFrame of calculated results. If pairs used, only return rows with annotated pairs.
- Returns:
- swarmplot(test=None, group_col=None, value_col=None, group_order=None, subgroup_col=None, subject_col=None, within_subject_col=None, pairs=None, pvalue_label=None, hide_ns=False, palette=None, orient='v', loc='inside', return_df=None, **kwargs)[source]
Draw a swarm plot from the input DataFrame.
- Parameters:
test – String Name of test for calculations. Names must match the test names from the py50.Stats()
group_col – String Name of column containing groups. This should be the between depending on the selected test.
value_col – String Name of the column containing the values. This is the dependent variable.
group_order – List. Place the groups in a specific order on the plot.
subgroup_col – String Name of the column containing the subgroup for the group column. This is associated with the hue parameters in Seaborn.
subject_col – String Name of the column containing the subject column.
within_subject_col – String Name of the column containing the within subject column.
pairs – List A list containing specific pairings for annotation on the plot.
pvalue_label – List. A list containing specific pvalue labels. This order must match the length of pairs list.
hide_ns – bool Automatically hide groups with no significance from plot.
palette – String or List. Color palette used for the plot. Can be given as common color name or in hex code.
orient – String Orientation of the plot. Only “v” and “h” are for vertical and horizontal, respectively, is supported
loc – String Set location of annotations. Only “inside” or “outside” are supported.
return_df – Boolean Returns a DataFrame of calculated results. If pairs used, only return rows with annotated pairs.
- Returns:
- violinplot(test=None, group_col=None, value_col=None, group_order=None, subgroup_col=None, subject_col=None, within_subject_col=None, pairs=None, pvalue_label=None, hide_ns=False, palette=None, orient='v', loc='inside', return_df=None, **kwargs)[source]
Draw a violinplot from the input DataFrame.
- Parameters:
test – String Name of test for calculations. Names must match the test names from the py50.Stats()
group_col – String Name of column containing groups. This should be the between depending on the selected test.
value_col – String Name of the column containing the values. This is the dependent variable.
group_order – List. Place the groups in a specific order on the plot.
subgroup_col – String Name of the column containing the subgroup for the group column. This is associated with the hue parameters in Seaborn.
subject_col – String Name of the column containing the subject column.
within_subject_col – String Name of the column containing the within subject column.
pairs – List A list containing specific pairings for annotation on the plot.
pvalue_label – List. A list containing specific pvalue labels. This order must match the length of pairs list.
hide_ns – bool Automatically hide groups with no significance from plot.
palette – String or List. Color palette used for the plot. Can be given as common color name or in hex code.
orient – String Orientation of the plot. Only “v” and “h” are for vertical and horizontal, respectively, is supported
loc – String Set location of annotations. Only “inside” or “outside” are supported.
return_df – Boolean Returns a DataFrame of calculated results. If pairs used, only return rows with annotated pairs.
- Returns: