Statistical comparison

Statistical comparisons

lisa.stats.series_mean_stats(series, kind, confidence_level=0.95)[source]

Compute the mean along with the a confidence interval based on the T-score.

Returns:

A tuple with:

  1. The mean

  2. The standard deviation, or its equivalent

  3. The standard error of the mean, or its equivalent (Harmonic Standard Error, Geometric Standard Error).

  4. The interval, as an 2-tuple of +/- values

Parameters:
  • kind (str) – Kind of mean to use: * arithmetic * harmonic * geometric

  • confidence_level (float) – Confidence level of the confidence interval.

lisa.stats.guess_mean_kind(unit, control_var)[source]

Guess which kind of mean should be used to summarize results in the given unit.

Returns:

'arithmetic' if an arithmetic mean should be used, or 'harmonic'. Geometric mean uses cannot be inferred by this function.

Parameters:
  • unit (str) – Unit of the values, e.g. 'km/h'.

  • control_var (str) – Control variable, i.e. variable that is fixed during the experiment. For example, in a car speed experiment, the control variable could be the distance (fixed distance), or the time. In that case, we would have unit='km/h' and control_var='h' if the time was fixed, or control_var='km' if the distance was fixed.

class lisa.stats.Stats(df, value_col='value', ref_group=None, filter_rows=None, compare=True, agg_cols=None, mean_ci_confidence=None, stats=None, stat_col='stat', unit_col='unit', ci_cols=('ci_minus', 'ci_plus'), control_var_col='fixed', mean_kind_col='mean_kind', non_normalizable_units={'pval'})[source]

Bases: Loggable

Compute the statistics on an input pandas.DataFrame in “database” format.

Parameters:
  • df (pandas.DataFrame) –

    Dataframe in database format, i.e. meaningless index, and values in a given column with the other columns used as tags.

    Note

    Redundant tag columns (aka that are equal) will be removed from the dataframe.

  • value_col (str) – Name of the column containing the values.

  • ref_group (dict(str, object)) –

    Reference group used to compare the other groups against. It’s format is dict(tag_column_name, tag_value). The comparison will be made on subgroups built out of all the other tag columns, with the reference subgroups being the one matching that dictionary. If the tag value is None, the key will only be used for grouping in graphs. Comparison will add the following statistics:

    • A 2-sample Komolgorov-Smirnov test 'ks2samp_test' column. This test is non-parametric and checks for difference in distributions. The only assumption is that the distribution is continuous, which should suit almost all use cases

    • Most statistics will be normalized against the reference group as a difference percentage, except for a few non-normalizable values.

    Note

    The group referenced must exist, otherwise unexpected behaviours might occur.

  • filter_rows (dict(object, object) or None) – Filter the given pandas.DataFrame with a dict of {“column”: value) that rows has to match to be selected.

  • compare (bool) – If True, normalize most statistics as a percentage of change compared to ref_group.

  • agg_cols (list(str)) –

    Columns to aggregate on. In a sense, the given columns will be treated like a compound iteration number. Defaults to:

    • iteration column if available, otherwise

    • All the tag columns that are neither the value nor part of the ref_group.

  • mean_ci_confidence (float) – Confidence level used to establish the mean confidence interval, between 0 and 1.

  • stats (dict(str, str or collections.abc.Callable)) –

    Dictionnary of statistical functions to summarize each value group formed by tag columns along the aggregation columns. If None is given as value, the name will be passed to pandas.core.groupby.SeriesGroupBy.agg(). Otherwise, the provided function will be run.

    Note

    One set of keys is special: 'mean', 'std' and 'sem'. When value None is used, a custom function is used instead of the one from pandas, which will compute other related statistics and provide a confidence interval. An attempt will be made to guess the most appropriate kind of mean to use using the mean_kind_col, unit_col and control_var_col:

    • The mean itself, as:

      • 'mean' (arithmetic)

      • 'hmean' (harmonic)

      • 'gmean' (geometric)

    • The Standard Error of the Mean (SEM):

      • 'sem' (arithmetic)

      • 'hse' (harmonic)

      • 'gse' (geometric)

    • The standard deviation:

      • 'std' (arithmetic)

      • 'hsd' (harmonic)

      • 'gsd' (geometric)

  • stat_col (str) – Name of the column used to hold the name of the statistics that are computed.

  • unit_col (str) – Name of the column holding the unit of each value (as a string).

  • ci_cols (tuple(str, str)) – Name of the two columns holding the confidence interval for each computed statistics.

  • control_var_col – Name of the column holding the control variable name in the experiment leading to the given value. .. seealso:: guess_mean_kind()

  • control_var_col – str

  • mean_kind_col (str) –

    Type of mean to be used to summarize this value.

    Note

    Unless geometric mean is used, unit_col and control_var_col should be used to make things more obvious and reduce risks of confusion.

  • non_normalizable_units (list(str)) – List of units that cannot be normalized against the reference group.

Examples:

import pandas as pd

# The index is meaningless, all what matters is to uniquely identify
# each row using a set of tag columns, such as 'board', 'kernel',
# 'iteration', ...
df = pd.DataFrame.from_records(
    [
        ('juno', 'kernel1', 'bench1', 'score1', 1, 42, 'frame/s', 's'),
        ('juno', 'kernel1', 'bench1', 'score1', 2, 43, 'frame/s', 's'),
        ('juno', 'kernel1', 'bench1', 'score2', 1, 420, 'frame/s', 's'),
        ('juno', 'kernel1', 'bench1', 'score2', 2, 421, 'frame/s', 's'),
        ('juno', 'kernel1', 'bench2', 'score',  1, 54, 'foobar', ''),
        ('juno', 'kernel2', 'bench1', 'score1', 1, 420, 'frame/s', 's'),
        ('juno', 'kernel2', 'bench1', 'score1', 2, 421, 'frame/s', 's'),
        ('juno', 'kernel2', 'bench1', 'score2', 1, 4200, 'frame/s', 's'),
        ('juno', 'kernel2', 'bench1', 'score2', 2, 4201, 'frame/s', 's'),
        ('juno', 'kernel2', 'bench2', 'score',  1, 540, 'foobar', ''),

        ('hikey','kernel1', 'bench1', 'score1', 1, 42, 'frame/s', 's'),
        ('hikey','kernel1', 'bench1', 'score2', 1, 420, 'frame/s', 's'),
        ('hikey','kernel1', 'bench2', 'score',  1, 54, 'foobar', ''),
        ('hikey','kernel2', 'bench1', 'score1', 1, 420, 'frame/s', 's'),
        ('hikey','kernel2', 'bench1', 'score2', 1, 4200, 'frame/s', 's'),
        ('hikey','kernel2', 'bench2', 'score',  1, 540, 'foobar', ''),
    ],
    columns=['board', 'kernel', 'benchmark', 'metric', 'iteration', 'value', 'unit', 'fixed'],
)


# Get a Dataframe will all the default statistics.
Stats(df).df

# Use a ref_group will also compare other groups against it
Stats(df, ref_group={'board': 'juno', 'kernel': 'kernel1'}).df
property df

pandas.DataFrame containing the statistics.

See also

get_df() for more controls.

get_df(remove_ref=None, compare=None)[source]

Returns a pandas.DataFrame containing the statistics.

Parameters:
  • compare (bool or None) – See Stats compare parameter. If None, it will default to the value provided to Stats.

  • remove_ref (bool or None) – If True, the rows of the reference group described by ref_group for this object will be removed from the returned dataframe. If None, it will default to compare.

plot_stats(filename=None, remove_ref=None, backend=None, groups_as_row=False, kind=None, **kwargs)[source]

Returns a matplotlib.figure.Figure containing the statistics for the class input pandas.DataFrame.

Parameters:
  • filename (str or None) – Path to the image file to write to.

  • remove_ref (bool or None) – If True, do not plot the reference group. See get_df().

  • backend (str or None) – Holoviews backend to use: bokeh or matplotlib. If None, the current holoviews backend selected with hv.extension() will be used.

  • groups_as_row (bool) – By default, subgroups are used as rows in the subplot matrix so that the values shown on a given graph can be expected to be in the same order of magnitude. However, when there are many subgroups, this can lead to very large and somewhat hard to navigate plot matrix. In this case, using the group for the rows might help a great deal.

  • kind (str or None) –

    Type of plot. Can be any of:

    • horizontal_bar

    • vertical_bar

    • None

Variable keyword arguments:

Forwarded to get_df().

plot_histogram(cumulative=False, bins=50, nbins=None, density=False, **kwargs)[source]

Returns a matplotlib.figure.Figure with histogram of the values in the input pandas.DataFrame.

Parameters:
  • cumulative (bool) – Cumulative plot (CDF).

  • bins (int or None) – Number of bins for the distribution.

  • filename (str or None) – Path to the image file to write to.

plot_values(**kwargs)[source]

Returns a holoviews element with the values in the input pandas.DataFrame.

Parameters:

filename (str or None) – Path to the image file to write to.

Workload Automation

exception lisa.wa.WAOutputNotFoundError(collectors)[source]

Bases: Exception

classmethod from_collector(collector, excep)[source]
classmethod from_excep_list(exceps)[source]
class lisa.wa.StatsProp[source]

Bases: object

Provides a stats property.

get_stats(ensure_default_groups=True, ref_group=None, agg_cols=None, **kwargs)[source]

Returns a lisa.stats.Stats loaded with the result pandas.DataFrame.

Parameters:
  • ensure_default_groups (bool) – If True, ensure ref_group will contain appropriate keys for usual Workload Automation result display.

  • ref_group – Forwarded to lisa.stats.Stats

Variable keyword arguments:

Forwarded to lisa.stats.Stats

property stats

Short-hand property equivalent to self.get_stats()

See also

get_stats()

class lisa.wa.WAOutput(path, kernel_path=None)[source]

Bases: StatsProp, Mapping, Loggable

Recursively parse a Workload Automation output, using registered collectors (leaf subclasses of WACollectorBase). The data collected are accessible through a pandas.DataFrame in “database” format:

  • meaningless index

  • all values are tagged using tag columns

Parameters:
  • path (str) – Path containing a Workload Automation output.

  • kernel_path – Kernel source path. Used to resolve the name of the kernel which ran the workload.

  • kernel_path – str

Example:

wa_output = WAOutput('wa/output/path')
# Pick a specific collector. See also WAOutput.get_collector()
stats = wa_output['results'].stats
stats.plot_stats(filename='stats.html')
__hash__()[source]

Each instance is different, like regular objects, and unlike dictionaries.

property df

DataFrame containing the data collected by all the registered WAOutput collectors.

get_collector(name, **kwargs)[source]

Returns a new collector with custom parameters passed to it.

Parameters:

name (str) – Name of the collector.

Variable keyword arguments:

Forwarded to the collector’s constructor.

Example:

WAOutput('wa/output/path').get_collector('energy', postprocess=func)
property jobs

List containing all the jobs present in the output of ‘wa run’.

property outputs

Dict containing a mapping of ‘wa run’ names to RunOutput objects.

class lisa.wa.WACollectorBase(wa_output, df_postprocess=None)[source]

Bases: StatsProp, Loggable, ABC

Base class for all Workload Automation dataframe collectors.

Parameters:

See also

Instances of this classes are typically built using WAOutput.get_collector() rather than directly.

property df

pandas.DataFrame containing the data collected.

class lisa.wa.WAResultsCollector(wa_output, df_postprocess=None)[source]

Bases: WACollectorBase

Collector for the Workload Automation test results.

NAME = 'results'
class lisa.wa.WAArtifactCollectorBase(wa_output, df_postprocess=None)[source]

Bases: WACollectorBase

Workload Automation artifact collector base class.

class lisa.wa.WAEnergyCollector(wa_output, df_postprocess=None)[source]

Bases: WAArtifactCollectorBase

WA collector for the energy_measurement augmentation.

Example:

def postprocess(df):
    df = df.pivot_table(values='value', columns='metric', index=['sample', 'iteration', 'workload'])

    df = pd.DataFrame({
        'CPU_power': (
            df['A55_power'] +
            df['A76_1_power'] +
            df['A76_2_power']
        ),
    })
    df['unit'] = 'Watt'
    df = df.reset_index()
    df = df.melt(id_vars=['sample', 'iteration', 'workload', 'unit'], var_name='metric')
    return df

WAOutput('wa/output/path').get_collector(
    'energy',
    df_postprocess=postprocess,
).df
NAME = 'energy'
get_stats(**kwargs)[source]

Returns a lisa.stats.Stats loaded with the result pandas.DataFrame.

Parameters:
  • ensure_default_groups (bool) – If True, ensure ref_group will contain appropriate keys for usual Workload Automation result display.

  • ref_group – Forwarded to lisa.stats.Stats

Variable keyword arguments:

Forwarded to lisa.stats.Stats

class lisa.wa.WATraceCollector(wa_output, trace_to_df=<function _stub_trace_to_df>, **kwargs)[source]

Bases: WAArtifactCollectorBase

WA collector for the trace augmentation.

Parameters:

trace_to_df (collections.abc.Callable) – Function used by the collector to convert the lisa.trace.Trace to a pandas.DataFrame.

Variable keyword arguments:

Forwarded to lisa.trace.Trace.

Example:

def trace_idle_analysis(trace):
    cpu = 0
    df = trace.ana.idle.df_cluster_idle_state_residency([cpu])
    df = df.reset_index()
    df['cpu'] = cpu

    # Melt the column 'time' into lines, so that the dataframe is in
    # "database" format: each value is uniquely identified by "tag"
    # columns
    return df.melt(
        var_name='metric',
        value_vars=['time'],
        id_vars=['idle_state'],
     )

WAOutput('wa/output/path').get_collector(
    'trace',
    trace_to_df=trace_idle_analysis,
).df
NAME = 'trace'
property traces

lisa.utils.LazyMapping that maps job names & iteration numbers to their corresponding lisa.trace.Trace.

class lisa.wa.WAJankbenchCollector(wa_output, df_postprocess=None)[source]

Bases: WAArtifactCollectorBase

WA collector for the jankbench frame timings.

The collector framework will return a single pandas.DataFrame with the results from every jankbench job in lisa.stats.Stats format (i.e. the returned dataframe is arranged such that each reported metric is separated as a separate row). The metrics reported are:

. total_duration: Time in milliseconds to complete the frame

. jank_frame: Boolean indicator of missed frame deadline. 1 is a Jank frame, 0 is not.

. name: Subtest name, provided by the Jankbench app

. frame_id: monotonically increasing frame number, starts from 1 for each subtest iteration.

An example plotter matching the old-style output can be found in the jupyter notebook working directory at ipynb/wltests/WAOutput-JankbenchDemo.ipynb

If you have existing code expecting a more direct translation of the original sqlite database format, you can massage the collected dataframe back into a closer resemblance to the original source database with this sequence of pandas operations:

wa_output = WAOutput('wa/output/path')
df = wa_output['jankbench'].df
db_df = df.pivot(index=['iteration', 'id', 'kernel', 'frame_id'], columns=['variable'])
db_df = db_df['value'].reset_index()
db_df.columns.name = None
# db_df now looks more like the original format
NAME = 'jankbench'
get_stats(**kwargs)[source]

Returns a lisa.stats.Stats loaded with the result pandas.DataFrame.

Parameters:
  • ensure_default_groups (bool) – If True, ensure ref_group will contain appropriate keys for usual Workload Automation result display.

  • ref_group – Forwarded to lisa.stats.Stats

Variable keyword arguments:

Forwarded to lisa.stats.Stats

class lisa.wa.WASysfsExtractorCollector(wa_output, path, type='diff', **kwargs)[source]

Bases: WAArtifactCollectorBase

WA collector for the syfs-extractor augmentation.

Example:

def pixel6_energy_meter(df):
    # Keep only CPU's meters
    df = df[df.value.str.contains('S4M_VDD_CPUCL0|S3M_VDD_CPUCL1|S2M_VDD_CPUCL2')]
    df[['variable', 'value']] = df.value.str.split(', ', expand=True)

    def _clean_variable(variable):
        if 'S4M_VDD_CPUCL0' in variable:
            return 'little-energy'
        if 'S3M_VDD_CPUCL1' in variable:
            return 'mid-energy'
        if 'S2M_VDD_CPUCL2' in variable:
            return 'big-energy'
        return ''

    df['variable'] = df['variable'].apply(_clean_variable)
    df['value'] = df['value'].astype(int)
    df['unit'] = "bogo-ujoules"

    # Add a total energy variable
    df = pd.concat([
        df,
        pd.DataFrame(data={
            'variable': 'total-energy',
            'value': [df['value'].sum()]
        })
    ])
    df.ffill(inplace=True)

    return df

df = WAOutput('.').get_collector(
        'sysfs-extractor',
        path='/sys/bus/iio/devices/iio:device0/energy_value',
        df_postprocess=pixel6_energy_meter
).df
NAME = 'sysfs-extractor'