Skip to content

Mismatch between documentation and code for the computation of non parametric variables IS and IV #148

@achey2016

Description

@achey2016

The documentation for IS and IV use formula with uncorrected variance
but the code use pandas.Series.var without specifying ddof=0, which by default correct for bias (using ddof=1).

For long recordings the results should be almost the same but for shorter recordings it could lead to slight differences with other tools.

in the documentation for IS

This variable is defined in [1]:

$$IS = \frac{d^{24h}}{d^{1h}}$$

with:

$$d^{1h} = \sum_{i}^{n}\frac{\left(x_{i}-\bar{x}\right)^{2}}{n}$$

where $x_{i}$ is the number of active (counts higher than a
predefined threshold) minutes during the $i^{th}$ period,
$\bar{x}$ is the mean of all data and $n$ is the number of
periods covered by the actigraphy data and with:

$$d^{24h} = \sum_{i}^{p} \frac{ \left( \bar{x}_{h,i} - \bar{x} \right)^{2} }{p}$$

What the current implementation does

$$IS = \frac{d^{24h}}{d^{1h}}$$

with:

$$d^{1h} = \sum_{i}^{n}\frac{\left(x_{i}-\bar{x}\right)^{2}}{n-1} = \mathrm{data.var()}$$

and:

$$d^{24h} = \sum_{i}^{p} \frac{ \left( \bar{x}_{h,i} - \bar{x} \right)^{2} }{p-1} = \mathrm{data.groupby([ data.index.hour, data.index.minute, data.index.second] ).mean().var()}$$

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions