.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/filter_and_interpolate.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_filter_and_interpolate.py: Drop outliers and interpolate ================================ Filter out points with low confidence scores and interpolate over missing values. .. GENERATED FROM PYTHON SOURCE LINES 9-11 Imports ------- .. GENERATED FROM PYTHON SOURCE LINES 11-13 .. code-block:: Python from movement import sample_data .. GENERATED FROM PYTHON SOURCE LINES 14-16 Load a sample dataset --------------------- .. GENERATED FROM PYTHON SOURCE LINES 16-20 .. code-block:: Python ds = sample_data.fetch_dataset("DLC_single-wasp.predictions.h5") print(ds) .. rst-class:: sphx-glr-script-out .. code-block:: none Size: 61kB Dimensions: (time: 1085, individuals: 1, keypoints: 2, space: 2) Coordinates: * time (time) float64 9kB 0.0 0.025 0.05 0.075 ... 27.05 27.07 27.1 * individuals (individuals) .. GENERATED FROM PYTHON SOURCE LINES 38-42 We can see that the pose tracks contain some implausible "jumps", such as the big shift in the final second, and the "spikes" of the stinger near the 14th second. Perhaps we can get rid of those based on the model's reported confidence scores? .. GENERATED FROM PYTHON SOURCE LINES 44-56 Visualise confidence scores --------------------------- The confidence scores are stored in the ``confidence`` data variable. Since the predicted poses in this example have been generated by DeepLabCut, the confidence scores should be likelihood values between 0 and 1. That said, confidence scores are not standardised across pose estimation frameworks, and their ranges can vary. Therefore, it's always a good idea to inspect the actual confidence values in the data. Let's first look at a histogram of the confidence scores. As before, we use :meth:`xarray.DataArray.squeeze` to remove the ``individuals`` dimension from the data. .. GENERATED FROM PYTHON SOURCE LINES 56-59 .. code-block:: Python ds.confidence.squeeze().plot.hist(bins=20) .. image-sg:: /examples/images/sphx_glr_filter_and_interpolate_002.png :alt: individuals = individual_0 :srcset: /examples/images/sphx_glr_filter_and_interpolate_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none (array([ 61., 13., 16., 10., 10., 8., 21., 11., 14., 11., 26., 13., 28., 19., 39., 44., 79., 84., 149., 1514.]), array([0. , 0.04999823, 0.09999646, 0.14999469, 0.19999292, 0.24999115, 0.29998938, 0.34998761, 0.39998584, 0.44998407, 0.4999823 , 0.54998053, 0.59997876, 0.64997699, 0.69997522, 0.74997345, 0.79997168, 0.84996991, 0.89996814, 0.94996637, 0.99996459]), ) .. GENERATED FROM PYTHON SOURCE LINES 60-63 Based on the above histogram, we can confirm that the confidence scores indeed range between 0 and 1, with most values closer to 1. Now let's see how they evolve over time. .. GENERATED FROM PYTHON SOURCE LINES 63-68 .. code-block:: Python ds.confidence.squeeze().plot.line( x="time", row="keypoints", aspect=2, size=2.5 ) .. image-sg:: /examples/images/sphx_glr_filter_and_interpolate_003.png :alt: keypoints = head, keypoints = stinger :srcset: /examples/images/sphx_glr_filter_and_interpolate_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 69-72 Encouragingly, some of the drops in confidence scores do seem to correspond to the implausible jumps and spikes we had seen in the position. We can use that to our advantage. .. GENERATED FROM PYTHON SOURCE LINES 74-87 Filter out points with low confidence ------------------------------------- Using the :meth:`filter_by_confidence()\ ` method of the ``move`` accessor, we can filter out points with confidence scores below a certain threshold. The default ``threshold=0.6`` will be used when ``threshold`` is not provided. This method will also report the number of NaN values in the dataset before and after the filtering operation by default (``print_report=True``). We will use :meth:`xarray.Dataset.update` to update ``ds`` in-place with the filtered ``position``. .. GENERATED FROM PYTHON SOURCE LINES 87-90 .. code-block:: Python ds.update({"position": ds.move.filter_by_confidence()}) .. rst-class:: sphx-glr-script-out .. code-block:: none Missing points (marked as NaN) in input Individual: individual_0 head: 0/1085 (0.0%) stinger: 0/1085 (0.0%) Missing points (marked as NaN) in output Individual: individual_0 head: 121/1085 (11.2%) stinger: 93/1085 (8.6%) .. raw:: html
<xarray.Dataset> Size: 61kB
    Dimensions:      (time: 1085, individuals: 1, keypoints: 2, space: 2)
    Coordinates:
      * time         (time) float64 9kB 0.0 0.025 0.05 0.075 ... 27.05 27.07 27.1
      * individuals  (individuals) <U12 48B 'individual_0'
      * keypoints    (keypoints) <U7 56B 'head' 'stinger'
      * space        (space) <U1 8B 'x' 'y'
    Data variables:
        position     (time, individuals, keypoints, space) float64 35kB nan ... nan
        confidence   (time, individuals, keypoints) float64 17kB 0.05305 ... 0.0
    Attributes:
        fps:              40.0
        time_unit:        seconds
        source_software:  DeepLabCut
        source_file:      /home/runner/.movement/data/poses/DLC_single-wasp.predi...
        ds_type:          poses
        frame_path:       /home/runner/.movement/data/frames/single-wasp_frame-10...
        video_path:       None


.. GENERATED FROM PYTHON SOURCE LINES 91-105 .. note:: The ``move`` accessor :meth:`filter_by_confidence()\ ` method is a convenience method that applies :func:`movement.filtering.filter_by_confidence`, which takes ``position`` and ``confidence`` as arguments. The equivalent function call using the :mod:`movement.filtering` module would be: .. code-block:: python from movement.filtering import filter_by_confidence ds.update({"position": filter_by_confidence(position, confidence)}) .. GENERATED FROM PYTHON SOURCE LINES 107-109 We can see that the filtering operation has introduced NaN values in the ``position`` data variable. Let's visualise the filtered data. .. GENERATED FROM PYTHON SOURCE LINES 109-114 .. code-block:: Python ds.position.squeeze().plot.line( x="time", row="keypoints", hue="space", aspect=2, size=2.5 ) .. image-sg:: /examples/images/sphx_glr_filter_and_interpolate_004.png :alt: keypoints = head, keypoints = stinger :srcset: /examples/images/sphx_glr_filter_and_interpolate_004.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 115-119 Here we can see that gaps (consecutive NaNs) have appeared in the pose tracks, some of which are over the implausible jumps and spikes we had seen earlier. Moreover, most gaps seem to be brief, lasting < 1 second (or 40 frames). .. GENERATED FROM PYTHON SOURCE LINES 121-133 Interpolate over missing values ------------------------------- Using the :meth:`interpolate_over_time()\ ` method of the ``move`` accessor, we can interpolate over the gaps we've introduced in the pose tracks. Here we use the default linear interpolation method (``method=linear``) and interpolate over gaps of 40 frames or less (``max_gap=40``). The default ``max_gap=None`` would interpolate over all gaps, regardless of their length, but this should be used with caution as it can introduce spurious data. The ``print_report`` argument acts as described above. .. GENERATED FROM PYTHON SOURCE LINES 133-136 .. code-block:: Python ds.update({"position": ds.move.interpolate_over_time(max_gap=40)}) .. rst-class:: sphx-glr-script-out .. code-block:: none Missing points (marked as NaN) in input Individual: individual_0 head: 121/1085 (11.2%) stinger: 93/1085 (8.6%) Missing points (marked as NaN) in output Individual: individual_0 head: 0/1085 (0.0%) stinger: 0/1085 (0.0%) .. raw:: html
<xarray.Dataset> Size: 61kB
    Dimensions:      (time: 1085, individuals: 1, keypoints: 2, space: 2)
    Coordinates:
      * time         (time) float64 9kB 0.0 0.025 0.05 0.075 ... 27.05 27.07 27.1
      * individuals  (individuals) <U12 48B 'individual_0'
      * keypoints    (keypoints) <U7 56B 'head' 'stinger'
      * space        (space) <U1 8B 'x' 'y'
    Data variables:
        position     (time, individuals, keypoints, space) float64 35kB 1.089e+03...
        confidence   (time, individuals, keypoints) float64 17kB 0.05305 ... 0.0
    Attributes:
        fps:              40.0
        time_unit:        seconds
        source_software:  DeepLabCut
        source_file:      /home/runner/.movement/data/poses/DLC_single-wasp.predi...
        ds_type:          poses
        frame_path:       /home/runner/.movement/data/frames/single-wasp_frame-10...
        video_path:       None


.. GENERATED FROM PYTHON SOURCE LINES 137-153 .. note:: The ``move`` accessor :meth:`interpolate_over_time()\ ` is also a convenience method that applies :func:`movement.filtering.interpolate_over_time` to the ``position`` data variable. The equivalent function call using the :mod:`movement.filtering` module would be: .. code-block:: python from movement.filtering import interpolate_over_time ds.update({"position": interpolate_over_time( position_filtered, max_gap=40 )}) .. GENERATED FROM PYTHON SOURCE LINES 155-158 We see that all NaN values have disappeared, meaning that all gaps were indeed shorter than 40 frames. Let's visualise the interpolated pose tracks. .. GENERATED FROM PYTHON SOURCE LINES 158-163 .. code-block:: Python ds.position.squeeze().plot.line( x="time", row="keypoints", hue="space", aspect=2, size=2.5 ) .. image-sg:: /examples/images/sphx_glr_filter_and_interpolate_005.png :alt: keypoints = head, keypoints = stinger :srcset: /examples/images/sphx_glr_filter_and_interpolate_005.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 164-172 Log of processing steps ----------------------- So, far we've processed the pose tracks first by filtering out points with low confidence scores, and then by interpolating over missing values. The order of these operations and the parameters with which they were performed are saved in the ``log`` attribute of the ``position`` data array. This is useful for keeping track of the processing steps that have been applied to the data. Let's inspect the log entries. .. GENERATED FROM PYTHON SOURCE LINES 172-176 .. code-block:: Python for log_entry in ds.position.log: print(log_entry) .. rst-class:: sphx-glr-script-out .. code-block:: none {'operation': 'filter_by_confidence', 'datetime': '2024-09-06 12:30:55.519032', 'confidence': Size: 17kB 0.05305 0.07366 0.03532 0.03293 0.01707 0.01022 ... 0.0 0.0 0.0 0.0 0.0 0.0 Coordinates: * time (time) float64 9kB 0.0 0.025 0.05 0.075 ... 27.05 27.07 27.1 * individuals (individuals)
<xarray.Dataset> Size: 96kB
    Dimensions:      (time: 1085, individuals: 1, keypoints: 2, space: 2)
    Coordinates:
      * time         (time) float64 9kB 0.0 0.025 0.05 0.075 ... 27.05 27.07 27.1
      * individuals  (individuals) <U12 48B 'individual_0'
      * keypoints    (keypoints) <U7 56B 'head' 'stinger'
      * space        (space) <U1 8B 'x' 'y'
    Data variables:
        position     (time, individuals, keypoints, space) float64 35kB nan ... nan
        confidence   (time, individuals, keypoints) float64 17kB 0.05305 ... 0.0
        velocity     (time, individuals, keypoints, space) float64 35kB nan ... nan
    Attributes:
        fps:              40.0
        time_unit:        seconds
        source_software:  DeepLabCut
        source_file:      /home/runner/.movement/data/poses/DLC_single-wasp.predi...
        ds_type:          poses
        frame_path:       /home/runner/.movement/data/frames/single-wasp_frame-10...
        video_path:       None


.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 2.401 seconds) .. _sphx_glr_download_examples_filter_and_interpolate.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/neuroinformatics-unit/movement/gh-pages?filepath=notebooks/examples/filter_and_interpolate.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: filter_and_interpolate.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: filter_and_interpolate.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: filter_and_interpolate.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_