API Reference

Contents

API Reference#

This page contains the API reference for the pupeyes package.

Reading Tobii Data (from Titta)#

Tobii Data Parsing Module (from Titta)

This module is designed for parsing Tobii data saved from a Titta experiment (hdf5 format). It provides functionalities to parse messages and raw gaze samples. However, it does not support parsing fixations, saccades, and blinks, as these are not saved by Titta.

For more info on the Titta package, see marcus-nystrom/Titta

class pupeyes.data.tobii_titta.TobiiTittaReader(path, start_msg, stop_msg, msg_format, delimiter, add_cols=None)[source]#

Bases: object

A class to read and parse Tobii data saved from Titta (hdf5 format). This class handles loading and parsing of Tobii data files, providing methods to extract messages and gaze samples. However, it does not support parsing fixations, saccades, and blinks, as these are not saved by Titta.

Most functions here are wrappers for existing functionalities in the Titta package.

Parameters:
  • path (str) – Path to the Tobii hdf5 data file

  • start_msg (str) – Common part of message marking the start of a trial. For example, if your trial start messages are ‘TRIAL_START 1 1’, ‘TRIAL_START 1 2’, etc., then start_msg would be ‘TRIAL_START’

  • stop_msg (str) – Common part of message marking the end of a trial. For example, if your trial end messages are ‘TRIAL_END 1 1’, ‘TRIAL_END 1 2’, etc., then stop_msg would be ‘TRIAL_END’

  • msg_format (dict) – Dictionary specifying the format of messages. The messages will be parsed based on this format. Example: {‘marker’: str, ‘event’: str, ‘block’: int, ‘trial’: int}

  • delimiter (str) – Character used to separate message components. For example, if messages are formatted as ‘TRIAL_END 1 1’, the delimiter would be ‘ ‘.

  • add_cols (dict, optional) – Additional columns to add to output DataFrames. The dictionary should be in the format {‘column_name’: column_data}. For example, to add a column ‘subject’ with value ‘S01’ to all rows, use {‘subject’: ‘S01’}.

  • progress_bar (bool, optional) – If True, shows a progress bar while reading the data file. Default is True.

calibration_history#

Raw calibration history as saved by Titta

Type:

pd.DataFrame

external_signal#

Raw external signal as saved by Titta

Type:

pd.DataFrame

gaze#

Raw gaze data as saved by Titta

Type:

pd.DataFrame

log#

Raw log data as saved by Titta

Type:

pd.DataFrame

msg#

Raw message data as saved by Titta

Type:

pd.DataFrame

notification#

Raw notification data as saved by Titta

Type:

pd.DataFrame

time_sync#

Raw time sync data as saved by Titta

Type:

pd.DataFrame

Examples

>>> reader = TobiiTittaReader(
...     path='subject01.h5',
...     start_msg='TRIAL_START',
...     stop_msg='TRIAL_END',
...     msg_format={'marker': str, 'event': str, 'block': int, 'trial': int},
...     delimiter=' '
... )
get_messages()[source]#

Extract and process marker events from the Titta dataset.

This method extracts all message events from the data and parses them according to the specified message format and delimiter.

Returns:

DataFrame containing processed message data with columns:

  • idint

    Trial identifier

  • system_time_stampfloat

    System timestamps

  • msgstr

    Raw message string

  • Additional columns to store parsed message parts based on msg_format specification.

  • Additional columns from self.add_cols are added if specified.

Return type:

pd.DataFrame

Notes

Messages are split using the specified delimiter and parsed according to the data types specified in msg_format.

get_samples(parse_messages=True)[source]#

Extract gaze samples for each trial based on start and stop messages.

Parameters:

parse_messages (bool, optional) – If True, parse message columns and add them to samples. If False, only add raw message. Default is True.

Returns:

DataFrame containing processed sample data. Columns include all columns in self.gaze, as well as:

  • trialtimefloat

    Trial timestamps in milliseconds (since the start of each trial)

  • msgtimefloat

    Message timestamps in system timestamps (start time of each trial)

  • msgstr

    Raw message strings (start message of each trial)

  • Additional columns from message parsing if parse_messages=True.

  • Additional columns from self.add_cols if specified.

Return type:

pd.DataFrame

Notes

Columns are converted to the appropriate data type that supports pd.NA. Only trials with both start and stop messages are included in the output.

Pupil Preprocessing#

Pupil Data Processing Module

This module provides tools for processing pupillometry data from eye trackers. It includes functionality for deblinking, smoothing, baseline correction, and plotting pupil size data.

class pupeyes.pupil.PupilProcessor(data, trial_identifier, pupil_col, time_col, x_col, y_col, samp_freq, convert_pupil_size=False, artificial_d=5, artificial_size=5663, recording_unit='diameter', device='eyelink', eyetracker_missing_value=0, progress_bar=True)[source]#

Bases: object

A class for processing and analyzing pupillometry data.

This class provides methods for preprocessing pupil size data, including blink removal, artifact rejection, smoothing, and baseline correction. It also includes tools for data visualization and analysis.

Parameters:
  • data (pd.DataFrame) – DataFrame containing pupil size data and associated measurements

  • trial_identifier (str or list) – Column name(s) identifying unique trials. If list, trials are uniquely identified by the combination of these columns.

  • pupil_col (str) – Column name containing pupil size measurements

  • time_col (str) – Column name containing timestamps. Must be in milliseconds and integer.

  • x_col (str) – Column name containing x-coordinates of gaze position

  • y_col (str) – Column name containing y-coordinates of gaze position

  • samp_freq (float) – Sampling frequency of the eye tracker in Hz

  • convert_pupil_size (bool, default=False) – Whether to convert pupil size from area to diameter or vice versa

  • artificial_d (float, default=5) – Artificial pupil diameter in mm, used for pupil size conversion

  • artificial_size (float, default=5663) – Artificial pupil size in arbitrary units, used for pupil size conversion

  • recording_unit ({'diameter', 'area'}, default='diameter') – Unit of the recorded pupil size

  • device ({'eyelink', 'tobii_titta', 'tobii_prolab','smi'}, default='eyelink') – Device type. At the moment, this only controls whether sampling frequency is checked.

  • eyetracker_missing_value (int, default=0) – Value for missing pupil size for the eye tracker. Different eye trackers use different values to indicate missing values. PupEyes will replace these values with 0. Other possible values are pd.NA, np.nan, -1, -999, etc.

  • progress_bar (bool, default=True) – Whether to show a progress bar for preprocessing steps

data#

The processed pupil data

Type:

pd.DataFrame

summary_data#

Summary statistics for each trial

Type:

pd.DataFrame

trials#

A dataframe of unique trial identifiers

Type:

pd.DataFrame

params#

Dictionary storing parameters used in processing steps

Type:

dict

all_pupil_cols#

List of column names containing pupil data at different processing stages

Type:

list

all_steps#

List of processing steps applied to the data

Type:

list

Notes

  • All processing methods return self for method chaining

  • Most methods create new columns with processed data rather than modifying existing ones

  • Processing parameters are stored in the params dictionary for reproducibility

  • Summary statistics are automatically updated after each processing step

  • artificial_d is the diameter of an artificial pupil provided by Eyelink.

  • artificial_size was measured for the setup of our research group and may not generalize to other setups.

artifact_rejection(suffix='_ar', method='both', speed_n=16, zscore_threshold=2.5, zscore_allowp=0.1)[source]#

Reject artifacts from pupil data using speed and/or z-score based methods.

This method identifies and removes artifacts using two possible approaches: 1. Speed-based: Removes samples where pupil size changes too rapidly 2. Z-score based: Removes extreme values based on z-score thresholds

The method can use either approach individually or combine both.

Parameters:
  • suffix (str, default='_ar') – Suffix to append to the pupil column name for the artifact-rejected data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_ar’.

  • method ({'speed', 'zscore', 'both'}, default='both') – Method to use for artifact rejection: - ‘speed’: Use only speed-based rejection - ‘zscore’: Use only z-score based rejection - ‘both’: Use both methods

  • speed_n (int, default=16) – Number of MADs above median speed to use as threshold for speed-based rejection

  • zscore_threshold (float, default=2.5) – Z-score threshold for artifact rejection for z-score based rejection

  • zscore_allowp (float, default=0.1) – Proportion of mean to use as minimum standard deviation for z-score based rejection. If sd/mean < zscore_allowp, the z-score threshold is not applied. This is to avoid rejecting stable data.

Returns:

self – Returns self for method chaining

Return type:

PupilProcessor

Notes

  • Updates summary_data with:
    • run_artifact: Boolean indicating if artifact rejection was performed

    • pct_artifact: Percentage of samples identified as artifacts

  • Creates a new column with suffix appended to the current pupil column name

  • Updates all_pupil_cols and all_steps to track processing history

  • Artifact periods are replaced with NaN values

  • Trials with all missing pupil data are skipped and reported

  • Processing parameters are stored in self.params[‘artifact_rejection’]

baseline_correction(baseline_query, baseline_range=[None, None], suffix='_bc', method='subtractive')[source]#

Apply baseline correction to pupil data.

Corrects pupil data by subtracting or dividing by baseline values. Creates a new column with the baseline-corrected data.

Parameters:
  • baseline_query (str) – Query string to select baseline period data

  • baseline_range (list, default=[None, None]) – Start and end indices for baseline period

  • suffix (str, default='_bc') – Suffix to append to the pupil column name for the corrected data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_bc’.

  • method ({'subtractive', 'divisive'}, default='subtractive') – Method to use for baseline correction: - ‘subtractive’: Subtract baseline mean from pupil data - ‘divisive’: Divide pupil data by baseline mean

Returns:

self – Returns self for method chaining.

Return type:

PupilProcessor

Notes

  • Updates summary_data with:
    • run_baseline_correction: Boolean indicating if baseline correction was performed

    • baseline: Mean baseline value used for correction

  • Adds a new column with suffix appended to the current pupil column name

  • Updates all_pupil_cols and all_steps to track processing history

check_baseline_outliers(outlier_by=None, n_mad_baseline=4, plot=True, **kwargs)[source]#

Check for outliers in baseline pupil values.

Identifies outliers in baseline values using median absolute deviation (MAD). Can group data and check outliers within groups.

Parameters:
  • outlier_by (str or list, optional) – Column(s) to group data by for outlier detection

  • n_mad_baseline (float, default=4) – Number of MADs from median to use as outlier threshold

  • plot (bool, default=True) – Whether to plot the baseline distributions

  • **kwargs (dict) – Additional arguments passed to plot_baseline

Returns:

self – Returns self for method chaining

Return type:

PupilProcessor

Notes

  • Updates summary_data with baseline outlier statistics

  • Updates all_steps

check_missing(pupil_col=None, missing_value=<NA>)[source]#

Check for missing values in pupil data.

This method calculates the percentage of missing values for each trial and updates the summary statistics. Missing values can be either NaN or a specific value.

Parameters:
  • pupil_col (str, optional) – Column name to check for missing values. If None, uses the latest pupil column.

  • missing_value (float or pd.NA, default=pd.NA) – Value to consider as missing. Can be pd.NA for NaN values or any specific value.

Returns:

self – Returns self for method chaining.

Return type:

PupilProcessor

Notes

  • Updates summary_data with:
    • run_check_missing: Boolean indicating if missing check was performed

    • missing: Percentage of missing values in each trial

  • Updates all_steps to track processing history

  • Trials that cannot be checked are reported

  • Processing parameters are stored in self.params[‘check_missing’]

check_sampling_frequency(sampling_rate=None, data=None)[source]#

Check if the sampling frequency is consistent. Only performed for Eyelink data.

This method checks if the sampling frequency is consistent across trials. If not, it raises an error. It is automatically called when initializing the PupilProcessor. If resampling is performed, the sampling frequency is checked again.

Parameters:
  • sampling_rate (int, default=None) – Sampling rate to check. If None, the sampling rate is checked against the current sampling rate.

  • data (pd.DataFrame, default=None) – Data to check. If None, the data is checked against the current data. The time column must be in milliseconds and integer.

Returns:

check_pass – True if the sampling frequency is consistent, False otherwise

Return type:

bool

check_trace_outliers(time_col=None, pupil_col=None, outlier_by=None, n_mad_trace=4, plot=True, **kwargs)[source]#

Check for outlier trials based on pupil trace values.

Detects outlier trials by comparing each trial’s pupil trace against thresholds calculated from the median absolute deviation (MAD) of all trials. Outliers can be calculated globally or within specified groups.

Parameters:
  • time_col (str, optional) – Column name for x-axis values (time). Defaults to time column.

  • pupil_col (str, optional) – Column name for pupil values. Defaults to last pupil column.

  • outlier_by (str or list, optional) – Column(s) to group trials by when calculating outlier thresholds.

  • n_mad_trace (float, default=4) – Number of MADs to use for outlier threshold.

  • plot (bool, default=True) – Whether to plot the results.

  • **kwargs – Additional arguments passed to plotting function.

Returns:

self – Returns self for method chaining.

Return type:

object

Notes

  • Updates summary_data with:
    • run_trace_outlier: Boolean indicating if trace outlier detection was performed

    • trace_outlier: Boolean indicating if trial is an outlier

    • trace_upper: Upper threshold for outlier detection

    • trace_lower: Lower threshold for outlier detection

  • Outlier detection uses median absolute deviation (MAD) method

  • Can detect outliers globally or within groups specified by outlier_by

static combine(processors)[source]#

Combine multiple PupilProcessor instances into a single instance.

This method allows combining data from multiple processors that have gone through identical preprocessing pipelines. This is useful for: 1. Processing large datasets in chunks to manage memory 2. Adding new data to an existing processed dataset 3. Processing data from multiple participants separately

Parameters:

processors (list of PupilProcessor) – List of PupilProcessor instances to combine. All processors must have identical preprocessing settings.

Returns:

A new PupilProcessor instance containing combined data.

Return type:

PupilProcessor

Notes

  • All processors must have identical:
    • Initialization parameters (pupil_col, time_col, etc.)

    • Data structure (column names and order)

    • Preprocessing steps and parameters

    • Outlier detection settings (if used)

  • Data and summary statistics are concatenated

Raises:

ValueError – If processors have different preprocessing settings If processors have incompatible data structures If no processors are provided

copy()[source]#

Create a deep copy of the PupilProcessor object.

This method creates an independent copy of the PupilProcessor object, including all data and processing history. Modifications to the copy will not affect the original object.

Returns:

A deep copy of the current object.

Return type:

PupilProcessor

Notes

  • Creates a completely independent copy using copy.deepcopy

  • All data, parameters, and processing history are copied

  • Useful for creating alternative processing pipelines

Remove blinks from pupil data using noise-based blink detection.

This method identifies and removes blinks from pupil data using the based_noise_blinks_detection algorithm. Blinks are detected based on rapid changes in pupil size characteristic of eye closure. The method processes each trial separately and creates a new column containing the deblinked data.

Parameters:

suffix (str, default='_db') – Suffix to append to the pupil column name for the deblinked data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_db’.

Returns:

self – Returns self for method chaining

Return type:

PupilProcessor

Notes

  • Updates summary_data with:
    • run_deblink: Boolean indicating if deblinking was performed

    • pct_deblink: Percentage of samples identified as blinks

  • Creates a new column with suffix appended to the current pupil column name

  • Updates all_pupil_cols and all_steps to track processing history

  • Blink periods are replaced with NaN values

  • Trials with all missing pupil data are skipped and reported

  • Processing parameters are stored in self.params[‘deblink’]

See also

based_noise_blinks_detection

The underlying blink detection algorithm

downsample(target_samp_freq, agg_methods=None)[source]#

Downsample pupil data to a new sampling rate.

This method downsamples the data by binning into fixed time windows and aggregating values within each bin. This is useful for reducing data size or matching sampling rates between different recordings.

Parameters:
  • target_samp_freq (int) – Target sampling frequency in Hz.

  • agg_methods (dict, optional) – Dictionary mapping column names to aggregation methods. Example: {‘pupil’: ‘mean’, ‘time’: ‘first’, ‘x’: ‘mean’, ‘y’: ‘mean’} If None, uses ‘first’ for all columns.

Returns:

self – Returns self for method chaining.

Return type:

PupilProcessor

Notes

  • Unlike other preprocessing functions, this function will replace the original .data with the downsampled data rather than creating a new column to the original .data.

  • Trials that cannot be downsampled are reported.

  • The sampling frequency is checked and updated again after downsampling.

  • Updates summary_data with:
    • run_downsample: Boolean indicating if downsampling was performed

    • downsampled_bin_size: Size of the downsampled time bin in milliseconds

    • downsampled_samp_freq: Downsampled sampling frequency in Hz

  • Updates all_steps to track processing history

  • Processing parameters are stored in self.params[‘downsample’]

filter_position(vertices, suffix='_xy')[source]#

Filter pupil data based on gaze position within a polygon.

This method removes pupil data points where the gaze position falls outside a specified polygon. This is useful for excluding data where participants were not looking at the intended region of interest.

Parameters:
  • vertices (list of tuples) – List of (x,y) coordinates defining the polygon vertices. Must be in screen coordinates and form a closed polygon. Example: [(0,0), (0,1080), (1920,1080), (1920,0), (0,0)]

  • suffix (str, default='_xy') – Suffix to append to the pupil column name for the filtered data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_xy’.

Returns:

self – Returns self for method chaining.

Return type:

PupilProcessor

Notes

  • Updates summary_data with:
    • run_gaze_filter: Boolean indicating if gaze filtering was performed

    • pct_gaze_filter: Percentage of samples outside the polygon

    • avg_gaze_x: Average gaze x-coordinate for the remaining samples

    • avg_gaze_y: Average gaze y-coordinate for the remaining samples

  • Creates a new column with suffix appended to the current pupil column name

  • Updates all_pupil_cols and all_steps to track processing history

  • Samples outside the polygon are replaced with NaN values

  • Trials with all missing pupil data are skipped and reported

  • Processing parameters are stored in self.params[‘filter_position’]

Raises:

ValueError – If vertices cannot be converted to float numpy array

interpolate(suffix='_it', method='linear', missing_threshold=0.6)[source]#

Interpolate missing values in pupil data.

This method fills missing values in the pupil data using either linear or spline interpolation. Trials with too many missing values (above missing_threshold) are skipped to avoid unreliable interpolation.

Parameters:
  • suffix (str, default='_it') – Suffix to append to the pupil column name for the interpolated data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_it’.

  • method ({'linear', 'spline'}, default='linear') – Method to use for interpolation: - ‘linear’: Linear interpolation between points - ‘spline’: Cubic spline interpolation

  • missing_threshold (float, default=0.6) – Maximum proportion of missing values allowed for interpolation. Trials with more missing values than this threshold are skipped.

Returns:

self – Returns self for method chaining.

Return type:

PupilProcessor

Notes

  • Updates summary_data with:
    • run_interpolate: Boolean indicating if interpolation was performed

    • pct_interpolate: Percentage of interpolated values in each trial

  • Creates a new column with suffix appended to the current pupil column name

  • Updates all_pupil_cols and all_steps to track processing history

  • Trials with too many missing values are skipped and reported

  • Processing parameters are stored in self.params[‘interpolate’]

Raises:

ValueError – If method is not ‘linear’ or ‘spline’

static load(path)[source]#

Load PupilProcessor object from file using dill deserialization.

This method loads a previously saved PupilProcessor object, restoring all data and processing history.

Parameters:

path (str) – Path to the file containing the saved PupilProcessor object.

Returns:

The loaded PupilProcessor object.

Return type:

PupilProcessor

Notes

  • The loaded object will be an exact copy of the saved object

  • All data, parameters, and processing history are preserved

  • Make sure the file was created using the save() method

plot_baseline(plot_by=None, show_outliers=True, save=None, interactive=True, plot_params=None, return_fig=False)[source]#

Plot histogram of baseline pupil sizes. This is a wrapper function that calls either plot_baseline_interactive() or plot_baseline_static() depending on the interactive parameter.

Parameters:
  • plot_by (str or list, optional) – Column(s) to group data by for separate plots.

  • show_outliers (bool, default=True) – Whether to show outlier thresholds.

  • save (str, optional) – Path to save plot.

  • interactive (bool, default=True) – Whether to create interactive plot.

  • plot_params (dict, optional) – Additional plotting parameters.

  • return_fig (bool, default=False) – Whether to return the figure object.

Returns:

  • figure (matplotlib.figure.Figure or plotly.graph_objects.Figure) – Plot figure object if return_fig is True.

  • axes (matplotlib.axes.Axes, optional) – Plot axes object (only for static plots).

See also

plot_baseline_interactive

Create interactive baseline histogram plot

plot_baseline_static

Create static baseline histogram plot

plot_evoked(data=None, pupil_col=None, condition=None, agg_by=None, error='ci', save=None, plot_params=None, **kwargs)[source]#

Plot evoked pupil response.

Creates plot of average pupil response across trials, optionally split by condition and aggregated by specified groups.

Parameters:
  • data (str or pandas.DataFrame, optional) – Data to plot. If string, uses corresponding attribute.

  • pupil_col (str, optional) – Column name for pupil values.

  • condition (str or list, optional) – Column(s) to split data by.

  • agg_by (str or list, optional) – Column(s) to aggregate data by before computing mean trace and confidence bands. For example, to compute subject-level means, use ‘subject_id’.

  • error ({'ci', 'sem', 'std', None}, default='ci') – Type of error to plot: - ‘ci’: bootstrap confidence interval - ‘sem’: standard error of the mean - ‘std’: standard deviation - None: no error bars

  • save (str, optional) – Path to save plot.

  • plot_params (dict, default={}) – Additional plotting parameters. This includes all rcParams accepted by matplotlib, as well as the following: - ‘title’: title of plot - ‘x_title’: x-axis label - ‘y_title’: y-axis label - ‘vline_color’: color of vertical line - ‘vline_linestyle’: linestyle of vertical line - ‘grid’: whether to show grid - ‘legend_labels’: labels for legend

  • **kwargs – Additional arguments passed to confidence interval calculation.

Returns:

  • arrays_by_condition (dict) – Dictionary of arrays containing trial data for each condition.

  • (figure, axes) (tuple) – Plot figure and axes objects.

plot_pupil_surface(data=None, pupil_col=None, x_col=None, y_col=None, plot_type='count', vertices=None, nbins=64, log_counts=False, plot_by=None, show_centroid=True, save=None, plot_params=None)[source]#

Create an interactive surface plot of pupil dilation by gaze coordinates using numpy.histogram2d.

Parameters:
  • data (pandas.DataFrame, optional) – DataFrame containing pupil size, x-coordinates, and y-coordinates. Being able to specify data is useful for plotting a subset of the data. See examples below. If None, uses self.data.

  • pupil_col (str, optional) – Column name for pupil size. Defaults to self.all_pupil_cols[-1].

  • x_col (str, optional) – Column name for x-coordinates of gaze. Defaults to self.x_col.

  • y_col (str, optional) – Column name for y-coordinates of gaze. Defaults to self.y_col.

  • plot_type (str, optional) – ‘count’ for number of measurements or ‘size’ for mean pupil size. Defaults to ‘count’.

  • nbins (int, optional) – Number of bins for the 2D histogram. Defaults to 64.

  • log_counts (bool, default=False) – Whether to apply log transformation to counts (only applies when plot_type=’count’). Defaults to False.

  • plot_by (str, optional) – Column name to group data by for separate subplots. Defaults to None.

  • show_centroid (bool, default=True) – Whether to show the centroid of the data. Defaults to True.

  • save (str, optional) – Path to save plot.

  • plot_params (dict, optional) – Dictionary of plotting parameters to override defaults - x_title : str, default=’Gaze X’ - y_title : str, default=’Gaze Y’ - title : str, default=’Pupil Foreshortening Error Surface’ - palette : str, default=’Viridis’ - width : int, default=400 - height : int, default=300

Examples

>>> # Plot a 2d histogram of the number of pupil measurements by condition
>>> p.plot_pupil_surface(plot_by='condition')
>>> # Plot a 2d histogram of the mean pupil size based on custom data
>>> p.plot_pupil_surface(data=p.data[p.data['event'] == 'event_name'])
>>> # Plot the mean pupil size rather than the count of measurements as a function of gaze coordinates
>>> p.plot_pupil_surface(plot_type='size')
plot_spaghetti(time_col=None, pupil_col=None, show_outliers=True, plot_by=None, save=False, plot_params=None, return_fig=True)[source]#

Plot pupil traces for all trials as a spaghetti plot.

Parameters:
  • time_col (str, optional) – Column name for x-axis. Defaults to time column specified during initialization.

  • pupil_col (str, optional) – Column name for y-axis. Defaults to latest pupil column.

  • show_outliers (bool, default=True) – Whether to highlight outlier traces.

  • plot_by (str or list, optional) – Column(s) to group data by for separate plots.

  • save (str, optional) – Path to save plot. Only supports html files. If None, plot is not saved.

  • plot_params (dict, default={}) – Additional plotting parameters.

  • return_fig (bool, default=True) – Whether to return the figure object.

Returns:

Plot figure object if return_fig is True.

Return type:

plotly.graph_objects.Figure

Notes

Creates an interactive spaghetti plot showing pupil traces for all trials. If plot_by is specified, creates separate subplots for each group using dropdown menus. Outlier traces can be highlighted if outlier detection was performed.

plot_trial(trial, time_col=None, pupil_col=None, hue=None, save=None, interactive=True, plot_params=None)[source]#

Plot data for a single trial.

A wrapper function that calls either _plot_trial_interactive() or _plot_trial_static() depending on the interactive parameter.

Parameters:
  • trial (pandas.DataFrame) – DataFrame containing trial identifier.

  • time_col (str, optional) – Column name for x-axis values. Defaults to time column specified during initialization.

  • pupil_col (str or list, optional) – Column name(s) for y-axis values. Defaults to all pupil columns.

  • hue (str or list, optional) – Column(s) to group data by for separate lines.

  • save (str, optional) – Path to save plot.

  • interactive (bool, default=True) – Whether to create interactive plot.

  • plot_params (dict, optional) – Additional plotting parameters.

Returns:

  • figure (matplotlib.figure.Figure or plotly.graph_objects.Figure) – Plot figure object.

  • axes (matplotlib.axes.Axes, optional) – Plot axes object (only for static plots).

save(path)[source]#

Save PupilProcessor object to file using dill serialization.

This method saves the entire PupilProcessor object, including all data and processing history, to a file for later use.

Parameters:

path (str) – Path where the object should be saved. Should include the file extension (e.g., ‘.pkl’).

Raises:

FileExistsError – If a file already exists at the specified path.

smooth(suffix='_sm', method='hann', window=100, **kwargs)[source]#

Smooth pupil data using various smoothing methods.

This method applies signal smoothing to reduce noise in the pupil data. Three smoothing methods are available:

  1. Rolling mean: Simple moving average

  2. Hann window: Weighted moving average using Hann window

  3. Butterworth filter: Low-pass filter with specified cutoff

Parameters:
  • suffix (str, default='_sm') – Suffix to append to the pupil column name for the smoothed data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_sm’.

  • method ({'rollingmean', 'hann', 'butter'}, default='hann') – Method to use for smoothing: - ‘rollingmean’: Simple moving average - ‘hann’: Hann window smoothing - ‘butter’: Butterworth low-pass filter

  • window (int, default=100) – Window size (in number of samples) for rolling mean or Hann window smoothing. Not used for Butterworth filter.

  • **kwargs (dict) –

    Additional arguments for specific smoothing methods.

    • For rolling mean and hann window:

      Check pandas.DataFrame.rolling documentation for additional arguments.

    • For Butterworth filter:
      cutoff_freqfloat

      Cutoff frequency in Hz. Default is 4 Hz.

      orderint

      Filter order. Default is 3.

Returns:

self – Returns self for method chaining

Return type:

PupilProcessor

Notes

  • Updates summary_data with smoothing method and parameters

  • Creates a new column with suffix appended to the current pupil column name

  • Updates all_pupil_cols and all_steps to track processing history

  • Missing values (NaN) are preserved

summary(columns=None, level=None, agg_methods=None)[source]#

Get summary statistics of the data.

Returns summary data for specified columns, optionally grouped by level and aggregated using specified methods.

Parameters:
  • columns (list, optional) – Columns to include in summary. Defaults to all columns.

  • level (str or list, optional) – Column(s) to group by.

  • agg_methods (dict, optional) – Dictionary mapping column names to aggregation methods. If None, uses mean for numeric columns.

Returns:

Summary statistics dataframe.

Return type:

pandas.DataFrame

upsample(target_samp_freq, fill_pupil=False)[source]#

Upsample pupil data to a higher sampling rate.

This method upsamples the data by inserting empty rows to meet the required sampling rate. Missing values are foward-filled for non-pupil columns. Pupil columns remain as NaN where no data exists unless fill_pupil=True. A new column ‘upsampled’ is added to track the inserted rows. Trials that cannot be upsampled are reported. The sampling frequency is checked and updated again after upsampling.

Parameters:
  • target_samp_freq (int) – Target sampling frequency in Hz. Must be higher than current sampling frequency.

  • fill_pupil (bool, default=False) – Whether to also fill missing values in pupil columns. If False, pupil columns remain as NaN where no data exists. This is simply a forward-fill. If you want to interpolate missing values, you can do so after upsampling.

Returns:

self – Returns self for method chaining.

Return type:

PupilProcessor

Notes

  • There might be slight discrepancies in the actual sampling rate from the target sampling rate because the time step between samples is rounded to the nearest integer. For example, if you supply a target sampling rate of 1001 Hz, the actual sampling rate will be 1000 Hz (round(1000/1001)= 1 ms time step). In the current implementation, this will result in an error because the actual sampling rate 1000 Hz does not match the target sampling rate 1001 Hz.

  • Upsampled data can be interpolated to fill missing values. You may need to set a lower missing_threshold for interpolation as the upsampling will introduce more missing values.

  • Updates summary_data with:
    • run_upsample: Boolean indicating if upsampling was performed

    • upsampled_bin_size: Size of the upsampled time bin in milliseconds

    • upsampled_samp_freq: Upsampled sampling frequency in Hz

  • Updates all_steps to track processing history

  • Processing parameters are stored in self.params[‘upsample’]

validate_trials(trials_to_exclude, invert_mask=False)[source]#

Mark trials as valid/invalid based on exclusion criteria.

This method adds a ‘valid’ column to both the data and summary_data, marking trials as valid or invalid based on the provided exclusion criteria.

Parameters:
  • trials_to_exclude (pandas.DataFrame) – DataFrame containing trial identifiers to exclude. Must have columns matching the trial_identifier of the PupilProcessor.

  • invert_mask (bool, default=False) – If True, excludes all trials except those specified in trials_to_exclude. If False, excludes only the trials specified in trials_to_exclude.

Returns:

self – Returns self for method chaining.

Return type:

PupilProcessor

Notes

  • Adds ‘valid’ column to both summary_data and data

  • Valid column is boolean: True for valid trials, False for invalid trials

  • Trials are matched based on trial_identifier columns

  • Duplicate entries in trials_to_exclude are automatically removed

pupeyes.pupil.compute_speed(x, y)[source]#

Compute the speed of change between two arrays.

This function calculates the rate of change (speed) between corresponding points in two arrays. The speed is computed as the absolute maximum of the forward and backward differences at each point, normalized by the time difference.

Parameters:
  • x (array-like) – First array of values, typically pupil measurements. Must be numeric and same length as y.

  • y (array-like) – Second array of values, typically time points. Must be numeric and same length as x.

Returns:

Array of speed values with same length as input arrays. Contains NaN values at endpoints and where division by zero or invalid values occur.

Return type:

numpy.ndarray

Notes

  • Uses np.diff() to compute differences between consecutive points

  • Takes absolute maximum of forward/backward differences at each point

  • Suppresses RuntimeWarnings for NaN/inf values

  • Sets NaN/inf values to NaN in output

pupeyes.pupil.convert_pupil(pupil_size, artificial_d, artificial_size, recording_unit='diameter')[source]#

Convert pupil measurements between different recording units.

This function converts pupil measurements from raw units (arbitrary units from the eye tracker) to millimeters using calibration values from an artificial pupil. It handles both diameter and area measurements.

Parameters:
  • pupil_size (float or array-like) – Pupil size in recording units (diameter or area). Can be a single value or an array of measurements.

  • artificial_d (float) – Diameter of artificial pupil used for calibration (in mm). This is the known physical size of the calibration pupil.

  • artificial_size (float) – Size of artificial pupil in recording units (diameter or area). This is the size measured by the eye tracker for the calibration pupil.

  • recording_unit ({'diameter', 'area'}, default='diameter') – Unit of the recorded measurements: - ‘diameter’: Linear scaling is applied - ‘area’: Square root is taken before scaling

Returns:

Converted pupil measurements in millimeters. Will have same shape as input pupil_size.

Return type:

numpy.ndarray

Notes

  • The unit of artificial_size must match the recording_unit

  • The unit of artificial_d is always in millimeters

  • For diameter recordings: output = artificial_d * pupil_size / artificial_size

  • For area recordings: output = artificial_d * sqrt(pupil_size / artificial_size)

  • Useful for standardizing pupil measurements across different setups

Raises:

ValueError – If recording_unit is not ‘diameter’ or ‘area’

pupeyes.pupil.prf(t, t_max=500, n=10.1)[source]#

PRF function according to Hoeks and Levelt (1993)

Parameters:
  • t (array-like) – Time points in milliseconds.

  • t_max (float, optional) – Location of the peak (default is 500 ms).

  • n (float, optional) – Scale parameter (default is 10.1).

Returns:

Normalized PRF values at each time point.

Return type:

numpy.ndarray

Areas of Interest (AOI)#

Area of Interest (AOI) Analysis Module

This module provides basic functions for analyzing eye tracking data in relation to Areas of Interest (AOIs).

pupeyes.aoi.compute_aoi_statistics(x, y, aois, durations=None)[source]#

Compute fixation statistics for each Area of Interest (AOI).

Parameters:
  • x (array-like) – Array of x-coordinates for fixation points

  • y (array-like) – Array of y-coordinates for fixation points

  • aois (dict) – Dictionary mapping AOI names to lists of vertex coordinates. Each vertex list should define a polygon as [(x1,y1), (x2,y2), …].

  • durations (array-like, optional) – Array of fixation durations corresponding to each (x,y) point.

Returns:

Dictionary containing statistics for each AOI and points outside AOIs:

  • outsidedict
    • countint

      Number of fixations outside all AOIs

    • total_durationfloat

      Total duration of outside fixations

  • aoi_namedict
    • countint

      Number of fixations in this AOI

    • total_durationfloat

      Total duration in this AOI

If durations is None, total_duration values will be 0. Returns empty dict if aois is empty.

Return type:

dict

Notes

If a fixation point lies within multiple AOIs, it is counted only in the first AOI that contains it based on the iteration order of the aois dictionary.

Examples

>>> aois = {
...     'face': [(0,0), (100,0), (100,100), (0,100), (0,0)],
...     'text': [(150,0), (250,0), (250,50), (150,50), (150,0)]
... }
>>> x = np.array([50, 200, 300])  # points in face, text, outside
>>> y = np.array([50, 25, 300])
>>> durations = np.array([100, 150, 200])  # durations in milliseconds
>>> stats = compute_aoi_statistics(x, y, aois, durations)
>>> stats
{
    'outside': {'count': 1, 'total_duration': 200.0},
    'face': {'count': 1, 'total_duration': 100.0},
    'text': {'count': 1, 'total_duration': 150.0}
}
pupeyes.aoi.get_fixation_aoi(x, y, aois)[source]#

For each fixation point, get the Area of Interest (AOI) that contains it. If the point is outside all AOIs, return None.

Parameters:
  • x (float or numpy.ndarray) – X-coordinate(s) of fixation point(s)

  • y (float or numpy.ndarray) – Y-coordinate(s) of fixation point(s)

  • aois (dict or None) – Dictionary mapping AOI names to lists of vertex coordinates. Each vertex list should define a polygon as [(x1,y1), (x2,y2), …]. The last vertex should be the same as the first vertex to close the polygon.

Returns:

If input coordinates are scalars:

  • str

    Name of the AOI containing the point, or None if not in any AOI

If input coordinates are arrays:

  • list

    List of AOI names for each point, with None for points outside all AOIs

Return type:

str or list

Notes

If a point lies within multiple AOIs, it is assigned to the first AOI that contains it based on the iteration order of the aois dictionary.

Examples

>>> # Single point
>>> aois = {
...     'face': [(0,0), (100,0), (100,100), (0,100), (0,0)],
...     'text': [(150,0), (250,0), (250,50), (150,50), (150,0)]
... }
>>> get_fixation_aoi(50, 50, aois)
'face'
>>> get_fixation_aoi(300, 300, aois)
None
>>> # Multiple points
>>> x = np.array([50, 200, 300])
>>> y = np.array([50, 25, 300])
>>> get_fixation_aoi(x, y, aois)
['face', 'text', None]
pupeyes.aoi.is_inside(points, polygon)[source]#

Check if multiple points lie inside a polygon.

Parameters:
  • points (numpy.ndarray) – Nx2 array of (x,y) coordinates to check

  • polygon (numpy.ndarray) – Array of (x,y) coordinates defining the polygon vertices

Returns:

Boolean array indicating whether each point is inside the polygon

Return type:

numpy.ndarray

Examples

>>> # Define a square polygon
>>> square = np.array([(0,0), (100,0), (100,100), (0,100), (0,0)])
>>>
>>> # Check multiple points
>>> points = np.array([
...     [50, 50],    # inside
...     [150, 150],  # outside
...     [0, 50],     # on edge
...     [0, 0]       # on vertex
... ])
>>> is_inside(points, square)
array([ True, False,  True,  True])
pupeyes.aoi.is_inside_singlepoint(polygon, point)[source]#

Check if a point lies inside a polygon using ray-casting algorithm.

Parameters:
  • polygon (array-like) – List of (x,y) coordinates defining the polygon vertices. The last vertex should be the same as the first to close the polygon.

  • point (tuple) – (x,y) coordinates of the point to check

Returns:

Result code indicating point position:

  • 0

    Point is outside the polygon

  • 1

    Point is inside the polygon

  • 2

    Point lies exactly on the polygon’s edge or vertex

Return type:

int

Notes

Uses a ray-casting algorithm that counts the number of times a horizontal ray from the point intersects with polygon edges.

Examples

>>> # Define a square
>>> square = [(0,0), (100,0), (100,100), (0,100), (0,0)]
>>>
>>> # Check points
>>> is_inside_singlepoint(square, (50, 50))  # inside
1
>>> is_inside_singlepoint(square, (150, 150))  # outside
0
>>> is_inside_singlepoint(square, (0, 50))    # on edge
2
>>> is_inside_singlepoint(square, (0, 0))     # on vertex
2

Interactive Applications#

Pupil Viewer#

Interactive Pupil Data Viewer

This module provides an interactive web application for visualizing pupil preprocessing steps. It uses Dash and Plotly to create an interface where users can: - Select individual trials - View all preprocessing steps applied to pupil data - Compare raw and processed pupil traces

class pupeyes.apps.pupil_viewer.PupilViewer(pupil_processor, hue=None, columns=None)[source]#

Bases: object

An interactive web-based visualization tool for pupil preprocessing data.

This class provides a Dash-based interface for visualizing pupil data processing steps, allowing users to explore how different preprocessing operations affect the pupil signal. The interface supports trial selection, column selection for comparison, and interactive plotting with subplots for each processing step.

Parameters:
  • pupil_processor (PupilProcessor) – Instance of PupilProcessor containing the pupil data and processing history. This object should contain both raw and processed pupil data.

  • hue (str, optional) – Column name to group data by for separate lines in the plot. Useful for visualizing different components of a single trial.

  • columns (list of str, optional) – List of column names to plot. If not provided, all pupil columns from the PupilProcessor will be shown.

pupil_processor#

The PupilProcessor instance containing the data

Type:

PupilProcessor

hue#

Column name used for plotting different components of a single trial

Type:

str or None

columns#

List of column names being plotted

Type:

list

app#

The Dash application instance

Type:

dash.Dash

run(port=8051, **kwargs)[source]#

Run the Dash server for the pupil data viewer.

Parameters:
  • port (int, default=8051) – Port number to run the server on. Make sure the port is available and not blocked by firewall.

  • **kwargs (dict) – Additional keyword arguments passed to dash.run_server(). See Dash documentation for available options.

Notes

  • The application will run until interrupted (Ctrl+C)

  • Access the interface at http://localhost:<port>

  • Each preprocessing step is shown in a separate subplot

  • Interactive controls allow exploration of different trials and columns

Fixation Viewer#

Interactive Eye Movement Visualization Module using Dash

This module provides an interactive web-based visualization tool for eye movement data, including scanpath replay, heatmaps, areas of interest, and fixation sequence plots.

class pupeyes.apps.fixation_viewer.FixationViewer(data=None, screen_dims=(1920, 1080), col_mapping=None, stimuli_path=None, animation_speed=500, dot_size=10)[source]#

Bases: object

An interactive web-based visualization tool for eye movement data.

This class provides a Dash-based interface for visualizing eye movement data with multiple visualization modes (scanpath, heatmap, AOI), interactive controls, and data export capabilities.

Parameters:
  • data (pandas.DataFrame, optional) – Eye movement data with columns for timestamps, coordinates, etc.

  • screen_dims (tuple, default=(1920, 1080)) – Screen dimensions in pixels (width, height)

  • col_mapping (dict, optional) –

    Column name mapping for required fields:

    • trial_idstr or list

      Trial identifier column(s). Can be a single column name or a list of column names that together uniquely identify a trial (e.g., [‘subject’, ‘block’, ‘trial’])

    • timestampstr

      Timestamp column (optional)

    • xstr

      X coordinate

    • ystr

      Y coordinate

    • durationstr

      Fixation duration (optional)

    • stimulistr

      Stimuli path/identifier

  • stimuli_path (str, optional) – Base path for stimuli images

  • animation_speed (int, default=500) – Animation playback speed in milliseconds

  • dot_size (int, default=10) – Fixed size for fixation dots

data#

The eye movement data being visualized

Type:

pandas.DataFrame

screen_dims#

The dimensions of the visualization canvas

Type:

tuple

col_mapping#

Mapping of required columns to data columns

Type:

dict

aois#

Dictionary of Areas of Interest definitions

Type:

dict

app#

The Dash application instance

Type:

dash.Dash

run(debug=False, port=8050, **kwargs)[source]#

Start the Dash server and run the fixation viewer application.

This method initializes and starts the web server for the fixation viewer application. The application will be accessible through a web browser at the specified port.

Parameters:
  • debug (bool, default=False) – Whether to run the server in debug mode

  • port (int, default=8050) – Port to run the server on

  • **kwargs (dict) – Additional arguments to pass to dash.run_server() See Dash documentation for available options.

Notes

  • The application will run until interrupted (Ctrl+C)

  • Access the interface at http://localhost:<port>

  • Debug mode provides additional error information

  • Default port (8050) can be changed if already in use

set_aois(aois)[source]#

Set Areas of Interest (AOIs) for visualization.

Parameters:

aois (dict) –

Can be either:

  • A nested dictionary mapping stimulus IDs to AOI definitions.

  • A simple dictionary of AOIs that applies to all stimuli.

where each AOI is defined by a list of (x,y) vertex coordinates. The last point should be the same as the first point to close the polygon.

Return type:

None

Examples

>>> # Define AOIs for each stimulus
>>> aois = {
...     'stimulus1': {
...         'aoi1': [(x1,y1), (x2,y2), ..., (x1, y1)],
...         'aoi2': [(x1,y1), (x2,y2), ..., (x1, y1)]
...     },
...     'stimulus2': {
...         'aoi1': [(x1,y1), (x2,y2), ..., (x1, y1)],
...         'aoi2': [(x1,y1), (x2,y2), ..., (x1, y1)]
...     }
... }
>>> # Define AOIs for all stimuli
>>> aois = {
...     'aoi1': [(x1,y1), (x2,y2), ..., (x1, y1)],
...     'aoi2': [(x1,y1), (x2,y2), ..., (x1, y1)]
... }
set_data(data)[source]#

Set the eye movement data for visualization.

AOI Drawer#

Interactive AOI Drawing Tool using Dash

This module provides an interactive web-based tool for drawing Areas of Interest (AOIs) that can be used with the EyeMovementVisualizer.

class pupeyes.apps.aoi_drawer.AOIDrawer(screen_dims=(1920, 1080), stimuli=None, stimuli_name=None)[source]#

Bases: object

An interactive web-based tool for drawing Areas of Interest (AOIs).

This class provides a Dash-based web interface for drawing and managing Areas of Interest (AOIs) on stimulus images. It supports multiple drawing tools (freeform, rectangle, circle), editing capabilities, and export functionality.

Parameters:
  • screen_dims (tuple, default=(1920, 1080)) – Screen dimensions in pixels (width, height). Used to set the drawing canvas size and scale background images.

  • stimuli (str or numpy.ndarray, optional) – Path to the stimulus image or a numpy array containing the image. Supports various image formats and both RGB and grayscale images.

  • stimuli_name (str, optional) – Name of the stimulus image, used for display and as default save filename. If not provided, defaults to “AOIs”.

aois#

Dictionary storing AOI data, where keys are AOI names and values are lists of (x, y) coordinate tuples defining the AOI vertices.

Type:

dict

app#

The Dash application instance.

Type:

dash.Dash

screen_dims#

The dimensions of the drawing canvas.

Type:

tuple

run(debug=False, port=8051, **kwargs)[source]#

Start the Dash server and run the AOI drawing application.

This method initializes and starts the web server for the AOI drawing interface. The application will be accessible through a web browser at the specified port.

Parameters:
  • debug (bool, default=False) – Whether to run the server in debug mode

  • port (int, default=8051) – Port number to run the server on. Make sure the port is available and not blocked by firewall.

  • **kwargs (dict) – Additional keyword arguments passed to dash.run_server(). See Dash documentation for available options.

Notes

  • The application will run until interrupted (Ctrl+C)

  • Access the interface at http://localhost:<port>

  • Debug mode provides additional error information

  • Default port (8051) can be changed if already in use

Utilities#

General Utilities#

Utility Functions Module

This module provides utility functions used across the pupeyes package, including: - Coordinate system conversions between Eyelink and PsychoPy - Point-in-polygon testing with parallel processing - Signal filtering and data masking - Geometric calculations for circular stimulus arrangements and others.

pupeyes.utils.angular_distance(line1, line2)[source]#

Calculate the angle between two lines in degrees.

Parameters:
  • line1 (tuple) – Tuple of two points ((x1,y1), (x2,y2)) defining the first line

  • line2 (tuple) – Tuple of two points ((x1,y1), (x2,y2)) defining the second line

Returns:

Angle between the lines in degrees, always in range [0, 180]

Return type:

float

Examples

>>> # Perpendicular lines
>>> line1 = ((0,0), (1,0))  # horizontal line
>>> line2 = ((0,0), (0,1))  # vertical line
>>> angular_distance(line1, line2)
90.0
>>> # 45-degree angle
>>> line1 = ((0,0), (1,0))
>>> line2 = ((0,0), (1,1))
>>> angular_distance(line1, line2)
45.0
pupeyes.utils.convert_coordinates(coord, screen_dims=None, direction='to_el', psychopy_units='pix', round_to=2)[source]#

Convert coordinates between Eyelink and PsychoPy coordinate systems. For Eyelink, the origin is at the top-left corner of the screen. For PsychoPy, the origin is at the center of the screen. For more information on the psychopy coordinate system, see: https://psychopy.org/general/units.html

Parameters:
  • coord (array-like or str) – The coordinates to convert. Can be: - array-like: [x, y] - string: ‘x,y’ or ‘[x,y]’ or ‘(x,y)’

  • screen_dims (array-like, optional) – Screen dimensions [width, height] in pixels. Default is [1600, 1200].

  • direction ({'to_el', 'to_psychopy'}, optional) – Conversion direction: - ‘to_el’: convert from PsychoPy to Eyelink coordinates - ‘to_psychopy’: convert from Eyelink to PsychoPy coordinates Default is ‘to_el’.

  • psychopy_units ({'pix', 'norm', 'height'}, optional) – PsychoPy units to convert from/to: - ‘pix’: pixels from center - ‘norm’: normalized units [-1, 1] - ‘height’: units relative to screen height Default is ‘pix’.

  • round_to (int or None, optional) – Number of decimal places to round coordinates to. Default is 2. If None, no rounding is performed.

Returns:

Converted [x, y] coordinates

Return type:

numpy.ndarray

Notes

Coordinate system details: - Eyelink: origin at top-left, positive x right, positive y down - PsychoPy: origin at center, positive x right, positive y up

Examples

>>> # Convert screen center from PsychoPy to Eyelink coordinates
>>> convert_coordinates([0, 0], screen_dims=[1600, 1200])
array([800., 600.])  # half width, half height in Eyelink coordinates
>>> # Convert back from Eyelink to PsychoPy coordinates
>>> convert_coordinates([800, 600], direction='to_psychopy')
array([0., 0.])  # back to center in PsychoPy coordinates
>>> # Convert normalized coordinates (range -1 to 1)
>>> convert_coordinates([0.5, 0.5], psychopy_units='norm')
array([1200., 300.])  # scaled by screen dimensions
>>> # Convert height units (relative to screen height)
>>> convert_coordinates([0.5, 0.5], screen_dims=[1600, 1200],
...                    psychopy_units='height')
array([1400., 0.])  # 50% of screen height = 600 pixels
>>> # Convert from string input
>>> convert_coordinates("100,100")
array([900., 500.])  # PsychoPy (100,100) to Eyelink coordinates
Raises:

ValueError – If direction is not ‘to_el’ or ‘to_psychopy’ If psychopy_units is not ‘pix’, ‘norm’, or ‘height’ If string coordinates cannot be parsed

pupeyes.utils.gaussian_2d(img, fc)[source]#

Apply a 2D Gaussian filter to an image. Python adaptation of cvzoya/saliency

Parameters:
  • img (numpy.ndarray) – 2D input image array

  • fc (float) – Cut-off frequency (-6dB)

Returns:

Filtered image with same shape as input

Return type:

numpy.ndarray

Notes

Python adaptation of the Gaussian filtering method from the saliency metrics toolbox [1]. The filter is applied in the frequency domain using FFT.

References

Examples

>>> # Create sample image with noise
>>> img = np.random.randn(100, 100)
>>> # Apply Gaussian filter
>>> filtered = gaussian_2d(img, fc=10)
pupeyes.utils.get_isoeccentric_positions(n_items, radius, offset_deg=0, coordinate_system='psychopy', screen_dims=None, round_to=2)[source]#

Get coordinates for items arranged in a circle around screen center.

Parameters:
  • n_items (int) – Number of items to position in circle

  • radius (float) – Distance from screen center to each item

  • offset_deg (float, optional) – Rotation offset in degrees from rightmost position (counterclockwise). Default is 0.

  • coordinate_system ({'psychopy', 'eyelink'}, optional) – Output coordinate system: - ‘psychopy’: origin at center, positive y up - ‘eyelink’: origin at top-left, positive y down Default is ‘psychopy’.

  • screen_dims (list, optional) – Screen dimensions [width, height] in pixels. Only used if coordinate_system is ‘eyelink’. Default is [1600, 1200].

  • round_to (int or None, optional) – Number of decimal places to round coordinates to. Default is 2. If None, no rounding is performed.

Returns:

List of (x,y) coordinate tuples for each item position, arranged counterclockwise starting from the rightmost position.

Return type:

list

Notes

  • Items are arranged counterclockwise at equal angular intervals

  • First item is placed at the rightmost position (0 degrees) plus any offset

  • Angular separation between items is 360°/n_items

Examples

>>> # Get 4 positions in PsychoPy coordinates (origin at center)
>>> get_isoeccentric_positions(4, 100, round_to=0)
[(100, 0), (0, 100), (-100, 0), (0, -100)]
>>> # Get 4 positions with 45° offset
>>> get_isoeccentric_positions(4, 100, offset_deg=45, round_to=0)
[(71, 71), (-71, 71), (-71, -71), (71, -71)]
>>> # Get positions in Eyelink coordinates (origin at top-left)
>>> get_isoeccentric_positions(4, 100, coordinate_system='eyelink', round_to=0)
[(900, 600), (800, 500), (700, 600), (800, 700)]
pupeyes.utils.lowpass_filter(data, sampling_freq, cutoff_freq=4, order=3)[source]#

Apply a Butterworth lowpass filter to the input data.

Uses scipy.signal to create and apply a Butterworth filter that removes high frequency components above the cutoff frequency while preserving lower frequencies.

Parameters:
  • data (array-like) – Input signal to be filtered

  • sampling_freq (float) – Sampling frequency of the input signal in Hz

  • cutoff_freq (float, optional (default=4)) – Cutoff frequency of the filter in Hz. Frequencies above this will be attenuated.

  • order (int, optional (default=3)) – Order of the Butterworth filter. Higher orders give sharper frequency cutoffs but may introduce more ringing artifacts.

Returns:

Filtered version of the input signal with same shape as input

Return type:

numpy.ndarray

Notes

  • Uses scipy.signal.butter() to design the filter coefficients

  • Applies zero-phase filtering using scipy.signal.filtfilt()

  • The filter is applied forward and backward to avoid phase shifts

pupeyes.utils.make_mask(data, trials_to_mask, invert=False)[source]#

Create a boolean mask for filtering data based on specified trials.

Parameters:
  • data (pandas.DataFrame) – The main dataset to create a mask for

  • trials_to_mask (pandas.DataFrame or dict) – Trials to use for creating the mask. Can be a DataFrame or a dictionary that can be converted to a DataFrame. Should have matching column names with data

  • invert (bool, optional (default=False)) – If True, inverts the mask (changes True to False and vice versa)

Returns:

Boolean mask series with same length as input data. True values indicate rows to keep, False values indicate rows to filter out

Return type:

pandas.Series

Notes

  • If trials_to_mask is a dictionary, it will attempt to convert it to a DataFrame

  • Warns if resulting mask is all True or all False

  • Uses pandas merge with indicator to create the mask

Examples

>>> # Create sample dataset
>>> data = pd.DataFrame({
...     'trial': [1, 2, 3, 4, 5],
...     'condition': ['A', 'B', 'A', 'B', 'C'],
...     'rt': [0.5, 0.6, 0.4, 0.7, 0.5]
... })
>>>
>>> # Mask trials with condition 'A' using dictionary
>>> to_mask = {'condition': 'A'}
>>> mask = make_mask(data, to_mask)
>>> data[mask]  # Shows only trials with conditions B and C
   trial condition   rt
1     2         B  0.6
3     4         B  0.7
4     5         C  0.5
>>>
>>> # Mask multiple trials using DataFrame
>>> to_mask_df = pd.DataFrame({
...     'trial': [1, 3],
...     'condition': ['A', 'A']
... })
>>> mask = make_mask(data, to_mask_df)
>>> data[mask]  # Same result as above
   trial condition   rt
1     2         B  0.6
3     4         B  0.7
4     5         C  0.5
>>>
>>> # Keep only the masked trials using invert=True
>>> mask = make_mask(data, to_mask_df, invert=True)
>>> data[mask]  # Shows only trials with condition A
   trial condition   rt
0     1         A  0.5
2     3         A  0.4
pupeyes.utils.mat2gray(img)[source]#

Scale image values to grayscale range [0, 1].

Parameters:

img (numpy.ndarray) – Input image array

Returns:

Normalized image with values scaled to range [0, 1]

Return type:

numpy.ndarray

Examples

>>> # Create sample image
>>> img = np.array([[0, 127, 255], [63, 191, 255]])
>>> normalized = mat2gray(img)
>>> normalized
array([[0. , 0.5, 1. ],
       [0.25, 0.75, 1. ]])
pupeyes.utils.parse_pole(pole)[source]#

Parse and validate pole (origin) coordinates. from https://osdoc.cogsci.nl/3.3/manual/python/common/

Parameters:

pole (tuple or array-like) – (x, y) coordinates for the pole/origin point

Returns:

Validated (x, y) coordinates as floats

Return type:

tuple

Raises:

ValueError – If pole is not a valid 2D coordinate pair

Examples

>>> parse_pole((1, 2))
(1.0, 2.0)
>>> parse_pole([1.5, 2.5])
(1.5, 2.5)
pupeyes.utils.xy_circle(n, rho, phi0=0, pole=(0, 0))[source]#

Generate points arranged in a circle. from https://osdoc.cogsci.nl/3.3/manual/python/common/

Parameters:
  • n (int) – Number of points to generate

  • rho (float) – Radius of the circle (distance from center)

  • phi0 (float, optional) – Starting angle in degrees (counterclockwise from right). Default is 0.

  • pole (tuple, optional) – Center point (x, y) coordinates. Default is (0, 0).

Returns:

List of (x, y) coordinate tuples for points arranged in a circle

Return type:

list

Notes

Points are arranged counterclockwise starting from phi0. The angular separation between points is 360°/n.

Examples

>>> # Generate 4 points in a circle of radius 100
>>> xy_circle(4, 100)
[(100, 0), (0, 100), (-100, 0), (0, -100)]
>>> # Generate 4 points with 45° offset
>>> xy_circle(4, 100, phi0=45)
[(70.71, 70.71), (-70.71, 70.71), (-70.71, -70.71), (70.71, -70.71)]
pupeyes.utils.xy_from_polar(rho, phi, pole=(0, 0))[source]#

Convert polar coordinates to Cartesian coordinates. from https://osdoc.cogsci.nl/3.3/manual/python/common/

Parameters:
  • rho (float) – Radial distance from origin (or pole)

  • phi (float) – Angle in degrees (counterclockwise from right)

  • pole (tuple, optional) – Origin point (x, y) coordinates. Default is (0, 0).

Returns:

(x, y) coordinates in Cartesian system

Return type:

tuple

Notes

The angle phi is measured counterclockwise from the positive x-axis, following the mathematical convention.

Examples

>>> # Convert 45° angle at distance 100
>>> xy_from_polar(100, 45)
(70.71, 70.71)
>>> # Convert with offset origin
>>> xy_from_polar(100, 0, pole=(50, 50))
(150, 50)

Plotting Utilities#

Plotting Utilities for Eye Movement Data

This module provides plotting functions for eye movement data visualization, including heatmaps, scanpaths, and areas of interest (AOIs).

pupeyes.plot_utils.draw_aois(aois, screen_dims, x=None, y=None, background_img=None, alpha=0, colors=None, save=None)[source]#

Draw Areas of Interest (AOIs) and optionally plot fixation points within them.

This function visualizes AOIs as polygons and can optionally show fixation points colored according to which AOI they fall within. AOIs are drawn as outlined polygons with optional fill color and can be overlaid on a background image.

Parameters:
  • aois (dict) – Dictionary mapping AOI names to lists of (x, y) vertex coordinates defining the AOI polygons. The last vertext should be the same as the first vertex to close the polygon. Example: {‘AOI1’: [(100, 100), (200, 100), (200, 200), (100, 200), (100, 100)]}

  • screen_dims (tuple) – Screen dimensions in pixels (width, height). Used to set plot boundaries and maintain correct aspect ratio.

  • x (array-like, optional) – X coordinates of fixation points in screen coordinates (0 = left). If provided along with y, points will be plotted and colored based on which AOI they fall within.

  • y (array-like, optional) – Y coordinates of fixation points in screen coordinates (0 = top)

  • background_img (str, PIL.Image or numpy.ndarray, optional) – Background image to overlay AOIs on. Can be: - Path to an image file (str) - PIL Image object - Numpy array of image data Image will be resized to match screen_dims if necessary.

  • alpha (float, default=0) – Fill transparency for AOI polygons (0 = transparent, 1 = opaque). The outlines remain fully opaque regardless of this value.

  • colors (dict, optional) – Dictionary mapping AOI names to colors for both the AOI polygons and their associated fixation points. If None, uses matplotlib’s tab20 colormap to assign colors automatically.

  • save (str, optional) – Path where the plot should be saved. If None, plot is not saved to disk.

Returns:

(figure, axes) tuple containing the plot

Return type:

tuple

Notes

  • The coordinate system uses screen coordinates where (0,0) is at the top-left

  • AOIs are drawn with solid outlines and optional transparent fill

  • When background_img is provided, it is displayed with 40% opacity

  • Fixation points outside any AOI are colored gray

  • A legend is automatically added showing AOI names

  • The plot maintains the correct aspect ratio based on screen dimensions

pupeyes.plot_utils.draw_heatmap(x, y, screen_dims, durations=None, fc=6, colormap='viridis', alpha=0.7, background_img=None, return_data=False)[source]#

Create a heatmap visualization of fixation density using 2D histogram and Gaussian smoothing.

This function generates a heatmap by first creating a 2D histogram of fixation locations, then applying Gaussian smoothing to create a continuous representation of fixation density. The resulting heatmap can be overlaid on a background image if provided.

Parameters:
  • x (array-like) – X coordinates of fixations in screen coordinates (0 = left)

  • y (array-like) – Y coordinates of fixations in screen coordinates (0 = top)

  • screen_dims (tuple) – Screen dimensions in pixels (width, height). Used to set the histogram bins and plot boundaries.

  • durations (array-like, optional) – Fixation durations for weighting the heatmap. If provided, longer fixations will contribute more to the density estimate.

  • fc (float, default=6) – Cut off frequency (-6dB) for Gaussian smoothing. Higher values result in less smoothing.

  • colormap (str, default='viridis') – Matplotlib colormap to use for the heatmap visualization

  • alpha (float, default=0.7) – Transparency of the heatmap overlay (0 = transparent, 1 = opaque)

  • background_img (str, PIL.Image or numpy.ndarray, optional) – Background image to overlay heatmap on. Can be: - Path to an image file (str) - PIL Image object - Numpy array of image data Image will be resized to match screen_dims if necessary.

  • return_data (bool, default=False) – If True, returns the raw heatmap array instead of plotting

Returns:

If return_data is True:

Returns the normalized heatmap array (shape: height x width)

If return_data is False:

Returns (figure, axes) tuple containing the plot

Return type:

tuple or numpy.ndarray

Notes

  • The heatmap is generated using numpy.histogram2d and smoothed using a Gaussian filter

  • The coordinate system uses screen coordinates where (0,0) is at the top-left

  • The heatmap values are normalized to the range [0,1]

  • When using a background image, the heatmap is overlaid with the specified alpha transparency

pupeyes.plot_utils.draw_scanpath(x, y, screen_dims, durations=None, dot_size_scale=3.0, line_width=1.0, dot_cmap='viridis', line_cmap='coolwarm', dot_alpha=0.8, line_alpha=0.5, background_img=None, show_labels=True, label_offset=(5, 5))[source]#

Create a visualization of fixation sequence (scanpath) with numbered points and connecting lines.

This function visualizes the sequence of fixations by plotting points at fixation locations and connecting them with lines to show the order. The points can be sized by fixation duration and colored using a colormap. The connecting lines use a different colormap to show sequence order.

Parameters:
  • x (array-like) – X coordinates of fixations in screen coordinates (0 = left)

  • y (array-like) – Y coordinates of fixations in screen coordinates (0 = top)

  • screen_dims (tuple) – Screen dimensions in pixels (width, height). Used to set plot boundaries.

  • durations (array-like, optional) – Fixation durations in milliseconds. If provided, dot sizes will be scaled by the square root of duration.

  • dot_size_scale (float, default=3.0) – Base size for dots if no duration data, or scaling factor for dot sizes when durations are provided. Larger values = bigger dots.

  • line_width (float, default=1.0) – Width of the lines connecting fixation points

  • dot_cmap (str, default='viridis') – Colormap for dots. If durations provided, represents duration. If no durations, all dots will be blue.

  • line_cmap (str, default='coolwarm') – Colormap for connecting lines to show sequence order. Earlier saccades are colored differently from later ones.

  • dot_alpha (float, default=0.8) – Transparency of fixation dots (0 = transparent, 1 = opaque)

  • line_alpha (float, default=0.5) – Transparency of connecting lines (0 = transparent, 1 = opaque)

  • background_img (str, PIL.Image or numpy.ndarray, optional) – Background image to overlay scanpath on. Can be: - Path to an image file (str) - PIL Image object - Numpy array of image data Image will be resized to match screen_dims if necessary.

  • show_labels (bool, default=True) – Whether to show numeric labels for fixation sequence order

  • label_offset (tuple, default=(5, 5)) – (x, y) offset in pixels for the position of numeric labels relative to fixation points

Returns:

(figure, axes) tuple containing the plot

Return type:

tuple

Notes

  • The coordinate system uses screen coordinates where (0,0) is at the top-left

  • Dot sizes are scaled by sqrt(duration) if durations are provided

  • When using a background image, it is displayed with 40% opacity

  • Fixation sequence is numbered starting from 1

  • Lines between fixations show the saccade paths

Miscellaneous#

Saccade Functions#

Saccade Analysis Module

This module provides functions for analyzing saccadic eye movements recorded with Eyelink eye trackers. Currently, this files only contains functions that are tailored for visual search tasks in which items are presented in a circular array (e.g., the additional singleton task).

pupeyes.saccades.saccade_aoi_angular(sample_data, data, col_sample_timestamp, col_x, col_y, col_saccade_start_time, col_saccade_end_time, col_target_pos, col_distractor_pos, col_distractor_cond, col_other_pos, item_coords, use=None, threshold=30)[source]#

Classify saccades based on their angular deviation towards potential target locations. Different from saccade_aoi_annulus(), this function uses the initial firing direction of a saccade to classify its destination. As a result, it also requires raw gaze position data. Make sure to use the same coordinate system for both sample_data and data.

Parameters:
  • sample_data (pandas.DataFrame) – Raw eye tracking samples containing gaze positions

  • data (pandas.DataFrame) – Saccade data with start/end times

  • col_sample_timestamp (str) – Column name for timestamps in sample_data

  • col_x (str) – Column names for x and y coordinates in sample_data

  • col_y (str) – Column names for x and y coordinates in sample_data

  • col_saccade_start_time (str) – Column names for saccade start and end times

  • col_saccade_end_time (str) – Column names for saccade start and end times

  • col_target_pos (str) – Column name for target position coordinates

  • col_distractor_pos (str) – Column name for distractor position coordinates

  • col_distractor_cond (str) – Column name for distractor condition (‘P’ for present, ‘A’ for absent)

  • col_other_pos (list of str or None) – Column names for other item position coordinates

  • item_coords (list or numpy.ndarray) – List of (x,y) coordinates for all possible item positions

  • use (str or int, optional) – Point in the trajectory of a saccade to use for classification: - ‘mid’: midpoint (default) - ‘one-third’: one-third point - int: specific sample number - None: endpoint

  • threshold (float, optional) – Maximum angular deviation (degrees) to consider a saccade as directed towards an item (default: 30)

Returns:

Original DataFrame with added columns:

  • curritemstr

    Item type (‘Target’, ‘Singleton’, ‘Non-singleton’, or NaN)

  • flagstr

    Reason for invalid classification (‘insufficient_samples’, ‘big_angle’, or NaN)

Return type:

pandas.DataFrame

Notes

  • If a saccade starts outside the annulus, it is classified as ‘invalid_start_pos’.

  • If a saccade ends outside the annulus, it is classified as ‘invalid_end_pos’.

  • If a saccade ends too far from any item, it is classified as ‘no_item_in_range’.

pupeyes.saccades.saccade_aoi_annulus(data, item_coords, col_startx, col_starty, col_endx, col_endy, col_distractor_cond, col_target_pos, col_distractor_pos, col_other_pos=None, screen_dims=(1600, 1200), annulus_range=(50, 600), item_range=None, start_range=None, fixation_mode=False)[source]#

Classify saccade endpoints or fixations based on their proximity to items within an annular region. The function assumes eyelink coordinates are used, where the origin is in the top-left corner. You might need to convert your coordinates before using this function.

Parameters:
  • data (pandas.DataFrame) – DataFrame containing saccade or fixation data

  • item_coords (list or numpy.ndarray) – List of (x,y) coordinates for all possible item positions

  • col_startx (str) – Column names for saccade start coordinates

  • col_starty (str) – Column names for saccade start coordinates

  • col_endx (str) – Column names for saccade end coordinates

  • col_endy (str) – Column names for saccade end coordinates

  • col_distractor_cond (str) – Column name for distractor condition (‘P’ for present, ‘A’ for absent)

  • col_target_pos (str) – Column name for target position coordinates

  • col_distractor_pos (str) – Column name for distractor position coordinates

  • col_other_pos (list of str, optional) – Column names for other item position coordinates

  • screen_dims (tuple, optional) – Screen dimensions (width, height) in pixels (default: (1600, 1200))

  • annulus_range (tuple, optional) – Inner and outer radius of annulus in pixels (default: (50, 600))

  • item_range (float, optional) – Maximum distance to consider a point as belonging to an item

  • start_range (float, optional) – Maximum allowed distance from screen center for start position

  • fixation_mode (bool, optional) – If True, only check end positions (default: False)

Returns:

Original DataFrame with added columns:

  • curritemstr

    Item type (‘Target’, ‘Singleton’, ‘Non-singleton’, or NaN)

  • currlocint

    Index of closest item position, based on the order provided in item_coords

  • flagstr

    Reason for invalid classification (‘invalid_start_pos’, ‘invalid_end_pos’, ‘no_item_in_range’, or NaN)

Return type:

pandas.DataFrame

Notes

  • If a saccade starts outside the annulus, it is classified as ‘invalid_start_pos’.

  • If a saccade ends outside the annulus, it is classified as ‘invalid_end_pos’.

  • If a saccade ends too far from any item, it is classified as ‘no_item_in_range’.

pupeyes.saccades.saccade_deviation(sample_data, data, col_sample_timestamp, col_x, col_y, col_saccade_start_time, col_saccade_end_time, find='mid')[source]#

Compute the angular deviation of saccade trajectories from a straight path.

This function measures how much a saccade’s trajectory deviates from a straight line between its start and end points. The deviation is measured as the angle between two lines: one from start to end point, and another from start to a specified point along the trajectory. This function may be helpful for detecting curved saccades. Make sure to use the same coordinate system for both sample_data and data.

Parameters:
  • sample_data (pandas.DataFrame) – Raw eye tracking samples containing gaze positions

  • data (pandas.DataFrame) – Saccade data with start/end times

  • col_sample_timestamp (str) – Column name for timestamps in sample_data

  • col_x (str) – Column names for x and y coordinates in sample_data

  • col_y (str) – Column names for x and y coordinates in sample_data

  • col_saccade_start_time (str) – Column names for saccade start and end times

  • col_saccade_end_time (str) – Column names for saccade start and end times

  • find (str or int, optional) – Point in trajectory for curvature calculation: - ‘mid’: use midpoint (default) - ‘one-third’: use one-third point - ‘max’: find point of maximum deviation - int: use specific sample number - None: use endpoint

Returns:

Original DataFrame with added columns:

  • deviationfloat

    Angular deviation at specified point (degrees)

  • deviation_idxint

    Sample index where deviation was computed

  • deviation_timefloat

    Timestamp where deviation was computed

Return type:

pandas.DataFrame

Notes

  • If a saccade starts outside the annulus, it is classified as ‘invalid_start_pos’.

  • If a saccade ends outside the annulus, it is classified as ‘invalid_end_pos’.

  • If a saccade ends too far from any item, it is classified as ‘no_item_in_range’.

External Modules#

EDF Reader#

EDF Reader of EyeLink Data

Adapted from: esdalmaijer/PyGazeAnalyser

Original Author: Edwin Dalmaijer

License: GPU GPL v3

Adapted By: Han Zhang <hanzh@umich.edu>

Date: 12/25/2024

Changes:
  • Added support for reading metadata.

  • Added support for storing the last message and its time for each sample and event.

  • Moved checking trial end to the end of the loop to allow the last line (stop MSG) to be extracted.

pupeyes.external.edfreader.read_edf(filename, start, stop=None, missing=0.0, debug=False, progress_bar=True)[source]#

Read EyeLink Data Format (EDF) file and extract trial data.

Adapted from: esdalmaijer/PyGazeAnalyser

Original Author: Edwin Dalmaijer

Parameters:
  • filename (str) – Path to the file that has to be read

  • start (str) – Trial start string to identify beginning of trials

  • stop (str, optional) – Trial ending string, by default None

  • missing (float, optional) – Value to be used for missing data, by default 0.0

  • debug (bool, optional) – If True, prints information about current processing steps, by default False

  • progress_bar (bool, optional) – If True, shows a progress bar while reading the file, by default True

Returns:

Contains two elements:

  • datalist
    List of dictionaries, one per trial, each containing:
    • xnumpy.ndarray

      Array of x positions

    • ynumpy.ndarray

      Array of y positions

    • sizenumpy.ndarray

      Array of pupil sizes

    • timenumpy.ndarray

      Array of timestamps, t=0 at trial start

    • trackertimenumpy.ndarray

      Array of timestamps according to EDF

    • eventsdict

      Dictionary containing event data (fixations, saccades, blinks, and messages)

  • metadatadict

    Dictionary containing calibration and tracking information

Return type:

tuple

pupeyes.external.edfreader.replace_missing(value, missing=0.0)[source]#

Replace missing values in gaze position data.

Adapted from: esdalmaijer/PyGazeAnalyser

Original Author: Edwin Dalmaijer

Parameters:
  • value (str) – Either an X or a Y gaze position value (NOT pupil size, which is coded ‘0.0’)

  • missing (float, optional) – The missing code to replace missing data with, by default 0.0

Returns:

Either the missing code, or the float value of the gaze position

Return type:

float

Notes

A missing value in the EDF contains only a period, no numbers. This function is for gaze position values only, NOT for pupil size, as missing pupil size data is coded ‘0.0’.