API Reference#
This page contains the API reference for the pupeyes package.
Reading Eyelink Data#
Eyelink Data Parsing Module
This module is designed for parsing Eyelink ASC data. It provides functionalities to parse messages, samples, fixations, saccades, and blinks.
- class pupeyes.data.eyelink.EyelinkReader(path, start_msg, stop_msg, msg_format, delimiter, add_cols=None, progress_bar=True)[source]#
Bases:
objectA class to read and parse Eyelink eye tracking data files. This class handles loading and parsing of Eyelink data files, providing methods to extract messages, samples, fixations, saccades and blinks. It supports customizable message formats and additional column specifications.
- Parameters:
path (str) – Path to the Eyelink data file
start_msg (str) – Common part of message marking the start of a trial. For example, if your trial start messages are ‘TRIAL_START 1 1’, ‘TRIAL_START 1 2’, etc., then start_msg would be ‘TRIAL_START’
stop_msg (str) – Common part of message marking the end of a trial. For example, if your trial end messages are ‘TRIAL_END 1 1’, ‘TRIAL_END 1 2’, etc., then stop_msg would be ‘TRIAL_END’
msg_format (dict) – Dictionary specifying the format of messages. The messages will be parsed based on this format. Example: {‘marker’: str, ‘event’: str, ‘block’: int, ‘trial’: int}
delimiter (str) – Character used to separate message components. For example, if messages are formatted as ‘TRIAL_END 1 1’, the delimiter would be ‘ ‘.
add_cols (dict, optional) – Additional columns to add to output DataFrames. The dictionary should be in the format {‘column_name’: column_data}. For example, to add a column ‘subject’ with value ‘S01’ to all rows, use {‘subject’: ‘S01’}.
progress_bar (bool, optional) – If True, shows a progress bar while reading the data file. Default is True.
- data#
Raw unformatted Eyelink data
- Type:
pd.DataFrame
- messages#
Extracted messages from the data file
- Type:
pd.DataFrame
- metadata#
Metadata from the Eyelink data file
- Type:
dict
Examples
>>> reader = EyelinkReader( ... path='subject01.asc', ... start_msg='TRIAL_START', ... stop_msg='TRIAL_END', ... msg_format={'marker': str, 'event': str, 'block': int, 'trial': int}, ... delimiter=' ' ... )
- get_blinks(strict=True, parse_messages=True)[source]#
Extract and process blink events from the dataset.
This method extracts all blink events from the Eyelink recording, including their duration and associated messages.
- Parameters:
strict (bool, optional) – If True, removes “bridge” blinks: blinks with start times before an event and end times after the same event. Default is True.
parse_messages (bool, optional) – If True, parses the associated messages according to the predefined message format. Default is True.
- Returns:
DataFrame containing processed blink data with columns:
- eyestr
Eye identifier (left/right)
- starttimefloat
Start time of blink
- endtimefloat
End time of blink
- durationfloat
Duration of blink in milliseconds
- msgstr
Message string (if parse_messages=False)
- msgtimefloat
Message timestamp
Additional columns from message parsing if parse_messages=True.
Additional columns from self.add_cols if specified.
- Return type:
pd.DataFrame
Notes
Columns are converted to the appropriate data type that supports pd.NA.
Blinks are detected by Eyelink’s algorithm.
- get_fixations(strict=True, parse_messages=True)[source]#
Extract and process fixation events from the dataset.
This method extracts all fixation events from the Eyelink recording, including their duration, position, and associated messages.
- Parameters:
strict (bool, optional) – If True, removes “bridge” fixations: fixations with start times before an event and end times after the same event. Default is True.
parse_messages (bool, optional) – If True, parses the associated messages according to the predefined message format. Default is True.
- Returns:
DataFrame containing processed fixation data with columns:
- eyestr
Eye identifier (left/right)
- starttimefloat
Start time of fixation
- endtimefloat
End time of fixation
- durationfloat
Duration of fixation in milliseconds
- endxfloat
X-coordinate at end of fixation
- endyfloat
Y-coordinate at end of fixation
- msgstr
Raw message string (if parse_messages=False)
- msgtimefloat
Message timestamp
Additional columns from message parsing if parse_messages=True.
Additional columns from self.add_cols if specified.
- Return type:
pd.DataFrame
Notes
Columns are converted to the appropriate data type that supports pd.NA. If strict=True, fixations starting before their associated trial message are removed from the output.
- get_messages()[source]#
Extract and process marker events from the Eyelink dataset.
This method extracts all message events from the data and parses them according to the specified message format and delimiter.
- Returns:
DataFrame containing processed message data with columns:
- idint
Trial identifier
- trackertimefloat
Eye tracker timestamps
- messagestr
Raw message string
Additional columns to store parsed message parts based on msg_format specification.
Additional columns from self.add_cols are added if specified.
- Return type:
pd.DataFrame
Notes
Messages are split using the specified delimiter and parsed according to the data types specified in msg_format.
- get_saccades(strict=True, remove_blinks=True, srt=True, parse_messages=True)[source]#
Extract and process saccadic eye movements from the dataset.
This method extracts all saccade information from the dataset, with the option to remove saccades that overlap with blinks and calculate saccade reaction times.
- Parameters:
strict (bool, optional) – If True, removes “bridge” saccades: saccades with start times before an event and end times after the same event. Default is True.
remove_blinks (bool, optional) – If True, removes saccades that overlap with blink periods. This is recommended for Eyelink data as Eyelink embeds a blink inside a saccade. Default is True.
srt (bool, optional) – If True, calculates saccade reaction time (srt) as the difference between saccade start time and message timestamp. Default is True.
parse_messages (bool, optional) – If True, parses the associated messages according to the predefined message format. Default is True.
- Returns:
DataFrame containing processed saccade data with columns:
- eyestr
Eye identifier (left/right)
- starttimefloat
Start time of saccade
- endtimefloat
End time of saccade
- durationfloat
Duration of saccade in milliseconds
- startxfloat
Starting X coordinate
- startyfloat
Starting Y coordinate
- endxfloat
Ending X coordinate
- endyfloat
Ending Y coordinate
- amplfloat
Amplitude of saccade in degrees
- pvfloat
Peak velocity in degrees/second
- msgstr
Associated message (if parse_messages=False)
- msgtimefloat
Message timestamp
- srtfloat
Saccade reaction time (if srt=True)
Additional columns from message parsing if parse_messages=True.
Additional columns from self.add_cols if specified.
- Return type:
pd.DataFrame
Notes
Columns are converted to the appropriate data type that supports pd.NA.
If remove_blinks=True, saccades overlapping with blinks are removed
If strict=True, saccades starting before trial message are removed
Saccade reaction time (srt) is calculated as starttime - msgtime
- get_samples(parse_messages=True)[source]#
Extract and process raw eye tracking samples from the dataset.
This method extracts all sample data points from the Eyelink recording, including gaze position, pupil size, and associated messages.
- Parameters:
parse_messages (bool, optional) – If True, parses the associated messages according to the predefined message format. Default is True.
- Returns:
DataFrame containing processed sample data with columns:
- trialtimefloat
Trial timestamps
- trackertimefloat
Eye tracker timestamps
- xfloat
X coordinates of gaze position
- yfloat
Y coordinates of gaze position
- ppfloat
Pupil size measurements (arbitrary unit; measurement unit [area/diameter] depends on recording setting)
- msgstr
Raw message strings
- msgtimefloat
Message timestamps
Additional columns from message parsing if parse_messages=True.
Additional columns from self.add_cols if specified.
- Return type:
pd.DataFrame
Notes
Columns are converted to the appropriate data type that supports pd.NA.
- parse_eyelink_data(progress_bar)[source]#
Loads and parses raw Eyelink data from the specified file. A wrapper for read_edf function.
This method reads the Eyelink data file and extracts both the data and metadata.
- Returns:
A tuple containing:
- pd.DataFrame
The parsed Eyelink data
- dict
Metadata from the Eyelink file
- Return type:
tuple
Notes
The read_edf function is adapted from the pygaze package esdalmaijer/PyGazeAnalyser
Reading Tobii Data (from Titta)#
Tobii Data Parsing Module (from Titta)
This module is designed for parsing Tobii data saved from a Titta experiment (hdf5 format). It provides functionalities to parse messages and raw gaze samples. However, it does not support parsing fixations, saccades, and blinks, as these are not saved by Titta.
For more info on the Titta package, see marcus-nystrom/Titta
- class pupeyes.data.tobii_titta.TobiiTittaReader(path, start_msg, stop_msg, msg_format, delimiter, add_cols=None)[source]#
Bases:
objectA class to read and parse Tobii data saved from Titta (hdf5 format). This class handles loading and parsing of Tobii data files, providing methods to extract messages and gaze samples. However, it does not support parsing fixations, saccades, and blinks, as these are not saved by Titta.
Most functions here are wrappers for existing functionalities in the Titta package.
- Parameters:
path (str) – Path to the Tobii hdf5 data file
start_msg (str) – Common part of message marking the start of a trial. For example, if your trial start messages are ‘TRIAL_START 1 1’, ‘TRIAL_START 1 2’, etc., then start_msg would be ‘TRIAL_START’
stop_msg (str) – Common part of message marking the end of a trial. For example, if your trial end messages are ‘TRIAL_END 1 1’, ‘TRIAL_END 1 2’, etc., then stop_msg would be ‘TRIAL_END’
msg_format (dict) – Dictionary specifying the format of messages. The messages will be parsed based on this format. Example: {‘marker’: str, ‘event’: str, ‘block’: int, ‘trial’: int}
delimiter (str) – Character used to separate message components. For example, if messages are formatted as ‘TRIAL_END 1 1’, the delimiter would be ‘ ‘.
add_cols (dict, optional) – Additional columns to add to output DataFrames. The dictionary should be in the format {‘column_name’: column_data}. For example, to add a column ‘subject’ with value ‘S01’ to all rows, use {‘subject’: ‘S01’}.
progress_bar (bool, optional) – If True, shows a progress bar while reading the data file. Default is True.
- calibration_history#
Raw calibration history as saved by Titta
- Type:
pd.DataFrame
- external_signal#
Raw external signal as saved by Titta
- Type:
pd.DataFrame
- gaze#
Raw gaze data as saved by Titta
- Type:
pd.DataFrame
- log#
Raw log data as saved by Titta
- Type:
pd.DataFrame
- msg#
Raw message data as saved by Titta
- Type:
pd.DataFrame
- notification#
Raw notification data as saved by Titta
- Type:
pd.DataFrame
- time_sync#
Raw time sync data as saved by Titta
- Type:
pd.DataFrame
Examples
>>> reader = TobiiTittaReader( ... path='subject01.h5', ... start_msg='TRIAL_START', ... stop_msg='TRIAL_END', ... msg_format={'marker': str, 'event': str, 'block': int, 'trial': int}, ... delimiter=' ' ... )
- get_messages()[source]#
Extract and process marker events from the Titta dataset.
This method extracts all message events from the data and parses them according to the specified message format and delimiter.
- Returns:
DataFrame containing processed message data with columns:
- idint
Trial identifier
- system_time_stampfloat
System timestamps
- msgstr
Raw message string
Additional columns to store parsed message parts based on msg_format specification.
Additional columns from self.add_cols are added if specified.
- Return type:
pd.DataFrame
Notes
Messages are split using the specified delimiter and parsed according to the data types specified in msg_format.
- get_samples(parse_messages=True)[source]#
Extract gaze samples for each trial based on start and stop messages.
- Parameters:
parse_messages (bool, optional) – If True, parse message columns and add them to samples. If False, only add raw message. Default is True.
- Returns:
DataFrame containing processed sample data. Columns include all columns in self.gaze, as well as:
- trialtimefloat
Trial timestamps in milliseconds (since the start of each trial)
- msgtimefloat
Message timestamps in system timestamps (start time of each trial)
- msgstr
Raw message strings (start message of each trial)
Additional columns from message parsing if parse_messages=True.
Additional columns from self.add_cols if specified.
- Return type:
pd.DataFrame
Notes
Columns are converted to the appropriate data type that supports pd.NA. Only trials with both start and stop messages are included in the output.
Pupil Preprocessing#
Pupil Data Processing Module
This module provides tools for processing pupillometry data from eye trackers. It includes functionality for deblinking, smoothing, baseline correction, and plotting pupil size data.
- class pupeyes.pupil.PupilProcessor(data, trial_identifier, pupil_col, time_col, x_col, y_col, samp_freq, convert_pupil_size=False, artificial_d=5, artificial_size=5663, recording_unit='diameter', device='eyelink', eyetracker_missing_value=0, progress_bar=True)[source]#
Bases:
objectA class for processing and analyzing pupillometry data.
This class provides methods for preprocessing pupil size data, including blink removal, artifact rejection, smoothing, and baseline correction. It also includes tools for data visualization and analysis.
- Parameters:
data (pd.DataFrame) – DataFrame containing pupil size data and associated measurements
trial_identifier (str or list) – Column name(s) identifying unique trials. If list, trials are uniquely identified by the combination of these columns.
pupil_col (str) – Column name containing pupil size measurements
time_col (str) – Column name containing timestamps. Must be in milliseconds and integer.
x_col (str) – Column name containing x-coordinates of gaze position
y_col (str) – Column name containing y-coordinates of gaze position
samp_freq (float) – Sampling frequency of the eye tracker in Hz
convert_pupil_size (bool, default=False) – Whether to convert pupil size from area to diameter or vice versa
artificial_d (float, default=5) – Artificial pupil diameter in mm, used for pupil size conversion
artificial_size (float, default=5663) – Artificial pupil size in arbitrary units, used for pupil size conversion
recording_unit ({'diameter', 'area'}, default='diameter') – Unit of the recorded pupil size
device ({'eyelink', 'tobii_titta', 'tobii_prolab','smi'}, default='eyelink') – Device type. At the moment, this only controls whether sampling frequency is checked.
eyetracker_missing_value (int, default=0) – Value for missing pupil size for the eye tracker. Different eye trackers use different values to indicate missing values. PupEyes will replace these values with 0. Other possible values are pd.NA, np.nan, -1, -999, etc.
progress_bar (bool, default=True) – Whether to show a progress bar for preprocessing steps
- data#
The processed pupil data
- Type:
pd.DataFrame
- summary_data#
Summary statistics for each trial
- Type:
pd.DataFrame
- trials#
A dataframe of unique trial identifiers
- Type:
pd.DataFrame
- params#
Dictionary storing parameters used in processing steps
- Type:
dict
- all_pupil_cols#
List of column names containing pupil data at different processing stages
- Type:
list
- all_steps#
List of processing steps applied to the data
- Type:
list
Notes
All processing methods return self for method chaining
Most methods create new columns with processed data rather than modifying existing ones
Processing parameters are stored in the params dictionary for reproducibility
Summary statistics are automatically updated after each processing step
artificial_d is the diameter of an artificial pupil provided by Eyelink.
artificial_size was measured for the setup of our research group and may not generalize to other setups.
- artifact_rejection(suffix='_ar', method='both', speed_n=16, zscore_threshold=2.5, zscore_allowp=0.1)[source]#
Reject artifacts from pupil data using speed and/or z-score based methods.
This method identifies and removes artifacts using two possible approaches: 1. Speed-based: Removes samples where pupil size changes too rapidly 2. Z-score based: Removes extreme values based on z-score thresholds
The method can use either approach individually or combine both.
- Parameters:
suffix (str, default='_ar') – Suffix to append to the pupil column name for the artifact-rejected data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_ar’.
method ({'speed', 'zscore', 'both'}, default='both') – Method to use for artifact rejection: - ‘speed’: Use only speed-based rejection - ‘zscore’: Use only z-score based rejection - ‘both’: Use both methods
speed_n (int, default=16) – Number of MADs above median speed to use as threshold for speed-based rejection
zscore_threshold (float, default=2.5) – Z-score threshold for artifact rejection for z-score based rejection
zscore_allowp (float, default=0.1) – Proportion of mean to use as minimum standard deviation for z-score based rejection. If sd/mean < zscore_allowp, the z-score threshold is not applied. This is to avoid rejecting stable data.
- Returns:
self – Returns self for method chaining
- Return type:
Notes
- Updates summary_data with:
run_artifact: Boolean indicating if artifact rejection was performed
pct_artifact: Percentage of samples identified as artifacts
Creates a new column with suffix appended to the current pupil column name
Updates all_pupil_cols and all_steps to track processing history
Artifact periods are replaced with NaN values
Trials with all missing pupil data are skipped and reported
Processing parameters are stored in self.params[‘artifact_rejection’]
- baseline_correction(baseline_query, baseline_range=[None, None], suffix='_bc', method='subtractive')[source]#
Apply baseline correction to pupil data.
Corrects pupil data by subtracting or dividing by baseline values. Creates a new column with the baseline-corrected data.
- Parameters:
baseline_query (str) – Query string to select baseline period data
baseline_range (list, default=[None, None]) – Start and end indices for baseline period
suffix (str, default='_bc') – Suffix to append to the pupil column name for the corrected data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_bc’.
method ({'subtractive', 'divisive'}, default='subtractive') – Method to use for baseline correction: - ‘subtractive’: Subtract baseline mean from pupil data - ‘divisive’: Divide pupil data by baseline mean
- Returns:
self – Returns self for method chaining.
- Return type:
Notes
- Updates summary_data with:
run_baseline_correction: Boolean indicating if baseline correction was performed
baseline: Mean baseline value used for correction
Adds a new column with suffix appended to the current pupil column name
Updates all_pupil_cols and all_steps to track processing history
- check_baseline_outliers(outlier_by=None, n_mad_baseline=4, plot=True, **kwargs)[source]#
Check for outliers in baseline pupil values.
Identifies outliers in baseline values using median absolute deviation (MAD). Can group data and check outliers within groups.
- Parameters:
outlier_by (str or list, optional) – Column(s) to group data by for outlier detection
n_mad_baseline (float, default=4) – Number of MADs from median to use as outlier threshold
plot (bool, default=True) – Whether to plot the baseline distributions
**kwargs (dict) – Additional arguments passed to plot_baseline
- Returns:
self – Returns self for method chaining
- Return type:
Notes
Updates summary_data with baseline outlier statistics
Updates all_steps
- check_missing(pupil_col=None, missing_value=<NA>)[source]#
Check for missing values in pupil data.
This method calculates the percentage of missing values for each trial and updates the summary statistics. Missing values can be either NaN or a specific value.
- Parameters:
pupil_col (str, optional) – Column name to check for missing values. If None, uses the latest pupil column.
missing_value (float or pd.NA, default=pd.NA) – Value to consider as missing. Can be pd.NA for NaN values or any specific value.
- Returns:
self – Returns self for method chaining.
- Return type:
Notes
- Updates summary_data with:
run_check_missing: Boolean indicating if missing check was performed
missing: Percentage of missing values in each trial
Updates all_steps to track processing history
Trials that cannot be checked are reported
Processing parameters are stored in self.params[‘check_missing’]
- check_sampling_frequency(sampling_rate=None, data=None)[source]#
Check if the sampling frequency is consistent. Only performed for Eyelink data.
This method checks if the sampling frequency is consistent across trials. If not, it raises an error. It is automatically called when initializing the PupilProcessor. If resampling is performed, the sampling frequency is checked again.
- Parameters:
sampling_rate (int, default=None) – Sampling rate to check. If None, the sampling rate is checked against the current sampling rate.
data (pd.DataFrame, default=None) – Data to check. If None, the data is checked against the current data. The time column must be in milliseconds and integer.
- Returns:
check_pass – True if the sampling frequency is consistent, False otherwise
- Return type:
bool
- check_trace_outliers(time_col=None, pupil_col=None, outlier_by=None, n_mad_trace=4, plot=True, **kwargs)[source]#
Check for outlier trials based on pupil trace values.
Detects outlier trials by comparing each trial’s pupil trace against thresholds calculated from the median absolute deviation (MAD) of all trials. Outliers can be calculated globally or within specified groups.
- Parameters:
time_col (str, optional) – Column name for x-axis values (time). Defaults to time column.
pupil_col (str, optional) – Column name for pupil values. Defaults to last pupil column.
outlier_by (str or list, optional) – Column(s) to group trials by when calculating outlier thresholds.
n_mad_trace (float, default=4) – Number of MADs to use for outlier threshold.
plot (bool, default=True) – Whether to plot the results.
**kwargs – Additional arguments passed to plotting function.
- Returns:
self – Returns self for method chaining.
- Return type:
object
Notes
- Updates summary_data with:
run_trace_outlier: Boolean indicating if trace outlier detection was performed
trace_outlier: Boolean indicating if trial is an outlier
trace_upper: Upper threshold for outlier detection
trace_lower: Lower threshold for outlier detection
Outlier detection uses median absolute deviation (MAD) method
Can detect outliers globally or within groups specified by outlier_by
- static combine(processors)[source]#
Combine multiple PupilProcessor instances into a single instance.
This method allows combining data from multiple processors that have gone through identical preprocessing pipelines. This is useful for: 1. Processing large datasets in chunks to manage memory 2. Adding new data to an existing processed dataset 3. Processing data from multiple participants separately
- Parameters:
processors (list of PupilProcessor) – List of PupilProcessor instances to combine. All processors must have identical preprocessing settings.
- Returns:
A new PupilProcessor instance containing combined data.
- Return type:
Notes
- All processors must have identical:
Initialization parameters (pupil_col, time_col, etc.)
Data structure (column names and order)
Preprocessing steps and parameters
Outlier detection settings (if used)
Data and summary statistics are concatenated
- Raises:
ValueError – If processors have different preprocessing settings If processors have incompatible data structures If no processors are provided
- copy()[source]#
Create a deep copy of the PupilProcessor object.
This method creates an independent copy of the PupilProcessor object, including all data and processing history. Modifications to the copy will not affect the original object.
- Returns:
A deep copy of the current object.
- Return type:
Notes
Creates a completely independent copy using copy.deepcopy
All data, parameters, and processing history are copied
Useful for creating alternative processing pipelines
- deblink(suffix='_db')[source]#
Remove blinks from pupil data using noise-based blink detection.
This method identifies and removes blinks from pupil data using the based_noise_blinks_detection algorithm. Blinks are detected based on rapid changes in pupil size characteristic of eye closure. The method processes each trial separately and creates a new column containing the deblinked data.
- Parameters:
suffix (str, default='_db') – Suffix to append to the pupil column name for the deblinked data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_db’.
- Returns:
self – Returns self for method chaining
- Return type:
Notes
- Updates summary_data with:
run_deblink: Boolean indicating if deblinking was performed
pct_deblink: Percentage of samples identified as blinks
Creates a new column with suffix appended to the current pupil column name
Updates all_pupil_cols and all_steps to track processing history
Blink periods are replaced with NaN values
Trials with all missing pupil data are skipped and reported
Processing parameters are stored in self.params[‘deblink’]
See also
based_noise_blinks_detectionThe underlying blink detection algorithm
- downsample(target_samp_freq, agg_methods=None)[source]#
Downsample pupil data to a new sampling rate.
This method downsamples the data by binning into fixed time windows and aggregating values within each bin. This is useful for reducing data size or matching sampling rates between different recordings.
- Parameters:
target_samp_freq (int) – Target sampling frequency in Hz.
agg_methods (dict, optional) – Dictionary mapping column names to aggregation methods. Example: {‘pupil’: ‘mean’, ‘time’: ‘first’, ‘x’: ‘mean’, ‘y’: ‘mean’} If None, uses ‘first’ for all columns.
- Returns:
self – Returns self for method chaining.
- Return type:
Notes
Unlike other preprocessing functions, this function will replace the original .data with the downsampled data rather than creating a new column to the original .data.
Trials that cannot be downsampled are reported.
The sampling frequency is checked and updated again after downsampling.
- Updates summary_data with:
run_downsample: Boolean indicating if downsampling was performed
downsampled_bin_size: Size of the downsampled time bin in milliseconds
downsampled_samp_freq: Downsampled sampling frequency in Hz
Updates all_steps to track processing history
Processing parameters are stored in self.params[‘downsample’]
- filter_position(vertices, suffix='_xy')[source]#
Filter pupil data based on gaze position within a polygon.
This method removes pupil data points where the gaze position falls outside a specified polygon. This is useful for excluding data where participants were not looking at the intended region of interest.
- Parameters:
vertices (list of tuples) – List of (x,y) coordinates defining the polygon vertices. Must be in screen coordinates and form a closed polygon. Example: [(0,0), (0,1080), (1920,1080), (1920,0), (0,0)]
suffix (str, default='_xy') – Suffix to append to the pupil column name for the filtered data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_xy’.
- Returns:
self – Returns self for method chaining.
- Return type:
Notes
- Updates summary_data with:
run_gaze_filter: Boolean indicating if gaze filtering was performed
pct_gaze_filter: Percentage of samples outside the polygon
avg_gaze_x: Average gaze x-coordinate for the remaining samples
avg_gaze_y: Average gaze y-coordinate for the remaining samples
Creates a new column with suffix appended to the current pupil column name
Updates all_pupil_cols and all_steps to track processing history
Samples outside the polygon are replaced with NaN values
Trials with all missing pupil data are skipped and reported
Processing parameters are stored in self.params[‘filter_position’]
- Raises:
ValueError – If vertices cannot be converted to float numpy array
- interpolate(suffix='_it', method='linear', missing_threshold=0.6)[source]#
Interpolate missing values in pupil data.
This method fills missing values in the pupil data using either linear or spline interpolation. Trials with too many missing values (above missing_threshold) are skipped to avoid unreliable interpolation.
- Parameters:
suffix (str, default='_it') – Suffix to append to the pupil column name for the interpolated data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_it’.
method ({'linear', 'spline'}, default='linear') – Method to use for interpolation: - ‘linear’: Linear interpolation between points - ‘spline’: Cubic spline interpolation
missing_threshold (float, default=0.6) – Maximum proportion of missing values allowed for interpolation. Trials with more missing values than this threshold are skipped.
- Returns:
self – Returns self for method chaining.
- Return type:
Notes
- Updates summary_data with:
run_interpolate: Boolean indicating if interpolation was performed
pct_interpolate: Percentage of interpolated values in each trial
Creates a new column with suffix appended to the current pupil column name
Updates all_pupil_cols and all_steps to track processing history
Trials with too many missing values are skipped and reported
Processing parameters are stored in self.params[‘interpolate’]
- Raises:
ValueError – If method is not ‘linear’ or ‘spline’
- static load(path)[source]#
Load PupilProcessor object from file using dill deserialization.
This method loads a previously saved PupilProcessor object, restoring all data and processing history.
- Parameters:
path (str) – Path to the file containing the saved PupilProcessor object.
- Returns:
The loaded PupilProcessor object.
- Return type:
Notes
The loaded object will be an exact copy of the saved object
All data, parameters, and processing history are preserved
Make sure the file was created using the save() method
- plot_baseline(plot_by=None, show_outliers=True, save=None, interactive=True, plot_params=None, return_fig=False)[source]#
Plot histogram of baseline pupil sizes. This is a wrapper function that calls either plot_baseline_interactive() or plot_baseline_static() depending on the interactive parameter.
- Parameters:
plot_by (str or list, optional) – Column(s) to group data by for separate plots.
show_outliers (bool, default=True) – Whether to show outlier thresholds.
save (str, optional) – Path to save plot.
interactive (bool, default=True) – Whether to create interactive plot.
plot_params (dict, optional) – Additional plotting parameters.
return_fig (bool, default=False) – Whether to return the figure object.
- Returns:
figure (matplotlib.figure.Figure or plotly.graph_objects.Figure) – Plot figure object if return_fig is True.
axes (matplotlib.axes.Axes, optional) – Plot axes object (only for static plots).
See also
plot_baseline_interactiveCreate interactive baseline histogram plot
plot_baseline_staticCreate static baseline histogram plot
- plot_evoked(data=None, pupil_col=None, condition=None, agg_by=None, error='ci', save=None, plot_params=None, **kwargs)[source]#
Plot evoked pupil response.
Creates plot of average pupil response across trials, optionally split by condition and aggregated by specified groups.
- Parameters:
data (str or pandas.DataFrame, optional) – Data to plot. If string, uses corresponding attribute.
pupil_col (str, optional) – Column name for pupil values.
condition (str or list, optional) – Column(s) to split data by.
agg_by (str or list, optional) – Column(s) to aggregate data by before computing mean trace and confidence bands. For example, to compute subject-level means, use ‘subject_id’.
error ({'ci', 'sem', 'std', None}, default='ci') – Type of error to plot: - ‘ci’: bootstrap confidence interval - ‘sem’: standard error of the mean - ‘std’: standard deviation - None: no error bars
save (str, optional) – Path to save plot.
plot_params (dict, default={}) – Additional plotting parameters. This includes all rcParams accepted by matplotlib, as well as the following: - ‘title’: title of plot - ‘x_title’: x-axis label - ‘y_title’: y-axis label - ‘vline_color’: color of vertical line - ‘vline_linestyle’: linestyle of vertical line - ‘grid’: whether to show grid - ‘legend_labels’: labels for legend
**kwargs – Additional arguments passed to confidence interval calculation.
- Returns:
arrays_by_condition (dict) – Dictionary of arrays containing trial data for each condition.
(figure, axes) (tuple) – Plot figure and axes objects.
- plot_pupil_surface(data=None, pupil_col=None, x_col=None, y_col=None, plot_type='count', vertices=None, nbins=64, log_counts=False, plot_by=None, show_centroid=True, save=None, plot_params=None)[source]#
Create an interactive surface plot of pupil dilation by gaze coordinates using numpy.histogram2d.
- Parameters:
data (pandas.DataFrame, optional) – DataFrame containing pupil size, x-coordinates, and y-coordinates. Being able to specify data is useful for plotting a subset of the data. See examples below. If None, uses self.data.
pupil_col (str, optional) – Column name for pupil size. Defaults to self.all_pupil_cols[-1].
x_col (str, optional) – Column name for x-coordinates of gaze. Defaults to self.x_col.
y_col (str, optional) – Column name for y-coordinates of gaze. Defaults to self.y_col.
plot_type (str, optional) – ‘count’ for number of measurements or ‘size’ for mean pupil size. Defaults to ‘count’.
nbins (int, optional) – Number of bins for the 2D histogram. Defaults to 64.
log_counts (bool, default=False) – Whether to apply log transformation to counts (only applies when plot_type=’count’). Defaults to False.
plot_by (str, optional) – Column name to group data by for separate subplots. Defaults to None.
show_centroid (bool, default=True) – Whether to show the centroid of the data. Defaults to True.
save (str, optional) – Path to save plot.
plot_params (dict, optional) – Dictionary of plotting parameters to override defaults - x_title : str, default=’Gaze X’ - y_title : str, default=’Gaze Y’ - title : str, default=’Pupil Foreshortening Error Surface’ - palette : str, default=’Viridis’ - width : int, default=400 - height : int, default=300
Examples
>>> # Plot a 2d histogram of the number of pupil measurements by condition >>> p.plot_pupil_surface(plot_by='condition') >>> # Plot a 2d histogram of the mean pupil size based on custom data >>> p.plot_pupil_surface(data=p.data[p.data['event'] == 'event_name']) >>> # Plot the mean pupil size rather than the count of measurements as a function of gaze coordinates >>> p.plot_pupil_surface(plot_type='size')
- plot_spaghetti(time_col=None, pupil_col=None, show_outliers=True, plot_by=None, save=False, plot_params=None, return_fig=True)[source]#
Plot pupil traces for all trials as a spaghetti plot.
- Parameters:
time_col (str, optional) – Column name for x-axis. Defaults to time column specified during initialization.
pupil_col (str, optional) – Column name for y-axis. Defaults to latest pupil column.
show_outliers (bool, default=True) – Whether to highlight outlier traces.
plot_by (str or list, optional) – Column(s) to group data by for separate plots.
save (str, optional) – Path to save plot. Only supports html files. If None, plot is not saved.
plot_params (dict, default={}) – Additional plotting parameters.
return_fig (bool, default=True) – Whether to return the figure object.
- Returns:
Plot figure object if return_fig is True.
- Return type:
plotly.graph_objects.Figure
Notes
Creates an interactive spaghetti plot showing pupil traces for all trials. If plot_by is specified, creates separate subplots for each group using dropdown menus. Outlier traces can be highlighted if outlier detection was performed.
- plot_trial(trial, time_col=None, pupil_col=None, hue=None, save=None, interactive=True, plot_params=None)[source]#
Plot data for a single trial.
A wrapper function that calls either _plot_trial_interactive() or _plot_trial_static() depending on the interactive parameter.
- Parameters:
trial (pandas.DataFrame) – DataFrame containing trial identifier.
time_col (str, optional) – Column name for x-axis values. Defaults to time column specified during initialization.
pupil_col (str or list, optional) – Column name(s) for y-axis values. Defaults to all pupil columns.
hue (str or list, optional) – Column(s) to group data by for separate lines.
save (str, optional) – Path to save plot.
interactive (bool, default=True) – Whether to create interactive plot.
plot_params (dict, optional) – Additional plotting parameters.
- Returns:
figure (matplotlib.figure.Figure or plotly.graph_objects.Figure) – Plot figure object.
axes (matplotlib.axes.Axes, optional) – Plot axes object (only for static plots).
- save(path)[source]#
Save PupilProcessor object to file using dill serialization.
This method saves the entire PupilProcessor object, including all data and processing history, to a file for later use.
- Parameters:
path (str) – Path where the object should be saved. Should include the file extension (e.g., ‘.pkl’).
- Raises:
FileExistsError – If a file already exists at the specified path.
- smooth(suffix='_sm', method='hann', window=100, **kwargs)[source]#
Smooth pupil data using various smoothing methods.
This method applies signal smoothing to reduce noise in the pupil data. Three smoothing methods are available:
Rolling mean: Simple moving average
Hann window: Weighted moving average using Hann window
Butterworth filter: Low-pass filter with specified cutoff
- Parameters:
suffix (str, default='_sm') – Suffix to append to the pupil column name for the smoothed data. For example, if pupil column is ‘pupil’, the new column will be ‘pupil_sm’.
method ({'rollingmean', 'hann', 'butter'}, default='hann') – Method to use for smoothing: - ‘rollingmean’: Simple moving average - ‘hann’: Hann window smoothing - ‘butter’: Butterworth low-pass filter
window (int, default=100) – Window size (in number of samples) for rolling mean or Hann window smoothing. Not used for Butterworth filter.
**kwargs (dict) –
Additional arguments for specific smoothing methods.
- For rolling mean and hann window:
Check pandas.DataFrame.rolling documentation for additional arguments.
- For Butterworth filter:
- cutoff_freqfloat
Cutoff frequency in Hz. Default is 4 Hz.
- orderint
Filter order. Default is 3.
- Returns:
self – Returns self for method chaining
- Return type:
Notes
Updates summary_data with smoothing method and parameters
Creates a new column with suffix appended to the current pupil column name
Updates all_pupil_cols and all_steps to track processing history
Missing values (NaN) are preserved
- summary(columns=None, level=None, agg_methods=None)[source]#
Get summary statistics of the data.
Returns summary data for specified columns, optionally grouped by level and aggregated using specified methods.
- Parameters:
columns (list, optional) – Columns to include in summary. Defaults to all columns.
level (str or list, optional) – Column(s) to group by.
agg_methods (dict, optional) – Dictionary mapping column names to aggregation methods. If None, uses mean for numeric columns.
- Returns:
Summary statistics dataframe.
- Return type:
pandas.DataFrame
- upsample(target_samp_freq, fill_pupil=False)[source]#
Upsample pupil data to a higher sampling rate.
This method upsamples the data by inserting empty rows to meet the required sampling rate. Missing values are foward-filled for non-pupil columns. Pupil columns remain as NaN where no data exists unless fill_pupil=True. A new column ‘upsampled’ is added to track the inserted rows. Trials that cannot be upsampled are reported. The sampling frequency is checked and updated again after upsampling.
- Parameters:
target_samp_freq (int) – Target sampling frequency in Hz. Must be higher than current sampling frequency.
fill_pupil (bool, default=False) – Whether to also fill missing values in pupil columns. If False, pupil columns remain as NaN where no data exists. This is simply a forward-fill. If you want to interpolate missing values, you can do so after upsampling.
- Returns:
self – Returns self for method chaining.
- Return type:
Notes
There might be slight discrepancies in the actual sampling rate from the target sampling rate because the time step between samples is rounded to the nearest integer. For example, if you supply a target sampling rate of 1001 Hz, the actual sampling rate will be 1000 Hz (round(1000/1001)= 1 ms time step). In the current implementation, this will result in an error because the actual sampling rate 1000 Hz does not match the target sampling rate 1001 Hz.
Upsampled data can be interpolated to fill missing values. You may need to set a lower missing_threshold for interpolation as the upsampling will introduce more missing values.
- Updates summary_data with:
run_upsample: Boolean indicating if upsampling was performed
upsampled_bin_size: Size of the upsampled time bin in milliseconds
upsampled_samp_freq: Upsampled sampling frequency in Hz
Updates all_steps to track processing history
Processing parameters are stored in self.params[‘upsample’]
- validate_trials(trials_to_exclude, invert_mask=False)[source]#
Mark trials as valid/invalid based on exclusion criteria.
This method adds a ‘valid’ column to both the data and summary_data, marking trials as valid or invalid based on the provided exclusion criteria.
- Parameters:
trials_to_exclude (pandas.DataFrame) – DataFrame containing trial identifiers to exclude. Must have columns matching the trial_identifier of the PupilProcessor.
invert_mask (bool, default=False) – If True, excludes all trials except those specified in trials_to_exclude. If False, excludes only the trials specified in trials_to_exclude.
- Returns:
self – Returns self for method chaining.
- Return type:
Notes
Adds ‘valid’ column to both summary_data and data
Valid column is boolean: True for valid trials, False for invalid trials
Trials are matched based on trial_identifier columns
Duplicate entries in trials_to_exclude are automatically removed
- pupeyes.pupil.compute_speed(x, y)[source]#
Compute the speed of change between two arrays.
This function calculates the rate of change (speed) between corresponding points in two arrays. The speed is computed as the absolute maximum of the forward and backward differences at each point, normalized by the time difference.
- Parameters:
x (array-like) – First array of values, typically pupil measurements. Must be numeric and same length as y.
y (array-like) – Second array of values, typically time points. Must be numeric and same length as x.
- Returns:
Array of speed values with same length as input arrays. Contains NaN values at endpoints and where division by zero or invalid values occur.
- Return type:
numpy.ndarray
Notes
Uses np.diff() to compute differences between consecutive points
Takes absolute maximum of forward/backward differences at each point
Suppresses RuntimeWarnings for NaN/inf values
Sets NaN/inf values to NaN in output
- pupeyes.pupil.convert_pupil(pupil_size, artificial_d, artificial_size, recording_unit='diameter')[source]#
Convert pupil measurements between different recording units.
This function converts pupil measurements from raw units (arbitrary units from the eye tracker) to millimeters using calibration values from an artificial pupil. It handles both diameter and area measurements.
- Parameters:
pupil_size (float or array-like) – Pupil size in recording units (diameter or area). Can be a single value or an array of measurements.
artificial_d (float) – Diameter of artificial pupil used for calibration (in mm). This is the known physical size of the calibration pupil.
artificial_size (float) – Size of artificial pupil in recording units (diameter or area). This is the size measured by the eye tracker for the calibration pupil.
recording_unit ({'diameter', 'area'}, default='diameter') – Unit of the recorded measurements: - ‘diameter’: Linear scaling is applied - ‘area’: Square root is taken before scaling
- Returns:
Converted pupil measurements in millimeters. Will have same shape as input pupil_size.
- Return type:
numpy.ndarray
Notes
The unit of artificial_size must match the recording_unit
The unit of artificial_d is always in millimeters
For diameter recordings: output = artificial_d * pupil_size / artificial_size
For area recordings: output = artificial_d * sqrt(pupil_size / artificial_size)
Useful for standardizing pupil measurements across different setups
- Raises:
ValueError – If recording_unit is not ‘diameter’ or ‘area’
- pupeyes.pupil.prf(t, t_max=500, n=10.1)[source]#
PRF function according to Hoeks and Levelt (1993)
- Parameters:
t (array-like) – Time points in milliseconds.
t_max (float, optional) – Location of the peak (default is 500 ms).
n (float, optional) – Scale parameter (default is 10.1).
- Returns:
Normalized PRF values at each time point.
- Return type:
numpy.ndarray
Areas of Interest (AOI)#
Area of Interest (AOI) Analysis Module
This module provides basic functions for analyzing eye tracking data in relation to Areas of Interest (AOIs).
- pupeyes.aoi.compute_aoi_statistics(x, y, aois, durations=None)[source]#
Compute fixation statistics for each Area of Interest (AOI).
- Parameters:
x (array-like) – Array of x-coordinates for fixation points
y (array-like) – Array of y-coordinates for fixation points
aois (dict) – Dictionary mapping AOI names to lists of vertex coordinates. Each vertex list should define a polygon as [(x1,y1), (x2,y2), …].
durations (array-like, optional) – Array of fixation durations corresponding to each (x,y) point.
- Returns:
Dictionary containing statistics for each AOI and points outside AOIs:
- outsidedict
- countint
Number of fixations outside all AOIs
- total_durationfloat
Total duration of outside fixations
- aoi_namedict
- countint
Number of fixations in this AOI
- total_durationfloat
Total duration in this AOI
If durations is None, total_duration values will be 0. Returns empty dict if aois is empty.
- Return type:
dict
Notes
If a fixation point lies within multiple AOIs, it is counted only in the first AOI that contains it based on the iteration order of the aois dictionary.
Examples
>>> aois = { ... 'face': [(0,0), (100,0), (100,100), (0,100), (0,0)], ... 'text': [(150,0), (250,0), (250,50), (150,50), (150,0)] ... } >>> x = np.array([50, 200, 300]) # points in face, text, outside >>> y = np.array([50, 25, 300]) >>> durations = np.array([100, 150, 200]) # durations in milliseconds >>> stats = compute_aoi_statistics(x, y, aois, durations) >>> stats { 'outside': {'count': 1, 'total_duration': 200.0}, 'face': {'count': 1, 'total_duration': 100.0}, 'text': {'count': 1, 'total_duration': 150.0} }
- pupeyes.aoi.get_fixation_aoi(x, y, aois)[source]#
For each fixation point, get the Area of Interest (AOI) that contains it. If the point is outside all AOIs, return None.
- Parameters:
x (float or numpy.ndarray) – X-coordinate(s) of fixation point(s)
y (float or numpy.ndarray) – Y-coordinate(s) of fixation point(s)
aois (dict or None) – Dictionary mapping AOI names to lists of vertex coordinates. Each vertex list should define a polygon as [(x1,y1), (x2,y2), …]. The last vertex should be the same as the first vertex to close the polygon.
- Returns:
If input coordinates are scalars:
- str
Name of the AOI containing the point, or None if not in any AOI
If input coordinates are arrays:
- list
List of AOI names for each point, with None for points outside all AOIs
- Return type:
str or list
Notes
If a point lies within multiple AOIs, it is assigned to the first AOI that contains it based on the iteration order of the aois dictionary.
Examples
>>> # Single point >>> aois = { ... 'face': [(0,0), (100,0), (100,100), (0,100), (0,0)], ... 'text': [(150,0), (250,0), (250,50), (150,50), (150,0)] ... } >>> get_fixation_aoi(50, 50, aois) 'face' >>> get_fixation_aoi(300, 300, aois) None
>>> # Multiple points >>> x = np.array([50, 200, 300]) >>> y = np.array([50, 25, 300]) >>> get_fixation_aoi(x, y, aois) ['face', 'text', None]
- pupeyes.aoi.is_inside(points, polygon)[source]#
Check if multiple points lie inside a polygon.
- Parameters:
points (numpy.ndarray) – Nx2 array of (x,y) coordinates to check
polygon (numpy.ndarray) – Array of (x,y) coordinates defining the polygon vertices
- Returns:
Boolean array indicating whether each point is inside the polygon
- Return type:
numpy.ndarray
Examples
>>> # Define a square polygon >>> square = np.array([(0,0), (100,0), (100,100), (0,100), (0,0)]) >>> >>> # Check multiple points >>> points = np.array([ ... [50, 50], # inside ... [150, 150], # outside ... [0, 50], # on edge ... [0, 0] # on vertex ... ]) >>> is_inside(points, square) array([ True, False, True, True])
- pupeyes.aoi.is_inside_singlepoint(polygon, point)[source]#
Check if a point lies inside a polygon using ray-casting algorithm.
- Parameters:
polygon (array-like) – List of (x,y) coordinates defining the polygon vertices. The last vertex should be the same as the first to close the polygon.
point (tuple) – (x,y) coordinates of the point to check
- Returns:
Result code indicating point position:
- 0
Point is outside the polygon
- 1
Point is inside the polygon
- 2
Point lies exactly on the polygon’s edge or vertex
- Return type:
int
Notes
Uses a ray-casting algorithm that counts the number of times a horizontal ray from the point intersects with polygon edges.
Examples
>>> # Define a square >>> square = [(0,0), (100,0), (100,100), (0,100), (0,0)] >>> >>> # Check points >>> is_inside_singlepoint(square, (50, 50)) # inside 1 >>> is_inside_singlepoint(square, (150, 150)) # outside 0 >>> is_inside_singlepoint(square, (0, 50)) # on edge 2 >>> is_inside_singlepoint(square, (0, 0)) # on vertex 2
Interactive Applications#
Pupil Viewer#
Interactive Pupil Data Viewer
This module provides an interactive web application for visualizing pupil preprocessing steps. It uses Dash and Plotly to create an interface where users can: - Select individual trials - View all preprocessing steps applied to pupil data - Compare raw and processed pupil traces
- class pupeyes.apps.pupil_viewer.PupilViewer(pupil_processor, hue=None, columns=None)[source]#
Bases:
objectAn interactive web-based visualization tool for pupil preprocessing data.
This class provides a Dash-based interface for visualizing pupil data processing steps, allowing users to explore how different preprocessing operations affect the pupil signal. The interface supports trial selection, column selection for comparison, and interactive plotting with subplots for each processing step.
- Parameters:
pupil_processor (PupilProcessor) – Instance of PupilProcessor containing the pupil data and processing history. This object should contain both raw and processed pupil data.
hue (str, optional) – Column name to group data by for separate lines in the plot. Useful for visualizing different components of a single trial.
columns (list of str, optional) – List of column names to plot. If not provided, all pupil columns from the PupilProcessor will be shown.
- pupil_processor#
The PupilProcessor instance containing the data
- Type:
- hue#
Column name used for plotting different components of a single trial
- Type:
str or None
- columns#
List of column names being plotted
- Type:
list
- app#
The Dash application instance
- Type:
dash.Dash
- run(port=8051, **kwargs)[source]#
Run the Dash server for the pupil data viewer.
- Parameters:
port (int, default=8051) – Port number to run the server on. Make sure the port is available and not blocked by firewall.
**kwargs (dict) – Additional keyword arguments passed to dash.run_server(). See Dash documentation for available options.
Notes
The application will run until interrupted (Ctrl+C)
Access the interface at http://localhost:<port>
Each preprocessing step is shown in a separate subplot
Interactive controls allow exploration of different trials and columns
Fixation Viewer#
Interactive Eye Movement Visualization Module using Dash
This module provides an interactive web-based visualization tool for eye movement data, including scanpath replay, heatmaps, areas of interest, and fixation sequence plots.
- class pupeyes.apps.fixation_viewer.FixationViewer(data=None, screen_dims=(1920, 1080), col_mapping=None, stimuli_path=None, animation_speed=500, dot_size=10)[source]#
Bases:
objectAn interactive web-based visualization tool for eye movement data.
This class provides a Dash-based interface for visualizing eye movement data with multiple visualization modes (scanpath, heatmap, AOI), interactive controls, and data export capabilities.
- Parameters:
data (pandas.DataFrame, optional) – Eye movement data with columns for timestamps, coordinates, etc.
screen_dims (tuple, default=(1920, 1080)) – Screen dimensions in pixels (width, height)
col_mapping (dict, optional) –
Column name mapping for required fields:
- trial_idstr or list
Trial identifier column(s). Can be a single column name or a list of column names that together uniquely identify a trial (e.g., [‘subject’, ‘block’, ‘trial’])
- timestampstr
Timestamp column (optional)
- xstr
X coordinate
- ystr
Y coordinate
- durationstr
Fixation duration (optional)
- stimulistr
Stimuli path/identifier
stimuli_path (str, optional) – Base path for stimuli images
animation_speed (int, default=500) – Animation playback speed in milliseconds
dot_size (int, default=10) – Fixed size for fixation dots
- data#
The eye movement data being visualized
- Type:
pandas.DataFrame
- screen_dims#
The dimensions of the visualization canvas
- Type:
tuple
- col_mapping#
Mapping of required columns to data columns
- Type:
dict
- aois#
Dictionary of Areas of Interest definitions
- Type:
dict
- app#
The Dash application instance
- Type:
dash.Dash
- run(debug=False, port=8050, **kwargs)[source]#
Start the Dash server and run the fixation viewer application.
This method initializes and starts the web server for the fixation viewer application. The application will be accessible through a web browser at the specified port.
- Parameters:
debug (bool, default=False) – Whether to run the server in debug mode
port (int, default=8050) – Port to run the server on
**kwargs (dict) – Additional arguments to pass to dash.run_server() See Dash documentation for available options.
Notes
The application will run until interrupted (Ctrl+C)
Access the interface at http://localhost:<port>
Debug mode provides additional error information
Default port (8050) can be changed if already in use
- set_aois(aois)[source]#
Set Areas of Interest (AOIs) for visualization.
- Parameters:
aois (dict) –
Can be either:
A nested dictionary mapping stimulus IDs to AOI definitions.
A simple dictionary of AOIs that applies to all stimuli.
where each AOI is defined by a list of (x,y) vertex coordinates. The last point should be the same as the first point to close the polygon.
- Return type:
None
Examples
>>> # Define AOIs for each stimulus >>> aois = { ... 'stimulus1': { ... 'aoi1': [(x1,y1), (x2,y2), ..., (x1, y1)], ... 'aoi2': [(x1,y1), (x2,y2), ..., (x1, y1)] ... }, ... 'stimulus2': { ... 'aoi1': [(x1,y1), (x2,y2), ..., (x1, y1)], ... 'aoi2': [(x1,y1), (x2,y2), ..., (x1, y1)] ... } ... }
>>> # Define AOIs for all stimuli >>> aois = { ... 'aoi1': [(x1,y1), (x2,y2), ..., (x1, y1)], ... 'aoi2': [(x1,y1), (x2,y2), ..., (x1, y1)] ... }
AOI Drawer#
Interactive AOI Drawing Tool using Dash
This module provides an interactive web-based tool for drawing Areas of Interest (AOIs) that can be used with the EyeMovementVisualizer.
- class pupeyes.apps.aoi_drawer.AOIDrawer(screen_dims=(1920, 1080), stimuli=None, stimuli_name=None)[source]#
Bases:
objectAn interactive web-based tool for drawing Areas of Interest (AOIs).
This class provides a Dash-based web interface for drawing and managing Areas of Interest (AOIs) on stimulus images. It supports multiple drawing tools (freeform, rectangle, circle), editing capabilities, and export functionality.
- Parameters:
screen_dims (tuple, default=(1920, 1080)) – Screen dimensions in pixels (width, height). Used to set the drawing canvas size and scale background images.
stimuli (str or numpy.ndarray, optional) – Path to the stimulus image or a numpy array containing the image. Supports various image formats and both RGB and grayscale images.
stimuli_name (str, optional) – Name of the stimulus image, used for display and as default save filename. If not provided, defaults to “AOIs”.
- aois#
Dictionary storing AOI data, where keys are AOI names and values are lists of (x, y) coordinate tuples defining the AOI vertices.
- Type:
dict
- app#
The Dash application instance.
- Type:
dash.Dash
- screen_dims#
The dimensions of the drawing canvas.
- Type:
tuple
- run(debug=False, port=8051, **kwargs)[source]#
Start the Dash server and run the AOI drawing application.
This method initializes and starts the web server for the AOI drawing interface. The application will be accessible through a web browser at the specified port.
- Parameters:
debug (bool, default=False) – Whether to run the server in debug mode
port (int, default=8051) – Port number to run the server on. Make sure the port is available and not blocked by firewall.
**kwargs (dict) – Additional keyword arguments passed to dash.run_server(). See Dash documentation for available options.
Notes
The application will run until interrupted (Ctrl+C)
Access the interface at http://localhost:<port>
Debug mode provides additional error information
Default port (8051) can be changed if already in use
Utilities#
General Utilities#
Utility Functions Module
This module provides utility functions used across the pupeyes package, including: - Coordinate system conversions between Eyelink and PsychoPy - Point-in-polygon testing with parallel processing - Signal filtering and data masking - Geometric calculations for circular stimulus arrangements and others.
- pupeyes.utils.angular_distance(line1, line2)[source]#
Calculate the angle between two lines in degrees.
- Parameters:
line1 (tuple) – Tuple of two points ((x1,y1), (x2,y2)) defining the first line
line2 (tuple) – Tuple of two points ((x1,y1), (x2,y2)) defining the second line
- Returns:
Angle between the lines in degrees, always in range [0, 180]
- Return type:
float
Examples
>>> # Perpendicular lines >>> line1 = ((0,0), (1,0)) # horizontal line >>> line2 = ((0,0), (0,1)) # vertical line >>> angular_distance(line1, line2) 90.0
>>> # 45-degree angle >>> line1 = ((0,0), (1,0)) >>> line2 = ((0,0), (1,1)) >>> angular_distance(line1, line2) 45.0
- pupeyes.utils.convert_coordinates(coord, screen_dims=None, direction='to_el', psychopy_units='pix', round_to=2)[source]#
Convert coordinates between Eyelink and PsychoPy coordinate systems. For Eyelink, the origin is at the top-left corner of the screen. For PsychoPy, the origin is at the center of the screen. For more information on the psychopy coordinate system, see: https://psychopy.org/general/units.html
- Parameters:
coord (array-like or str) – The coordinates to convert. Can be: - array-like: [x, y] - string: ‘x,y’ or ‘[x,y]’ or ‘(x,y)’
screen_dims (array-like, optional) – Screen dimensions [width, height] in pixels. Default is [1600, 1200].
direction ({'to_el', 'to_psychopy'}, optional) – Conversion direction: - ‘to_el’: convert from PsychoPy to Eyelink coordinates - ‘to_psychopy’: convert from Eyelink to PsychoPy coordinates Default is ‘to_el’.
psychopy_units ({'pix', 'norm', 'height'}, optional) – PsychoPy units to convert from/to: - ‘pix’: pixels from center - ‘norm’: normalized units [-1, 1] - ‘height’: units relative to screen height Default is ‘pix’.
round_to (int or None, optional) – Number of decimal places to round coordinates to. Default is 2. If None, no rounding is performed.
- Returns:
Converted [x, y] coordinates
- Return type:
numpy.ndarray
Notes
Coordinate system details: - Eyelink: origin at top-left, positive x right, positive y down - PsychoPy: origin at center, positive x right, positive y up
Examples
>>> # Convert screen center from PsychoPy to Eyelink coordinates >>> convert_coordinates([0, 0], screen_dims=[1600, 1200]) array([800., 600.]) # half width, half height in Eyelink coordinates
>>> # Convert back from Eyelink to PsychoPy coordinates >>> convert_coordinates([800, 600], direction='to_psychopy') array([0., 0.]) # back to center in PsychoPy coordinates
>>> # Convert normalized coordinates (range -1 to 1) >>> convert_coordinates([0.5, 0.5], psychopy_units='norm') array([1200., 300.]) # scaled by screen dimensions
>>> # Convert height units (relative to screen height) >>> convert_coordinates([0.5, 0.5], screen_dims=[1600, 1200], ... psychopy_units='height') array([1400., 0.]) # 50% of screen height = 600 pixels
>>> # Convert from string input >>> convert_coordinates("100,100") array([900., 500.]) # PsychoPy (100,100) to Eyelink coordinates
- Raises:
ValueError – If direction is not ‘to_el’ or ‘to_psychopy’ If psychopy_units is not ‘pix’, ‘norm’, or ‘height’ If string coordinates cannot be parsed
- pupeyes.utils.gaussian_2d(img, fc)[source]#
Apply a 2D Gaussian filter to an image. Python adaptation of cvzoya/saliency
- Parameters:
img (numpy.ndarray) – 2D input image array
fc (float) – Cut-off frequency (-6dB)
- Returns:
Filtered image with same shape as input
- Return type:
numpy.ndarray
Notes
Python adaptation of the Gaussian filtering method from the saliency metrics toolbox [1]. The filter is applied in the frequency domain using FFT.
References
Examples
>>> # Create sample image with noise >>> img = np.random.randn(100, 100) >>> # Apply Gaussian filter >>> filtered = gaussian_2d(img, fc=10)
- pupeyes.utils.get_isoeccentric_positions(n_items, radius, offset_deg=0, coordinate_system='psychopy', screen_dims=None, round_to=2)[source]#
Get coordinates for items arranged in a circle around screen center.
- Parameters:
n_items (int) – Number of items to position in circle
radius (float) – Distance from screen center to each item
offset_deg (float, optional) – Rotation offset in degrees from rightmost position (counterclockwise). Default is 0.
coordinate_system ({'psychopy', 'eyelink'}, optional) – Output coordinate system: - ‘psychopy’: origin at center, positive y up - ‘eyelink’: origin at top-left, positive y down Default is ‘psychopy’.
screen_dims (list, optional) – Screen dimensions [width, height] in pixels. Only used if coordinate_system is ‘eyelink’. Default is [1600, 1200].
round_to (int or None, optional) – Number of decimal places to round coordinates to. Default is 2. If None, no rounding is performed.
- Returns:
List of (x,y) coordinate tuples for each item position, arranged counterclockwise starting from the rightmost position.
- Return type:
list
Notes
Items are arranged counterclockwise at equal angular intervals
First item is placed at the rightmost position (0 degrees) plus any offset
Angular separation between items is 360°/n_items
Examples
>>> # Get 4 positions in PsychoPy coordinates (origin at center) >>> get_isoeccentric_positions(4, 100, round_to=0) [(100, 0), (0, 100), (-100, 0), (0, -100)]
>>> # Get 4 positions with 45° offset >>> get_isoeccentric_positions(4, 100, offset_deg=45, round_to=0) [(71, 71), (-71, 71), (-71, -71), (71, -71)]
>>> # Get positions in Eyelink coordinates (origin at top-left) >>> get_isoeccentric_positions(4, 100, coordinate_system='eyelink', round_to=0) [(900, 600), (800, 500), (700, 600), (800, 700)]
- pupeyes.utils.lowpass_filter(data, sampling_freq, cutoff_freq=4, order=3)[source]#
Apply a Butterworth lowpass filter to the input data.
Uses scipy.signal to create and apply a Butterworth filter that removes high frequency components above the cutoff frequency while preserving lower frequencies.
- Parameters:
data (array-like) – Input signal to be filtered
sampling_freq (float) – Sampling frequency of the input signal in Hz
cutoff_freq (float, optional (default=4)) – Cutoff frequency of the filter in Hz. Frequencies above this will be attenuated.
order (int, optional (default=3)) – Order of the Butterworth filter. Higher orders give sharper frequency cutoffs but may introduce more ringing artifacts.
- Returns:
Filtered version of the input signal with same shape as input
- Return type:
numpy.ndarray
Notes
Uses scipy.signal.butter() to design the filter coefficients
Applies zero-phase filtering using scipy.signal.filtfilt()
The filter is applied forward and backward to avoid phase shifts
- pupeyes.utils.make_mask(data, trials_to_mask, invert=False)[source]#
Create a boolean mask for filtering data based on specified trials.
- Parameters:
data (pandas.DataFrame) – The main dataset to create a mask for
trials_to_mask (pandas.DataFrame or dict) – Trials to use for creating the mask. Can be a DataFrame or a dictionary that can be converted to a DataFrame. Should have matching column names with data
invert (bool, optional (default=False)) – If True, inverts the mask (changes True to False and vice versa)
- Returns:
Boolean mask series with same length as input data. True values indicate rows to keep, False values indicate rows to filter out
- Return type:
pandas.Series
Notes
If trials_to_mask is a dictionary, it will attempt to convert it to a DataFrame
Warns if resulting mask is all True or all False
Uses pandas merge with indicator to create the mask
Examples
>>> # Create sample dataset >>> data = pd.DataFrame({ ... 'trial': [1, 2, 3, 4, 5], ... 'condition': ['A', 'B', 'A', 'B', 'C'], ... 'rt': [0.5, 0.6, 0.4, 0.7, 0.5] ... }) >>> >>> # Mask trials with condition 'A' using dictionary >>> to_mask = {'condition': 'A'} >>> mask = make_mask(data, to_mask) >>> data[mask] # Shows only trials with conditions B and C trial condition rt 1 2 B 0.6 3 4 B 0.7 4 5 C 0.5 >>> >>> # Mask multiple trials using DataFrame >>> to_mask_df = pd.DataFrame({ ... 'trial': [1, 3], ... 'condition': ['A', 'A'] ... }) >>> mask = make_mask(data, to_mask_df) >>> data[mask] # Same result as above trial condition rt 1 2 B 0.6 3 4 B 0.7 4 5 C 0.5 >>> >>> # Keep only the masked trials using invert=True >>> mask = make_mask(data, to_mask_df, invert=True) >>> data[mask] # Shows only trials with condition A trial condition rt 0 1 A 0.5 2 3 A 0.4
- pupeyes.utils.mat2gray(img)[source]#
Scale image values to grayscale range [0, 1].
- Parameters:
img (numpy.ndarray) – Input image array
- Returns:
Normalized image with values scaled to range [0, 1]
- Return type:
numpy.ndarray
Examples
>>> # Create sample image >>> img = np.array([[0, 127, 255], [63, 191, 255]]) >>> normalized = mat2gray(img) >>> normalized array([[0. , 0.5, 1. ], [0.25, 0.75, 1. ]])
- pupeyes.utils.parse_pole(pole)[source]#
Parse and validate pole (origin) coordinates. from https://osdoc.cogsci.nl/3.3/manual/python/common/
- Parameters:
pole (tuple or array-like) – (x, y) coordinates for the pole/origin point
- Returns:
Validated (x, y) coordinates as floats
- Return type:
tuple
- Raises:
ValueError – If pole is not a valid 2D coordinate pair
Examples
>>> parse_pole((1, 2)) (1.0, 2.0) >>> parse_pole([1.5, 2.5]) (1.5, 2.5)
- pupeyes.utils.xy_circle(n, rho, phi0=0, pole=(0, 0))[source]#
Generate points arranged in a circle. from https://osdoc.cogsci.nl/3.3/manual/python/common/
- Parameters:
n (int) – Number of points to generate
rho (float) – Radius of the circle (distance from center)
phi0 (float, optional) – Starting angle in degrees (counterclockwise from right). Default is 0.
pole (tuple, optional) – Center point (x, y) coordinates. Default is (0, 0).
- Returns:
List of (x, y) coordinate tuples for points arranged in a circle
- Return type:
list
Notes
Points are arranged counterclockwise starting from phi0. The angular separation between points is 360°/n.
Examples
>>> # Generate 4 points in a circle of radius 100 >>> xy_circle(4, 100) [(100, 0), (0, 100), (-100, 0), (0, -100)]
>>> # Generate 4 points with 45° offset >>> xy_circle(4, 100, phi0=45) [(70.71, 70.71), (-70.71, 70.71), (-70.71, -70.71), (70.71, -70.71)]
- pupeyes.utils.xy_from_polar(rho, phi, pole=(0, 0))[source]#
Convert polar coordinates to Cartesian coordinates. from https://osdoc.cogsci.nl/3.3/manual/python/common/
- Parameters:
rho (float) – Radial distance from origin (or pole)
phi (float) – Angle in degrees (counterclockwise from right)
pole (tuple, optional) – Origin point (x, y) coordinates. Default is (0, 0).
- Returns:
(x, y) coordinates in Cartesian system
- Return type:
tuple
Notes
The angle phi is measured counterclockwise from the positive x-axis, following the mathematical convention.
Examples
>>> # Convert 45° angle at distance 100 >>> xy_from_polar(100, 45) (70.71, 70.71)
>>> # Convert with offset origin >>> xy_from_polar(100, 0, pole=(50, 50)) (150, 50)
Plotting Utilities#
Plotting Utilities for Eye Movement Data
This module provides plotting functions for eye movement data visualization, including heatmaps, scanpaths, and areas of interest (AOIs).
- pupeyes.plot_utils.draw_aois(aois, screen_dims, x=None, y=None, background_img=None, alpha=0, colors=None, save=None)[source]#
Draw Areas of Interest (AOIs) and optionally plot fixation points within them.
This function visualizes AOIs as polygons and can optionally show fixation points colored according to which AOI they fall within. AOIs are drawn as outlined polygons with optional fill color and can be overlaid on a background image.
- Parameters:
aois (dict) – Dictionary mapping AOI names to lists of (x, y) vertex coordinates defining the AOI polygons. The last vertext should be the same as the first vertex to close the polygon. Example: {‘AOI1’: [(100, 100), (200, 100), (200, 200), (100, 200), (100, 100)]}
screen_dims (tuple) – Screen dimensions in pixels (width, height). Used to set plot boundaries and maintain correct aspect ratio.
x (array-like, optional) – X coordinates of fixation points in screen coordinates (0 = left). If provided along with y, points will be plotted and colored based on which AOI they fall within.
y (array-like, optional) – Y coordinates of fixation points in screen coordinates (0 = top)
background_img (str, PIL.Image or numpy.ndarray, optional) – Background image to overlay AOIs on. Can be: - Path to an image file (str) - PIL Image object - Numpy array of image data Image will be resized to match screen_dims if necessary.
alpha (float, default=0) – Fill transparency for AOI polygons (0 = transparent, 1 = opaque). The outlines remain fully opaque regardless of this value.
colors (dict, optional) – Dictionary mapping AOI names to colors for both the AOI polygons and their associated fixation points. If None, uses matplotlib’s tab20 colormap to assign colors automatically.
save (str, optional) – Path where the plot should be saved. If None, plot is not saved to disk.
- Returns:
(figure, axes) tuple containing the plot
- Return type:
tuple
Notes
The coordinate system uses screen coordinates where (0,0) is at the top-left
AOIs are drawn with solid outlines and optional transparent fill
When background_img is provided, it is displayed with 40% opacity
Fixation points outside any AOI are colored gray
A legend is automatically added showing AOI names
The plot maintains the correct aspect ratio based on screen dimensions
- pupeyes.plot_utils.draw_heatmap(x, y, screen_dims, durations=None, fc=6, colormap='viridis', alpha=0.7, background_img=None, return_data=False)[source]#
Create a heatmap visualization of fixation density using 2D histogram and Gaussian smoothing.
This function generates a heatmap by first creating a 2D histogram of fixation locations, then applying Gaussian smoothing to create a continuous representation of fixation density. The resulting heatmap can be overlaid on a background image if provided.
- Parameters:
x (array-like) – X coordinates of fixations in screen coordinates (0 = left)
y (array-like) – Y coordinates of fixations in screen coordinates (0 = top)
screen_dims (tuple) – Screen dimensions in pixels (width, height). Used to set the histogram bins and plot boundaries.
durations (array-like, optional) – Fixation durations for weighting the heatmap. If provided, longer fixations will contribute more to the density estimate.
fc (float, default=6) – Cut off frequency (-6dB) for Gaussian smoothing. Higher values result in less smoothing.
colormap (str, default='viridis') – Matplotlib colormap to use for the heatmap visualization
alpha (float, default=0.7) – Transparency of the heatmap overlay (0 = transparent, 1 = opaque)
background_img (str, PIL.Image or numpy.ndarray, optional) – Background image to overlay heatmap on. Can be: - Path to an image file (str) - PIL Image object - Numpy array of image data Image will be resized to match screen_dims if necessary.
return_data (bool, default=False) – If True, returns the raw heatmap array instead of plotting
- Returns:
- If return_data is True:
Returns the normalized heatmap array (shape: height x width)
- If return_data is False:
Returns (figure, axes) tuple containing the plot
- Return type:
tuple or numpy.ndarray
Notes
The heatmap is generated using numpy.histogram2d and smoothed using a Gaussian filter
The coordinate system uses screen coordinates where (0,0) is at the top-left
The heatmap values are normalized to the range [0,1]
When using a background image, the heatmap is overlaid with the specified alpha transparency
- pupeyes.plot_utils.draw_scanpath(x, y, screen_dims, durations=None, dot_size_scale=3.0, line_width=1.0, dot_cmap='viridis', line_cmap='coolwarm', dot_alpha=0.8, line_alpha=0.5, background_img=None, show_labels=True, label_offset=(5, 5))[source]#
Create a visualization of fixation sequence (scanpath) with numbered points and connecting lines.
This function visualizes the sequence of fixations by plotting points at fixation locations and connecting them with lines to show the order. The points can be sized by fixation duration and colored using a colormap. The connecting lines use a different colormap to show sequence order.
- Parameters:
x (array-like) – X coordinates of fixations in screen coordinates (0 = left)
y (array-like) – Y coordinates of fixations in screen coordinates (0 = top)
screen_dims (tuple) – Screen dimensions in pixels (width, height). Used to set plot boundaries.
durations (array-like, optional) – Fixation durations in milliseconds. If provided, dot sizes will be scaled by the square root of duration.
dot_size_scale (float, default=3.0) – Base size for dots if no duration data, or scaling factor for dot sizes when durations are provided. Larger values = bigger dots.
line_width (float, default=1.0) – Width of the lines connecting fixation points
dot_cmap (str, default='viridis') – Colormap for dots. If durations provided, represents duration. If no durations, all dots will be blue.
line_cmap (str, default='coolwarm') – Colormap for connecting lines to show sequence order. Earlier saccades are colored differently from later ones.
dot_alpha (float, default=0.8) – Transparency of fixation dots (0 = transparent, 1 = opaque)
line_alpha (float, default=0.5) – Transparency of connecting lines (0 = transparent, 1 = opaque)
background_img (str, PIL.Image or numpy.ndarray, optional) – Background image to overlay scanpath on. Can be: - Path to an image file (str) - PIL Image object - Numpy array of image data Image will be resized to match screen_dims if necessary.
show_labels (bool, default=True) – Whether to show numeric labels for fixation sequence order
label_offset (tuple, default=(5, 5)) – (x, y) offset in pixels for the position of numeric labels relative to fixation points
- Returns:
(figure, axes) tuple containing the plot
- Return type:
tuple
Notes
The coordinate system uses screen coordinates where (0,0) is at the top-left
Dot sizes are scaled by sqrt(duration) if durations are provided
When using a background image, it is displayed with 40% opacity
Fixation sequence is numbered starting from 1
Lines between fixations show the saccade paths
Miscellaneous#
Saccade Functions#
Saccade Analysis Module
This module provides functions for analyzing saccadic eye movements recorded with Eyelink eye trackers. Currently, this files only contains functions that are tailored for visual search tasks in which items are presented in a circular array (e.g., the additional singleton task).
- pupeyes.saccades.saccade_aoi_angular(sample_data, data, col_sample_timestamp, col_x, col_y, col_saccade_start_time, col_saccade_end_time, col_target_pos, col_distractor_pos, col_distractor_cond, col_other_pos, item_coords, use=None, threshold=30)[source]#
Classify saccades based on their angular deviation towards potential target locations. Different from saccade_aoi_annulus(), this function uses the initial firing direction of a saccade to classify its destination. As a result, it also requires raw gaze position data. Make sure to use the same coordinate system for both sample_data and data.
- Parameters:
sample_data (pandas.DataFrame) – Raw eye tracking samples containing gaze positions
data (pandas.DataFrame) – Saccade data with start/end times
col_sample_timestamp (str) – Column name for timestamps in sample_data
col_x (str) – Column names for x and y coordinates in sample_data
col_y (str) – Column names for x and y coordinates in sample_data
col_saccade_start_time (str) – Column names for saccade start and end times
col_saccade_end_time (str) – Column names for saccade start and end times
col_target_pos (str) – Column name for target position coordinates
col_distractor_pos (str) – Column name for distractor position coordinates
col_distractor_cond (str) – Column name for distractor condition (‘P’ for present, ‘A’ for absent)
col_other_pos (list of str or None) – Column names for other item position coordinates
item_coords (list or numpy.ndarray) – List of (x,y) coordinates for all possible item positions
use (str or int, optional) – Point in the trajectory of a saccade to use for classification: - ‘mid’: midpoint (default) - ‘one-third’: one-third point - int: specific sample number - None: endpoint
threshold (float, optional) – Maximum angular deviation (degrees) to consider a saccade as directed towards an item (default: 30)
- Returns:
Original DataFrame with added columns:
- curritemstr
Item type (‘Target’, ‘Singleton’, ‘Non-singleton’, or NaN)
- flagstr
Reason for invalid classification (‘insufficient_samples’, ‘big_angle’, or NaN)
- Return type:
pandas.DataFrame
Notes
If a saccade starts outside the annulus, it is classified as ‘invalid_start_pos’.
If a saccade ends outside the annulus, it is classified as ‘invalid_end_pos’.
If a saccade ends too far from any item, it is classified as ‘no_item_in_range’.
- pupeyes.saccades.saccade_aoi_annulus(data, item_coords, col_startx, col_starty, col_endx, col_endy, col_distractor_cond, col_target_pos, col_distractor_pos, col_other_pos=None, screen_dims=(1600, 1200), annulus_range=(50, 600), item_range=None, start_range=None, fixation_mode=False)[source]#
Classify saccade endpoints or fixations based on their proximity to items within an annular region. The function assumes eyelink coordinates are used, where the origin is in the top-left corner. You might need to convert your coordinates before using this function.
- Parameters:
data (pandas.DataFrame) – DataFrame containing saccade or fixation data
item_coords (list or numpy.ndarray) – List of (x,y) coordinates for all possible item positions
col_startx (str) – Column names for saccade start coordinates
col_starty (str) – Column names for saccade start coordinates
col_endx (str) – Column names for saccade end coordinates
col_endy (str) – Column names for saccade end coordinates
col_distractor_cond (str) – Column name for distractor condition (‘P’ for present, ‘A’ for absent)
col_target_pos (str) – Column name for target position coordinates
col_distractor_pos (str) – Column name for distractor position coordinates
col_other_pos (list of str, optional) – Column names for other item position coordinates
screen_dims (tuple, optional) – Screen dimensions (width, height) in pixels (default: (1600, 1200))
annulus_range (tuple, optional) – Inner and outer radius of annulus in pixels (default: (50, 600))
item_range (float, optional) – Maximum distance to consider a point as belonging to an item
start_range (float, optional) – Maximum allowed distance from screen center for start position
fixation_mode (bool, optional) – If True, only check end positions (default: False)
- Returns:
Original DataFrame with added columns:
- curritemstr
Item type (‘Target’, ‘Singleton’, ‘Non-singleton’, or NaN)
- currlocint
Index of closest item position, based on the order provided in item_coords
- flagstr
Reason for invalid classification (‘invalid_start_pos’, ‘invalid_end_pos’, ‘no_item_in_range’, or NaN)
- Return type:
pandas.DataFrame
Notes
If a saccade starts outside the annulus, it is classified as ‘invalid_start_pos’.
If a saccade ends outside the annulus, it is classified as ‘invalid_end_pos’.
If a saccade ends too far from any item, it is classified as ‘no_item_in_range’.
- pupeyes.saccades.saccade_deviation(sample_data, data, col_sample_timestamp, col_x, col_y, col_saccade_start_time, col_saccade_end_time, find='mid')[source]#
Compute the angular deviation of saccade trajectories from a straight path.
This function measures how much a saccade’s trajectory deviates from a straight line between its start and end points. The deviation is measured as the angle between two lines: one from start to end point, and another from start to a specified point along the trajectory. This function may be helpful for detecting curved saccades. Make sure to use the same coordinate system for both sample_data and data.
- Parameters:
sample_data (pandas.DataFrame) – Raw eye tracking samples containing gaze positions
data (pandas.DataFrame) – Saccade data with start/end times
col_sample_timestamp (str) – Column name for timestamps in sample_data
col_x (str) – Column names for x and y coordinates in sample_data
col_y (str) – Column names for x and y coordinates in sample_data
col_saccade_start_time (str) – Column names for saccade start and end times
col_saccade_end_time (str) – Column names for saccade start and end times
find (str or int, optional) – Point in trajectory for curvature calculation: - ‘mid’: use midpoint (default) - ‘one-third’: use one-third point - ‘max’: find point of maximum deviation - int: use specific sample number - None: use endpoint
- Returns:
Original DataFrame with added columns:
- deviationfloat
Angular deviation at specified point (degrees)
- deviation_idxint
Sample index where deviation was computed
- deviation_timefloat
Timestamp where deviation was computed
- Return type:
pandas.DataFrame
Notes
If a saccade starts outside the annulus, it is classified as ‘invalid_start_pos’.
If a saccade ends outside the annulus, it is classified as ‘invalid_end_pos’.
If a saccade ends too far from any item, it is classified as ‘no_item_in_range’.
External Modules#
EDF Reader#
EDF Reader of EyeLink Data
Adapted from: esdalmaijer/PyGazeAnalyser
Original Author: Edwin Dalmaijer
License: GPU GPL v3
Adapted By: Han Zhang <hanzh@umich.edu>
Date: 12/25/2024
- Changes:
Added support for reading metadata.
Added support for storing the last message and its time for each sample and event.
Moved checking trial end to the end of the loop to allow the last line (stop MSG) to be extracted.
- pupeyes.external.edfreader.read_edf(filename, start, stop=None, missing=0.0, debug=False, progress_bar=True)[source]#
Read EyeLink Data Format (EDF) file and extract trial data.
Adapted from: esdalmaijer/PyGazeAnalyser
Original Author: Edwin Dalmaijer
- Parameters:
filename (str) – Path to the file that has to be read
start (str) – Trial start string to identify beginning of trials
stop (str, optional) – Trial ending string, by default None
missing (float, optional) – Value to be used for missing data, by default 0.0
debug (bool, optional) – If True, prints information about current processing steps, by default False
progress_bar (bool, optional) – If True, shows a progress bar while reading the file, by default True
- Returns:
Contains two elements:
- datalist
- List of dictionaries, one per trial, each containing:
- xnumpy.ndarray
Array of x positions
- ynumpy.ndarray
Array of y positions
- sizenumpy.ndarray
Array of pupil sizes
- timenumpy.ndarray
Array of timestamps, t=0 at trial start
- trackertimenumpy.ndarray
Array of timestamps according to EDF
- eventsdict
Dictionary containing event data (fixations, saccades, blinks, and messages)
- metadatadict
Dictionary containing calibration and tracking information
- Return type:
tuple
- pupeyes.external.edfreader.replace_missing(value, missing=0.0)[source]#
Replace missing values in gaze position data.
Adapted from: esdalmaijer/PyGazeAnalyser
Original Author: Edwin Dalmaijer
- Parameters:
value (str) – Either an X or a Y gaze position value (NOT pupil size, which is coded ‘0.0’)
missing (float, optional) – The missing code to replace missing data with, by default 0.0
- Returns:
Either the missing code, or the float value of the gaze position
- Return type:
float
Notes
A missing value in the EDF contains only a period, no numbers. This function is for gaze position values only, NOT for pupil size, as missing pupil size data is coded ‘0.0’.
Blink Detection#
From the original code:
This adaptation to Python was made with the supervision and encouragement of Upamanyu Ghose For more information about this adaptation and for more Python solutions, don’t hesitate to contact him:
Email: titoghose@gmail.com
Github code repository: github.com/titoghose
- pupeyes.external.based_noise_blinks_detection.based_noise_blinks_detection(pupil_size, sampling_freq)[source]#
Function to find blinks and return blink onset and offset indices.
Adapted from: R. Hershman, A. Henik, and N. Cohen, “A novel blink detection method based on pupillometry noise,” Behav. Res. Methods, vol. 50, no. 1, pp. 107–114, 2018.
- Parameters:
pupil_size (array-like) – Array of pupil size data for left/right eye
sampling_freq (float) – Sampling frequency of eye tracking hardware (default = 1000 Hz)
- Returns:
- Dictionary with keys:
”blink_onset” : array of blink onset indices
”blink_offset” : array of blink offset indices
- Return type:
dict
Notes
- The function handles several edge cases:
No blinks in the data
Data starts with a blink
Data ends with a blink
- The algorithm:
Smooths the data to increase difference between measurement noise and eyelid signal
Finds monotonically increasing and decreasing sections
Updates blink onsets and offsets using these sections
Concatenates close blinks/missing trials if they are within 100ms