Tutorial 2 – Data Preprocessing
Tutorial Developers: Robert Salati and B.J. Fregly, Rice Computational Neuromechanics Lab, Rice University
Last Updated: 11/4/2025
The NMSM Pipeline expects input data to be in a specific format for the tools to work properly. The data preprocessing sub-tool focuses on processing raw kinematic, force, and EMG data into a usable format. Data preprocessing is the next step in the pipeline after JMP, and is required to proceed with MTP, NCP, GCP, and finally, Treatment Optimization.
The inputs to data preprocessing are experimental IK motions, ID loads, ground reaction forces and moments, geometric muscle data, and EMG data. The tool then crops and filters the relevant quantities for the NMSM Pipeline tools to run.
Data preparation:
Open the OpenSim model
UF_Subject_3_reduced_muscles.osimin the OpenSim GUI.Run the Inverse Kinematics tool on the model using the settings file
input_data\IKSettings.xmla. This settings file runs IK using 4 seconds of marker data for the gait trial to be analyzed. All weights are set to 1 for simplicity.
Run the Inverse Dynamics tool on the model using the settings file
input_data\IDSettings.xmla. This settings file runs ID using the previous filtered IK results, and external force data in electrical center format.
Run the Muscle Analysis tool on the model using the settings file
input_data\MASettings.xmla. This settings file runs MA using the previous IK results to calculate muscle-tendon lengths and moment arms that will be used later in the pipeline.
Data Preprocessing:
Open the MATLAB script
preprocessing.mand examine the file.This MATLAB script does 4 primary tasks:
a. Process EMG data: Starting from the raw EMG file, the script high pass filters, demeans, rectifies, and low pass filters the EMG signals. Next, any remaining negative EMG values are set to zero. EMG signals are then offset so that the minimum value of each signal is 0. Finally, EMG signals are normalized so that the max value of each signal is 1.
b. Create muscle-tendon velocities: The script low pass filters the muscle-tendon lengths created by Muscle Analysis, splines the filtered data using GCV splines, and then differentiates the filtered data.
c. Crop data: A user input to preprocessing is a set of time pairs for which the data should be cropped to. One file is created for each time pair specified by the variable trialTimePairs.
d. Filter data: All data are low pass filtered using the cutoff frequency specified by inputSettings.cutoffFrequency.
Processed data are output into the directory
preprocessed.Before moving onto the next tools, it is important to visualize your preprocessed data to make sure it looks correct. Run
PlotPreprocessed.mto do this.