Skip to main content

Data Preprocessing

Data Preprocessing is a series of functions combined into a MATLAB script to help users process and organize their data for use in MTP, NCP, and GCP. This tutorial assumes JMP has been successfully run on your OpenSim model and marker data.

Prior to Preprocessing

Before Data Preprocessing, data from OpenSim's Inverse Kinematics, Inverse Dynamics, and Muscle Analysis must be acquired for use in the tools. These tools must be run after JMP because the joint personalized model will result in improved IK, ID, and MA results.

Muscle Analysis Notes

The data needed from Muscle Analysis for the NMSM Pipeline is described here:

  • Moment Arm files for all coordinates of interest. These should be produced by Muscle Analysis and named something like model_MuscleAnalysis_MomentArm_hip_r.sto.
  • Length file containing the fiber length of all muscles throughout the motion. This should be named something like model_MuscleAnalysis_Length.sto.

Data Preprocessing Steps

Data Preprocessing includes a few functions that will help process your data as convenience functions. If you prefer to process your data yourself using another method, that is perfectly fine, but the resulting data organization (described below) is required for the other Model Personalization Tools.

These preprocessing steps are available as a script: preprocessing.m contained in the core codebase under the Preprocessing directory or in the examples.

EMG Processing

EMG processing of raw EMG signals can be done with processRawEmgFile.m. The exact implementation for EMG processing is in processEmg.m.

Muscle-Tendon Velocity

Muscle-tendon velocity is needed for the remaining tools and can be precalculated. createMuscleTendonVelocity.m should be passed a fiber length file name and a cutoff frequency.

Data Splitting

Long data files can be split into individual trials with splitIntoTrials.m. This function takes matching long inverse kinematics, inverse dynamics, and optional EMG files, as well as a muscle analysis directory and organizes that data for use in the rest of the pipeline.

Data Organization Specification

Data is organized in a specific way to be conveniently loaded into each tool in the pipeline. Each trial has its own prefix (gait_1, gait_2, gait_3), etc. that identifies which files are time synchronized between directories. The directories are IKData, IDData, MAData, and EMGData. For IKData, IDData, and EMGData, all files need to be 101 data points with the time column matching for a given trial.

For the MAData directory, each trial is a sub-directory (gait_1, gait_2, gait_3, etc). In each of those sub-directories, the MA data, specifically, the moment arm and length data should be prefixed with the trial and suffixed with MomentArm_*coordinate_name*.sto as well as a single file suffixed with _Length.sto for the fiber lengths.

If you want to write your own preprocessing script, it may be best to run splitIntoTrials.m to better understand how the data is organized for use in subsequent tools.