Difference between revisions of "PhysBox: The Psychophysiology Toolbox"
|Line 451:||Line 451:|
Latest revision as of 19:58, 20 August 2018
- 1 Overview
- 2 Installation
- 3 File Naming and Directory Structure Conventions
- 4 The EEG Data Structure
- 5 The EEGLab and Physbox GUI
- 6 PhysBox Processing by Script
- 7 Checking Data Reduction Integrity
- 8 PhysBox Functions
- 9 PhysBox Listserv
- 10 Citing PhysBox
- 11 For Developers
- 12 To Do and Future Functionality
PhysBox is a plugin that works in conjunction with the EEGLab toolbox. PhysBox provides additional functionality and/or easier use for novice users to process psychophyiological data including EEG/ERP, startle EMG, skin conductance, heart rate, etc. PhysBoX is designed to allow for the use of a very simple matlab script along with a simple tab-delimited text "Parameter" file for batch reduction. In most instances, the user only needs to implement very minor (if any) changes to the script along with simple updates of parameters (e.g., path and file names) in the Parameter file. Physbox is also designed to output by default numerous quantitiave and visual metrics to verify the integrity of the data reduction for all participants. This facilitates identification of problematic subjects and/or other problems in the data reduction process.
Installation of both EEGLab and PhysBox is required to use PhysBox as PhysBox is a plugin for EEGLab. Both EEGLab and PhysBox are installed via Subversion (SVN). Installation via SVN is attractive because it will allow you to update your installations of EEGLab and PhysBox easily and as frequently as you like.
To start, you will need to determine where to install EEGLab and PhysBox. If the computers in your laboratory are on a network, we strongly suggest that you install EEGLab and PhysBox in their own folders on one central computer (e.g., a file server) that is accessible from all of your data processing workstations. This provides two benefits. First, when you install/update PhysBox and EEGLab, you will only need to install/update on this one computer rather than following these steps for every data processing workstation in your laboratory. Second, this will guarantee that all your data processing workstations are using the same versions of EEGLab and PhysBox with the same configuration setup. Of course, if you don't have a local network, you can simply install both on every workstation. In this case, I suggest installing into appropriately named folders in Program Files. It will work fine this way as well but will take more time to install/update.
NOTE: Matlab will need to be configured to recognize EEGLab and PhysBox by adding their folders to the Matlab path. This step is accomplished after installation of both EEGLab and PhysBox and is described below.
NOTE: PhysBox should work with most recent versions of EEGLab and Matlab. Our lab tends to be up to date within the past year for versions of both. In addition to the base package for Matlab, PhysBox filtering requires the Signal Processing Toolbox.
Installation via SVN
Subversion (SVN) provides version control for software developers. It is used by both the EEGLab and PhysBox developers. The files are hosted on a central repository on a server on the web and users can download the most up to date copy of these files very easily. To accomplish this, you need to interact with the SVN server via an SVN client. The client that you use (and therefore the steps to install or update the files) varies by OS. Follow the instructions for your OS
Windows users will use the TortoiseSVN client. TortoiseSVN is very simple to use. It is a shell program that will add context menus to Windows Explorer, which will allow you to download/update both EEGLab and PhysBox directly using Windows Explorer.
1. Go to the TortoiseSVN download page and download and install the correct version of this software (32 bit or 64 bit). You will need to restart your computer after installing TortoiseSVN before your first use.
2. Determine where you will install EEGLab (see comments above). Create a folder called EEGLab in this location.
3. Right click on this folder in Windows Explorer and select SVN Checkout.... In the URL of Repository box, enter the following URL (You will only need to do this once. Future updates will remember this URL):
All other defaults are correct. Press OK and the most current version of EEGLab will be downloaded into this folder.
4. Now determine where you will install PhysBox (see comments above). Create a folder called PhysBox in this location.
5. Right click on this folder in Windows Explorer and select SVN Checkout.... In the URL of Respository box, enter one of the following URLs:
for anonymous access: svn://svn.code.sf.net/p/physbox/code/trunk/
for read/write access (developers only): https://svn.code.sf.net/p/physbox/code/trunk/
All other defaults are correct. Press OK and the most current version of PhysBox will be downloaded into this folder.
6. In Windows Explorer, copy the file called eegplugin_PhysBox.m within the PhysBox folder. Paste this file into the plugin folder inside of the EEGLab folder. This will add PhysBox to the EEGLab GUI.
NOTE: In the future, you can update either EEGLab or PhysBox to their most current version by right clicking on the appropriate folder and selecting SVN Update. This will bring up a dialog box where you will again select the appropriate Repository URL from a drop down menu (TortoiseSVN will remember your previous entries). All other defaults will be correct. Click OK and your software will be up to date! If you update PhysBox, you will still need to manually copy the eegplugin_PhysBox.m file to the plugin folder for EEGLab to keep the GUI current.
Mac users should first read the documentation above for Windows. The only difference is that you will not use TortoiseSVN and Windows Explorer. Instead, you will use the built-in Subversion client.
First open the Terminal application. (An introduction to Terminal is here: http://macapper.com/2007/03/08/the-terminal-an-introduction/)
To get an initial copy of the current version of PhysBox, type:
svn checkout -d physbox 'http://svn.code.sf.net/p/physbox/code/trunk'
To update an existing copy to the latest version in the repository:
cd physbox; svn update
Repeat this process with the URL for EEGLab to get or update a copy of it.
If you would like to explore other SVN clients with GUIs for the Mac:
Add PhysBox and EEGLab to the Matlab Path
Install our lab's custom startup file as per the instructions here: Installing Matlab and updating license and startup files
After you have modified (or created) your startup.m file, restart Matlab and test your installation with the following tests:
Type eeglab at the Matlab command prompt. The EEGlab GUI should start. If it does, you have successfully installed EEGLab.
If PhysBox is the first menu option at the top left of the EEGLab GUI, you have installed the PhysBox GUI correctly.
Type help pop_ProcessSet. If this command returns help for this function, you have added PhysBox to the Matlab path correctly.
Recommended EEGLab configuration changes
1. Type eeglab at the Matlab command prompt
2. From the GUI, select File:Memory and Other Options
3. Under STUDY options, unselect the box which indicates If set, save not one but two files....
4. Under Memory Options, unselect the box which indicates If set, use single precison...
5 Under Folder Options, select the box which indicates If set, when browsing to open a new data...
6. Under EEGLAB chat option, unselect the box which indicates if set, enable EEGLAB chat...
NOTE: Depending on your OS and your level of permissions, you may need to change the location of your Options file. The path to the options file can be changed via the button at the bottom of the dialog box. If necessary, select a location that you have permission to write/modify files.
NOTE2: It may be more direct to edit the options file by typing edit eeg_optionsbackup at the command line
Final Recommendation: You may also prefer to change the default y-axis scale when viewing data. To do this, type edit eegplot at the Matlab command prompt. Use Edit:Find and Replace from menu of the editor to search for this line:
try, g.spacing; catch, g.spacing = 0; end;
Change g.spacing=0 to an appropriate value. I suggest 400.
File Naming and Directory Structure Conventions
MyStudy -> MyRawDataRootDir -> MySingleSubDirs (1 per subject; contains raw files) -> MyReducedDataDir (contains processed intermediate files and logs)
1. For each study there’s a study folder (one folder will have raw data -root path for all data files) root path a. inside root path individual folder for each subject named with the subject ID number (these files are never modified). If you need to do anything by hand it would go here. input path i. within each subjects folder there is a reduce folder, where processed data is saved (can be recreated easily) output path 2. Files names somestingsubid.whatever
Data file types and formats
CNT (int16, int32) vs. SMA vs. SET
CON vs. EPH vs. AVG vs GND
xxx = local file info (applied processing) yyy = subid and/or runid type info (root name) aaa = file type extension
PhysBox function names and types
1. pop_TileCase-work at the command line, called by gui. If you provide a single parameter a gui will pop up to ask for the rest of the parameters necessary.
2. notes_TitleCase-most processing makes notes saved in the data structure to review if things work well.
3. Support functions-from multiple places (inconsistent naming), never called directly
Parameters in functions and Parameter file
time in ms
all entries in cell arrays
Windows for epochs
ChanList of channel labels ('all' 'exclude')
The EEG Data Structure
Read documentation on the EEG data structure from the EEGLab Wiki
NEED TO describe the notes and scores fields that are unique to PhysBox
The EEGLab and Physbox GUI
PhysBox Processing by Script
This function is the main workhorse for data processing by script. It is a liaison between the individual pop functions and the parameter (P) file. The P file provides information about the parameters needed for any specific pop function for each subject. The P file also tracks if subjects have been rejected and/or reduced already. Finally the pop functions write output (via notes functions) back to the P file with assistance from pop_ProcessSet().
pop_ProcessSet() calls are typically embedded in a for loop that loops through all entries in a parameter file (e.g.,
for i = 1:CountSets(P)
- [EEG, P] = pop_ProcessSet(...)
- [EEG, P] = pop_ProcessSet(...)
- [EEG, P] = pop_ProcessSet(...)
The use of pop_ProcessSet() requires the set up and loading of a Parameter (P) file. This file will be used to a) determine if subject is rejected or reduced already, b) provide input parameters for the pop functions for each subject, and c) receive output from the pop functions that is useful for data reduction integrity checks.
USAGE: [EEG, P] = pop_ProcessSet(EEG, Function, P, SubjectIndex, ParameterIndex)
The calls to pop_ProcessSet() provide a standard interface to all pop functions that require you to input only (up to) five simple parameters to this function in your processing scripts. They are defined as follows:
EEG: an EEG set file that will be the target for the associated pop function indicated in Function
Function: A character string that matches the name of a pop function without the 'pop_' and the '()'. This string is case sensitive and should use TitleCase
P: A parameter file which is a DAT file with input parameters for all processing steps for every subject. This P file should have already been loaded into the workspace through a call to pop_GetParameters(). More information about this P file is provided in the appropriate section below
SubjectIndex: Index (row) of the current subject in the P file to obtain parameters. If you are calling pop_ProceesSet() in a for i = 1:CountSets(P) loop as described above, SubjectIndex is simply i
ParameterIndex: Optional index to add onto parameter field names in the P file when the same pop function will be called multiple times in one reduction. For example, SaveSet requires a parameter field named 'ssFile'. SaveSet is often called multiple times in one reduction script. Therefore separate calls to pop_ProcessSet() for each application of SaveSet will use a ParameterIndex of 1, 2, 3, etc respectively and the P file will include fields labeled 'ssFile1', 'ssFile2', 'ssFile3', etc:
- [EEG, P] = pop_ProcessSet(EEG, 'SaveSet', i, 1)
- [EEG, P] = pop_ProcessSet(EEG, 'SaveSet', i, 2)
- [EEG, P] = pop_ProcessSet(EEG, 'SaveSet', i, 3)
EEG: the EEG set file updated after application of the pop function indicated by Function
P: the Parameter file updated (possibly depending on pop function)
The Parameter File P
The Parameter file is a tab-delimited text file (typically ending in .dat extension) that is used by pop_ProcessSet() and pop_MultiSet() to obtain input parameters for the associated pop functions and to save output parameters (to check data reduction integrity) from the application of these pop functions.
On disk, the Parameter file is a Rows X Columns DAT file where individual subjects are recorded in separate rows and input/output parameter fields are saved in separate columns. The field names are case sensitive. No specific ordering of fields is necessary but it is good practice to use a sensible ordering (e.g., required fields first, remaining fields in the order that they are called by processing script). The Parameter files are best viewed and edited in Excel but should always be saved as .dat text files. It is also important that you close the Parameter file before running your scripts.
A call to [P] = pop_GetParameters() is used to import the data in this Parameter file into a data structure, P, that has different fields for each input/output parameter field/column. These fields can contain varied data types including double, char, and cell arrays.
The P data structure created from the Parameter file is intended to be used by a specific script that contains calls to pop_ProcessSet() or by command line calls to pop_MultiSet(). Sample Parameter files and associated data reduction scripts are available for the reduction of Startle ([Script]; [Parameter file]), ERP ([Script]; [Parameter file]), Corrugator([Script]; [Parameter file]), Skin Conductance Response([Script]; [Parameter file]). You should review a couple of these sample Parameter files before reading further.
Input Parameter Fields
There are a handful of input parameter fields that are required in all Parameter files. As noted earlier, ordering does not matter but I prefer to list them at the start of the Parameter file in this order...
- This numeric field contains the SubID
- This numeric field indictes how many characters should be preserved when converting SubID to character for appending to filenames. This allows for padding shorter SubIDs with zeros and/or SubIDs of different lengths. For example, if you list a SubID of 18 and SubIDDigits of 4, the character version of SubID will be '0018'
- This numeric field indicates if the subject has been rejected for future data processing. 1= Rejected, 0= Not Rejected. This allows you to retain a record of subjects that you will not include in final processing and analysis b/c of data or other problems. This field can also take on a value of -1 if a subject is 'Auto-Rejected' by pop_ProcessSet(). This will happen if the script attempts to open the original raw data file but does not find it due to a filename error (or b/c it is truly missing). This allows your script to continue without crashing, and you can later set the Rejected field back to 0 after fixing the problem and re-run the script to reduce this individual subject.
- This character field is used primarily to store notes on why you have rejected a subject. However, pop_ProcessSet() will add notes to this field if it auto-rejects a subject.
- This character field is never used directly by PhysBox functions. As such, it is not really required. However, it provides a nice place to keep track of notes about a subject that might be relevant for interpreting their data.
- This numeric field is used by pop_CheckStartle() and pop_CheckERP(). If you are not using those functions, it is not required. It is set to 1 (manually by you) once you have checked the reduction for a participant. Future calls to the pop_Check functions will then ignore this subject to save you time. This field should be initially set to 0 when the subject has not been checked.
- This numeric field should initially be set to 0 ('not reduced'). When a subject is reduced by script and you call pop_ReductionComplete() at the end of the data reduction loop, this function will set the Reduced field to 1. This allows subsequent runs of the data reduction script to skip reduced subjects and only reduce new subjects in the Parameter file. It is also used by pop_MultiSet() to determine which subjects to display figs, export notes, score, etc.
- This character field should provide the full path to your root folder that contains all subject data files. In our lab, this folder is called RawData. This is the higher level folder that contains all the subject (InputPath) folders. See notes above under File/Folder naming conventions for more details.
- This character field should provide the full path to the folder that contains all aggregate data files (i.e., data files with data from all subjects). In our lab, this folder is called Analysis. See notes above under File/Folder naming conventions for more details.
- This character field contains a string that is appended on the front of all files (data, figure, etc) that are created. Use a unique Rrefix for each data reduction script to avoid overwriting files that have otherwise similar names.
While not strictly required, the next set of input parameter fields are the ssFile# fields. These fields are used when you call SaveSet within pop_ProcessSet(). These are character fields that are used to determine output file names for your data files. Therefore if you are saving any EEG files for each subject, you will have to include these fields. For example, in the typical startle reduction, we first save a raw epoched file with all trials (ssFile1 = EPHAll), followed by a final epoched file with artifactual trials rejected (ssFile2 = EPHFin), and finally an averaged file across event types (ssFile3 = AVG). See notes about the ParameterIndex in pop_ProcessSet() above to understand how individual calls to SaveSet unambigiously link to the appropriately numbered ssFile field.
The remaining input parameter fields are each linked to calls to specific pop functions within pop_ProcessSet(). Each pop function requires its own parameters with specific field names. You can see a list of the parameters and their field names for each pop function by typing help pop_ProcessSet() at the command line. These field names do have some consistent structure. Specifically each field name begins with two lower case letters that reflect the two words in the pop function (e.g., ss for SaveSet, ee for ExtractEpochs, de for DeleteEpochs, etc). Following these two lowercase letters is the actual descriptive parameter name in TitleCase.
There is one final 'field' associated with the input parameters. This field is called 'StartNotes' and is typically filled with a series of 'XXXXX' for all subjects. This field is not required but serves as a nice visual separator between the input parameter fields and the output parameter fields.
Output Parameter Fields
An attractive feature of the PhysBox pop functions is that many call associated notes functions (with same TitleCase name as pop function). These notes functions add to a 'notes' field in the EEG set file. If you call AppendNotes via pop_ProcessSet() or pop_MultiSet(), these notes fields in each EEG file will be appended to the end of your Parameter file in the row associated with that subject. This puts many details about the data reduction in one place for you to review to confirm the integrity of your data reduction. These output parameter fields also use a semi-consistent name structure. Specifically they begin with two lower case letters and a '_' (to distinguish them from the input parameters). The lower case letters indicate which pop function produced these output parameters. The remainder of the field name uses TitleCase to provide more detail about the specific parameter.
It is good practice to put a call to pop_ProcessSet() with AppendNotes as the function near the end of your reduction script loop. Alternatively, you can append these output parameters after data reduction at any point using pop_MultiSet() with AppendNotes as the function. Make sure you use the EEG file with the most complete set of notes. This is typically the last EEG file created in the data reduction stream for a subject.
This function is designed to execute pop functions on all Reduced (and typically not Rejected) subjects in a parameter file without the need to put it in a loop (in contrast to pop_ProcessSet). This function is designed to be used at the command line (or by GUI) after primary processing is complete. It is only implemented for a subset of pop functions that are useful after primary processing at the command line. For example, ScoreERP will likely be called many times as you consider various windows and scoring methods. Similarly LoadFig and ViewRejects are intended to be used after primary processing to check the validity of that processing. It is expected that additional necessary parameters will be passed to the pop function via the use of the OptParams parameter in pop_MultiSet(). The functions that are implemented along with the parameters that need to be supplied via OptParams is provided in the help for this function (type help pop_MultiSet at command line).
As noted earlier, the specific pop function will generally be applied to all Reduced but not Rejected subjects in the P file. ExportNotes and AppendNotes are exceptions that will include Rejected subjects to the degree that notes exist for them.
USAGE: [P] = pop_MultiSet(P, Function, ssFileIndex, OptParams) Command line calls to pop_MultiSet() provide a standard interface to apply many pop functions to all subjects in a P file. It requires you to input only (up to) four simple parameters at the command line. They are defined as follows:
P: A Parameter (P) file
Function: A character string that matches the name of a pop function without the 'pop_' and the '()'. It is case sensitive and should use TitleCase
ssFileIndex: This is the index of the ssFile field in the P file with the name of the EEG file required to execute this pop function. It is not necessary for all pop functions as some (e.g., LoadFig) do not require an EEG file
OptParams: This is a cell array that allows input of any additional parameters needed by the pop function. As a cell aray, it is quite flexible and entries in this cell array can take varied forms (double, char, cell). See help for pop_MultiSet() for details on the requirements for any specific pop function that is implemented
OUTPUT PARAMETERS P: an updated (possibly) P file. Currently, none of these functions update P but it is included for consistency and use in scripting.
Examples of typical/useful pop_MultiSet() calls are provided at the bottom (commented out) of each demo script.
Sample PhysBox Processing Scripts, Parameter Files, and Data
Demonstration Sets by Measure
- Startle: Zip file
- ERP: Zip file
- Corrugator (time domain): Zip file
- Skin Conductance Response: Zip file
Checking Data Reduction Integrity
Checking Event Table Structure
Demo EEGlab/Matlab script for checking event code count, timing/spacing and counterbalancing: CheckEventCodes.m
Check Startle Data Reduction
Check ERP Data Reduction
pop_LoadFig() and multiset processing
pop_ViewRejects() and multiset processing (soon)
list each function here
To join or leave the PhysBox Listserv, send a blank email from your relevant email account to either:
Cite both PhysBox plugin and EEGLab toolbox
Curtin, J. J. (2011). PhysBox: The Psychophysiology toolbox. An open source toolbox for psychophysiological data reduction within EEGLab. http://dionysus.psych.wisc.edu/PhysBox.htm
A Delorme & S Makeig (2004) EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics. Journal of Neuroscience Methods, 134:9-21
To Do and Future Functionality
- Need help for pop_DiagnosticFigure
- AppendNotes has bug when new note is added that is shorter than previous note for same subject, it will not erase all of previous note.
- AvgFigure should take and (optional) parameter which lists events to plot
- Check on NaNs for subjects with no epochs in SAFE. Why are fields in sets earlier then EPHFin NaN?
- Make the method input parameter of ScoreERP a cell array, rather than a string. It would be fun if we had the option to specify unique scoring methods for each of our windows. perhaps the current setup (ie, single method) could be the default if only one method were passed?
- Demo script and Parameter file for CRG (time and frequency) and SCR
- Save vs. Saveas in GUI
- Create notes_ScoreWindows() and notes_ScoreERP() based on notes_ScoreStartle()
- Systematic update of COM at end of functions.
- Systematic completion of help for notes and support functions
- fix warning message precision error in AdjTime()
- Figure out how setname is set
- Create a ReadDat and WriteDat that works with tdfread/write but converts char fields to cell rather than char array.
- Make sure that MarkMean and MarkThreshold can be called multiple time without overwriting past results. Also update code in MarkMean (and maybe MarkThreshold) to model use of max for identifying rejected trials)
- Figure out how to speed up pop_AppendNotes
- make change gain (e.g., for PRB) function for CON files
- Make all parameter calls like ButterworthFilter with option to include ParameterIndex in fieldname in pop_ProcessSet(). Definitely need ParameterIndex for ScoreERP and Score Windows
- Warning for ExportScores function if subID is in file already. Maybe save in notes like agWarning?
- Add LoadSet to pop_ProcessSet for later call in script to loading set files that arent currently in the processing stream
- Write help menu for function with links as per eeglab
- ExportScores takes list of scores
- Figure out when to use eeg_hist and eegh. Generally consider what COMs to record. Possibly Add new COM field for multiset?
- Need pop_ScoreLatency() for ERP
- Need pop wrapper for pop_image()
- Need wrapper for pop_rejepoch (already in menu)
- Need wrapper for pop_select (need in Menu)
- Make pop_Convert2set()
- Finish pop_CheckERP()
- Consider what happens to notes fields when EEG= emptyset for functions like pop_CreateAvg and pop_AverageWaveform. If this is a second (or later) reduction, with the old notes remain?
Possible Future Functionality
- Consider ways to mark bad trials and review visually with independent reviewers in PhysBox GUI
- Integrate methods from MASS: http://openwetware.org/wiki/Mass_Univariate_ERP_Toolbox
- consider added a field in output file and notes output for mean in 150-250 for startle. It would be very similar to basedeflect scoring currently.
- need a REMOVE SCORES function to get rid of ERP scores from scores field that don’t work.
- write a Parameter file creating script. This can actually be a pop_function that takes STL or ERP and creates a shell (header) with typical columns for default reduction
- Explore new toolbox for viewing data. Consider including as alternative view option
- Estimate missing data for STL
- reject based on point by point exceeding SDs from mean (i.e., confidence envelope around full trial)
- Check for each parameter in sub functions and output descriptive error when not found
- Need better DiagnosticFigure for ERP
- Think more carefully about eeg_FastFourier. Compare to pwelch, etc
- Function to find peak latency of components (in various windows) in grand average to use in selection of scoring windows.
- better screen processing info for EventRecode. Include what is being recoded to what