Procedures of analysis using NeoAnalysis
After successfully installing NeoAnalysisFootnote 1, users can use the toolbox following the procedures depicted in Fig. 2. A step-by-step tutorial can be found in the user manual.Footnote 2 In brief, users first import the raw data of any supported format and convert to HDF5 format. Then users can perform spike detection (see Additional file 1), spike sorting and/or signal filtering on the converted data. Next, if the experiment includes data regarding eye movement, users can perform saccade detection and extraction. Otherwise, users can begin to analyze spike trains, LFPs, and other behavioral data using the corresponding plotting functions. The results of each analysis session can be saved for future use. If users want to analyze the data for a population of neurons, NeoAnalysis can retrieve the saved workspace and perform analysis and statistics across the data gathered for an entire population.
Spike sorting
Offline spike sorting can be done using the SpikeSorting module. The following codes are used in order to start the module with a 3D view:
An interface with several buttons and panels is then displayed. Users can load data from a specific location by clicking the Load button at the bottom of the control panel. All of the spike channels will be shown in the drop-down box labeled as ‘Channel’, and users can select the channel of interest to begin the sorting process. All of the spikes recorded in the selected channel are shown in the bottom panel (labeled as ‘timestamps’). A sliding window is provided so that a portion of the spikes can be selected to display their waveforms, which are shown in the left panel (labeled as ‘waveforms’). The right panel shows the principal components of all spikes in the selected channel (labeled as ‘PCA’, Fig. 3a). Users can check the ‘AutoSortThisChannel’ box to start automated sorting using the wavelet analysis and superparamagnetic clustering method [19, 20]. The parameters displayed below have been set to be optimal based on a previous study [19]. Generally, this function generates satisfying results without adjusting these parameters. Though users can adjust the UnitsNum to define the number of sorted units, this actually does not change the result of the major units but only assign those units of minority spikes to the unsorted one (unit 0). In cases when users are not satisfied with the automated sorting, we provide the option to sort manually. Users can choose either the window discriminator or the PCA discriminator to perform the sorting. When using the window discriminator, users can use the segment widget (two red lines with square ends, which can be moved, stretched, shortened, and oriented) to select spike waveforms in the left panel. When using the PCA discriminator, a polygon widget (red polygon with square nodes, which can be moved, reshaped, and edges can be added or removed) is provided to select spike principal components in the right panel. The selected spikes can then be assigned to unit 1–unit 9 (unit 0 means unsorted). It is important to note that any re-sorting done using either discriminator will be simultaneously displayed in both panels. In the meantime, this module provides a 3D view of the first three principal components of all the spikes in the selected channel (Fig. 3b). Even though no other operation is allowed, it provides users an overview of the data and helps users verify the selection using the PCA discriminator. After users are satisfied with the sorting results, they can click the Save button to save the data. Otherwise, they can click the ResetAll button to start over. An additional movie shows this procedure in more detail (see Additional file 2).
Single unit analysis
The graphics module of NeoAnalysis provides users several useful functions to perform the basic analysis. A remarkable feature of these functions is that they are all equipped with a powerful ‘sort_by’ option, which allows users to obtain results according to the experimental conditions (see “Design principles”). The graphics module first provides users a data table that includes all of the experimental information and the recorded signals on a trial-by-trial basis. Then, through the use of the ‘sort_by’ option in combination with other settings, users can obtain the required results without having to write complex codes. The following command lines illustrate how the graphics module computes PSTH, plots raster and calculates spike counts.
In the Python console window, run the following codes:
In line 1, the graphics module from NeoAnalysis is imported for single unit analysis. Line 3 defines the path and the filename of the data. Line 4 initiates the graphics class by setting the parameters filename, trial_start_mark and comment_expr. The trial_start_mark is the marker representing the start of a trial, which is used to separate the raw data into different trials. The comment_expr tells the program how the experimental conditions and parameters are stored in the data. In this example data, the experimental condition (here is ‘patch_direction’) and the setting of each trial (here is a value in degree) are stored together as a comment entity with a semicolon in between (i.e. ‘patch_direction:degree’). By setting the comment_expr as ‘key:value’, the program decodes the key as ‘patch_direction’, and the value for a particular trial is the degree of that trial. This option provides users the flexibility to store their experimental parameters. After this step, all data are reorganized into an informative data table on a trial-by-trial basis, which can be displayed using the code in line 5. A portion of the table is shown in the graphics panel of Fig. 2.
Considering that experimental conditions are stored as ‘string’ in the data, converting them to ‘numeric’ will make the sorting faster during conditioning, as the data are sorted by their logical orders. This is done using the code in line 6.
Raster with accumulated PSTH can be plotted using the function in line 7. Most parameters, including bin_size, overlap, Mean, Sigma, filter_nan, and fig_column have default values, which means that users do not necessarily have to input these parameters if they do not have particular requirements. Users do need to define the parameters channel, sort_by, align_to, pre_time, and post_time. The channel parameter defines the spike channel and the unit order, in case there are multiple units recorded. The sort_by parameter defines which experimental conditions are used to sort the data. The align_to parameter defines which event marker is used to align the data. In this example, the event marker ‘event_64721’ represents the onset time of the visual stimuli. The pre_time and post_time parameters represent the time range (relative to the align_to parameter) selected for the analysis. The bin_size and overlap parameters represent the bin width for computing the PSTH and the overlap between two adjacent bins. The Mean and Sigma define the Gaussian kernel for data smoothing. The output of line 7 is shown in Fig. 4, which shows the smoothed PSTH at the bottom and the raster at the top of each panel. Notably, this function does not just plot a figure, it also allows for plotting the results according to the required experimental conditions.
The command in the line 8 plots the spike counts during the period defined by the parameter timebin. Other parameters use the same convention as in line 7. The output of this command is shown in Fig. 5, which shows the direction tuning of this example neuron.
Spectrum analysis
A common analysis for LFP is to plot the spectrogram. The graphics module provides several functions to perform the spectrum analysis using the periodogram method [23]. For example, the function below plots the time–frequency spectrum of LFP for the low frequency domain (< 100 Hz):
This function sorts the data in channel ‘analog_26’ using the patch_direction parameter, and the time window defined by pre_time and post_time. Setting the color_bar to be ‘True’ turns on the scale bar. By default, the function uses a ‘hann’ window to calculate the density across the time–frequency domain. Users can refer to the manual for more details about the available options. The result is shown in Fig. 6.
Saccade detection
NeoAnalysis provides a function called find_saccade to detect saccades. The algorithm for saccade detection in this function is based on setting thresholds for eye movement speed, duration, and distance [24]. These parameters have already been set to optimal values, according to our experience; however, users can reset these parameters if the default settings do not satisfy their needs. The results of saccade detection contain information regarding when and where a saccade starts and ends, as well as the amplitude of the saccade. This information is also added to the aforementioned data table that contains all of the experimental settings and recorded signals. In addition, NeoAnalysis provides another function called choose_saccade, which can be used to select saccades during a given period of time and/or within a certain range of amplitude. An example of saccade detection is illustrated in Fig. 7, in which the black vertical lines indicate the start and end times, and the red and green spots indicate the start and end positions of the detected saccade, respectively.
Data analysis at population level
The results obtained from the analysis discussed above can be stored in a workspace for each recording session. NeoAnalysis then provides a module, named PopuAnalysis, to analyze the population data across all sessions. In the following example, we illustrate how to use this module to analyze behavioral and electrophysiological data at the population level using a simulated workspace named ‘sample_workspace.h5’.
Using the codes above, first the workspace in the data folder is loaded (line 1-3), and then the mean reaction time is computed for the different experimental conditions with line plot displays (line 4). The parameter store_key in line 4 defines which data will be analyzed in the workspace, and the parameter conditions defines the conditions for data sorting. In this example, there are two levels of conditions, with each level containing three factors (‘a’, ‘b’, ‘c’ and ‘A’, ‘B’, ‘C’ for level 1 and level 2, respectively). The result of this analysis is shown in Fig. 8.
For spike train analysis at the population level, the function plot_spike is used:
The command line above compares the neuronal activities among four experimental conditions: (‘a’, ‘A’), (‘a’, ‘B’), (‘b’, ‘A’) and (‘b’, ‘B’). The parameter store_key defines the data to be analyzed. If the parameter normalize is set to be True, the neuronal activities from different neurons will be normalized before calculating the mean responses. The fig_mark denotes where to put the vertical reference lines to indicate specific events (e.g. stimulus onset). The error_style sets the error bar style in the figure and ci sets the confidence interval. The result of this command is shown in Fig. 9.