The PDM (and nwbPipeline) organize and process data with the following steps:
- Set montage Set the montage information which maps the device channel to the brain region.
- Unpack data Read binary data and save CSC (Continuously Sample Channel) signals and timestamps to .mat files.
- Automatic spike sort Detect spikes and cluster spikes into units.
- Extract LFP Remove spikes in the raw csc signals and downsample to 2k Hz.
- Manual spike sort Select spike clusters by visual inspection.
- Export to NWB Export data to NWB (neural data without borders) format for data sharing.
- Read NWB with Python
This document will describe the output data saved in the NWB file. For more details on the nwb pipeline, check the GitHub repo.
Automatic Spike Sort: #
The output of spike sorting has the following structure:
Experiment-<exp_id1>–<exp_id2>-…
– CSC_micro_spikes
– G[A-D][1-8]-<channel_name>_spikes.mat
– G[A-D][1-8]-<channel_name>_spikeCodes.mat
– times_G[A-D][1-8]-<channel_name>.mat
We will refer to them as the spike file, spike code file, and times file. The spike file has spike waveform and time extracted by spike detection. Spike code file has features that are used for automatic clustering. The times file has a cluster index for each spike, with noise spikes rejected.
G[A-D][1-8]-<channel_name>_spikes.mat #
spikes
: an N (number of spikes) by 74 matrix with raw spike waveforms.spikeTimestamps
: the timestamp (in seconds) of spikes in the experiment (or combined experiment).timestampsStart
: the start time (UNIX time in seconds) of the experiment (or the first of the combined experiment).ExpName
: a cell array of strings that indicates the experiment name.ExpNameId
: a vector (index of ExpName) with the length of the number of spikes to link ExpName for each spike.
times_G[A-D][1-8]-<channel_name>.mat #
cluster_class
: an M (number of spikes after noise rejected) by 2 matrix with 1st column be the cluster index and 2nd column the spike timestamps (in seconds).timestampsStart
(save as in spike file): the start time (UNIX time in seconds) of the experiment (or the first of the combined experiment).spikeIdxRejected
: a logical vector (with a length of the number of spikes in the spike file) to indicate whether the spike is noise.
Manual Spike sort #
In addition to automatic spike sorting, we usually do manual spike sort with wave_clus
. Open the UI by running wave_clus
in matlab and load the spike file (G[A-D][1-8]-<channel_name>_spikes.mat). Use the UI to reject, merge, or change the temperature to redo the clustering of spikes. This will create an additional times file with a pattern times_manual_G[A-D][1-8]-<channel_name>.mat, with following variables:
cluster_class
: the updated cluster index and timestamps of spikes.sortedBy
: the details regarding who performs the manual spike sorting and when it takes place
>> Note: you need both the spikes file and times file to run wave_clus
. These files will not be modified by wave_clus, so you only need to upload times_manual_* file to the server.
Extract LFP #
The output of LFP extraction has the following structure:
Experiment-<exp_id1>–<exp_id2>-…
– lfpTimeStamps.mat
– LFP_micro
– G[A-D][1-8]-<channel_name>_lfp.mat
– G[A-D][1-8]-<channel_name>_001.png
– LFP_macro
– <channel_name>_lfp.mat
G[A-D][1-8]-<channel_name>_lfp.mat #
lfp
: the downsampled local field potential at 2000 Hz.lfpTimestamps
: timestamps of LFP (as LFP is regularly sampled with a fixed sampling interval, we plan to remove this by FS in the future)timestampStart
: the start time (UNIX time in seconds) of the experiment (or the first of the combined experiment).
The PNG file in LFP_micro shows the distribution of gaps caused by spikes, which are interpolated with linear interpolation. The variables for macro LFP are the same as in the micro LFP file.
– lfpTimeStamps.mat #
The LFP timestamps are identical for micro and macro channels with the same sampling frequency (2000 Hz). For data with multiple experiments combined or experiments with multiple segments (usually caused by a pause during the experiment), the gaps between experiments/segments are filled with NaNs that cover the whole duration. The lfpTimeStamps.mat file has the following variables:
- timestamps: a vector of [start time, sampling interval, end time] of the timestamps. use `(timestamps(1): timestamps(2): timestamps(3))` to recover the full timestamps.
- timestampStart: a Unix time (in seconds) of the start time of the experiment.
>> Note: all times are in seconds in the lfpTimestamps.mat file.
Export NWB #
The final destination of the processed data will be exported to an HDF5 file with NWB (neural data without border) format, with the following information:
- subject metadata
- trial
- electrode table
- spike time
- mean spike waveform
- micro LFP
- macro LFP
All the above data will be packed into a single file in the directory:
Experiment-<exp_id1>–<exp_id2>-…
– nwb
– ecephys.nwb
This file can be written with pynwb
or matnwb
. For details on reading a NWB file, see this demo.