Spike Sorting: pipeline version 1¶

This is a tutorial for Spyglass spike sorting pipeline version 1 (V1). This pipeline coexists with version 0 but differs in that:

it stores more of the intermediate results (e.g. filtered and referenced recording) in the NWB format
it has more streamlined curation pipelines
it uses UUIDs as the primary key for important tables (e.g. SpikeSorting) to reduce the number of keys that make up the composite primary key

The output of both versions of the pipeline are saved in a merge table called SpikeSortingOutput.

To start, connect to the database. See instructions in Setup.

In [1]:

Copied!





import os
import datajoint as dj
import numpy as np

# change to the upper level folder to detect dj_local_conf.json
if os.path.basename(os.getcwd()) == "notebooks":
    os.chdir("..")
dj.config["enable_python_native_blobs"] = True
dj.config.load("dj_local_conf.json")  # load config for database connection info
import os
import datajoint as dj
import numpy as np

# change to the upper level folder to detect dj_local_conf.json
if os.path.basename(os.getcwd()) == "notebooks":
    os.chdir("..")
dj.config["enable_python_native_blobs"] = True
dj.config.load("dj_local_conf.json")  # load config for database connection info

Insert Data and populate pre-requisite tables¶

First, import the pipeline and other necessary modules.

In [2]:

Copied!

import spyglass.common as sgc
import spyglass.spikesorting.v1 as sgs
import spyglass.data_import as sgi
import spyglass.common as sgc
import spyglass.spikesorting.v1 as sgs
import spyglass.data_import as sgi

[2024-04-19 10:57:17,965][INFO]: Connecting sambray@lmf-db.cin.ucsf.edu:3306
[2024-04-19 10:57:17,985][INFO]: Connected sambray@lmf-db.cin.ucsf.edu:3306

We will be using minirec20230622.nwb as our example. As usual, first insert the NWB file into Session (can skip if you have already done so).

In [3]:

Copied!





nwb_file_name = "minirec20230622.nwb"
nwb_file_name2 = "minirec20230622_.nwb"
sgi.insert_sessions(nwb_file_name)
sgc.Session() & {"nwb_file_name": nwb_file_name2}
nwb_file_name = "minirec20230622.nwb"
nwb_file_name2 = "minirec20230622_.nwb"
sgi.insert_sessions(nwb_file_name)
sgc.Session() & {"nwb_file_name": nwb_file_name2}

/home/sambray/Documents/spyglass/src/spyglass/data_import/insert_sessions.py:58: UserWarning: Cannot insert data from minirec20230622.nwb: minirec20230622_.nwb is already in Nwbfile table.
  warnings.warn(

Out[3]:

Table for holding experimental sessions.

nwb_file_name name of the NWB file	subject_id	institution_name	lab_name	session_id	session_description	session_start_time	timestamps_reference_time	experiment_description
minirec20230622_.nwb	54321	UCSF	Loren Frank Lab	12345	test yaml insertion	2023-06-22 15:59:58	1970-01-01 00:00:00	Test Conversion

Total: 1

All spikesorting results are linked to a team name from the LabTeam table. If you haven't already inserted a team for your project do so here.

In [4]:

Copied!





# Make a lab team if doesn't already exist, otherwise insert yourself into team
team_name = "My Team"
if not sgc.LabTeam() & {"team_name": team_name}:
    sgc.LabTeam().create_new_team(
        team_name=team_name,  # Should be unique
        team_members=[],
        team_description="test",  # Optional
    )
# Make a lab team if doesn't already exist, otherwise insert yourself into team
team_name = "My Team"
if not sgc.LabTeam() & {"team_name": team_name}:
    sgc.LabTeam().create_new_team(
        team_name=team_name,  # Should be unique
        team_members=[],
        team_description="test",  # Optional
    )

Define sort groups and extract recordings¶

Each NWB file will have multiple electrodes we can use for spike sorting. We commonly use multiple electrodes in a SortGroup selected by what tetrode or shank of a probe they were on. Electrodes in the same sort group will then be sorted together.

In [5]:

Copied!

sgs.SortGroup.set_group_by_shank(nwb_file_name=nwb_file_name2)
sgs.SortGroup.set_group_by_shank(nwb_file_name=nwb_file_name2)

The next step is to filter and reference the recording so that we isolate the spike band data. This is done by combining the data with the parameters in SpikeSortingRecordingSelection. For inserting into this table, use insert_selection method. This automatically generates a UUID for a recording.

In [6]:

Copied!





# define and insert a key for each sort group and interval you want to sort
key = {
    "nwb_file_name": nwb_file_name2,
    "sort_group_id": 0,
    "preproc_param_name": "default",
    "interval_list_name": "01_s1",
    "team_name": "My Team",
}
sgs.SpikeSortingRecordingSelection.insert_selection(key)
# define and insert a key for each sort group and interval you want to sort
key = {
    "nwb_file_name": nwb_file_name2,
    "sort_group_id": 0,
    "preproc_param_name": "default",
    "interval_list_name": "01_s1",
    "team_name": "My Team",
}
sgs.SpikeSortingRecordingSelection.insert_selection(key)

Out[6]:

{'nwb_file_name': 'minirec20230622_.nwb',
 'sort_group_id': 0,
 'preproc_param_name': 'default',
 'interval_list_name': '01_s1',
 'team_name': 'My Team',
 'recording_id': UUID('3450db49-28d5-4942-aa37-7c19126d16db')}

Next we will call populate method of SpikeSortingRecording.

In [7]:

Copied!





# Assuming 'key' is a dictionary with fields that you want to include in 'ssr_key'
ssr_key = {
    "recording_id": (sgs.SpikeSortingRecordingSelection() & key).fetch1(
        "recording_id"
    ),
} | key

ssr_pk = (sgs.SpikeSortingRecordingSelection & key).proj()
sgs.SpikeSortingRecording.populate(ssr_pk)
sgs.SpikeSortingRecording() & ssr_key
# Assuming 'key' is a dictionary with fields that you want to include in 'ssr_key'
ssr_key = {
    "recording_id": (sgs.SpikeSortingRecordingSelection() & key).fetch1(
        "recording_id"
    ),
} | key

ssr_pk = (sgs.SpikeSortingRecordingSelection & key).proj()
sgs.SpikeSortingRecording.populate(ssr_pk)
sgs.SpikeSortingRecording() & ssr_key

[10:57:43][INFO] Spyglass: Writing new NWB file minirec20230622_PTCFX77XOI.nwb
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/hdmf/build/objectmapper.py:668: MissingRequiredBuildWarning: NWBFile 'root' is missing required value for attribute 'source_script_file_name'.
  warnings.warn(msg, MissingRequiredBuildWarning)

Out[7]:

Processed recording.

recording_id	analysis_file_name name of the file	object_id Object ID for the processed recording in NWB file
3450db49-28d5-4942-aa37-7c19126d16db	minirec20230622_PTCFX77XOI.nwb	15592178-c317-4112-bfa6-b0943542e507

Total: 1

In [8]:

Copied!

key = (sgs.SpikeSortingRecordingSelection & key).fetch1()
key = (sgs.SpikeSortingRecordingSelection & key).fetch1()

Artifact Detection¶

Sometimes the recording may contain artifacts that can confound spike sorting. For example, we often have artifacts when the animal licks the reward well for milk during behavior. These appear as sharp transients across all channels, and sometimes they are not adequately removed by filtering and referencing. We will identify the periods during which this type of artifact appears and set them to zero so that they won't interfere with spike sorting.

In [9]:

Copied!





sgs.ArtifactDetectionSelection.insert_selection(
    {"recording_id": key["recording_id"], "artifact_param_name": "default"}
)
sgs.ArtifactDetection.populate()
sgs.ArtifactDetectionSelection.insert_selection(
    {"recording_id": key["recording_id"], "artifact_param_name": "default"}
)
sgs.ArtifactDetection.populate()

[10:57:52][INFO] Spyglass: Using 4 jobs...

detect_artifact_frames:   0%|          | 0/2 [00:00<?, ?it/s]

[10:57:53][WARNING] Spyglass: No artifacts detected.

In [10]:

Copied!

sgs.ArtifactDetection()
sgs.ArtifactDetection()

Out[10]:

Detected artifacts (e.g. large transients from movement).

artifact_id
0058dab4-41c1-42b1-91f4-5773f2ad36cc
01b39d37-3ff8-4907-9da6-9fec9baf87b5
035f0bae-80b3-4ce9-a767-94d336f36283
038ee778-6cf1-4e99-ab80-e354db5170c9
03e9768d-d101-4f56-abf9-5b0e3e1803b7
0490c820-c381-43b6-857e-f463147723ff
04a289c6-9e19-486a-a4cb-7e9638af225a
06dd7922-7042-4023-bebf-da1dacb0b6c7
07036486-e9f5-4dba-8662-7fb5ff2a6711
070ed448-a52d-478e-9102-0d04a6ed0b96
07a65788-bb89-48f3-90ea-4ab1add06eae
0a6611b3-c593-4900-a715-66bb1396940e

...

Total: 151

The output of ArtifactDetection is actually stored in IntervalList because it is another type of interval. The UUID however can be found in both.

Run Spike Sorting¶

Now that we have prepared the recording, we will pair this with a spike sorting algorithm and associated parameters. This will be inserted to SpikeSortingSelection, again via insert_selection method.

The spike sorting pipeline is powered by spikeinterface, a community-developed Python package that enables one to easily apply multiple spike sorters to a single recording. Some spike sorters have special requirements, such as GPU. Others need to be installed separately from spyglass. In the Frank lab, we have been using mountainsort4, though the pipeline have been tested with mountainsort5, kilosort2_5, kilosort3, and ironclust as well.

When using mountainsort5, make sure to run pip install mountainsort5. kilosort2_5, kilosort3, and ironclust are MATLAB-based, but we can run these without having to install MATLAB thanks to spikeinterface. It does require downloading additional files (as singularity containers) so make sure to do pip install spython. These sorters also require GPU access, so also do pip install cuda-python (and make sure your computer does have a GPU).

In [11]:

Copied!





sorter = "mountainsort4"

common_key = {
    "recording_id": key["recording_id"],
    "sorter": sorter,
    "nwb_file_name": nwb_file_name2,
    "interval_list_name": str(
        (
            sgs.ArtifactDetectionSelection
            & {"recording_id": key["recording_id"]}
        ).fetch1("artifact_id")
    ),
}

if sorter == "mountainsort4":
    key = {
        **common_key,
        "sorter_param_name": "franklab_tetrode_hippocampus_30KHz",
    }
else:
    key = {
        **common_key,
        "sorter_param_name": "default",
    }
sorter = "mountainsort4"

common_key = {
    "recording_id": key["recording_id"],
    "sorter": sorter,
    "nwb_file_name": nwb_file_name2,
    "interval_list_name": str(
        (
            sgs.ArtifactDetectionSelection
            & {"recording_id": key["recording_id"]}
        ).fetch1("artifact_id")
    ),
}

if sorter == "mountainsort4":
    key = {
        **common_key,
        "sorter_param_name": "franklab_tetrode_hippocampus_30KHz",
    }
else:
    key = {
        **common_key,
        "sorter_param_name": "default",
    }

In [12]:

Copied!

sgs.SpikeSortingSelection.insert_selection(key)
sgs.SpikeSortingSelection() & key
sgs.SpikeSortingSelection.insert_selection(key)
sgs.SpikeSortingSelection() & key

Out[12]:

Processed recording and spike sorting parameters. Use `insert_selection` method to insert rows.

sorting_id	recording_id	sorter	sorter_param_name	nwb_file_name name of the NWB file	interval_list_name descriptive name of this interval list
16cbb873-052f-44f3-9f4d-89af3544915e	3450db49-28d5-4942-aa37-7c19126d16db	mountainsort4	franklab_tetrode_hippocampus_30KHz	minirec20230622_.nwb	f03513af-bff8-4732-a6ab-e53f0550e7b0

Total: 1

Once SpikeSortingSelection is populated, let's run SpikeSorting.populate.

In [13]:

Copied!

sss_pk = (sgs.SpikeSortingSelection & key).proj()

sgs.SpikeSorting.populate(sss_pk)
sss_pk = (sgs.SpikeSortingSelection & key).proj()

sgs.SpikeSorting.populate(sss_pk)

Mountainsort4 use the OLD spikeextractors mapped with NewToOldRecording

[10:58:17][INFO] Spyglass: Writing new NWB file minirec20230622_PP6Y10VW0V.nwb
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/hdmf/build/objectmapper.py:668: MissingRequiredBuildWarning: NWBFile 'root' is missing required value for attribute 'source_script_file_name'.
  warnings.warn(msg, MissingRequiredBuildWarning)
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/datajoint/hash.py:39: ResourceWarning: unclosed file <_io.BufferedReader name='/stelmo/nwb/analysis/minirec20230622/minirec20230622_PP6Y10VW0V.nwb'>
  return uuid_from_stream(Path(filepath).open("rb"), init_string=init_string)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/datajoint/external.py:276: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
  if check_hash:
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/tempfile.py:821: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/stelmo/nwb/tmp/tmpa7_uli3g'>
  _warnings.warn(warn_message, ResourceWarning)

The spike sorting results (spike times of detected units) are saved in an NWB file. We can access this in two ways. First, we can access it via the fetch_nwb method, which allows us to directly access the spike times saved in the units table of the NWB file. Second, we can access it as a spikeinterface.NWBSorting object. This allows us to take advantage of the rich APIs of spikeinterface to further analyze the sorting.

In [14]:

Copied!

sorting_nwb = (sgs.SpikeSorting & key).fetch_nwb()
sorting_si = sgs.SpikeSorting.get_sorting(key)
sorting_nwb = (sgs.SpikeSorting & key).fetch_nwb()
sorting_si = sgs.SpikeSorting.get_sorting(key)

/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/datajoint/hash.py:39: ResourceWarning: unclosed file <_io.BufferedReader name='/stelmo/nwb/analysis/minirec20230622/minirec20230622_PP6Y10VW0V.nwb'>
  return uuid_from_stream(Path(filepath).open("rb"), init_string=init_string)
ResourceWarning: Enable tracemalloc to get the object allocation traceback

Note that the spike times of fetch_nwb is in units of seconds aligned with the timestamps of the recording. The spike times of the spikeinterface.NWBSorting object is in units of samples (as is generally true for sorting objects in spikeinterface).

Automatic Curation¶

Next step is to curate the results of spike sorting. This is often necessary because spike sorting algorithms are not perfect; they often return clusters that are clearly not biological in origin, and sometimes oversplit clusters that should have been merged. We have two main ways of curating spike sorting: by computing quality metrics followed by thresholding, and manually applying curation labels. To do either, we first insert the spike sorting to CurationV1 using insert_curation method.

In [15]:

Copied!





sgs.SpikeSortingRecording & key
sgs.CurationV1.insert_curation(
    sorting_id=(
        sgs.SpikeSortingSelection & {"recording_id": key["recording_id"]}
    ).fetch1("sorting_id"),
    description="testing sort",
)
sgs.SpikeSortingRecording & key
sgs.CurationV1.insert_curation(
    sorting_id=(
        sgs.SpikeSortingSelection & {"recording_id": key["recording_id"]}
    ).fetch1("sorting_id"),
    description="testing sort",
)

[10:58:32][INFO] Spyglass: Writing new NWB file minirec20230622_SYPH1SYT75.nwb
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/hdmf/build/objectmapper.py:668: MissingRequiredBuildWarning: NWBFile 'root' is missing required value for attribute 'source_script_file_name'.
  warnings.warn(msg, MissingRequiredBuildWarning)
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/datajoint/hash.py:39: ResourceWarning: unclosed file <_io.BufferedReader name='/stelmo/nwb/analysis/minirec20230622/minirec20230622_SYPH1SYT75.nwb'>
  return uuid_from_stream(Path(filepath).open("rb"), init_string=init_string)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/datajoint/external.py:276: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
  if check_hash:

Out[15]:

{'sorting_id': UUID('16cbb873-052f-44f3-9f4d-89af3544915e'),
 'curation_id': 0,
 'parent_curation_id': -1,
 'analysis_file_name': 'minirec20230622_SYPH1SYT75.nwb',
 'object_id': '3e4f927b-716f-4dd8-9c98-acd132d758fb',
 'merges_applied': False,
 'description': 'testing sort'}

In [16]:

Copied!

sgs.CurationV1()
sgs.CurationV1()

Out[16]:

Curation of a SpikeSorting. Use `insert_curation` to insert rows.

sorting_id	curation_id	parent_curation_id	analysis_file_name name of the file	object_id	description
021fb85a-992f-4360-99c7-e2da32c5b9cb	0	-1	BS2820231107_8Z8CLG184Z.nwb	37ee7365-028f-46e1-8351-1cd402a7b36c	testing sort
021fb85a-992f-4360-99c7-e2da32c5b9cb	1	0	BS2820231107_HPIQR9LZWU.nwb	538032a5-5d29-4cb8-b0a2-7224fee6d8ce	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	2	0	BS2820231107_SVW8YK84IP.nwb	ed440315-7302-4217-be15-087c7efeda7e	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	3	0	BS2820231107_7CWR2JR68B.nwb	0d8be667-2831-4e99-8c9b-54102de48e85	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	4	0	BS2820231107_1PCRTB2UZ2.nwb	9f9e9a1e-9be3-405c-9c66-4bf6dc54d4d9	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	5	0	BS2820231107_4NPZ4YTASV.nwb	89170a28-487a-4787-83dd-18009c446700	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	6	0	BS2820231107_MMSIJ8YQ54.nwb	c9fb8c88-6449-4d9a-a40a-cd10dcdc193f	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	7	0	BS2820231107_LZJWQPP1YW.nwb	f078e3bb-92fc-4e7f-b3a8-32936a90e057	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	8	0	BS2820231107_RJ7DLUKOIG.nwb	c311fbfb-cd3d-4d92-b535-b5da3d4a6ec3	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	9	0	BS2820231107_6ZJP5NRCX9.nwb	a54ee3f8-851a-4dca-bb46-7673e2807462	after metric curation
03dc29a5-febe-4a59-ab61-21a25dea3625	0	-1	j1620210710_EOE1VZ4YAX.nwb	52889e86-c249-4916-9576-a9ccf7f48dbe
061ba57b-d2cb-4052-b375-42ba13684e41	0	-1	BS2820231107_S21IIVRCZA.nwb	5d71500d-1065-4610-b3a6-746821d0f438	testing sort

...

Total: 626

We will first do an automatic curation based on quality metrics. Under the hood, this part again makes use of spikeinterface. Some of the quality metrics that we often compute are the nearest neighbor isolation and noise overlap metrics, as well as SNR and ISI violation rate. For computing some of these metrics, the waveforms must be extracted and projected onto a feature space. Thus here we set the parameters for waveform extraction as well as how to curate the units based on these metrics (e.g. if nn_noise_overlap is greater than 0.1, mark as noise).

In [17]:

Copied!





key = {
    "sorting_id": (
        sgs.SpikeSortingSelection & {"recording_id": key["recording_id"]}
    ).fetch1("sorting_id"),
    "curation_id": 0,
    "waveform_param_name": "default_not_whitened",
    "metric_param_name": "franklab_default",
    "metric_curation_param_name": "default",
}
key = {
    "sorting_id": (
        sgs.SpikeSortingSelection & {"recording_id": key["recording_id"]}
    ).fetch1("sorting_id"),
    "curation_id": 0,
    "waveform_param_name": "default_not_whitened",
    "metric_param_name": "franklab_default",
    "metric_curation_param_name": "default",
}

In [18]:

Copied!

sgs.MetricCurationSelection.insert_selection(key)
sgs.MetricCurationSelection() & key
sgs.MetricCurationSelection.insert_selection(key)
sgs.MetricCurationSelection() & key

Out[18]:

Spike sorting and parameters for metric curation. Use `insert_selection` to insert a row into this table.

metric_curation_id	sorting_id	curation_id	waveform_param_name name of waveform extraction parameters	metric_param_name	metric_curation_param_name
5bd75cd5-cc2e-41dd-9056-5d62fa46021a	16cbb873-052f-44f3-9f4d-89af3544915e	0	default_not_whitened	franklab_default	default

Total: 1

In [27]:

Copied!

sgs.MetricCuration.populate(key)
sgs.MetricCuration() & key
sgs.MetricCuration.populate(key)
sgs.MetricCuration() & key

Out[27]:

Results of applying curation based on quality metrics. To do additional curation, insert another row in `CurationV1`

metric_curation_id	analysis_file_name name of the file	object_id Object ID for the metrics in NWB file
5bd75cd5-cc2e-41dd-9056-5d62fa46021a	minirec20230622_PVSMM7XHHJ.nwb	01b58a59-1b49-4bd1-a204-16fb09d67b2a

Total: 1

to do another round of curation, fetch the relevant info and insert back into CurationV1 using insert_curation

In [28]:

Copied!





key = {
    "metric_curation_id": (
        sgs.MetricCurationSelection & {"sorting_id": key["sorting_id"]}
    ).fetch1("metric_curation_id")
}
labels = sgs.MetricCuration.get_labels(key)
merge_groups = sgs.MetricCuration.get_merge_groups(key)
metrics = sgs.MetricCuration.get_metrics(key)
sgs.CurationV1.insert_curation(
    sorting_id=(
        sgs.MetricCurationSelection
        & {"metric_curation_id": key["metric_curation_id"]}
    ).fetch1("sorting_id"),
    parent_curation_id=0,
    labels=labels,
    merge_groups=merge_groups,
    metrics=metrics,
    description="after metric curation",
)
key = {
    "metric_curation_id": (
        sgs.MetricCurationSelection & {"sorting_id": key["sorting_id"]}
    ).fetch1("metric_curation_id")
}
labels = sgs.MetricCuration.get_labels(key)
merge_groups = sgs.MetricCuration.get_merge_groups(key)
metrics = sgs.MetricCuration.get_metrics(key)
sgs.CurationV1.insert_curation(
    sorting_id=(
        sgs.MetricCurationSelection
        & {"metric_curation_id": key["metric_curation_id"]}
    ).fetch1("sorting_id"),
    parent_curation_id=0,
    labels=labels,
    merge_groups=merge_groups,
    metrics=metrics,
    description="after metric curation",
)

[11:08:29][INFO] Spyglass: Writing new NWB file minirec20230622_ZCMODPF1NM.nwb
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/hdmf/build/objectmapper.py:668: MissingRequiredBuildWarning: NWBFile 'root' is missing required value for attribute 'source_script_file_name'.
  warnings.warn(msg, MissingRequiredBuildWarning)
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/datajoint/hash.py:39: ResourceWarning: unclosed file <_io.BufferedReader name='/stelmo/nwb/analysis/minirec20230622/minirec20230622_ZCMODPF1NM.nwb'>
  return uuid_from_stream(Path(filepath).open("rb"), init_string=init_string)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
/home/sambray/mambaforge-pypy3/envs/spyglass/lib/python3.9/site-packages/datajoint/external.py:276: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
  if check_hash:

Out[28]:

{'sorting_id': UUID('16cbb873-052f-44f3-9f4d-89af3544915e'),
 'curation_id': 1,
 'parent_curation_id': 0,
 'analysis_file_name': 'minirec20230622_ZCMODPF1NM.nwb',
 'object_id': 'c43cd7ab-e5bd-4528-a0e5-0ca7c337a72d',
 'merges_applied': False,
 'description': 'after metric curation'}

In [29]:

Copied!

sgs.CurationV1()
sgs.CurationV1()

Out[29]:

Curation of a SpikeSorting. Use `insert_curation` to insert rows.

sorting_id	curation_id	parent_curation_id	analysis_file_name name of the file	object_id	description
021fb85a-992f-4360-99c7-e2da32c5b9cb	0	-1	BS2820231107_8Z8CLG184Z.nwb	37ee7365-028f-46e1-8351-1cd402a7b36c	testing sort
021fb85a-992f-4360-99c7-e2da32c5b9cb	1	0	BS2820231107_HPIQR9LZWU.nwb	538032a5-5d29-4cb8-b0a2-7224fee6d8ce	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	2	0	BS2820231107_SVW8YK84IP.nwb	ed440315-7302-4217-be15-087c7efeda7e	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	3	0	BS2820231107_7CWR2JR68B.nwb	0d8be667-2831-4e99-8c9b-54102de48e85	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	4	0	BS2820231107_1PCRTB2UZ2.nwb	9f9e9a1e-9be3-405c-9c66-4bf6dc54d4d9	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	5	0	BS2820231107_4NPZ4YTASV.nwb	89170a28-487a-4787-83dd-18009c446700	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	6	0	BS2820231107_MMSIJ8YQ54.nwb	c9fb8c88-6449-4d9a-a40a-cd10dcdc193f	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	7	0	BS2820231107_LZJWQPP1YW.nwb	f078e3bb-92fc-4e7f-b3a8-32936a90e057	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	8	0	BS2820231107_RJ7DLUKOIG.nwb	c311fbfb-cd3d-4d92-b535-b5da3d4a6ec3	after metric curation
021fb85a-992f-4360-99c7-e2da32c5b9cb	9	0	BS2820231107_6ZJP5NRCX9.nwb	a54ee3f8-851a-4dca-bb46-7673e2807462	after metric curation
03dc29a5-febe-4a59-ab61-21a25dea3625	0	-1	j1620210710_EOE1VZ4YAX.nwb	52889e86-c249-4916-9576-a9ccf7f48dbe
061ba57b-d2cb-4052-b375-42ba13684e41	0	-1	BS2820231107_S21IIVRCZA.nwb	5d71500d-1065-4610-b3a6-746821d0f438	testing sort

...

Total: 627

Manual Curation (Optional)¶

Next we will do manual curation. this is done with figurl. to incorporate info from other stages of processing (e.g. metrics) we have to store that with kachery cloud and get curation uri referring to it. it can be done with generate_curation_uri.

Note: This step is dependent on setting up a kachery sharing system as described in 02_Data_Sync.ipynb and will likely not work correctly on the spyglass-demo server.

In [ ]:

Copied!





curation_uri = sgs.FigURLCurationSelection.generate_curation_uri(
    {
        "sorting_id": (
            sgs.MetricCurationSelection
            & {"metric_curation_id": key["metric_curation_id"]}
        ).fetch1("sorting_id"),
        "curation_id": 1,
    }
)
key = {
    "sorting_id": (
        sgs.MetricCurationSelection
        & {"metric_curation_id": key["metric_curation_id"]}
    ).fetch1("sorting_id"),
    "curation_id": 1,
    "curation_uri": curation_uri,
    "metrics_figurl": list(metrics.keys()),
}
sgs.FigURLCurationSelection()
curation_uri = sgs.FigURLCurationSelection.generate_curation_uri(
    {
        "sorting_id": (
            sgs.MetricCurationSelection
            & {"metric_curation_id": key["metric_curation_id"]}
        ).fetch1("sorting_id"),
        "curation_id": 1,
    }
)
key = {
    "sorting_id": (
        sgs.MetricCurationSelection
        & {"metric_curation_id": key["metric_curation_id"]}
    ).fetch1("sorting_id"),
    "curation_id": 1,
    "curation_uri": curation_uri,
    "metrics_figurl": list(metrics.keys()),
}
sgs.FigURLCurationSelection()

In [ ]:

Copied!

sgs.FigURLCurationSelection.insert_selection(key)
sgs.FigURLCurationSelection()
sgs.FigURLCurationSelection.insert_selection(key)
sgs.FigURLCurationSelection()

In [ ]:

Copied!

sgs.FigURLCuration.populate()
sgs.FigURLCuration()
sgs.FigURLCuration.populate()
sgs.FigURLCuration()

or you can manually specify it if you already have a curation.json

In [ ]:

Copied!





gh_curation_uri = (
    "gh://LorenFrankLab/sorting-curations/main/khl02007/test/curation.json"
)

key = {
    "sorting_id": key["sorting_id"],
    "curation_id": 1,
    "curation_uri": gh_curation_uri,
    "metrics_figurl": [],
}
sgs.FigURLCurationSelection.insert_selection(key)
gh_curation_uri = (
    "gh://LorenFrankLab/sorting-curations/main/khl02007/test/curation.json"
)

key = {
    "sorting_id": key["sorting_id"],
    "curation_id": 1,
    "curation_uri": gh_curation_uri,
    "metrics_figurl": [],
}
sgs.FigURLCurationSelection.insert_selection(key)

In [ ]:

Copied!

sgs.FigURLCuration.populate()
sgs.FigURLCuration()
sgs.FigURLCuration.populate()
sgs.FigURLCuration()

once you apply manual curation (curation labels and merge groups) you can store them as nwb by inserting another row in CurationV1. And then you can do more rounds of curation if you want.

In [ ]:

Copied!





labels = sgs.FigURLCuration.get_labels(gh_curation_uri)
merge_groups = sgs.FigURLCuration.get_merge_groups(gh_curation_uri)
sgs.CurationV1.insert_curation(
    sorting_id=key["sorting_id"],
    parent_curation_id=1,
    labels=labels,
    merge_groups=merge_groups,
    metrics=metrics,
    description="after figurl curation",
)
labels = sgs.FigURLCuration.get_labels(gh_curation_uri)
merge_groups = sgs.FigURLCuration.get_merge_groups(gh_curation_uri)
sgs.CurationV1.insert_curation(
    sorting_id=key["sorting_id"],
    parent_curation_id=1,
    labels=labels,
    merge_groups=merge_groups,
    metrics=metrics,
    description="after figurl curation",
)

In [ ]:

Copied!

sgs.CurationV1()
sgs.CurationV1()

Downstream usage (Merge table)¶

Regardless of Curation method used, to make use of spikeorting results in downstream pipelines like Decoding, we will need to insert it into the SpikeSortingOutput merge table.

In [30]:

Copied!

from spyglass.spikesorting.spikesorting_merge import SpikeSortingOutput

SpikeSortingOutput()
from spyglass.spikesorting.spikesorting_merge import SpikeSortingOutput

SpikeSortingOutput()

Out[30]:

Output of spike sorting pipelines.

merge_id	source
0001a1ab-7c2b-1085-2062-53c0338ffe22	CuratedSpikeSorting
000c5d0b-1c4c-55d1-ccf6-5808f57152d3	CuratedSpikeSorting
0015e01d-0dc0-ca2c-1f5c-2178fa2c7f1e	CuratedSpikeSorting
001628b1-0af1-7c74-a211-0e5c158ba10f	CuratedSpikeSorting
001783f0-c5da-98c2-5b2a-63f1334c0a43	CuratedSpikeSorting
0020b039-6a2d-1d68-6585-4866fb7ea266	CuratedSpikeSorting
002be77b-38a6-fff8-cb48-a81e20ccb51b	CuratedSpikeSorting
002da11c-2d16-a6dc-0468-980674ca12b0	CuratedSpikeSorting
003bf29a-fa09-05be-5cac-b7ea70a48c0c	CuratedSpikeSorting
003cabf2-c471-972a-4b18-63d4ab7e1b8b	CuratedSpikeSorting
004d99c6-1b2e-1696-fc85-e78ac5cc7e6b	CuratedSpikeSorting
004faf9a-72cb-4416-ae13-3f85d538604f	CuratedSpikeSorting

...

Total: 8684

In [52]:

Copied!





# insert the automatic curation spikesorting results
curation_key = sss_pk.fetch1("KEY")
curation_key["curation_id"] = 1
merge_insert_key = (sgs.CurationV1 & curation_key).fetch("KEY", as_dict=True)
SpikeSortingOutput.insert(merge_insert_key, part_name="CurationV1")
SpikeSortingOutput.merge_view()
# insert the automatic curation spikesorting results
curation_key = sss_pk.fetch1("KEY")
curation_key["curation_id"] = 1
merge_insert_key = (sgs.CurationV1 & curation_key).fetch("KEY", as_dict=True)
SpikeSortingOutput.insert(merge_insert_key, part_name="CurationV1")
SpikeSortingOutput.merge_view()

*merge_id      *source        *sorting_id    *curation_id   *nwb_file_name *sort_group_id *sort_interval *preproc_param *team_name    *sorter    *sorter_params *artifact_remo
+------------+ +------------+ +------------+ +------------+ +------------+ +------------+ +------------+ +------------+ +-----------+ +--------+ +------------+ +------------+
d76584f8-0969- CurationV1     03dc29a5-febe- 0              None           0              None           None           None          None       None           None          
33d71671-63e5- CurationV1     090377fb-72b7- 0              None           0              None           None           None          None       None           None          
dfa87e8e-c5cf- CurationV1     0cf93833-6a14- 0              None           0              None           None           None          None       None           None          
a6cc0a23-7e29- CurationV1     110e27f6-5ffa- 0              None           0              None           None           None          None       None           None          
7f8841a6-5e27- CurationV1     16cbb873-052f- 1              None           0              None           None           None          None       None           None          
91e8e8d8-1568- CurationV1     21bea0ea-3084- 0              None           0              None           None           None          None       None           None          
218c17c7-8a4c- CurationV1     21bea0ea-3084- 1              None           0              None           None           None          None       None           None          
25823222-85ed- CurationV1     2484ee5d-0819- 0              None           0              None           None           None          None       None           None          
5ae79d97-6a99- CurationV1     3046a016-1613- 0              None           0              None           None           None          None       None           None          
869072e1-76d6- CurationV1     41a13836-e128- 0              None           0              None           None           None          None       None           None          
a0771d6c-fc9d- CurationV1     4bc61e94-5bf9- 0              None           0              None           None           None          None       None           None          
ed70dacb-a637- CurationV1     5d15f94e-d53d- 0              None           0              None           None           None          None       None           None          
   ...
 (Total: 0)

Finding the merge id's corresponding to an interpretable restriction such as merge_id or interval_list can require several join steps with upstream tables. To simplify this process we can use the included helper function SpikeSortingOutput().get_restricted_merge_ids() to perform the necessary joins and return the matching merge id's

In [6]:

Copied!





selection_key = {
    "nwb_file_name": nwb_file_name2,
    "sorter": "mountainsort4",
    "interval_list_name": "01_s1",
    "curation_id": 0,
}  # this function can use restrictions from throughout the spikesorting pipeline
spikesorting_merge_ids = SpikeSortingOutput().get_restricted_merge_ids(
    selection_key, as_dict=True
)
spikesorting_merge_ids
selection_key = {
    "nwb_file_name": nwb_file_name2,
    "sorter": "mountainsort4",
    "interval_list_name": "01_s1",
    "curation_id": 0,
}  # this function can use restrictions from throughout the spikesorting pipeline
spikesorting_merge_ids = SpikeSortingOutput().get_restricted_merge_ids(
    selection_key, as_dict=True
)
spikesorting_merge_ids

[13:34:12][WARNING] Spyglass: V0 requires artifact restrict. Ignoring "restrict_by_artifact" flag.

Out[6]:

[{'merge_id': UUID('74c006e8-dcfe-e994-7b40-73f8d9f75b85')}]

With the spikesorting merge_ids we want we can also use the method get_sort_group_info to get a table linking the merge id to the electrode group it is sourced from. This can be helpful for restricting to just electrodes from a brain area of interest

In [60]:

Copied!

merge_keys = [{"merge_id": str(id)} for id in spikesorting_merge_ids]
SpikeSortingOutput().get_sort_group_info(merge_keys)
merge_keys = [{"merge_id": str(id)} for id in spikesorting_merge_ids]
SpikeSortingOutput().get_sort_group_info(merge_keys)

Out[60]:

merge_id	nwb_file_name name of the NWB file	electrode_group_name electrode group name from NWBFile	electrode_id the unique number for this electrode	curation_id a number correponding to the index of this curation	sort_group_id identifier for a group of electrodes	sort_interval_name name for this interval	preproc_params_name	team_name	sorter	sorter_params_name	artifact_removed_interval_list_name	region_id	probe_id	probe_shank shank number within probe	probe_electrode electrode	name unique label for each contact	original_reference_electrode the configured reference electrode for this electrode	x the x coordinate of the electrode position in the brain	y the y coordinate of the electrode position in the brain	z the z coordinate of the electrode position in the brain	filtering description of the signal filtering	impedance electrode impedance	bad_channel if electrode is "good" or "bad" as observed during recording	x_warped x coordinate of electrode position warped to common template brain	y_warped y coordinate of electrode position warped to common template brain	z_warped z coordinate of electrode position warped to common template brain	contacts label of electrode contacts used for a bipolar signal - current workaround	analysis_file_name name of the file	units_object_id	region_name the name of the brain region	subregion_name subregion name	subsubregion_name subregion within subregion
662f3e35-c81e-546c-69c3-b3a2f5ed2776	minirec20230622_.nwb	0	0	1	0	01_s1_first9	default_hippocampus	My Team	mountainsort4	hippocampus_tutorial	minirec20230622_.nwb_01_s1_first9_0_default_hippocampus_none_artifact_removed_valid_times	35	tetrode_12.5	0	0	0	0	0.0	0.0	0.0	None	0.0	False	0.0	0.0	0.0		minirec20230622_RXRSAFCGVJ.nwb		corpus callosum and associated subcortical white matter (cc-ec-cing-dwm)	None	None

Total: 1