Clusterless Decoding¶
Overview¶
Developer Note: if you may make a PR in the future, be sure to copy this
notebook, and use the gitignore
prefix temp
to avoid future conflicts.
This is one notebook in a multi-part series on Spyglass.
- To set up your Spyglass environment and database, see the Setup notebook
- This tutorial assumes you've already extracted waveforms, as well as loaded position data. If 1D decoding, this data should also be linearized.
Clusterless decoding can be performed on either 1D or 2D data. We will start with 2D data.
Elements of Clusterless Decoding¶
- Position Data: This is the data that we want to decode. It can be 1D or 2D.
- Spike Waveform Features: These are the features that we will use to decode the position data.
- Decoding Model Parameters: This is how we define the model that we will use to decode the position data.
Grouping Data¶
An important concept will be groups. Groups are tables that allow use to specify collections of data. We will use groups in two situations here:
- Because we want to decode from more than one tetrode (or probe), so we will create a group that contains all of the tetrodes that we want to decode from.
- Similarly, we will create a group for the position data that we want to decode, so that we can decode from position data from multiple sessions.
Grouping Waveform Features¶
Let's start with grouping the Waveform Features. We will first inspect the waveform features that we have extracted to figure out the primary keys of the data that we want to decode from. We need to use the tables SpikeSortingSelection
and SpikeSortingOutput
to figure out the merge_id
associated with nwb_file_name
to get the waveform features associated with the NWB file of interest.
from pathlib import Path
import datajoint as dj
dj.config.load(
Path("../dj_local_conf.json").absolute()
) # load config for database connection info
from spyglass.spikesorting.merge import SpikeSortingOutput
import spyglass.spikesorting.v1 as sgs
from spyglass.decoding.v1.waveform_features import UnitWaveformFeaturesSelection
nwb_copy_file_name = "mediumnwb20230802_.nwb"
sorter_keys = {
"nwb_file_name": nwb_copy_file_name,
"sorter": "clusterless_thresholder",
"sorter_param_name": "default_clusterless",
}
feature_key = {"features_param_name": "amplitude"}
(sgs.SpikeSortingSelection & sorter_keys) * SpikeSortingOutput.CurationV1 * (
UnitWaveformFeaturesSelection.proj(merge_id="spikesorting_merge_id")
& feature_key
)
[2024-01-17 22:43:09,216][INFO]: Connecting root@localhost:3306 [2024-01-17 22:43:09,293][INFO]: Connected root@localhost:3306
sorting_id | merge_id | features_param_name a name for this set of parameters | recording_id | sorter | sorter_param_name | nwb_file_name name of the NWB file | interval_list_name descriptive name of this interval list | curation_id |
---|---|---|---|---|---|---|---|---|
08a302b6-5505-40fa-b4d5-62162f8eef58 | 485a4ddf-332d-35b5-3ad4-0561736c1844 | amplitude | 449b64e3-db0b-437e-a1b9-0d29928aa2dd | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | 45f6b9a1-eef3-46eb-866d-d0999afebda6 | 0 |
0ca508ee-af4c-4a89-8181-d48bd209bfd4 | 6acb99b8-6a0c-eb83-1141-5f603c5895e0 | amplitude | 328da21c-1d9c-41e2-9800-76b3484b707b | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | 686d9951-1c0f-4d5e-9f5c-09e6fd8bdd4c | 0 |
209dc048-6fae-4315-b293-c06fff29f947 | f7237e18-4e73-4aee-805b-90735e9147de | amplitude | aff78f2f-2ba0-412a-95cc-447c3a2f4683 | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | 719e8a86-fcf1-4ffc-8c1f-ea912f67ad5d | 0 |
21a9a593-f6f3-4b82-99d7-8fc46556eff3 | 7e3fa66e-727e-1541-819a-b01309bb30ae | amplitude | 2402805a-04f9-4a88-9ccf-071376c8de19 | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | d581b117-160e-4311-b096-7781a4de4394 | 0 |
406a20e3-5a9f-4fec-b046-a6561f72461e | 6d039a63-17ad-0b78-4b1e-f02d5f3dbbc5 | amplitude | f1427e00-2974-4301-b2ac-b4dc29277c51 | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | 0e848c38-9105-4ea4-b6ba-dbdd5b46a088 | 0 |
4131c51b-c56d-41fa-b046-46635fc17fd9 | e0e9133a-7a4e-1321-a43a-e8afcb2f25da | amplitude | 9e332d82-1daf-4e92-bb50-12e4f9430875 | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | 9ed11db5-c42e-491a-8caf-7d9a37a65f13 | 0 |
4c5a629a-71d9-481d-ab11-a4cb0fc16087 | 9959b614-2318-f597-6651-a3a82124d28a | amplitude | 3a2c3eed-413a-452a-83c8-0e4648141bde | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | 2b9fbf14-74a0-4294-a805-26702340aac9 | 0 |
4d629c07-1931-4e1f-a3a8-cbf1b72161e3 | c0eb6455-fc41-c200-b62e-e3ca81b9a3f7 | amplitude | f07bc0b0-de6b-4424-8ef9-766213aaca26 | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | 5c68f0f0-f577-4905-8a09-e4d171d0a22d | 0 |
554a9a3c-0461-48be-8435-123eed59c228 | 912e250e-56d8-ee33-4525-c844d810971b | amplitude | 7f128981-6868-4976-ba20-248655dcac21 | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | f4b9301f-bc91-455b-9474-c801093f3856 | 0 |
7bb007f2-26d3-463f-b7dc-7bd4d271725e | d7d2c97a-0e6e-d1b8-735c-d55dc66a30e1 | amplitude | a9b7cec0-1256-49cf-abf0-8c45fd155379 | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | 74270cba-36ee-4afb-ab50-2a6cc948e68c | 0 |
80e1f37f-48a7-4087-bd37-7a37b6a2c160 | abb92dce-4410-8f17-a501-a4104bda0dcf | amplitude | 3c40ebdc-0b61-4105-9971-e1348bd49bc7 | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | 0f91197e-bebb-4dc6-ad41-5bf89c3eed28 | 0 |
8848c4a8-a2f2-4f3d-82cd-51b13b8bae3c | 74e10781-1228-4075-0870-af224024ffdc | amplitude | 257c077b-8f3b-4abb-a631-6b8084d6a1ea | clusterless_thresholder | default_clusterless | mediumnwb20230802_.nwb | e289e03d-32ad-461a-a1cc-c88537343149 | 0 |
...
Total: 23
from spyglass.decoding.v1.waveform_features import UnitWaveformFeaturesSelection
spikesorting_merge_id = (
(sgs.SpikeSortingSelection & sorter_keys)
* SpikeSortingOutput.CurationV1
* (
UnitWaveformFeaturesSelection.proj(merge_id="spikesorting_merge_id")
& feature_key
)
).fetch("merge_id")
waveform_selection_keys = [
{"spikesorting_merge_id": merge_id, "features_param_name": "amplitude"}
for merge_id in spikesorting_merge_id
]
UnitWaveformFeaturesSelection & waveform_selection_keys
spikesorting_merge_id | features_param_name a name for this set of parameters |
---|---|
0751a1e1-a406-7f87-ae6f-ce4ffc60621c | amplitude |
485a4ddf-332d-35b5-3ad4-0561736c1844 | amplitude |
4a712103-c223-864f-82e0-6c23de79cc14 | amplitude |
4a72c253-b3ca-8c13-e615-736a7ebff35c | amplitude |
5c53bd33-d57c-fbba-e0fb-55e0bcb85d03 | amplitude |
614d796c-0b95-6364-aaa0-b6cb1e7bbb83 | amplitude |
6acb99b8-6a0c-eb83-1141-5f603c5895e0 | amplitude |
6d039a63-17ad-0b78-4b1e-f02d5f3dbbc5 | amplitude |
74e10781-1228-4075-0870-af224024ffdc | amplitude |
7e3fa66e-727e-1541-819a-b01309bb30ae | amplitude |
86897349-ff68-ac72-02eb-739dd88936e6 | amplitude |
8bbddc0f-d6ae-6260-9400-f884a6e25ae8 | amplitude |
...
Total: 23
We will create a group called test_group
that contains all of the tetrodes that we want to decode from. We will use the create_group
function to create this group. This function takes two arguments: the name of the group, and the keys of the tables that we want to include in the group.
from spyglass.decoding.v1.clusterless import UnitWaveformFeaturesGroup
UnitWaveformFeaturesGroup().create_group(
nwb_file_name=nwb_copy_file_name,
group_name="test_group",
keys=waveform_selection_keys,
)
UnitWaveformFeaturesGroup & {"waveform_features_group_name": "test_group"}
nwb_file_name name of the NWB file | waveform_features_group_name |
---|---|
mediumnwb20230802_.nwb | test_group |
Total: 1
We can see that we successfully associated "test_group" with the tetrodes that we want to decode from by using the get_group
function.
UnitWaveformFeaturesGroup.UnitFeatures & {
"nwb_file_name": nwb_copy_file_name,
"waveform_features_group_name": "test_group",
}
nwb_file_name name of the NWB file | waveform_features_group_name | spikesorting_merge_id | features_param_name a name for this set of parameters |
---|---|---|---|
mediumnwb20230802_.nwb | test_group | 0751a1e1-a406-7f87-ae6f-ce4ffc60621c | amplitude |
mediumnwb20230802_.nwb | test_group | 485a4ddf-332d-35b5-3ad4-0561736c1844 | amplitude |
mediumnwb20230802_.nwb | test_group | 4a712103-c223-864f-82e0-6c23de79cc14 | amplitude |
mediumnwb20230802_.nwb | test_group | 4a72c253-b3ca-8c13-e615-736a7ebff35c | amplitude |
mediumnwb20230802_.nwb | test_group | 5c53bd33-d57c-fbba-e0fb-55e0bcb85d03 | amplitude |
mediumnwb20230802_.nwb | test_group | 614d796c-0b95-6364-aaa0-b6cb1e7bbb83 | amplitude |
mediumnwb20230802_.nwb | test_group | 6acb99b8-6a0c-eb83-1141-5f603c5895e0 | amplitude |
mediumnwb20230802_.nwb | test_group | 6d039a63-17ad-0b78-4b1e-f02d5f3dbbc5 | amplitude |
mediumnwb20230802_.nwb | test_group | 74e10781-1228-4075-0870-af224024ffdc | amplitude |
mediumnwb20230802_.nwb | test_group | 7e3fa66e-727e-1541-819a-b01309bb30ae | amplitude |
mediumnwb20230802_.nwb | test_group | 86897349-ff68-ac72-02eb-739dd88936e6 | amplitude |
mediumnwb20230802_.nwb | test_group | 8bbddc0f-d6ae-6260-9400-f884a6e25ae8 | amplitude |
...
Total: 23
Grouping Position Data¶
We will now create a group called 02_r1
that contains all of the position data that we want to decode from. As before, we will use the create_group
function to create this group. This function takes two arguments: the name of the group, and the keys of the tables that we want to include in the group.
We use the the PositionOutput
table to figure out the merge_id
associated with nwb_file_name
to get the position data associated with the NWB file of interest. In this case, we only have one position to insert, but we could insert multiple positions if we wanted to decode from multiple sessions.
Note that the position data sampling frequency is what determines the time step of the decoding. In this case, the position data sampling frequency is 30 Hz, so the time step of the decoding will be 1/30 seconds. In practice, you will want to use a smaller time step such as 500 Hz. This will allow you to decode at a finer time scale. To do this, you will want to interpolate the position data to a higher sampling frequency as shown in the position trodes notebook.
You will also want to specify the name of the position variables if they are different from the default names. The default names are position_x
and position_y
.
from spyglass.position import PositionOutput
PositionOutput.TrodesPosV1 & {"nwb_file_name": nwb_copy_file_name}
merge_id | nwb_file_name name of the NWB file | interval_list_name descriptive name of this interval list | trodes_pos_params_name name for this set of parameters |
---|---|---|---|
a95d0105-de87-9c2f-85cf-f940e0490bee | mediumnwb20230802_.nwb | pos 0 valid times | default |
Total: 1
from spyglass.decoding.v1.core import PositionGroup
position_merge_ids = (
PositionOutput.TrodesPosV1
& {
"nwb_file_name": nwb_copy_file_name,
"interval_list_name": "pos 0 valid times",
"trodes_pos_params_name": "default",
}
).fetch("merge_id")
PositionGroup().create_group(
nwb_file_name=nwb_copy_file_name,
group_name="test_group",
keys=[{"pos_merge_id": merge_id} for merge_id in position_merge_ids],
)
PositionGroup & {
"nwb_file_name": nwb_copy_file_name,
"position_group_name": "test_group",
}
nwb_file_name name of the NWB file | position_group_name | position_variables list of position variables to decode |
---|---|---|
mediumnwb20230802_.nwb | test_group | =BLOB= |
Total: 1
(
PositionGroup
& {"nwb_file_name": nwb_copy_file_name, "position_group_name": "test_group"}
).fetch1("position_variables")
['position_x', 'position_y']
PositionGroup.Position & {
"nwb_file_name": nwb_copy_file_name,
"position_group_name": "test_group",
}
nwb_file_name name of the NWB file | position_group_name | pos_merge_id |
---|---|---|
mediumnwb20230802_.nwb | test_group | a95d0105-de87-9c2f-85cf-f940e0490bee |
Total: 1
Decoding Model Parameters¶
We will use the non_local_detector
package to decode the data. This package is highly flexible and allows several different types of models to be used. In this case, we will use the ContFragClusterlessClassifier
to decode the data. This has two discrete states: Continuous and Fragmented, which correspond to different types of movement models. To read more about this model, see:
Denovellis, E.L., Gillespie, A.K., Coulter, M.E., Sosa, M., Chung, J.E., Eden, U.T., and Frank, L.M. (2021). Hippocampal replay of experience at real-world speeds. eLife 10, e64505. 10.7554/eLife.64505.
Let's first look at the model and the default parameters:
from non_local_detector.models import ContFragClusterlessClassifier
ContFragClusterlessClassifier()
ContFragClusterlessClassifier(clusterless_algorithm='clusterless_kde', clusterless_algorithm_params={'block_size': 10000, 'position_std': 6.0, 'waveform_std': 24.0}, continuous_initial_conditions_types=[UniformInitialConditions(), UniformInitialConditions()], continuous_transition_types=[[RandomWalk(environment_name='', movement_var=6.0, movement_mean=0.0, use... environments=(Environment(environment_name='', place_bin_size=2.0, track_graph=None, edge_order=None, edge_spacing=None, is_track_interior=None, position_range=None, infer_track_interior=True, fill_holes=False, dilate=False, bin_count_threshold=0),), infer_track_interior=True, no_spike_rate=1e-10, observation_models=[ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False), ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False)], sampling_frequency=500.0, state_names=['Continuous', 'Fragmented'])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
ContFragClusterlessClassifier(clusterless_algorithm='clusterless_kde', clusterless_algorithm_params={'block_size': 10000, 'position_std': 6.0, 'waveform_std': 24.0}, continuous_initial_conditions_types=[UniformInitialConditions(), UniformInitialConditions()], continuous_transition_types=[[RandomWalk(environment_name='', movement_var=6.0, movement_mean=0.0, use... environments=(Environment(environment_name='', place_bin_size=2.0, track_graph=None, edge_order=None, edge_spacing=None, is_track_interior=None, position_range=None, infer_track_interior=True, fill_holes=False, dilate=False, bin_count_threshold=0),), infer_track_interior=True, no_spike_rate=1e-10, observation_models=[ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False), ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False)], sampling_frequency=500.0, state_names=['Continuous', 'Fragmented'])
You can change these parameters like so:
from non_local_detector.models import ContFragClusterlessClassifier
ContFragClusterlessClassifier(
clusterless_algorithm_params={
"block_size": 10000,
"position_std": 12.0,
"waveform_std": 24.0,
},
)
ContFragClusterlessClassifier(clusterless_algorithm='clusterless_kde', clusterless_algorithm_params={'block_size': 10000, 'position_std': 12.0, 'waveform_std': 24.0}, continuous_initial_conditions_types=[UniformInitialConditions(), UniformInitialConditions()], continuous_transition_types=[[RandomWalk(environment_name='', movement_var=6.0, movement_mean=0.0, us... environments=(Environment(environment_name='', place_bin_size=2.0, track_graph=None, edge_order=None, edge_spacing=None, is_track_interior=None, position_range=None, infer_track_interior=True, fill_holes=False, dilate=False, bin_count_threshold=0),), infer_track_interior=True, no_spike_rate=1e-10, observation_models=[ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False), ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False)], sampling_frequency=500.0, state_names=['Continuous', 'Fragmented'])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
ContFragClusterlessClassifier(clusterless_algorithm='clusterless_kde', clusterless_algorithm_params={'block_size': 10000, 'position_std': 12.0, 'waveform_std': 24.0}, continuous_initial_conditions_types=[UniformInitialConditions(), UniformInitialConditions()], continuous_transition_types=[[RandomWalk(environment_name='', movement_var=6.0, movement_mean=0.0, us... environments=(Environment(environment_name='', place_bin_size=2.0, track_graph=None, edge_order=None, edge_spacing=None, is_track_interior=None, position_range=None, infer_track_interior=True, fill_holes=False, dilate=False, bin_count_threshold=0),), infer_track_interior=True, no_spike_rate=1e-10, observation_models=[ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False), ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False)], sampling_frequency=500.0, state_names=['Continuous', 'Fragmented'])
This is how to insert the model parameters into the database:
from spyglass.decoding.v1.core import DecodingParameters
DecodingParameters.insert1(
{
"decoding_param_name": "contfrag_clusterless",
"decoding_params": ContFragClusterlessClassifier(),
"decoding_kwargs": dict(),
},
skip_duplicates=True,
)
DecodingParameters & {"decoding_param_name": "contfrag_clusterless"}
decoding_param_name a name for this set of parameters | decoding_params initialization parameters for model | decoding_kwargs additional keyword arguments |
---|---|---|
contfrag_clusterless | =BLOB= | =BLOB= |
Total: 1
We can retrieve these parameters and rebuild the model like so:
model_params = (
DecodingParameters & {"decoding_param_name": "contfrag_clusterless"}
).fetch1()
ContFragClusterlessClassifier(**model_params["decoding_params"])
ContFragClusterlessClassifier(clusterless_algorithm='clusterless_kde', clusterless_algorithm_params={'block_size': 10000, 'position_std': 6.0, 'waveform_std': 24.0}, continuous_initial_conditions_types=[UniformInitialConditions(), UniformInitialConditions()], continuous_transition_types=[[RandomWalk(environment_name='', movement_var=6.0, movement_mean=0.0, use... environments=[Environment(environment_name='', place_bin_size=2.0, track_graph=None, edge_order=None, edge_spacing=None, is_track_interior=None, position_range=None, infer_track_interior=True, fill_holes=False, dilate=False, bin_count_threshold=0)], infer_track_interior=True, no_spike_rate=1e-10, observation_models=[ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False), ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False)], sampling_frequency=500.0, state_names=['Continuous', 'Fragmented'])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
ContFragClusterlessClassifier(clusterless_algorithm='clusterless_kde', clusterless_algorithm_params={'block_size': 10000, 'position_std': 6.0, 'waveform_std': 24.0}, continuous_initial_conditions_types=[UniformInitialConditions(), UniformInitialConditions()], continuous_transition_types=[[RandomWalk(environment_name='', movement_var=6.0, movement_mean=0.0, use... environments=[Environment(environment_name='', place_bin_size=2.0, track_graph=None, edge_order=None, edge_spacing=None, is_track_interior=None, position_range=None, infer_track_interior=True, fill_holes=False, dilate=False, bin_count_threshold=0)], infer_track_interior=True, no_spike_rate=1e-10, observation_models=[ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False), ObservationModel(environment_name='', encoding_group=0, is_local=False, is_no_spike=False)], sampling_frequency=500.0, state_names=['Continuous', 'Fragmented'])
1D Decoding¶
If you want to do 1D decoding, you will need to specify the track_graph
, edge_order
, and edge_spacing
in the environments
parameter. You can read more about these parameters in the linearization notebook. You can retrieve these parameters from the TrackGraph
table if you have stored them there. These will then go into the environments
parameter of the ContFragClusterlessClassifier
model.
from non_local_detector.environment import Environment
?Environment
Init signature: Environment( environment_name: str = '', place_bin_size: Union[float, Tuple[float]] = 2.0, track_graph: Optional[networkx.classes.graph.Graph] = None, edge_order: Optional[tuple] = None, edge_spacing: Optional[tuple] = None, is_track_interior: Optional[numpy.ndarray] = None, position_range: Optional[numpy.ndarray] = None, infer_track_interior: bool = True, fill_holes: bool = False, dilate: bool = False, bin_count_threshold: int = 0, ) -> None Docstring: Represent the spatial environment with a discrete grid. Parameters ---------- environment_name : str, optional place_bin_size : float, optional Approximate size of the position bins. track_graph : networkx.Graph, optional Graph representing the 1D spatial topology edge_order : tuple of 2-tuples, optional The order of the edges in 1D space edge_spacing : None or int or tuples of len n_edges-1, optional Any gapes between the edges in 1D space is_track_interior : np.ndarray or None, optional If given, this will be used to define the valid areas of the track. Must be of type boolean. position_range : sequence, optional A sequence of `n_position_dims`, each an optional (lower, upper) tuple giving the outer bin edges for position. An entry of None in the sequence results in the minimum and maximum values being used for the corresponding dimension. The default, None, is equivalent to passing a tuple of `n_position_dims` None values. infer_track_interior : bool, optional If True, then use the given positions to figure out the valid track areas. fill_holes : bool, optional Fill holes when inferring the track dilate : bool, optional Inflate the available track area with binary dilation bin_count_threshold : int, optional Greater than this number of samples should be in the bin for it to be considered on the track. File: ~/miniconda3/envs/spyglass/lib/python3.9/site-packages/non_local_detector/environment.py Type: type Subclasses:
Decoding¶
Now that we have grouped the data and defined the model parameters, we have finally set up the elements in tables that we need to decode the data. We now need to use the ClusterlessDecodingSelection
to fully specify all the parameters and data that we want.
This has:
waveform_features_group_name
: the name of the group that contains the waveform features that we want to decode fromposition_group_name
: the name of the group that contains the position data that we want to decode fromdecoding_param_name
: the name of the decoding parameters that we want to usenwb_file_name
: the name of the NWB file that we want to decode fromencoding_interval
: the interval of time that we want to train the initial model ondecoding_interval
: the interval of time that we want to decode fromestimate_decoding_params
: whether or not we want to estimate the decoding parameters
The first three parameters should be familiar to you.
Decoding and Encoding Intervals¶
The encoding_interval
is the interval of time that we want to train the initial model on. The decoding_interval
is the interval of time that we want to decode from. These two intervals can be the same, but they do not have to be. For example, we may want to train the model on a long interval of time, but only decode from a short interval of time. This is useful if we want to decode from a short interval of time that is not representative of the entire session. In this case, we will train the model on a longer interval of time that is representative of the entire session.
These keys come from the IntervalList
table. We can see that the IntervalList
table contains the nwb_file_name
and interval_name
that we need to specify the encoding_interval
and decoding_interval
. We will specify a short decoding interval called test decoding interval
and use that to decode from.
Estimating Decoding Parameters¶
The last parameter is estimate_decoding_params
. This is a boolean that specifies whether or not we want to estimate the decoding parameters. If this is True
, then we will estimate the initial conditions and discrete transition matrix from the data.
NOTE: If estimating parameters, then we need to treat times outside decoding interval as missing. this means that times outside the decoding interval will not use the spiking data and only the state transition matrix and previous time step will be used. This may or may not be desired depending on the length of this missing interval.
from spyglass.decoding.v1.clusterless import ClusterlessDecodingSelection
ClusterlessDecodingSelection()
nwb_file_name name of the NWB file | waveform_features_group_name | position_group_name | decoding_param_name a name for this set of parameters | encoding_interval descriptive name of this interval list | decoding_interval descriptive name of this interval list | estimate_decoding_params whether to estimate the decoding parameters |
---|---|---|---|---|---|---|
mediumnwb20230802_.nwb | test_group | test_group | contfrag_clusterless | pos 0 valid times | test decoding interval | 0 |
Total: 1
from spyglass.common import IntervalList
IntervalList & {"nwb_file_name": nwb_copy_file_name}
nwb_file_name name of the NWB file | interval_list_name descriptive name of this interval list | valid_times numpy array with start/end times for each interval | pipeline type of interval list (e.g. 'position', 'spikesorting_recording_v1') |
---|---|---|---|
mediumnwb20230802_.nwb | 02_r1 | =BLOB= | |
mediumnwb20230802_.nwb | 04f3ecb4-a18c-4ffb-85d8-2f5f62d4d6d4 | =BLOB= | spikesorting_recording_v1 |
mediumnwb20230802_.nwb | 0e848c38-9105-4ea4-b6ba-dbdd5b46a088 | =BLOB= | spikesorting_artifact_v1 |
mediumnwb20230802_.nwb | 0f91197e-bebb-4dc6-ad41-5bf89c3eed28 | =BLOB= | spikesorting_artifact_v1 |
mediumnwb20230802_.nwb | 15c8a3e8-5ce9-4654-891e-6ee4109d6f1a | =BLOB= | spikesorting_artifact_v1 |
mediumnwb20230802_.nwb | 1d2b5966-415a-4c65-955a-0e422d8b5b00 | =BLOB= | spikesorting_recording_v1 |
mediumnwb20230802_.nwb | 1e3f3707-613e-4a44-93f1-c7e5484112cd | =BLOB= | spikesorting_recording_v1 |
mediumnwb20230802_.nwb | 2402805a-04f9-4a88-9ccf-071376c8de19 | =BLOB= | spikesorting_recording_v1 |
mediumnwb20230802_.nwb | 24107d8c-ce26-4c77-8f6a-bf6955d8a3c7 | =BLOB= | spikesorting_recording_v1 |
mediumnwb20230802_.nwb | 257c077b-8f3b-4abb-a631-6b8084d6a1ea | =BLOB= | spikesorting_recording_v1 |
mediumnwb20230802_.nwb | 2b93bcd0-7b05-457c-8aab-c41ef543ecf2 | =BLOB= | spikesorting_artifact_v1 |
mediumnwb20230802_.nwb | 2b9fbf14-74a0-4294-a805-26702340aac9 | =BLOB= | spikesorting_artifact_v1 |
...
Total: 52
decoding_interval_valid_times = [
[1625935714.6359036, 1625935714.6359036 + 15.0]
]
IntervalList.insert1(
{
"nwb_file_name": "mediumnwb20230802_.nwb",
"interval_list_name": "test decoding interval",
"valid_times": decoding_interval_valid_times,
},
skip_duplicates=True,
)
Once we have figured out the keys that we need, we can insert the ClusterlessDecodingSelection
into the database.
selection_key = {
"waveform_features_group_name": "test_group",
"position_group_name": "test_group",
"decoding_param_name": "contfrag_clusterless",
"nwb_file_name": nwb_copy_file_name,
"encoding_interval": "pos 0 valid times",
"decoding_interval": "test decoding interval",
"estimate_decoding_params": False,
}
ClusterlessDecodingSelection.insert1(
selection_key,
skip_duplicates=True,
)
ClusterlessDecodingSelection & selection_key
nwb_file_name name of the NWB file | waveform_features_group_name | position_group_name | decoding_param_name a name for this set of parameters | encoding_interval descriptive name of this interval list | decoding_interval descriptive name of this interval list | estimate_decoding_params whether to estimate the decoding parameters |
---|---|---|---|---|---|---|
mediumnwb20230802_.nwb | test_group | test_group | contfrag_clusterless | pos 0 valid times | test decoding interval | 0 |
Total: 1
ClusterlessDecodingSelection()
nwb_file_name name of the NWB file | waveform_features_group_name | position_group_name | decoding_param_name a name for this set of parameters | encoding_interval descriptive name of this interval list | decoding_interval descriptive name of this interval list | estimate_decoding_params whether to estimate the decoding parameters |
---|---|---|---|---|---|---|
mediumnwb20230802_.nwb | test_group | test_group | contfrag_clusterless | pos 0 valid times | test decoding interval | 0 |
Total: 1
To run decoding, we simply populate the ClusterlessDecodingOutput
table. This will run the decoding and insert the results into the database. We can then retrieve the results from the database.
from spyglass.decoding.v1.clusterless import ClusterlessDecodingV1
ClusterlessDecodingV1.populate(selection_key)
We can now see it as an entry in the DecodingOutput
table.
from spyglass.decoding.decoding_merge import DecodingOutput
DecodingOutput.ClusterlessDecodingV1 & selection_key
merge_id | nwb_file_name name of the NWB file | waveform_features_group_name | position_group_name | decoding_param_name a name for this set of parameters | encoding_interval descriptive name of this interval list | decoding_interval descriptive name of this interval list | estimate_decoding_params whether to estimate the decoding parameters |
---|---|---|---|---|---|---|---|
b63395dd-402e-270a-a8d1-7aabaf83d452 | mediumnwb20230802_.nwb | test_group | test_group | contfrag_clusterless | pos 0 valid times | test decoding interval | 0 |
Total: 1
We can load the results of the decoding:
decoding_results = (ClusterlessDecodingV1 & selection_key).load_results()
decoding_results
[2024-01-17 22:43:13,358][WARNING]: Skipped checksum for file with hash: ecf01dd1-0d3b-24c2-b843-7e554abf0ea7, and path: /Users/edeno/Documents/GitHub/spyglass/DATA/analysis/mediumnwb20230802/mediumnwb20230802_a26e1d1a-1480-4f89-b5e0-bb6486d7d15e.nc /Users/edeno/miniconda3/envs/spyglass/lib/python3.9/site-packages/xarray/namedarray/core.py:487: UserWarning: Duplicate dimension names present: dimensions {'states'} appear more than once in dims=('states', 'states'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``. warnings.warn( /Users/edeno/miniconda3/envs/spyglass/lib/python3.9/site-packages/xarray/namedarray/core.py:487: UserWarning: Duplicate dimension names present: dimensions {'states'} appear more than once in dims=('states', 'states'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``. warnings.warn(
<xarray.Dataset> Dimensions: (state_ind: 26668, dim_0: 26668, time: 451, states: 2, intervals: 1, state_bins: 26668) Coordinates: * state_ind (state_ind) int32 0 0 0 0 0 0 0 ... 1 1 1 1 1 1 * time (time) float64 1.626e+09 ... 1.626e+09 * states (states) object 'Continuous' 'Fragmented' environments (states) object ... encoding_groups (states) int32 ... * state_bins (state_bins) object MultiIndex * state (state_bins) object 'Continuous' ... 'Fragme... * x_position (state_bins) float64 29.02 29.02 ... 262.7 * y_position (state_bins) float64 0.4493 2.445 ... 224.0 Dimensions without coordinates: dim_0, intervals Data variables: initial_conditions (dim_0) float64 ... discrete_state_transitions (states, states) float64 ... acausal_posterior (intervals, time, state_bins) float32 ... acausal_state_probabilities (intervals, time, states) float64 ... Attributes: marginal_log_likelihoods: -154947.16
Finally, if we deleted the results, we can use the cleanup
function to delete the results from the file system:
DecodingOutput().cleanup()
[22:43:13][INFO] Spyglass: Cleaning up decoding outputs [2024-01-17 22:43:13,623][WARNING]: Skipped checksum for file with hash: ecf01dd1-0d3b-24c2-b843-7e554abf0ea7, and path: /Users/edeno/Documents/GitHub/spyglass/DATA/analysis/mediumnwb20230802/mediumnwb20230802_a26e1d1a-1480-4f89-b5e0-bb6486d7d15e.nc [2024-01-17 22:43:13,713][WARNING]: Skipped checksum for file with hash: 257962d6-fc68-dc91-b0d2-8bef2a4914f3, and path: /Users/edeno/Documents/GitHub/spyglass/DATA/analysis/mediumnwb20230802/mediumnwb20230802_a26e1d1a-1480-4f89-b5e0-bb6486d7d15e.pkl
Visualization of decoding output.¶
The output of decoding can be challenging to visualize with static graphs, especially if the decoding is performed on 2D data.
We can interactively visualize the output of decoding using the figurl package. This package allows to create a visualization of the decoding output that can be viewed in a web browser. This is useful for exploring the decoding output over time and sharing the results with others.
NOTE: You will need a kachery cloud instance to use this feature. If you are a member of the Frank lab, you should have access to the Frank lab kachery cloud instance. If you are not a member of the Frank lab, you can create your own kachery cloud instance by following the instructions here.
For each user, you will need to run kachery-cloud-init
in the terminal and follow the instructions to associate your computer with your GitHub user on the kachery-cloud network.
# from non_local_detector.visualization import (
# create_interactive_2D_decoding_figurl,
# )
# (
# position_info,
# position_variable_names,
# ) = ClusterlessDecodingV1.load_position_info(selection_key)
# results_time = decoding_results.acausal_posterior.isel(intervals=0).time.values
# position_info = position_info.loc[results_time[0] : results_time[-1]]
# env = ClusterlessDecodingV1.load_environments(selection_key)[0]
# spike_times, _ = ClusterlessDecodingV1.load_spike_data(selection_key)
# create_interactive_2D_decoding_figurl(
# position_time=position_info.index.to_numpy(),
# position=position_info[position_variable_names],
# env=env,
# results=decoding_results,
# posterior=decoding_results.acausal_posterior.isel(intervals=0)
# .unstack("state_bins")
# .sum("state"),
# spike_times=spike_times,
# head_dir=position_info["orientation"],
# speed=position_info["speed"],
# )
GPUs¶
We can use GPUs for decoding which will result in a significant speedup. This is achieved using the jax package.
Ensuring jax can find a GPU¶
Assuming you've set up a GPU, we can use jax.devices()
to make sure the decoding code can see the GPU. If a GPU is available, it will be listed.
In the following instance, we do not have a GPU:
import jax
jax.devices()
[CpuDevice(id=0)]
Selecting a GPU¶
If you do have multiple GPUs, you can use the jax
package to set the device (GPU) that you want to use. For example, if you want to use the second GPU, you can use the following code (uncomment first):
# device_id = 2
# device = jax.devices()[device_id]
# jax.config.update("jax_default_device", device)
# device
Monitoring GPU Usage¶
You can see which GPUs are occupied (if you have multiple GPUs) by running the command nvidia-smi
in
a terminal (or !nvidia-smi
in a notebook). Pick a GPU with low memory usage.
We can monitor GPU use with the terminal command watch -n 0.1 nvidia-smi
, will
update nvidia-smi
every 100 ms. This won't work in a notebook, as it won't
display the updates.
Other ways to monitor GPU usage are:
- A jupyter widget by nvidia to monitor GPU usage in the notebook
- A terminal program like nvidia-smi with more information about which GPUs are being utilized and by whom.