Merge Tables¶

Intro¶

Developer Note: if you may make a PR in the future, be sure to copy this notebook, and use the gitignore prefix temp to avoid future conflicts.

This is one notebook in a multi-part series on Spyglass.

To set up your Spyglass environment and database, see the Setup notebook
To insert data, see the Insert Data notebook
For additional info on DataJoint syntax, including table definitions and inserts, see these additional tutorials
For information on why we use merge tables, and how to make one, see our documentation

Imports¶

Let's start by importing the spyglass package, along with a few others.

In [1]:

Copied!





import os
import datajoint as dj

# change to the upper level folder to detect dj_local_conf.json
if os.path.basename(os.getcwd()) == "notebooks":
    os.chdir("..")
dj.config.load("dj_local_conf.json")  # load config for database connection info

# ignore datajoint+jupyter async warnings
import warnings

warnings.simplefilter("ignore", category=DeprecationWarning)
warnings.simplefilter("ignore", category=ResourceWarning)
warnings.simplefilter("ignore", category=UserWarning)

import spyglass.common as sgc
import spyglass.lfp as lfp
from spyglass.utils.nwb_helper_fn import get_nwb_copy_filename
from spyglass.utils.dj_merge_tables import delete_downstream_merge, Merge
from spyglass.common.common_ephys import LFP as CommonLFP  # Upstream 1
from spyglass.lfp.lfp_merge import LFPOutput  # Merge Table
from spyglass.lfp.v1.lfp import LFPV1  # Upstream 2
import os
import datajoint as dj

# change to the upper level folder to detect dj_local_conf.json
if os.path.basename(os.getcwd()) == "notebooks":
    os.chdir("..")
dj.config.load("dj_local_conf.json")  # load config for database connection info

# ignore datajoint+jupyter async warnings
import warnings

warnings.simplefilter("ignore", category=DeprecationWarning)
warnings.simplefilter("ignore", category=ResourceWarning)
warnings.simplefilter("ignore", category=UserWarning)

import spyglass.common as sgc
import spyglass.lfp as lfp
from spyglass.utils.nwb_helper_fn import get_nwb_copy_filename
from spyglass.utils.dj_merge_tables import delete_downstream_merge, Merge
from spyglass.common.common_ephys import LFP as CommonLFP  # Upstream 1
from spyglass.lfp.lfp_merge import LFPOutput  # Merge Table
from spyglass.lfp.v1.lfp import LFPV1  # Upstream 2

[2023-10-12 11:15:17,864][INFO]: Connecting root@localhost:3306
[2023-10-12 11:15:17,873][INFO]: Connected root@localhost:3306

Example data¶

Check to make sure the data inserted in the previour notebook is still there.

In [2]:

Copied!





nwb_file_name = "minirec20230622.nwb"
nwb_copy_file_name = get_nwb_copy_filename(nwb_file_name)
nwb_file_dict = {"nwb_file_name": nwb_copy_file_name}
sgc.Session & nwb_file_dict
nwb_file_name = "minirec20230622.nwb"
nwb_copy_file_name = get_nwb_copy_filename(nwb_file_name)
nwb_file_dict = {"nwb_file_name": nwb_copy_file_name}
sgc.Session & nwb_file_dict

Out[2]:

Table for holding experimental sessions.

nwb_file_name name of the NWB file	subject_id	institution_name	lab_name	session_id	session_description	session_start_time	timestamps_reference_time	experiment_description
minirec20230622_.nwb	54321	UCSF	Loren Frank Lab	12345	test yaml insertion	2023-06-22 15:59:58	1970-01-01 00:00:00	Test Conversion

Total: 1

If you haven't already done so, insert data into a Merge Table.

Note: Some existing parents of Merge Tables perform the Merge Table insert as part of the populate methods. This practice will be revised in the future.

In [3]:

Copied!





sgc.FirFilterParameters().create_standard_filters()
lfp.lfp_electrode.LFPElectrodeGroup.create_lfp_electrode_group(
    nwb_file_name=nwb_copy_file_name,
    group_name="test",
    electrode_list=[0],
)
lfp_key = {
    "nwb_file_name": nwb_copy_file_name,
    "lfp_electrode_group_name": "test",
    "target_interval_list_name": "01_s1",
    "filter_name": "LFP 0-400 Hz",
    "filter_sampling_rate": 30_000,
}
lfp.v1.LFPSelection.insert1(lfp_key, skip_duplicates=True)
lfp.v1.LFPV1().populate(lfp_key)
LFPOutput.insert([lfp_key], skip_duplicates=True)
sgc.FirFilterParameters().create_standard_filters()
lfp.lfp_electrode.LFPElectrodeGroup.create_lfp_electrode_group(
    nwb_file_name=nwb_copy_file_name,
    group_name="test",
    electrode_list=[0],
)
lfp_key = {
    "nwb_file_name": nwb_copy_file_name,
    "lfp_electrode_group_name": "test",
    "target_interval_list_name": "01_s1",
    "filter_name": "LFP 0-400 Hz",
    "filter_sampling_rate": 30_000,
}
lfp.v1.LFPSelection.insert1(lfp_key, skip_duplicates=True)
lfp.v1.LFPV1().populate(lfp_key)
LFPOutput.insert([lfp_key], skip_duplicates=True)

Helper functions¶

Merge Tables have multiple custom methods that begin with merge.

help can show us the docstring of each

In [4]:

Copied!

merge_methods = [d for d in dir(Merge) if d.startswith("merge")]
print(merge_methods)
merge_methods = [d for d in dir(Merge) if d.startswith("merge")]
print(merge_methods)

['merge_delete', 'merge_delete_parent', 'merge_fetch', 'merge_get_parent', 'merge_get_part', 'merge_html', 'merge_populate', 'merge_restrict', 'merge_view']

In [5]:

Copied!

help(getattr(Merge, merge_methods[-1]))
help(getattr(Merge, merge_methods[-1]))

Help on method merge_view in module spyglass.utils.dj_merge_tables:

merge_view(restriction: str = True) method of datajoint.user_tables.TableMeta instance
    Prints merged view, including null entries for unique columns.
    
    Note: To handle this Union as a table-like object, use `merge_resrict`
    
    Parameters
    ---------
    restriction: str, optional
        Restriction to apply to the merged view

Showing data¶

merge_view shows a union of the master and all part tables.

Note: Restrict Merge Tables with arguments, not the & operator.

Normally: Table & "field='value'"
Instead: MergeTable.merge_view(restriction="field='value'").

In [7]:

Copied!

LFPOutput.merge_view()
LFPOutput.merge_view()

*merge_id      *source    *nwb_file_name *lfp_electrode *target_interv *filter_name   *filter_sampli
+------------+ +--------+ +------------+ +------------+ +------------+ +------------+ +------------+
c34f98c5-7de7- LFPV1      minirec2023062 test           01_s1          LFP 0-400 Hz   30000         
 (Total: 1)

UUIDs help retain unique entries across all part tables. We can fetch NWB file by referencing this or other features.

In [8]:

Copied!

uuid_key = LFPOutput.fetch(limit=1, as_dict=True)[-1]
restrict = LFPOutput & uuid_key
restrict
uuid_key = LFPOutput.fetch(limit=1, as_dict=True)[-1]
restrict = LFPOutput & uuid_key
restrict

Out[8]:

merge_id	source
c34f98c5-7de7-1daf-6eaf-1e15981def44	LFPV1

Total: 1

In [9]:

Copied!

result1 = restrict.fetch_nwb()
result1
result1 = restrict.fetch_nwb()
result1

Out[9]:

[{'nwb_file_name': 'minirec20230622_.nwb',
  'lfp_electrode_group_name': 'test',
  'target_interval_list_name': '01_s1',
  'filter_name': 'LFP 0-400 Hz',
  'filter_sampling_rate': 30000,
  'analysis_file_name': 'minirec20230622_JOV02AWW09.nwb',
  'interval_list_name': 'lfp_test_01_s1_valid times',
  'lfp_object_id': '340b9a0b-626b-40ca-8b48-e033be72570a',
  'lfp_sampling_rate': 1000.0,
  'lfp': filtered data pynwb.ecephys.ElectricalSeries at 0x139910624563552
  Fields:
    comments: no comments
    conversion: 1.0
    data: <HDF5 dataset "data": shape (10476, 1), type "<i2">
    description: filtered data
    electrodes: electrodes <class 'hdmf.common.table.DynamicTableRegion'>
    interval: 1
    offset: 0.0
    resolution: -1.0
    timestamps: <HDF5 dataset "timestamps": shape (10476,), type "<f8">
    timestamps_unit: seconds
    unit: volts}]

In [10]:

Copied!

nwb_key = LFPOutput.merge_restrict(nwb_file_dict).fetch(as_dict=True)[0]
nwb_key
nwb_key = LFPOutput.merge_restrict(nwb_file_dict).fetch(as_dict=True)[0]
nwb_key

Out[10]:

{'merge_id': UUID('c34f98c5-7de7-1daf-6eaf-1e15981def44'),
 'source': 'LFPV1',
 'nwb_file_name': 'minirec20230622_.nwb',
 'lfp_electrode_group_name': 'test',
 'target_interval_list_name': '01_s1',
 'filter_name': 'LFP 0-400 Hz',
 'filter_sampling_rate': 30000}

In [12]:

Copied!

result2 = (LFPOutput & nwb_key).fetch_nwb()
result2 == result1
result2 = (LFPOutput & nwb_key).fetch_nwb()
result2 == result1

Out[12]:

True

Selecting data¶

There are also functions for retrieving part/parent table(s) and fetching data.

These get functions will either return the part table of the Merge table or the parent table with the source information for that part.

In [14]:

Copied!

result4 = LFPOutput.merge_get_part(restriction=nwb_file_dict, join_master=True)
result4
result4 = LFPOutput.merge_get_part(restriction=nwb_file_dict, join_master=True)
result4

Out[14]:

merge_id	source	nwb_file_name name of the NWB file	lfp_electrode_group_name the name of this group of electrodes	target_interval_list_name descriptive name of this interval list	filter_name descriptive name of this filter	filter_sampling_rate sampling rate for this filter
c34f98c5-7de7-1daf-6eaf-1e15981def44	LFPV1	minirec20230622_.nwb	test	01_s1	LFP 0-400 Hz	30000

Total: 1

In [15]:

Copied!

result5 = LFPOutput.merge_get_parent(restriction='nwb_file_name LIKE "mini%"')
result5
result5 = LFPOutput.merge_get_parent(restriction='nwb_file_name LIKE "mini%"')
result5

Out[15]:

nwb_file_name name of the NWB file	lfp_electrode_group_name the name of this group of electrodes	target_interval_list_name descriptive name of this interval list	filter_name descriptive name of this filter	filter_sampling_rate sampling rate for this filter	analysis_file_name name of the file	interval_list_name descriptive name of this interval list	lfp_object_id the NWB object ID for loading this object from the file	lfp_sampling_rate the sampling rate, in HZ
minirec20230622_.nwb	test	01_s1	LFP 0-400 Hz	30000	minirec20230622_JOV02AWW09.nwb	lfp_test_01_s1_valid times	340b9a0b-626b-40ca-8b48-e033be72570a	1000.0

Total: 1

fetch will collect all relevant entries and return them as a list in the format specified by keyword arguments and one's DataJoint config.

In [16]:

Copied!

result6 = result5.fetch("lfp_sampling_rate")  # Sample rate for all mini* files
result6
result6 = result5.fetch("lfp_sampling_rate")  # Sample rate for all mini* files
result6

Out[16]:

array([1000.])

merge_fetch requires a restriction as the first argument. For no restriction, use True.

In [19]:

Copied!

result7 = LFPOutput.merge_fetch(True, "filter_name", "nwb_file_name")
result7
result7 = LFPOutput.merge_fetch(True, "filter_name", "nwb_file_name")
result7

Out[19]:

[array(['LFP 0-400 Hz'], dtype=object),
 array(['minirec20230622_.nwb'], dtype=object)]

In [20]:

Copied!

result8 = LFPOutput.merge_fetch(as_dict=True)
result8
result8 = LFPOutput.merge_fetch(as_dict=True)
result8

Out[20]:

{'merge_id': UUID('c34f98c5-7de7-1daf-6eaf-1e15981def44'),
 'nwb_file_name': 'minirec20230622_.nwb',
 'lfp_electrode_group_name': 'test',
 'target_interval_list_name': '01_s1',
 'filter_name': 'LFP 0-400 Hz',
 'filter_sampling_rate': 30000}

Deletion from Merge Tables¶

When deleting from Merge Tables, we can either...

delete from the Merge Table itself with merge_delete, deleting both the master and part.
use merge_delete_parent to delete from the parent sources, getting rid of the entries in the source table they came from.
use delete_downstream_merge to find Merge Tables downstream and get rid full entries, avoiding orphaned master table entries.

The two latter cases can be destructive, so we include an extra layer of protection with dry_run. When true (by default), these functions return a list of tables with the entries that would otherwise be deleted.

In [ ]:

Copied!





LFPOutput.merge_delete(nwb_file_dict)  # Delete from merge table
LFPOutput.merge_delete_parent(restriction=nwb_file_dict, dry_run=True)
delete_downstream_merge(
    table=LFPV1,
    restriction=nwb_file_dict,
    dry_run=True,
)
LFPOutput.merge_delete(nwb_file_dict)  # Delete from merge table
LFPOutput.merge_delete_parent(restriction=nwb_file_dict, dry_run=True)
delete_downstream_merge(
    table=LFPV1,
    restriction=nwb_file_dict,
    dry_run=True,
)

To delete all merge table entries associated with an NWB file, use delete_downstream_merge with the Nwbfile table.

In [ ]:

Copied!





delete_downstream_merge(
    table=sgc.Nwbfile,
    restriction={"nwb_file_name": nwb_copy_file_name},
    dry_run=True,
    recurse_level=3,  # for long pipelines with many tables
)
delete_downstream_merge(
    table=sgc.Nwbfile,
    restriction={"nwb_file_name": nwb_copy_file_name},
    dry_run=True,
    recurse_level=3,  # for long pipelines with many tables
)

Up Next¶

In the next notebook, we'll start working with ephys data with spike sorting.