Merge Tables¶
Intro¶
Developer Note: if you may make a PR in the future, be sure to copy this
notebook, and use the gitignore
prefix temp
to avoid future conflicts.
This is one notebook in a multi-part series on Spyglass.
- To set up your Spyglass environment and database, see the Setup notebook
- To insert data, see the Insert Data notebook
- For additional info on DataJoint syntax, including table definitions and inserts, see these additional tutorials
- For information on why we use merge tables, and how to make one, see our documentation
Imports¶
Let's start by importing the spyglass
package, along with a few others.
import os
import datajoint as dj
# change to the upper level folder to detect dj_local_conf.json
if os.path.basename(os.getcwd()) == "notebooks":
os.chdir("..")
dj.config.load("dj_local_conf.json") # load config for database connection info
# ignore datajoint+jupyter async warnings
import warnings
warnings.simplefilter("ignore", category=DeprecationWarning)
warnings.simplefilter("ignore", category=ResourceWarning)
warnings.simplefilter("ignore", category=UserWarning)
import spyglass.common as sgc
import spyglass.lfp as lfp
from spyglass.utils.nwb_helper_fn import get_nwb_copy_filename
from spyglass.utils.dj_merge_tables import delete_downstream_merge, Merge
from spyglass.common.common_ephys import LFP as CommonLFP # Upstream 1
from spyglass.lfp.lfp_merge import LFPOutput # Merge Table
from spyglass.lfp.v1.lfp import LFPV1 # Upstream 2
[2023-10-12 11:15:17,864][INFO]: Connecting root@localhost:3306 [2023-10-12 11:15:17,873][INFO]: Connected root@localhost:3306
Example data¶
Check to make sure the data inserted in the previour notebook is still there.
nwb_file_name = "minirec20230622.nwb"
nwb_copy_file_name = get_nwb_copy_filename(nwb_file_name)
nwb_file_dict = {"nwb_file_name": nwb_copy_file_name}
sgc.Session & nwb_file_dict
nwb_file_name name of the NWB file | subject_id | institution_name | lab_name | session_id | session_description | session_start_time | timestamps_reference_time | experiment_description |
---|---|---|---|---|---|---|---|---|
minirec20230622_.nwb | 54321 | UCSF | Loren Frank Lab | 12345 | test yaml insertion | 2023-06-22 15:59:58 | 1970-01-01 00:00:00 | Test Conversion |
Total: 1
If you haven't already done so, insert data into a Merge Table.
Note: Some existing parents of Merge Tables perform the Merge Table insert as part of the populate methods. This practice will be revised in the future.
sgc.FirFilterParameters().create_standard_filters()
lfp.lfp_electrode.LFPElectrodeGroup.create_lfp_electrode_group(
nwb_file_name=nwb_copy_file_name,
group_name="test",
electrode_list=[0],
)
lfp_key = {
"nwb_file_name": nwb_copy_file_name,
"lfp_electrode_group_name": "test",
"target_interval_list_name": "01_s1",
"filter_name": "LFP 0-400 Hz",
"filter_sampling_rate": 30_000,
}
lfp.v1.LFPSelection.insert1(lfp_key, skip_duplicates=True)
lfp.v1.LFPV1().populate(lfp_key)
LFPOutput.insert([lfp_key], skip_duplicates=True)
Helper functions¶
Merge Tables have multiple custom methods that begin with merge
.
help
can show us the docstring of each
merge_methods = [d for d in dir(Merge) if d.startswith("merge")]
print(merge_methods)
['merge_delete', 'merge_delete_parent', 'merge_fetch', 'merge_get_parent', 'merge_get_part', 'merge_html', 'merge_populate', 'merge_restrict', 'merge_view']
help(getattr(Merge, merge_methods[-1]))
Help on method merge_view in module spyglass.utils.dj_merge_tables: merge_view(restriction: str = True) method of datajoint.user_tables.TableMeta instance Prints merged view, including null entries for unique columns. Note: To handle this Union as a table-like object, use `merge_resrict` Parameters --------- restriction: str, optional Restriction to apply to the merged view
Showing data¶
merge_view
shows a union of the master and all part tables.
Note: Restrict Merge Tables with arguments, not the &
operator.
- Normally:
Table & "field='value'"
- Instead:
MergeTable.merge_view(restriction="field='value'"
).
LFPOutput.merge_view()
*merge_id *source *nwb_file_name *lfp_electrode *target_interv *filter_name *filter_sampli +------------+ +--------+ +------------+ +------------+ +------------+ +------------+ +------------+ c34f98c5-7de7- LFPV1 minirec2023062 test 01_s1 LFP 0-400 Hz 30000 (Total: 1)
UUIDs help retain unique entries across all part tables. We can fetch NWB file by referencing this or other features.
uuid_key = LFPOutput.fetch(limit=1, as_dict=True)[-1]
restrict = LFPOutput & uuid_key
restrict
merge_id | source |
---|---|
c34f98c5-7de7-1daf-6eaf-1e15981def44 | LFPV1 |
Total: 1
result1 = restrict.fetch_nwb()
result1
[{'nwb_file_name': 'minirec20230622_.nwb', 'lfp_electrode_group_name': 'test', 'target_interval_list_name': '01_s1', 'filter_name': 'LFP 0-400 Hz', 'filter_sampling_rate': 30000, 'analysis_file_name': 'minirec20230622_JOV02AWW09.nwb', 'interval_list_name': 'lfp_test_01_s1_valid times', 'lfp_object_id': '340b9a0b-626b-40ca-8b48-e033be72570a', 'lfp_sampling_rate': 1000.0, 'lfp': filtered data pynwb.ecephys.ElectricalSeries at 0x139910624563552 Fields: comments: no comments conversion: 1.0 data: <HDF5 dataset "data": shape (10476, 1), type "<i2"> description: filtered data electrodes: electrodes <class 'hdmf.common.table.DynamicTableRegion'> interval: 1 offset: 0.0 resolution: -1.0 timestamps: <HDF5 dataset "timestamps": shape (10476,), type "<f8"> timestamps_unit: seconds unit: volts}]
nwb_key = LFPOutput.merge_restrict(nwb_file_dict).fetch(as_dict=True)[0]
nwb_key
{'merge_id': UUID('c34f98c5-7de7-1daf-6eaf-1e15981def44'), 'source': 'LFPV1', 'nwb_file_name': 'minirec20230622_.nwb', 'lfp_electrode_group_name': 'test', 'target_interval_list_name': '01_s1', 'filter_name': 'LFP 0-400 Hz', 'filter_sampling_rate': 30000}
result2 = (LFPOutput & nwb_key).fetch_nwb()
result2 == result1
True
Selecting data¶
There are also functions for retrieving part/parent table(s) and fetching data.
These get
functions will either return the part table of the Merge table or the parent table with the source information for that part.
result4 = LFPOutput.merge_get_part(restriction=nwb_file_dict, join_master=True)
result4
merge_id | source | nwb_file_name name of the NWB file | lfp_electrode_group_name the name of this group of electrodes | target_interval_list_name descriptive name of this interval list | filter_name descriptive name of this filter | filter_sampling_rate sampling rate for this filter |
---|---|---|---|---|---|---|
c34f98c5-7de7-1daf-6eaf-1e15981def44 | LFPV1 | minirec20230622_.nwb | test | 01_s1 | LFP 0-400 Hz | 30000 |
Total: 1
result5 = LFPOutput.merge_get_parent(restriction='nwb_file_name LIKE "mini%"')
result5
nwb_file_name name of the NWB file | lfp_electrode_group_name the name of this group of electrodes | target_interval_list_name descriptive name of this interval list | filter_name descriptive name of this filter | filter_sampling_rate sampling rate for this filter | analysis_file_name name of the file | interval_list_name descriptive name of this interval list | lfp_object_id the NWB object ID for loading this object from the file | lfp_sampling_rate the sampling rate, in HZ |
---|---|---|---|---|---|---|---|---|
minirec20230622_.nwb | test | 01_s1 | LFP 0-400 Hz | 30000 | minirec20230622_JOV02AWW09.nwb | lfp_test_01_s1_valid times | 340b9a0b-626b-40ca-8b48-e033be72570a | 1000.0 |
Total: 1
fetch
will collect all relevant entries and return them as a list in
the format specified by keyword arguments and one's DataJoint config.
result6 = result5.fetch("lfp_sampling_rate") # Sample rate for all mini* files
result6
array([1000.])
merge_fetch
requires a restriction as the first argument. For no restriction,
use True
.
result7 = LFPOutput.merge_fetch(True, "filter_name", "nwb_file_name")
result7
[array(['LFP 0-400 Hz'], dtype=object), array(['minirec20230622_.nwb'], dtype=object)]
result8 = LFPOutput.merge_fetch(as_dict=True)
result8
{'merge_id': UUID('c34f98c5-7de7-1daf-6eaf-1e15981def44'), 'nwb_file_name': 'minirec20230622_.nwb', 'lfp_electrode_group_name': 'test', 'target_interval_list_name': '01_s1', 'filter_name': 'LFP 0-400 Hz', 'filter_sampling_rate': 30000}
Deletion from Merge Tables¶
When deleting from Merge Tables, we can either...
delete from the Merge Table itself with
merge_delete
, deleting both the master and part.use
merge_delete_parent
to delete from the parent sources, getting rid of the entries in the source table they came from.use
delete_downstream_merge
to find Merge Tables downstream and get rid full entries, avoiding orphaned master table entries.
The two latter cases can be destructive, so we include an extra layer of
protection with dry_run
. When true (by default), these functions return
a list of tables with the entries that would otherwise be deleted.
LFPOutput.merge_delete(nwb_file_dict) # Delete from merge table
LFPOutput.merge_delete_parent(restriction=nwb_file_dict, dry_run=True)
delete_downstream_merge(
table=LFPV1,
restriction=nwb_file_dict,
dry_run=True,
)
To delete all merge table entries associated with an NWB file, use
delete_downstream_merge
with the Nwbfile
table.
delete_downstream_merge(
table=sgc.Nwbfile,
restriction={"nwb_file_name": nwb_copy_file_name},
dry_run=True,
recurse_level=3, # for long pipelines with many tables
)
Up Next¶
In the next notebook, we'll start working with ephys data with spike sorting.