Spyglass Mixin¶
The Spyglass Mixin provides a way to centralize all Spyglass-specific functionalities that have been added to DataJoint tables. This includes...
- Fetching NWB files
- Long-distance restrictions.
- Permission checks on delete
- Export logging. See export doc for more information.
- Miscellaneous helper functions
To add this functionality to your own tables, simply inherit from the mixin:
import datajoint as dj
from spyglass.utils import SpyglassMixin
schema = dj.schema("my_schema")
@schema
class MyOldTable(dj.Manual):
pass
@schema
class MyNewTable(SpyglassMixin, dj.Manual):
pass
NOTE: The mixin must be the first class inherited from in order to override default DataJoint functions.
Fetching NWB Files¶
Many tables in Spyglass inheris from central tables that store records of NWB files. Rather than adding a helper function to each table, the mixin provides a single function to access these files from any table.
This function will look at the table definition to determine if the raw file
should be fetched from Nwbfile
or an analysis file should be fetched from
AnalysisNwbfile
. If neither is foreign-key-referenced, the function will refer
to a _nwb_table
attribute.
Long-Distance Restrictions¶
In complicated pipelines like Spyglass, there are often tables that 'bury' their foreign keys as secondary keys. This is done to avoid having to pass a long list of foreign keys through the pipeline, potentially hitting SQL limits (see also Merge Tables). This burrying makes it difficult to restrict a given table by familiar attributes.
Spyglass provides a function, restrict_by
, to handle this. The function takes
your restriction and checks parents/children until the restriction can be
applied. Spyglass introduces <<
as a shorthand for restrict_by
an upstream
key and >>
as a shorthand for restrict_by
a downstream key.
from spyglass.example import AnyTable
AnyTable() << 'upstream_attribute="value"'
AnyTable() >> 'downstream_attribute="value"'
# Equivalent to
AnyTable().restrict_by('downstream_attribute="value"', direction="down")
AnyTable().restrict_by('upstream_attribute="value"', direction="up")
Some caveats to this function:
- 'Peripheral' tables, like
IntervalList
andAnalysisNwbfile
make it hard to determine the correct parent/child relationship and have been removed from this search by default. - This function will raise an error if it attempts to check a table that has not been imported into the current namespace. It is best used for exploring and debugging, not for production code.
- It's hard to determine the attributes in a mixed dictionary/string restriction. If you are having trouble, try using a pure string restriction.
- The most direct path to your restriction may not be the path your data took, especially when using Merge Tables. When the result is empty see the warning about the path used. Then, ban tables from the search to force a different path.
my_table = MyTable() # must be instanced
my_table.ban_search_table(UnwantedTable1)
my_table.ban_search_table([UnwantedTable2, UnwantedTable3])
my_table.unban_search_table(UnwantedTable3)
my_table.see_banned_tables()
my_table << my_restriction
my_table << upstream_restriction >> downstream_restriction
When providing a restriction of the parent, use 'up' direction. When providing a restriction of the child, use 'down' direction.
Delete Permission Checks¶
By default, DataJoint is unable to set delete permissions on a per-table basis. If a user is able to delete entries in a given table, she can delete entries in any table in the schema.
The mixin relies on the Session.Experimenter
and LabTeams
tables to ...
- Check the session and experimenter associated with the attempted deletion.
- Check the lab teams associated with the session experimenter and the user.
If the user shares a lab team with the session experimenter, the deletion is permitted.
This is not secure system and is not a replacement for database backups (see database management). A user could readily curcumvent the default permission checks by adding themselves to the relevant team or removing the mixin from the class declaration. However, it provides a reasonable level of security for the average user.
Because parts of this process rely on caching, this process will be faster if you assign the instanced table to a variable.
# Slower
YourTable().delete()
YourTable().delete()
# Faster
nwbfile = YourTable()
nwbfile.delete()
nwbfile.delete()
Deprecated delete feature
Previous versions of Spyglass also deleted masters of parts with foreign key references. This functionality has been migrated to DataJoint in version 0.14.2 via the `force_masters` delete argument. This argument is `True` by default in Spyglass tables.Populate Calls¶
The mixin also overrides the default populate
function to provide additional
functionality for non-daemon process pools and disabling transaction protection.
Non-Daemon Process Pools¶
To allow the make
function to spawn a new process pool, the mixin overrides
the default populate
function for tables with _parallel_make
set to True
.
See issue #1000 and
PR #1001 for more
information.
Disable Transaction Protection¶
By default, DataJoint wraps the populate
function in a transaction to ensure
data integrity (see
Transactions).
This can cause issues when populating large tables if another user attempts to declare/modify a table while the transaction is open (see issue #1030 and DataJoint issue #1170).
Tables with _use_transaction
set to False
will not be wrapped in a
transaction when calling populate
. Transaction protection is replaced by a
hash of upstream data to ensure no changes are made to the table during the
unprotected populate. The additional time required to hash the data is a
trade-off for already time-consuming populates, but avoids blocking other users.
Miscellaneous Helper functions¶
file_like
allows you to restrict a table using a substring of a file name.
This is equivalent to the following:
MyTable().file_like("eg")
MyTable() & ('nwb_file_name LIKE "%eg%" OR analysis_file_name LIKE "%eg%"')
find_insert_fail
is a helper function to find the cause of an IntegrityError
when inserting into a table. This checks parent tables for required keys.
my_key = {"key": "value"}
MyTable().insert1(my_key) # Raises IntegrityError
MyTable().find_insert_fail(my_key) # Shows the parent(s) missing the key
Populate Calls¶
The mixin also overrides the default populate
function to provide additional
functionality for non-daemon process pools and disabling transaction protection.
Non-Daemon Process Pools¶
To allow the make
function to spawn a new process pool, the mixin overrides
the default populate
function for tables with _parallel_make
set to True
.
See issue #1000 and
PR #1001 for more
information.
Disable Transaction Protection¶
By default, DataJoint wraps the populate
function in a transaction to ensure
data integrity (see
Transactions).
This can cause issues when populating large tables if another user attempts to declare/modify a table while the transaction is open (see issue #1030 and DataJoint issue #1170).
Tables with _use_transaction
set to False
will not be wrapped in a
transaction when calling populate
. Transaction protection is replaced by a
hash of upstream data to ensure no changes are made to the table during the
unprotected populate. The additional time required to hash the data is a
trade-off for already time-consuming populates, but avoids blocking other users.