offspect.cache.file

CacheFile

The python interface to the CacheFile which checks for filename validity during instantiation. When one of its properties are called, it loads and parses the metadata and datasets fresh from the hdf5 and aggregatates them.

Examples

Peek

The most straightforward example would be loading a CacheFile and printing its content.

from offspect.api import CacheFile
cf = CacheFile("example.hdf5")
print(cf)

Iterate

Another use case would be printing a TraceAttribute across all traces in the file, using the iterator interface of the CacheFile, which returns data and attributes of each Trace.

from offspect.api import CacheFile
cf = CacheFile("example.hdf5")
for data, attrs in cf:
     print("rejected?:", attrs["reject"])

Manipulate

We can change the value for a key of the annotations for a specific trace by indexing CacheFile.get_trace_attrs() with a specific index. Please note that we now decode and encode the values of the attrs. This is because they are stored as string in the CacheFile, but we need them in their respective type to manipulate them properly. Additionally, we encode them, before we set the attributes again with set_trace_attrs().

from offspect.api import CacheFile, encode
cf = CacheFile("example.hdf5")
attrs = cf.get_trace_attrs(0)
attrs["stimulation_intensity_mso"] = encode(66)
cf.set_trace_attrs(0, attrs)

Batch-Manipulate

Another typical use case would be changing one TraceAttribute across all traces in the file. Here, we iterate across all traces, and shift the onset of the TMS 5 samples to the right.

from offspect.api import CacheFile, decode, encode
cf = CacheFile("merged.hdf5")
for ix, (data, attrs) in enumerate(cf):
    key = "onset_shift"
    old = decode(attrs[key])
    print(f"Trace {ix} {key}:", old, end=" ")
    new = old + 5
    attrs["onset_shift"] = encode(new)
    cf.set_trace_attrs(ix, attrs)
    test = decode(cf.get_trace_attrs(ix)["onset_shift"])
    print("to", test)

Plotting

Eventually, and ideally after visual inspection, you might want to plot the resulting map. You can do so with using plot_map(), as in the following example.

from offspect.api import plot_map, CacheFile
# we load a cachefile
cf = CacheFile("example.hdf5")
# and plot and show it.
display = plot_map(cf)
display.show()
# you can also save the figure with
display.savefig("example_map.png")

There is a variety of options to tune the plotting to your whims. For example, you can normalize the values, e.g. by taking the logarithm or thresholding by giving the foo argument a sensible Callable. Note that we add 1 to be able to deal with a Vpp of 0 from e.g. MEP-negative traces.

from math import log10
# taking the log10
plot_map(cf, foo = lambda x : log10(x + 1))
# thresholding
def threshold(x):
    return float(x>50)
plot_map(cf, foo = threshold)

Additionally, you can use all the keywords from plot_glass() to beautify your plot.

plot_map(cf, vmax=100, title="Example", smooth=25)
class CacheFile(fname)[source]

instantiate a new cachefile from HDD

Parameters
  • fname (FileName) – path to the file

  • each readout, a specific set of fields must be in the metadata of a trace. Whenever attributes are read or written, the validity of the metadata will automatically be checked to be consistent with its 'readout'. (For) –

__iter__()[source]

iterate over all traces in the cachefile

Returns

  • data (TraceData) – the data of this trace

  • attrs (TraceAttributes) – the attributes of this trace

get_trace_attrs(idx)[source]

read the TraceAttributes for a specific traces in the file

Parameters

idx (int) – which trace to pick.

Returns

attrs (TraceAttributes) – the collapsed attributes for this trace.

Example:

cf = CacheFile("example.hdf5")
for i in len(cf):
    attrs = cf.get_trace_attrs(i)

Note

The TraceAttributes contain the metadata of this trace, and the metadata of its parent group, i.e. sourcefile. Additionally, two fields will be added, containing information about the ‘cache_file’ and the ‘cache_file_index’. The number of fields is therefore larger than the number of fields valid for TraceAttributes according to filter_trace_attrs(). This is no problem, because when you update with set_trace_attrs(), these fields will be used for safety checks and subsequently discarded.

get_trace_data(idx)[source]

return TraceData for a specific traces in the file

Parameters

idx (int) – which trace to pick.

Returns

  • attrs (TraceData) – the date stored for this trace.

  • .. note:: – This is a read-only attribute, and raw data can never be overwritten with the CacheFile interface. If you need to perform any preprocessing steps, manage the TraceData with a low-level interface, e.g. populate().

property origins

returns a list of original files used in creating this cachefile

set_trace_attrs(idx, attrs)[source]

update the attributes of a specific trace

Parameters
  • idx (int) – at which index to overwrite

  • attrs (TraceAttributes) – with which attributes to overwrite

Example:

import datetime
now = str(datetime.datetime.now())
cf = CacheFile("example.hdf5")
attrs = cf.get_trace_attrs(0)
attrs["comment"] = now
cf.set_trace_attrs(0, attrs)

Note

Because we expect the TraceAttributes to originate from a CacheFiles get_trace_attrs() method, we expect them to have information about their original file and index included. For safety reasons, you have to specify the index when calling this setter. Additionally, the original file must be this instance of CacheFile. If you want to directly overwrite an arbitrary attribute without this safety checks, update the values for original_file and original_index and use update_trace_attributes(). Additionally, please note that while get_trace_attrs() returns a complete dictionary of attributes, including thise that apply to the whole group or origin file, only valid fields for trace metadata will be saved, i.e. those fields which are in correspondence with the “readout” parameter (see filter_trace_attrs()).

asdict(attrs)[source]

parse the metadata from a cachefile and return it as dictionary

merge(to, sources)[source]

merge one or more cachefiles into one file :type to: Union[str, Path] :param to: the name of the file to be written into. Will be overwritten, if already existing :type to: FileName :type sources: List[Union[str, Path]] :param sources: a list of source files from which we will read traces and annotations :type sources: List[FileName]

Returns

fname (FileName) – the name of the target file

parse_traceattrs(attrs)[source]

parse any metadata from a cachefile and return it as Dict

parse_tracedata(dset)[source]

parse a hdf5 dataset from a cachefile and return it as a trace

populate(tf, annotations, traceslist)[source]

create a new cachefile from a annotations and traces

Parameters
  • tf (FileName) – the name of the file to be created. will overwrite an existing file

  • annotations (List[Attributes]) – a list of annotation dictionaries

  • traceslist (List[List[TraceData]]) – a list of list of traces

Returns

fname (FileName) – the path to the freshly populated cachefile

read_file = functools.partial(<class 'h5py._hl.files.File'>, mode='r', libver='latest', swmr=True)

open an hdf5 file in single-write-multiple-reader mode

read_trace(cf, idx, what='attrs')[source]

read either metadata or attributes for a specific trace

Parameters
  • cf (CacheFile) – for which file

  • idx (int) – which trace to load

  • what (str) – whether to load ‘data’ or ‘attrs’. defaults to attrs

recover_annotations(cf)[source]

“recover the file and annotations from a cachefile :type cf: CacheFile :param cf: the cachefile from which to recover :type cf: CacheFile

Returns

annotations (List[Annotations]) – a list of annotations, where annotations are the collapsed metadata of all sourcefiles in the cachefile organized as [sourcesfiles][Annotations] Annotations

recover_parts(cf)[source]

recover the two parts of a cachefile, i.e. annotations and traces

Parameters

cf (CacheFile) – the cachefile from which to recover

Returns

  • annotations (List[Annotations]) – a list of annotations, i.e the metadata of all sourcefiles in the cachefile organized as [sourcesfiles][Annotations]

  • traces (List[List[TraceData]]) – a list of the traces of all sourcefiles saved in the cachefile organized as [sourcefiles][traceidx][TraceData]

sort_keys(okeys)[source]
update_trace_attributes(attrs)[source]

overwrite the traceattributes for a trace

the original file and index of the trace are specified as field within the TraceAttributes :type attrs: Dict[str, str] :param attrs: :type attrs: TraceAttributes

write_file = functools.partial(<class 'h5py._hl.files.File'>, mode='r+', libver='latest', swmr=True)

open an hdf5 file in single-write-multiple-reader mode

write_tracedata(cf, data, idx)[source]