Notebook 1 Loading and converting Ndf files to hdf5 files¶
In this notebook we go through converting .ndf files to .h5 files and adding features to be used for classifiaction to them.
In [1]:
import sys
import os
Loading the pyecog module¶
The easiest place in which to download and run this notebook is the pyecog directory downloaded from github, e.g. “pyecog-Development” as the pyecog module will be found in this folder. However, if you want to run the notebook from elsewhere on your computer you first need to make sure that python can find the pyecog module using sys.path.append(). To do this modify and copy the following code into a cell and run it (shift+enter).
pyecog_path = '/home/jonathan/git_repos/pyecog' # replace this with the pyecog's location on your computer
sys.path.append(pyecog_path)
If you are on windows you have to deal with the problem that backslashes in your paths (when you copy them) are treated escape characters by python. Prefixing the string with ‘r’ prevents this, as python creates a “raw” string literal in which are treated as literal characters, not escape characters.
pyecog_path = r'C:\home\jonathan\git_repos\pyecog' # replace this with the pyecog's location on your computer
In [2]:
# note if you are in the directory downloaded from github you do not have to run this cell
pyecog_path = '/home/jonathan/git_repos/pyecog'
# e.g. if on a windows computer
# pyecog_path = r'C:\Users\jonathan\Desktop\pyecog-Development'
sys.path.append(pyecog_path)
In [3]:
# we can now inport pyecog and check it is located where we expect
import pyecog as pg
pg # check the module is imported from where you expect
Out[3]:
<module 'pyecog' from '/home/jonathan/git_repos/pyecog/pyecog/__init__.py'>
Warnings and errors!¶
- If you have an error message like:
ImportError: No module named 'pyecog'
This probably means you haven't set he "pyecog_path" variable correctly
- If you have a different warning or error. Check you have activated your pyecog environment before running jupyter notebook. On a windows machine, in anaconda prompt:
>> activate pyecog_env # "source activate pyecog" for mac or linux
>> jupyter notebook
Converting ndf files to HDF5 format¶
Insert blurb on hdf5 format. Two stages.
To begin we define variables for ndf and h5 folder paths.
In [4]:
ndf_folderpath = r'D:\2019\rat ivc\ndf\01_baseline_01'
In [5]:
ndf_filenames = os.listdir(ndf_folderpath)
print('Found', len(ndf_filenames), '.ndf files to convert to h5 file format')
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-5-a7ae3e3e88a0> in <module>()
----> 1 ndf_filenames = os.listdir(ndf_folderpath)
2 print('Found', len(ndf_filenames), '.ndf files to convert to h5 file format')
FileNotFoundError: [Errno 2] No such file or directory: 'D:\\2019\\rat ivc\\ndf\\01_baseline_01'
In [6]:
h5_folderpath = r'D:\2019\analysis pyecog\01_baselineconverted_01'
# make folder for conversion if it doesnt already exist
if not os.path.exists(h5_folderpath):
os.makedirs(h5_folderpath)
1. Convert a ndf folder to h5 files¶
Below we instatiate a DataHandler instance. Its convert_ndf_directory_to_h5 method has the follwoing arguments:
- ndf_dir : path to the directory containing ndf files
- h5_dir : path to the directory in which to save h5 files
- fs : sampling rate per second - int e.g. 256,512,1024, or 'auto' (not recommended)
- tids : Transmitter ids to convert. 'all', or list of ids e.g. [88,89,92,94]
- n_cores : Number of cores to use for conversion. Either int or -1 for all.
- glitch_detection : boolean flag, True or False. Specifies whether to apply glitch detection.
- high_pass_filter : boolean flag, True or False. Specifies whether to apply a 1 khz high pass filter.
In [7]:
handler = pg.DataHandler()
In [8]:
sampling_freq = 512
In [9]:
handler.convert_ndf_directory_to_h5(ndf_dir = ndf_folderpath,
save_dir = h5_folderpath,
fs = sampling_freq,
tids = 'all', # [1,2,4,6,9]
n_cores = 4, # -1 for all cores
glitch_detection=True,
high_pass_filter=True)
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-9-baa09b819f99> in <module>()
5 n_cores = 4, # -1 for all cores
6 glitch_detection=True,
----> 7 high_pass_filter=True)
~/git_repos/pyecog/pyecog/ndf/datahandler.py in convert_ndf_directory_to_h5(self, ndf_dir, tids, save_dir, n_cores, fs, glitch_detection, high_pass_filter, gui_object)
590 gui_object = gui_object
591
--> 592 files = [f for f in self.fullpath_listdir(ndf_dir) if f.endswith('.ndf')]
593 if type(tids)=='tid': tids = tids.strip(' ')
594
~/git_repos/pyecog/pyecog/ndf/datahandler.py in fullpath_listdir(d)
560 def fullpath_listdir(d):
561 ''' returns full filepath, excludes hidden files, starting with .'''
--> 562 return [os.path.join(d, f) for f in os.listdir(d) if not f.startswith('.')]
563
564 def convert_ndf_directory_to_h5(self, ndf_dir,
FileNotFoundError: [Errno 2] No such file or directory: 'D:\\2019\\rat ivc\\ndf\\01_baseline_01'
In [10]:
os.listdir(h5_folderpath)[:5]
Out[10]:
[]
2. Add features to h5 files to be used for prediction¶
- h5py_folder : path to the directory containing ndf files
- n_cores : number of cores to use for conversion. Either int or -1 for all.
- timewindow : time in seconds. This specifies the length of time to chunk the data.
- overwrite_features : boolean. If there are existing features, overwrite them or not.
In [11]:
handler.parallel_add_prediction_features(h5py_folder = h5_folderpath,
n_cores=3,
timewindow=5,
overwrite_features=False)
Adding features to transmitters in 0 h5 files in D:\2019\analysis pyecog\01_baselineconverted_01
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
<ipython-input-11-b2c9ed71c732> in <module>()
2 n_cores=3,
3 timewindow=5,
----> 4 overwrite_features=False)
~/git_repos/pyecog/pyecog/ndf/datahandler.py in parallel_add_prediction_features(self, h5py_folder, n_cores, timewindow, overwrite_features, gui_object)
249
250 print( 'Adding features to transmitters in '+str(l)+ ' h5 files in '+ h5py_folder)
--> 251 self.printProgress(0,l, prefix = 'Progress:', suffix = 'Complete', barLength = 50)
252 for i, _ in enumerate(pool.imap(self.add_predicition_features_to_h5_file, files_to_add_features), 1):
253 self.printProgress(i, l, prefix = 'Progress:', suffix = 'Complete', barLength = 50)
~/git_repos/pyecog/pyecog/ndf/datahandler.py in printProgress(self, iteration, total, prefix, suffix, decimals, barLength)
729 """
730 str_format = "{0:." + str(decimals) + "f}"
--> 731 percents = str_format.format(100 * (iteration / float(total)))
732 filled_length = int(round(barLength * iteration / float(total)))
733 bar = '*' * filled_length + '-' * (barLength - filled_length)
ZeroDivisionError: float division by zero