Baby Language Lab Scripts
A collection of data processing tools.
 All Classes Namespaces Files Functions Variables Pages
parsers.trs_splitter.TRSSplitter Class Reference

This class splits a TRS file into chunks of a given length, writing the split files to a user-specified directory. More...

Inheritance diagram for parsers.trs_splitter.TRSSplitter:

Public Member Functions

def __init__
 Constructor. More...
 
def split
 Splits the TRS file. More...
 

Public Attributes

 logger
 
 dest_path
 
 filename_base
 
 filename
 
 tree
 

Private Member Functions

def _get_next_speaker_num
 Speakers are given string ids 'spk0', 'spk1', etc. More...
 
def _insert_void_speaker
 Inserts a speaker with code 'VOID' into the xml file. More...
 
def _get_time_str
 Constructs a string in the format "hh:mm:ss.ss" from a total seconds count. More...
 
def _get_void_section
 Constructs a section element (with a turn subelement) for the void speaker, for the specified time period. More...
 
def _build_episode
 Each TRS file contains a single <episode> tag that encloses all <turn> tags. More...
 

Detailed Description

This class splits a TRS file into chunks of a given length, writing the split files to a user-specified directory.

Definition at line 11 of file trs_splitter.py.

Constructor & Destructor Documentation

def parsers.trs_splitter.TRSSplitter.__init__ (   self,
  filename,
  dest_path 
)

Constructor.

Parameters
self
filename(string) name of the TRS file to split (absolute path)
dest_path(string) directory in which to store the split TRS files (absolute path)

Definition at line 16 of file trs_splitter.py.

Member Function Documentation

def parsers.trs_splitter.TRSSplitter._build_episode (   self,
  sections,
  start_offset,
  win_len,
  void_speaker_num,
  progress_update_fcn 
)
private

Each TRS file contains a single <episode> tag that encloses all <turn> tags.

This method constructs a new <episode> tag containing as many sections as will fit into the time period specified by win_len. This can be used to write a new TRS file. If a single section is bigger than win_len, the section is appended separately as a single file.

Parameters
self
start_offset(int) section index at which to start building the episode
win_len(float) The size of the chunks we want to split this file into (specified in seconds)
void_speaker_num(int) next available speaker integer - see _get_next_speaker_num()
progress_update_fcn(function=None) function that updates the progress bar, accepting a single parameter, a real number in [0.0, 1.0]
Returns
(Element, int, float, float) New Episode XML element, index of the last segment we stuffed into it, start time of the first segment in the episode, end time of the last segment in the episode

Definition at line 118 of file trs_splitter.py.

def parsers.trs_splitter.TRSSplitter._get_next_speaker_num (   self)
private

Speakers are given string ids 'spk0', 'spk1', etc.

This method retreives the integer from the next available id.

Parameters
self
Returns
(int) next available id for a Speaker

Definition at line 33 of file trs_splitter.py.

def parsers.trs_splitter.TRSSplitter._get_time_str (   self,
  total_sec 
)
private

Constructs a string in the format "hh:mm:ss.ss" from a total seconds count.

Parameters
self
total_sec(float) The total second count to convert the the specified format
Returns
(string) the formatted result, as indicated above

Definition at line 59 of file trs_splitter.py.

def parsers.trs_splitter.TRSSplitter._get_void_section (   self,
  start_time,
  end_time,
  speaker_num 
)
private

Constructs a section element (with a turn subelement) for the void speaker, for the specified time period.

Parameters
start_time(float) section start time, in seconds (specified as offset from beginning of file)
end_time(float) section end time, in seconds (specified as offset from beginning of file)
speaker_num(int) next available speaker integer - see _get_next_speaker_num()
Returns
(Element) etree "section" Element for the void speaker

Definition at line 105 of file trs_splitter.py.

def parsers.trs_splitter.TRSSplitter._insert_void_speaker (   self,
  tree,
  speaker_num 
)
private

Inserts a speaker with code 'VOID' into the xml file.

This speaker is used to pad the start and end of the file (from time 0 to start of first segment, and from end of last segment to end of file time). This is done so that whent he split file is opened in transcriber, the wav file will still sync up. This method modifies the "Speakers" tag at the top of a TRS file. This tag contains a list of all the speakers in the file. Nothing is returned - instead, the tree param is directly modified.

Parameters
self
tree(etree ElementTree) The XML tree in which to search for the speakers tag
speaker_num(int) the number for the new VOID speaker (should be unused by other speakers already present) - see _get_next_speaker_num()

Definition at line 45 of file trs_splitter.py.

def parsers.trs_splitter.TRSSplitter.split (   self,
  win_len,
  progress_update_fcn = None 
)

Splits the TRS file.

This will write to the destination file.

Parameters
self
win_len(float) The size of the chunks we want to split this file into (specified in seconds)
progress_update_fcn(function=None) function that updates the progress bar, accepting a single parameter, a real number in [0.0, 1.0]

Definition at line 68 of file trs_splitter.py.

Member Data Documentation

parsers.trs_splitter.TRSSplitter.dest_path

Definition at line 18 of file trs_splitter.py.

parsers.trs_splitter.TRSSplitter.filename

Definition at line 21 of file trs_splitter.py.

parsers.trs_splitter.TRSSplitter.filename_base

Definition at line 20 of file trs_splitter.py.

parsers.trs_splitter.TRSSplitter.logger

Definition at line 17 of file trs_splitter.py.

parsers.trs_splitter.TRSSplitter.tree

Definition at line 24 of file trs_splitter.py.


The documentation for this class was generated from the following file: