Workflow is an online utility for constructing scientific workflows to assist with analysis of network data.
A basic vocabulary is useful for navigating within this tool.
|Actor:||a module which represents one step or task (e.g., low pass filter)|
|Category:||a node in the toolbox tree that has other categories or library attached|
|Channel:||a single curve or line that connects two actors together (also referred to as a wire)|
|Library:||a branch in the toolbox that has one or more actors attached to it|
||a single input or output attached to an actor to which a wire can connect|
||a collection of actors|
|Workflow:||a user generated collection of actors and channels|
To create a private workflow:
- Right-click on Private under the Workflows tab
- Select 'New Workflow'
- Select the 'Toolboxes' tab
- Navigate through the treeview, and drag the desired actors into the 'Design' workspace.
- Use your mouse to connect actors (hovering over a port will indicate the expected input/output type.
- Double-click on an actor to adjust its parameters.
- Click Save.
- Enter a unique workflow name. Once saved, it will appear in your list of Private workflows under the 'Workflows' tab.
- When ready, click Run to execute your workflow. A message will appear when execution is completed.
- At this point, the results will be available as icons in the output actor and/or available in the FTP Directory.
Private workflows can be opened, edited and executed at any later time.
Public workflows provide a continually expanding resource for getting started on your data analysis.
To use a public workflow, simply select it and make required changes. Saving these changes will make the changed version a private workflow that can you can continue to modify.
Results of every executed workflow are stored in a separate timestamped folder within the FTP directory. An icon on the last output actor will provide a link to this directory. Within this directory, you will also find a text file called commands.txt that is the source code generated from the workflow during execution. This file can be useful in troubleshooting.
- Annotation Set - A pre-determined annotation set can be selected for training a classifier. To create an annotation set, use the Annotation Search (for more information, refer to Annotations Help ) to generate a list of annotations. Then you can save this list of annotations as a set.
- ARFF (Attribute Relation File Format) - These feature files are text files describing data sets with features for training and predicting using classifiers. We follow the .arff format using by the Weka machine learning software. They describe the features (attributes), what type they have and what is the output class (which is used for training or evaluating predictions). The last column is annotation (if from training set) OR source textfilename (e.g., if not from training set). The number of DATA entries is governed by the window size and hop size and the length of the input wave file. For example, if window size and hop size are 512 (no overlapping), there will be one entry for each 512 samples in the data stream. An example .arff file would be:
- Hydrophone Data - Multiple hydrophone files from the NEPTUNE Canada archive can be selected with this actor for a chosen hydrophone and time range.
- MPL File - Marsyas plugins are textual descriptions of network of Marsyas processing objects (MarSystems) and their associated parameters. The networks operate at a finer granularity (buffers) and contain many more parameters than need to be exposed to a user. They can be treated as a file of text that is saved and loaded by Marsyas and does not require any interpretation. Other formats such as JSON are supported. They are rather verbose and hard to read however they contain all the necessary information to run a network. They are primarily used in the Neptune modules to store trained classifiers.
- PNG File - image format used for plots (e.g., spectrogram, waveform)
- Timeline File - Timelines contain a list of TAB-separated events (one per line) with the format (start, end, label). The start and end are in seconds with respect to the input wav file. Contents of the annotation files, applies to training set and output from timeline. They can be created, saved and loaded using the Audacity open source audio editor. There must be at least two labels to get any meaningful results out of feature extraction/classification. An example time-line would be:
- TXT File - Text files are used for a variety of actors, such as the SOX statistics actor which outputs its results to a simple text file.
- WAV - audio files (e.g., for hydrophones)
Marsyas (Music Analysis, Retrieval and Synthesis for Audio Signals) is an open source software framework for audio processing. It has been designed and written by George Tzanetakis (firstname.lastname@example.org) with help from students and researchers from around the world. A subset of its functionality has been integrated into the Workflow tool. For specific details, refer to the Marsyas websites:
Available modules from Marsyas in Workflow tool:
- Visualization Modules: Spectrogram, Waveform
- Audio Modules: LowPass, BandPass, HighPass, Normalize
- Feature Extraction Modules: SpectralCentroid, SpectralRolloff, SpectralFlux, MelFrequencyCepstralCoefficients, ZeroCrossings
- Classifier Modules: Gaussian, SVM, ZeroR
- Prediction Modules: TimeLine, ClassificationStatistics, Frames
These modules are all based on command-line tools included in the Marsyas repository.
SoX is a command-line utility that can apply various effects to sound files. A subset of its functionality has been integrated into the Workflow tool. For specific details, refer to its website:http://sox.sourceforge.net/Main/HomePage
The following SOX functionality is incorporated as actors in the Workflow tool (example command line calls are given, but refer to documentation for full list of parameters).
- stat (time and frequency domain stats)
- stats (time domain stats)