h2. Workflow Help
[Workflow|http://dmas.uvic.ca/Workflow] is an online utility for constructing scientific workflows to assist with analysis of network data.
{toc}
h3. Vocabulary
A basic vocabulary is useful for navigating within this tool.
| _Actor_: | a module which represents one step or task (e.g., low pass filter) |
| _Category_: | a node in the toolbox tree that has other categories or library attached |
| _Channel_: | a single curve or line that connects two actors together (also referred to as a wire) |
| _Library_: | a branch in the toolbox that has one or more actors attached to it |
| _Port_: \\ | a single input or output attached to an actor to which a wire can connect |
| _Toolbox_: \\ | a collection of actors |
| _Workflow_: | a user generated collection of actors and channels |
h3. Workflows
h4. Private Workflows
To create a private workflow:
# Right-click on Private under the Workflows tab
# Select 'New Workflow'
# Select the 'Toolboxes' tab
# Navigate through the treeview, and drag the desired actors into the 'Design' workspace.
# Use your mouse to connect actors (hovering over a port will indicate the expected input/output type.
# Double-click on an actor to adjust its parameters.
# Click Save.
# Enter a unique workflow name. Once saved, it will appear in your list of Private workflows under the 'Workflows' tab.
# When ready, click Run to execute your workflow. A message will appear when execution is completed.
# At this point, the results will be available as icons in the output actor and/or available in the FTP Directory.
Private workflows can be opened, edited and executed at any later time.
h4. Public Workflows
Public workflows provide a continually expanding resource for getting started on your data analysis.
To use a public workflow, simply select it and make required changes. Saving these changes will make the changed version a private workflow that can you can continue to modify.
h3. FTP Directory
Results of every executed workflow are stored in a separate timestamped folder within the FTP directory. An icon on the last output actor will provide a link to this directory. Within this directory, you will also find a text file called commands.txt that is the source code generated from the workflow during execution. This file can be useful in troubleshooting.
h3. Toolboxes
h4. Input/Output Toolboxes
* *Annotation Set* \- A pre-determined annotation set can be selected for training a classifier. To create an annotation set, use the Annotation Search (for more information, refer to [help:Annotations Help] ) to generate a list of annotations. Then you can save this list of annotations as a set.
* *ARFF* (Attribute Relation File Format) - These feature files are text files describing data sets with features for training and predicting using classifiers. We follow the .arff format using by the Weka machine learning software. They describe the features (attributes), what type they have and what is the output class (which is used for training or evaluating predictions). The last column is annotation (if from training set) OR source textfilename (e.g., if not from training set). The number of DATA entries is governed by the window size and hop size and the length of the input wave file. For example, if window size and hop size are 512 (no overlapping), there will be one entry for each 512 samples in the data stream. An example .arff file would be:
{noformat}
@RELATION iris
@ATTRIBUTE sepallength REAL
@ATTRIBUTE sepalwidth REAL
@ATTRIBUTE petallength REAL
@ATTRIBUTE petalwidth REAL
@ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica}
@DATA
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.4,3.7,1.5,0.2,Iris-setosa
4.8,3.4,1.6,0.2,Iris-setosa
4.8,3.0,1.4,0.1,Iris-setosa
4.3,3.0,1.1,0.1,Iris-setosa
5.8,4.0,1.2,0.2,Iris-setosa
5.7,4.4,1.5,0.4,Iris-setosa
5.4,3.9,1.3,0.4,Iris-setosa
5.1,3.5,1.4,0.3,Iris-setosa
5.7,3.8,1.7,0.3,Iris-setosa
5.1,3.8,1.5,0.3,Iris-setosa
5.4,3.4,1.7,0.2,Iris-setosa
5.1,3.7,1.5,0.4,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
6.4,3.2,4.5,1.5,Iris-versicolor
6.9,3.1,4.9,1.5,Iris-versicolor
5.5,2.3,4.0,1.3,Iris-versicolor
6.5,2.8,4.6,1.5,Iris-versicolor
5.7,2.8,4.5,1.3,Iris-versicolor
6.3,3.3,4.7,1.6,Iris-versicolor
4.9,2.4,3.3,1.0,Iris-versicolor
6.6,2.9,4.6,1.3,Iris-versicolor
5.2,2.7,3.9,1.4,Iris-versicolor
5.0,2.0,3.5,1.0,Iris-versicolor
5.9,3.0,4.2,1.5,Iris-versicolor
6.0,2.2,4.0,1.0,Iris-versicolor
6.1,2.9,4.7,1.4,Iris-versicolor
5.6,2.9,3.6,1.3,Iris-versicolor
6.7,3.1,4.4,1.4,Iris-versicolor
5.6,3.0,4.5,1.5,Iris-versicolor
5.8,2.7,4.1,1.0,Iris-versicolor
6.2,2.2,4.5,1.5,Iris-versicolor
5.6,2.5,3.9,1.1,Iris-versicolor
5.9,3.2,4.8,1.8,Iris-versicolor
6.1,2.8,4.0,1.3,Iris-versicolor
6.3,2.5,4.9,1.5,Iris-versicolor
7.7,2.8,6.7,2.0,Iris-virginica
6.3,2.7,4.9,1.8,Iris-virginica
6.7,3.3,5.7,2.1,Iris-virginica
7.2,3.2,6.0,1.8,Iris-virginica
6.2,2.8,4.8,1.8,Iris-virginica
6.1,3.0,4.9,1.8,Iris-virginica
6.4,2.8,5.6,2.1,Iris-virginica
7.2,3.0,5.8,1.6,Iris-virginica
7.4,2.8,6.1,1.9,Iris-virginica
7.9,3.8,6.4,2.0,Iris-virginica
6.4,2.8,5.6,2.2,Iris-virginica
6.3,2.8,5.1,1.5,Iris-virginica
6.1,2.6,5.6,1.4,Iris-virginica
7.7,3.0,6.1,2.3,Iris-virginica
6.3,3.4,5.6,2.4,Iris-virginica
6.4,3.1,5.5,1.8,Iris-virginica
6.0,3.0,4.8,1.8,Iris-virginica
6.9,3.1,5.4,2.1,Iris-virginica
6.7,3.1,5.6,2.4,Iris-virginica
6.9,3.1,5.1,2.3,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
6.8,3.2,5.9,2.3,Iris-virginica
6.7,3.3,5.7,2.5,Iris-virginica
6.7,3.0,5.2,2.3,Iris-virginica
6.3,2.5,5.0,1.9,Iris-virginica
6.5,3.0,5.2,2.0,Iris-virginica
6.2,3.4,5.4,2.3,Iris-virginica
5.9,3.0,5.1,1.8,Iris-virginica
{noformat}
* *Hydrophone Data -* Multiple hydrophone files from the NEPTUNE Canada archive can be selected with this actor for a chosen hydrophone and time range.
* *MPL File* \- Marsyas plugins are textual descriptions of network of Marsyas processing objects (MarSystems) and their associated parameters. The networks operate at a finer granularity (buffers) and contain many more parameters than need to be exposed to a user. They can be treated as a file of text that is saved and loaded by Marsyas and does not require any interpretation. Other formats such as JSON are supported. They are rather verbose and hard to read however they contain all the necessary information to run a network. They are primarily used in the Neptune modules to store trained classifiers.
* *PNG File* \- image format used for plots (e.g., spectrogram, waveform)
* *Timeline File* \- Timelines contain a list of TAB-separated events (one per line) with the format (start, end, label). The start and end are in seconds with respect to the input wav file. Contents of the annotation files, applies to training set and output from timeline. They can be created, saved and loaded using the Audacity open source audio editor. There must be at least two labels to get any meaningful results out of feature extraction/classification. An example time-line would be:
{noformat}
0.000000 0.633949 background
0.633949 2.493532 whale
2.609756 4.395379 background
4.606695 6.719859 whale
7.068530 8.473784 background
8.748495 10.565815 whale
10.787697 12.340872 background
12.594452 15.214774 whale
16.968699 18.595835 whale
18.807151 21.258420 background
21.533132 23.223662 whale
23.350452 25.474181 background
25.632668 27.217540 whale
29.964652 30.894444 whale
31.750275 36.526024 background
37.149407 39.283701 whale
41.819497 44.693399 whale
45.528098 47.789183 background
48.412566 49.257831 whale
49.860083 51.666837 background
{noformat}
* *TXT File* \- Text files are used for a variety of actors, such as the SOX statistics actor which outputs its results to a simple text file.
* *WAV* \- audio files (e.g., for hydrophones)
h4. Digital Signal Processing (DSP) Toolboxes
h5. Marsyas
Marsyas (Music Analysis, Retrieval and Synthesis for Audio Signals) is an open source software framework for audio processing. It has been designed and written by George Tzanetakis (gtzan@cs.uvic.ca) with help from students and researchers from around the world. A subset of its functionality has been integrated into the Workflow tool. For specific details, refer to the Marsyas websites:
* [http://marsyas.info/]
* [http://sourceforge.net/projects/marsyas/]
Available modules from Marsyas in Workflow tool:
* Visualization Modules: Spectrogram, Waveform
* Audio Modules: LowPass, BandPass, HighPass, Normalize
* Feature Extraction Modules: SpectralCentroid, SpectralRolloff, SpectralFlux, MelFrequencyCepstralCoefficients, ZeroCrossings
* Classifier Modules: Gaussian, SVM, ZeroR
* Prediction Modules: TimeLine, ClassificationStatistics, Frames
These modules are all based on command-line tools included in the Marsyas repository.
h5. SoX
SoX is a command-line utility that can apply various effects to sound files. A subset of its functionality has been integrated into the Workflow tool. For specific details, refer to its website:[http://sox.sourceforge.net/Main/HomePage]
The following SOX functionality is incorporated as actors in the Workflow tool (example command line calls are given, but refer to documentation for full list of parameters).
* highpass
* lowpass
* stat (time and frequency domain stats)
* stats (time domain stats)
* spectrogram
* dcshift:
* gain:
* rate:
* pad
* trim
* vol
* norm
h5. MATLab - coming soon
h5. Python - coming soon
h3. Resources
[Audacity|http://audacity.sourceforge.net/]
[Workflow|http://dmas.uvic.ca/Workflow] is an online utility for constructing scientific workflows to assist with analysis of network data.
{toc}
h3. Vocabulary
A basic vocabulary is useful for navigating within this tool.
| _Actor_: | a module which represents one step or task (e.g., low pass filter) |
| _Category_: | a node in the toolbox tree that has other categories or library attached |
| _Channel_: | a single curve or line that connects two actors together (also referred to as a wire) |
| _Library_: | a branch in the toolbox that has one or more actors attached to it |
| _Port_: \\ | a single input or output attached to an actor to which a wire can connect |
| _Toolbox_: \\ | a collection of actors |
| _Workflow_: | a user generated collection of actors and channels |
h3. Workflows
h4. Private Workflows
To create a private workflow:
# Right-click on Private under the Workflows tab
# Select 'New Workflow'
# Select the 'Toolboxes' tab
# Navigate through the treeview, and drag the desired actors into the 'Design' workspace.
# Use your mouse to connect actors (hovering over a port will indicate the expected input/output type.
# Double-click on an actor to adjust its parameters.
# Click Save.
# Enter a unique workflow name. Once saved, it will appear in your list of Private workflows under the 'Workflows' tab.
# When ready, click Run to execute your workflow. A message will appear when execution is completed.
# At this point, the results will be available as icons in the output actor and/or available in the FTP Directory.
Private workflows can be opened, edited and executed at any later time.
h4. Public Workflows
Public workflows provide a continually expanding resource for getting started on your data analysis.
To use a public workflow, simply select it and make required changes. Saving these changes will make the changed version a private workflow that can you can continue to modify.
h3. FTP Directory
Results of every executed workflow are stored in a separate timestamped folder within the FTP directory. An icon on the last output actor will provide a link to this directory. Within this directory, you will also find a text file called commands.txt that is the source code generated from the workflow during execution. This file can be useful in troubleshooting.
h3. Toolboxes
h4. Input/Output Toolboxes
* *Annotation Set* \- A pre-determined annotation set can be selected for training a classifier. To create an annotation set, use the Annotation Search (for more information, refer to [help:Annotations Help] ) to generate a list of annotations. Then you can save this list of annotations as a set.
* *ARFF* (Attribute Relation File Format) - These feature files are text files describing data sets with features for training and predicting using classifiers. We follow the .arff format using by the Weka machine learning software. They describe the features (attributes), what type they have and what is the output class (which is used for training or evaluating predictions). The last column is annotation (if from training set) OR source textfilename (e.g., if not from training set). The number of DATA entries is governed by the window size and hop size and the length of the input wave file. For example, if window size and hop size are 512 (no overlapping), there will be one entry for each 512 samples in the data stream. An example .arff file would be:
{noformat}
@RELATION iris
@ATTRIBUTE sepallength REAL
@ATTRIBUTE sepalwidth REAL
@ATTRIBUTE petallength REAL
@ATTRIBUTE petalwidth REAL
@ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica}
@DATA
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.4,3.7,1.5,0.2,Iris-setosa
4.8,3.4,1.6,0.2,Iris-setosa
4.8,3.0,1.4,0.1,Iris-setosa
4.3,3.0,1.1,0.1,Iris-setosa
5.8,4.0,1.2,0.2,Iris-setosa
5.7,4.4,1.5,0.4,Iris-setosa
5.4,3.9,1.3,0.4,Iris-setosa
5.1,3.5,1.4,0.3,Iris-setosa
5.7,3.8,1.7,0.3,Iris-setosa
5.1,3.8,1.5,0.3,Iris-setosa
5.4,3.4,1.7,0.2,Iris-setosa
5.1,3.7,1.5,0.4,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
6.4,3.2,4.5,1.5,Iris-versicolor
6.9,3.1,4.9,1.5,Iris-versicolor
5.5,2.3,4.0,1.3,Iris-versicolor
6.5,2.8,4.6,1.5,Iris-versicolor
5.7,2.8,4.5,1.3,Iris-versicolor
6.3,3.3,4.7,1.6,Iris-versicolor
4.9,2.4,3.3,1.0,Iris-versicolor
6.6,2.9,4.6,1.3,Iris-versicolor
5.2,2.7,3.9,1.4,Iris-versicolor
5.0,2.0,3.5,1.0,Iris-versicolor
5.9,3.0,4.2,1.5,Iris-versicolor
6.0,2.2,4.0,1.0,Iris-versicolor
6.1,2.9,4.7,1.4,Iris-versicolor
5.6,2.9,3.6,1.3,Iris-versicolor
6.7,3.1,4.4,1.4,Iris-versicolor
5.6,3.0,4.5,1.5,Iris-versicolor
5.8,2.7,4.1,1.0,Iris-versicolor
6.2,2.2,4.5,1.5,Iris-versicolor
5.6,2.5,3.9,1.1,Iris-versicolor
5.9,3.2,4.8,1.8,Iris-versicolor
6.1,2.8,4.0,1.3,Iris-versicolor
6.3,2.5,4.9,1.5,Iris-versicolor
7.7,2.8,6.7,2.0,Iris-virginica
6.3,2.7,4.9,1.8,Iris-virginica
6.7,3.3,5.7,2.1,Iris-virginica
7.2,3.2,6.0,1.8,Iris-virginica
6.2,2.8,4.8,1.8,Iris-virginica
6.1,3.0,4.9,1.8,Iris-virginica
6.4,2.8,5.6,2.1,Iris-virginica
7.2,3.0,5.8,1.6,Iris-virginica
7.4,2.8,6.1,1.9,Iris-virginica
7.9,3.8,6.4,2.0,Iris-virginica
6.4,2.8,5.6,2.2,Iris-virginica
6.3,2.8,5.1,1.5,Iris-virginica
6.1,2.6,5.6,1.4,Iris-virginica
7.7,3.0,6.1,2.3,Iris-virginica
6.3,3.4,5.6,2.4,Iris-virginica
6.4,3.1,5.5,1.8,Iris-virginica
6.0,3.0,4.8,1.8,Iris-virginica
6.9,3.1,5.4,2.1,Iris-virginica
6.7,3.1,5.6,2.4,Iris-virginica
6.9,3.1,5.1,2.3,Iris-virginica
5.8,2.7,5.1,1.9,Iris-virginica
6.8,3.2,5.9,2.3,Iris-virginica
6.7,3.3,5.7,2.5,Iris-virginica
6.7,3.0,5.2,2.3,Iris-virginica
6.3,2.5,5.0,1.9,Iris-virginica
6.5,3.0,5.2,2.0,Iris-virginica
6.2,3.4,5.4,2.3,Iris-virginica
5.9,3.0,5.1,1.8,Iris-virginica
{noformat}
* *Hydrophone Data -* Multiple hydrophone files from the NEPTUNE Canada archive can be selected with this actor for a chosen hydrophone and time range.
* *MPL File* \- Marsyas plugins are textual descriptions of network of Marsyas processing objects (MarSystems) and their associated parameters. The networks operate at a finer granularity (buffers) and contain many more parameters than need to be exposed to a user. They can be treated as a file of text that is saved and loaded by Marsyas and does not require any interpretation. Other formats such as JSON are supported. They are rather verbose and hard to read however they contain all the necessary information to run a network. They are primarily used in the Neptune modules to store trained classifiers.
* *PNG File* \- image format used for plots (e.g., spectrogram, waveform)
* *Timeline File* \- Timelines contain a list of TAB-separated events (one per line) with the format (start, end, label). The start and end are in seconds with respect to the input wav file. Contents of the annotation files, applies to training set and output from timeline. They can be created, saved and loaded using the Audacity open source audio editor. There must be at least two labels to get any meaningful results out of feature extraction/classification. An example time-line would be:
{noformat}
0.000000 0.633949 background
0.633949 2.493532 whale
2.609756 4.395379 background
4.606695 6.719859 whale
7.068530 8.473784 background
8.748495 10.565815 whale
10.787697 12.340872 background
12.594452 15.214774 whale
16.968699 18.595835 whale
18.807151 21.258420 background
21.533132 23.223662 whale
23.350452 25.474181 background
25.632668 27.217540 whale
29.964652 30.894444 whale
31.750275 36.526024 background
37.149407 39.283701 whale
41.819497 44.693399 whale
45.528098 47.789183 background
48.412566 49.257831 whale
49.860083 51.666837 background
{noformat}
* *TXT File* \- Text files are used for a variety of actors, such as the SOX statistics actor which outputs its results to a simple text file.
* *WAV* \- audio files (e.g., for hydrophones)
h4. Digital Signal Processing (DSP) Toolboxes
h5. Marsyas
Marsyas (Music Analysis, Retrieval and Synthesis for Audio Signals) is an open source software framework for audio processing. It has been designed and written by George Tzanetakis (gtzan@cs.uvic.ca) with help from students and researchers from around the world. A subset of its functionality has been integrated into the Workflow tool. For specific details, refer to the Marsyas websites:
* [http://marsyas.info/]
* [http://sourceforge.net/projects/marsyas/]
Available modules from Marsyas in Workflow tool:
* Visualization Modules: Spectrogram, Waveform
* Audio Modules: LowPass, BandPass, HighPass, Normalize
* Feature Extraction Modules: SpectralCentroid, SpectralRolloff, SpectralFlux, MelFrequencyCepstralCoefficients, ZeroCrossings
* Classifier Modules: Gaussian, SVM, ZeroR
* Prediction Modules: TimeLine, ClassificationStatistics, Frames
These modules are all based on command-line tools included in the Marsyas repository.
h5. SoX
SoX is a command-line utility that can apply various effects to sound files. A subset of its functionality has been integrated into the Workflow tool. For specific details, refer to its website:[http://sox.sourceforge.net/Main/HomePage]
The following SOX functionality is incorporated as actors in the Workflow tool (example command line calls are given, but refer to documentation for full list of parameters).
* highpass
* lowpass
* stat (time and frequency domain stats)
* stats (time domain stats)
* spectrogram
* dcshift:
* gain:
* rate:
* pad
* trim
* vol
* norm
h5. MATLab - coming soon
h5. Python - coming soon
h3. Resources
[Audacity|http://audacity.sourceforge.net/]