The module in charge of assigning lexical probabilities to each word analysis only requires a data file, referenced by the ProbabilityFile configuration option.
This file may be creted using a tagged corpus and the script provided in src/utilities/TRAIN.
See section 3.11 for format details.