Openfst troubleshooting

A bit of openfst troubleshooting when you try to build WFST with Juicer. Say you are running


fstcompose ${OUTLEXBFSM} ${OUTGRAMBFSM} | \
fstepsnormalize | \
fstdeterminize | \
fstencode --encode_labels - $CODEX | \
fstminimize - | \
fstencode --decode - $CODEX | \
fstpush --push_weights | \
fstarcsort

and get this


FATAL: StringWeight::Plus: unequal arguments (non-functional FST?)


Huh? Which arguments are not equal? What caused this? How to fix this? Definitely it should be more self-explaining. That's basically quite a common issue. You get just a short message that nobody including the author could understand. Go find out how to fix it.



Looking on the waves

Here is the question - a perfectly looking sound file which is transcribed with 10% accuracy. Sounds crazy, isn't it? Click on it to enlarge. No noise, no accent.



Because of that I'm looking on state-of-art in channel normalization, especially for non-linear channel distortions. No good solution yet, I've only found the description of the problem in very old paper



There is CDCN normalization, few CMN improvements, RASTA and even recently invented HN normalization. CDCN is suprisingly available in Sphinxtrain but nobody uses it. Well it gives no improvement but it's an interesting approach worth to document one day. The idea to collect statistics from the speech to apply it later sounds nice.

There are model-level approaches, various feature transforms, adaptations. They do not really look that attractive. Most papers now deal with channel compensation for speaker recognition, not speech recognition. I must admit the topic is too large to overview it in few weeks.

Luckily, I can also spend time looking on the waves like the one on the right. Somewhat more pleasant I would say.