The discussion is here.
I'm glad to see the progress happens, big thanks to everyone involved - Joe, Piter and others.
About git itself I have a mixed feeling. The advantages of DVCS aren't obvious for me and in the past I even gave up my participation in one of the projects after it's migration to mercurial (it was http://linuxtv.org). Distributed nature increases complexity and confuses at least me. It's hard to understand where the latest changes are done, what is the real state of thing and where change happens. Developers tend to add their changes to their own branches and little effort is made to create a common branch. Also among all DVCSs git is the worst in terms of usability. Sadly GNOME also migrates to git in near future.
Every change has it black and white sides. Many things I do like in a new sphinx4 - clear split of the tests one can run. Some things are hard to understand like Rakefile migration. I'm afraid of windows users, how will they build sphinx4 now? Anyhow, let's hope issues will be resolved and the new shiny release will appear very soon.
- Text summarizer in Epiphany
- Improved spell check for GEdit
- Doxygen support for gtk-doc
- Desktop-wide services for activity registration
- Automatic workstation mode detection and more AI tasks desktop can benefit from
- Cleanup of the Evolution interface where sent and received mail are grouped together
The overview of this issue makes me think again about GNOME as a product on the market and the possible ways of it's development. It seems that we are now at a point when feature set among competitors are stabilized and it's hard to invent something else in a market. So-called mature product stage where it's important to polish and lower costs. The big step is required to shift product on a new level. Probably I need to investigate the research desktops that completely change the way users works with the system. For example I'd love to see better AI support everywhere like adaptive preferences, better stability and security with proper IPC and service-based architecture, the self-awareness services, the modern programming language. I'm not sure I'm brave enough for that though.
Amazing news really. The new features of the release include:
1. The HTK Book has been extended to include tutorial sections on HDecode and discriminative training. An initial description of the theory and options for discriminative training has also been added.
2. HDecode has been extended to support decoding with trigram language models.
3. Lattice generation with HDecode has been improved to yield a greater lattice density.
4. HVite now supports model-marking of lattices.
5. Issues with HERest using single-pass retraining with HLDA and other input transforms have been resolved.
6. Many other smaller changes and bug fixes have been integrated.
The release is available on the HTK website
Sadly cmuclmtk requires a lot of magic passes with the models to get lm_combine work. Many thanks to Bayle Shanks from voicekey to write a receipt. So if you want to give it a try:
- Download voice-keyboard
- Unpack it
- Train both language models
- Process them with the scripts lm_combine_workaround
- Process both with lm_fix_ngram_counts
- Create a weight file like this (the weights could be different of course):
- Combine models with lm_combine.
It's discouraging that sphinx4 doesn't support high-order n-grams. Another article mentions a solution for that to join some often word combinations into compound words.
Btw, generic model gives 40% accuracy while home-groun dialog model gives 60, so it's a promising direction anyhow.
I tried to find some articles about training of the acoustic model on the uncomplete data, but it seems that most of such research is devoted to another domains like web classification. Web data by definition is incomplete and has errors. We could reuse their methods on unsupervised learning, but I failed to find information on this. Links are welcome.
Another interesting reading I had today is the performance of the Fisher database. Articles mention that the baseline is around 22% WER on 20xRT speed. 20xRT is unacceptably slow I think, but even with 5xRT we are close to this barrier. The thing that makes me wonder is that in sphinx4 beams make decoding slow but doesn't improve accuracy. It must be a bug I think.
The great article about architecture management:
And some consequences.
It's impossible to get consistent behaviour without sharing codebase. One can make application written in Qt or with Mozilla suite look like Gtk application, but user easily see the difference way such applications do things. Once you open settings dialog, mirage of consistensy dissapear. HIG should not be recomendations everyone is trying to follow, but a documentation to hardcoded rules, that anyone using library is automatically following.
Integration with other toolkits doesn't have any sense, as supporting software on different platform. If application uses another codebase it will behave differently. Take, for example, gecko or gtk-mozembedd applications. They all have problems with keyboard focus and accessibility. It's impossible to make them work like GNOME user expected. Even if you'll get them look similar, it's impossible to maintain such consistensy every time something changes in gtk.
- ► 2011 (14)
- ► 2010 (42)
- Sphinx4 migrated to git
- Russian GNOME 2.26
- GNOME Summer of code tasks
- HTK 3.4.1 is released
- Building interpolated language model
- Building the language model for dialogs
- Cleanup strategies for acoustic models
- Nexiwave in MIT100k
- The great article about architecture management: I...
- Behaviour guideline
- ▼ March (10)