I’ll try to study more on this new methodology in my spare time. ]]>

I agree we would want more than just probability estimates of observed sequences. Since the “spectral learning of HMMs” paper there have been several other papers that use the method of moments to get the actual parameters of an HMM — in a more direct way than in the Hsu et al. paper.

Here is one of them:

http://arxiv.org/abs/1210.7559

and here is another (older):

http://arxiv.org/abs/1203.0683

Michael Collins and I had also a paper on using a method of moments for extracting the parameters of an L-PCFG, which HMMs are a subclass of:

http://homepages.inf.ed.ac.uk/scohen/acl14pivot+supp.pdf

I am sure that there are other papers that do similar things.

]]>The idea of utilizing spectral properties is brilliant though, it seems that this paper did not tell us about how to recover T, O and \pi.

HMMs may not become useful for most NLP tasks if we have probability estimates merely on observed sequences. ]]>