Opened 12 years ago
Closed 12 years ago
#3726 closed enhancement (fixed)
[with patch, positive review] stats/finance -- add support for hidden markov models to sage
Reported by: | was | Owned by: | was |
---|---|---|---|
Priority: | major | Milestone: | sage-3.1.2 |
Component: | finance | Keywords: | |
Cc: | Merged in: | ||
Authors: | Reviewers: | ||
Report Upstream: | Work issues: | ||
Branch: | Commit: | ||
Dependencies: | Stopgaps: |
Description (last modified by )
The spkg:
http://sage.math.washington.edu/home/was/patches/ghmm-20080803.spkg
builds ghmm and doesn't depend on anything not in sage (I hope). It does *not* build the official GHMM bindings. This ticket replaces those bindings with clean new Cython bindings that have much much better documentation, but initially expose much less functionality.
NOTE: This shouldn't actually get added to Sage until it gets formally voted on in sage-devel.
SEE ALSO: #3773
Attachments (10)
Change History (18)
Changed 12 years ago by
Changed 12 years ago by
Changed 12 years ago by
Changed 12 years ago by
Changed 12 years ago by
Changed 12 years ago by
Changed 12 years ago by
comment:1 Changed 12 years ago by
- Description modified (diff)
- Summary changed from stats/finance -- add support for hidden markov models to sage to [with patch; needs review] stats/finance -- add support for hidden markov models to sage
comment:2 Changed 12 years ago by
- Description modified (diff)
Changed 12 years ago by
Changed 12 years ago by
comment:3 Changed 12 years ago by
- Description modified (diff)
comment:4 Changed 12 years ago by
- Summary changed from [with patch; needs review] stats/finance -- add support for hidden markov models to sage to [with patch; with positive review] stats/finance -- add support for hidden markov models to sage
Code quality is of course excellent and of course positive review
but
Actual Bugs
- No checks are made that matrices given are valid stochastic matrices (rows add to 1). Or more importantly that the probabilities are actually positive. There is a function to normalize but it isn't called by default. Actually I'm not sure if this is a bug but there should be some documentation about what to expect if the inputs matrices are not stochastic, does it do reasonable things or is the result numerical junk.
- For continuous hidden markov models baum welch does not accept a set of sequences only a single sequence.
Miscellaneous comments
- Doc string locations are sort of hidden. It would be nice if hmm? or hmm.DiscreteHiddenMarkovModel?? hit something.
- (very minor) sample vs sample_single annoyed me, I would have rather sample had the number of sequences be optional which defaulted to 1.
- Does there need to be a discussion as to whether or not we want to support this in standard or optional since it involves adding another spkg.
comment:5 Changed 12 years ago by
- No checks are made that matrices given are valid stochastic matrices (rows add to 1). Or more
importantly that the probabilities are actually positive. There is a function to normalize but it
isn't called by default. Actually I'm not sure if this is a bug but there should be some
documentation about what to expect if the inputs matrices are not stochastic, does it do reasonable things or is the result numerical junk.
I have no idea what it will do :-). I will find out by looking at the GHMM docs. I'm not sure what the best behavior would be on invalid input yet.
- For continuous hidden markov models baum welch does not accept a set of sequences only a single sequence.
You're right, this is a bug. It does take multiple sequences but only like this with a weight:
sage: z.baum_welch([([1,2,3], 1), ([1,2,8], 2)])
It should also work with no weights.
DiscreteMarkovModels? in this patch don't take single sequences (only multiple), but I fixed that in #3773 (I think).
- Doc string locations are sort of hidden. It would be nice if hmm? or hmm.DiscreteHiddenMarkovModel??? hit something.
Thanks for finding this. I consider this a serious problem and will fix it.
- (very minor) sample vs sample_single annoyed me, I would have rather sample had the number of sequences be optional which defaulted to 1.
I'm annoyed by not having it since foo.sample(n) returns something like 1,2,5,3? so one ends up typing foo.sample(n)[0] a lot, which seems odd. Wait, this is easy. Do this:
23:08 < wstein-3514> I can just make M.sample(n, None) give a single list, 23:08 < wstein-3514> whereas M.sample(n, k) with k >= 1 give a list of lists. 23:08 < jkantor> right 23:08 < wstein-3514> That is way better. 23:08 < wstein-3514> Thanks.
- Does there need to be a discussion as to whether or not we want to support this in standard or > optional since it involves adding another spkg.
Yes. This will not go in standard until the spkg is officially voted on in sage-devel. This will happen in the next day, as soon as I fix all the problems you point out above.
comment:6 Changed 12 years ago by
- Regarding invalid input, I would suggest accepting positive inputs and normalizing but printing a message to that affect (that the probabilities didn't sum to 1 and were normalized) and I would raise an exception on negative inputs.
- The solution to 2 in what I had in mind.
comment:7 Changed 12 years ago by
- Summary changed from [with patch; with positive review] stats/finance -- add support for hidden markov models to sage to [with patch, positive review] stats/finance -- add support for hidden markov models to sage
The latest spkg seems to be
http://sage.math.washington.edu/home/was/patches/ghmm-20080813.spkg
Cheers,
Michael
comment:8 Changed 12 years ago by
- Resolution set to fixed
- Status changed from new to closed
Merged hmm-bundle-3.0.5.hg in Sage 3.1.2.alpha0. This bundle contains 12 changesets, so there are some patches missing in the broken out series. Oh well ...
Cheers,
Michael
The attached patches bring the coverage of this code to 100% and cleanly wrap a solid set of functionality.