Conversation
|
@tixxit ok I think this is ready now for review |
There was a problem hiding this comment.
It seems a bit arbitrary to make this be a Double - it could be some generic F:Ordering, but then again I've got a bit of type param exhaustion.
There was a problem hiding this comment.
I also wonder whether the Option[T] is actually confounding two concepts: "unacceptable" should really be something that Stopper determines separately. That is, Evaluator would always return Double but Stopper would include both isWorthTryingToSplit and isAcceptableSplitOutcome.
There was a problem hiding this comment.
So, I think we should stick with Double instead of Option[Double]. My guess is that a NaN value would be totally fine to use for this purpose (it certainly won't be a valid score).
The idea of using some Double => Boolean values to test criteria seems fine to me. I just think we don't need to create more boxes in this case.
(Sorry I am late to this party.)
There was a problem hiding this comment.
Re: Double vs new type param - I'm definitely pro-Double until we reach a point where we have an Evaluator where Double won't work anymore.
|
Got rid of the |
| def evaluate(split: Split[V, T]): Option[(Split[V, T], Double)] | ||
| trait Evaluator[T] { | ||
| /** returns an overall numeric training error for a tree or split, or None for infinite/unacceptable error */ | ||
| def trainingError(leaves: Iterable[T]): Option[Double] |
|
We should probably remove all the commented code in this PR once we agree it is really gone. |
This makes a few related changes to
Evaluator:Splitand just has it directly operate onIterable[T], which was the only part of aSplitit ever cared about anyway.Evaluatorto a tree or sub-tree, which could be useful in various ways in the future (eg if we want to changeSplitterto produce candidate sub-trees instead of splits)Iterable[T]represents the leaves of the sub-tree, and so it's now labeled as suchInfinityto represent an unacceptable level of error, it now returnsOption[Double]so you can returnNoneto signal thatSplit, just the errorTis now also provided. None of the evaluators makes use of it, but there are at least two potential uses I have in mind:Tto the rootT(eg, "all leaves must contain at least X% of the input Ys")This is WIP because I haven't fully updated the training code to make use of it.