Skip to content
This repository was archived by the owner on Apr 8, 2021. It is now read-only.

Comments

rethink Evaluator#78

Open
avibryant wants to merge 16 commits intomasterfrom
avi-new-evaluator
Open

rethink Evaluator#78
avibryant wants to merge 16 commits intomasterfrom
avi-new-evaluator

Conversation

@avibryant
Copy link
Contributor

This makes a few related changes to Evaluator:

  • it decouples it from Split and just has it directly operate on Iterable[T], which was the only part of a Split it ever cared about anyway.
  • this makes it possible to apply an Evaluator to a tree or sub-tree, which could be useful in various ways in the future (eg if we want to change Splitter to produce candidate sub-trees instead of splits)
  • in this context the Iterable[T] represents the leaves of the sub-tree, and so it's now labeled as such
  • rather than using Infinity to represent an unacceptable level of error, it now returns Option[Double] so you can return None to signal that
  • the sign has been reversed (ie this is now an error value not a "goodness" value)
  • it doesn't return a new Split, just the error
  • the root T is now also provided. None of the evaluators makes use of it, but there are at least two potential uses I have in mind:
    • considering a split unacceptable based on some ratio of the leaf T to the root T (eg, "all leaves must contain at least X% of the input Ys")
    • using more global optimization criteria (this comes up in particular in the "explanation tree" use case which I won't explain in more detail here)

This is WIP because I haven't fully updated the training code to make use of it.

@avibryant avibryant changed the title WIP: rethink Evaluator rethink Evaluator Dec 29, 2015
@avibryant
Copy link
Contributor Author

@tixxit ok I think this is ready now for review

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit arbitrary to make this be a Double - it could be some generic F:Ordering, but then again I've got a bit of type param exhaustion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder whether the Option[T] is actually confounding two concepts: "unacceptable" should really be something that Stopper determines separately. That is, Evaluator would always return Double but Stopper would include both isWorthTryingToSplit and isAcceptableSplitOutcome.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think we should stick with Double instead of Option[Double]. My guess is that a NaN value would be totally fine to use for this purpose (it certainly won't be a valid score).

The idea of using some Double => Boolean values to test criteria seems fine to me. I just think we don't need to create more boxes in this case.

(Sorry I am late to this party.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: Double vs new type param - I'm definitely pro-Double until we reach a point where we have an Evaluator where Double won't work anymore.

@avibryant
Copy link
Contributor Author

Got rid of the root part, and merged with #77 which should now be landed first.

def evaluate(split: Split[V, T]): Option[(Split[V, T], Double)]
trait Evaluator[T] {
/** returns an overall numeric training error for a tree or split, or None for infinite/unacceptable error */
def trainingError(leaves: Iterable[T]): Option[Double]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 on the rename!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also, 1 less type param!)

@erik-stripe
Copy link
Contributor

We should probably remove all the commented code in this PR once we agree it is really gone.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants