Trying to understand the testing frame work

Hi @zaxtax 

My team and I have been doing working on Hakaru for our undergrad capstone project. We are wrapping things up and writing a paper based on our results. However, we've been having trouble formulating our hypothesis.

We've been writing a number of RoundTrip test cases to test known relationships between distributions. However, we are having trouble understanding what exactly is going on with these tests. We've tried diving into the related files, but none of us are knowledgeable enough with Haskell to figure it out. Best we can tell is it grabs the test cases and passes them to some Maple environment.

@JacquesCarette has told us to ask you to explain it to us.

I'll explain what we've figured out so far and then hopefully you can fill in the gaps for us. So for example, we have added a test with the following result:

```
### Failure in: 6:RoundTrip:7:2:t_rayleigh_to_stdChiSq:0
haskell/Tests/TestTools.hs:130
expected:
chiSq_iid = fn n nat:
            fn mean real:
            fn stdev prob:
            q <~ plate _ of n: normal(mean, stdev)
            return summate i from 0 to size(q):
                   ((q[i] - mean) * prob2real(1/ stdev)) ^ 2
standardChiSq = fn n nat: chiSq_iid(n, nat2real(0), nat2prob(1))
standardChiSq(2)
but got:
q307 <~ normal(+0/1, 1/1)
q315 <~ normal(+0/1, 1/1)
return q307 ^ 2 + q315 ^ 2
Cases: 338  Tried: 287  Errors: 2  Failures: 20
                                               
### Failure in: 6:RoundTrip:7:2:t_rayleigh_to_stdChiSq:1
haskell/Tests/TestTools.hs:130
expected:
chiSq_iid = fn n nat:
            fn mean real:
            fn stdev prob:
            q <~ plate _ of n: normal(mean, stdev)
            return summate i from 0 to size(q):
                   ((q[i] - mean) * prob2real(1/ stdev)) ^ 2
standardChiSq = fn n nat: chiSq_iid(n, nat2real(0), nat2prob(1))
standardChiSq(2)
but got:
X3 <~ uniform(+0/1, +1/1)
return log(real2prob(X3)) * (-2/1)
Cases: 338  Tried: 288  Errors: 2  Failures: 21
```

So for each test case it looks like 2 tests are run. I've messed around with `hk-maple` a bit and it looks like this is roughly what is happening:

- 0-test: run default and Summarize modes on the expected file and check if their outputs match
- 1-test: run default mode on the expected file again, run Summarize mode on the test file and check if their outputs match

However, these outputs don't always exactly match the outputs when I run Summarize on these files (although they are very close) so I don't think this is exactly what is happening. Can you clarify how these outputs are generated?

We would also like to make sure we understand the purposes of both tests. The 0-test seems to be some sort preliminary test before the 1-test tests the actual relationship we are interested in. As far as I can tell, Summarize seems to be a more ambitious version of Simplify. So I expect if their outputs are equal, Summarize can't do any better, right? I think Dr. Carette had said it has to do with making sure some sort of change of variables is done correctly. Can you expand on this?

I know the 1 test is meant to produce equivalent code for Hakaru files describing equivalent distributions. Can you explain how the inference algorithms used in the test are meant to accomplish this?

For reference, this is the hypothesis we are currently working with:

_Assume we know a relationship between 2 statistical distributions which transforms distribution A into distribution B, which we are able to prove by analyzing their PDFs._

_We hypothesize that by applying the appropriate transformations on an implementation of distribution A in hakaru, we can create a hakaru program whose hk-maple output will be a hakaru program that is equivalent to the hakaru program output by hk-maple run on an implementation of distribution B._

Really appreciate your help with this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trying to understand the testing frame work #169

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Trying to understand the testing frame work #169

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions