WIP Add lifted version of split #34

davidanthoff · 2017-11-24T01:19:31Z

This is probably a horrible idea and most likely I won't merge it. But hey, lets think about it a while. It is the only way that I can think of that would make the last query in queryverse/Query.jl#134 work out of the gate without a need to think about missing values... Maybe that goal is mistaken in the first place, though.

Essentially what this would do is treat a missing string passed to split as equivalent to an empty string...

codecov-io · 2017-11-24T01:28:22Z

Codecov Report

Merging #34 into master will decrease coverage by 1.06%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master      #34      +/-   ##
==========================================
- Coverage   84.95%   83.89%   -1.07%     
==========================================
  Files          11       12       +1     
  Lines         472      478       +6     
==========================================
  Hits          401      401              
- Misses         71       77       +6

Impacted Files	Coverage Δ
src/scalar/strings.jl	`0% <0%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 45e1c18...a711151. Read the comment docs.

tcovert · 2017-11-24T18:07:21Z

I sorta think that if you go down this path, you ought to add lifted versions of many other string functions in the standard library: length, sizeof, invalid, all the regex stuff etc. I would totally be a fan of this.

davidanthoff · 2017-11-24T19:52:39Z

Oh, I definitely want to add all those lifted versions!

The question is, what should they return when they encounter a NA. There is the philosophy that they should always propagate NA. That is nice and consistent and predictable. But I'm not sure really helpful for something like split, i.e. if the return value could either be an array or NA. But maybe that is what it should do... I'm just not sure.

tcovert · 2017-11-24T20:02:04Z

Ah good point. Is there any common theme across base string functions that return scalars vs arrays? In the array case (i.e. ‘split’) where the correct return value for a non-null input is, say, an empty array, returning the same thing for a null input seems reasonable. For the scalar case maybe null propagation makes more sense?

tcovert · 2017-11-24T21:35:19Z

Here's another thought, at least based on the split case. When you split a string on a pattern that isn't found, split seems to give you an array containing just the original string:

julia> split("abc", " ")
1-element Array{SubString{String},1}:
 "abc"

Closest analog to this that I can think of for the null case would be a single element array of a DataValue{String}().

Add lifted version of split

a711151

davidanthoff added the enhancement label Nov 24, 2017

davidanthoff added this to the Backlog milestone Nov 24, 2017

davidanthoff mentioned this pull request Nov 24, 2017

Selection using Base functions and possibly missing values queryverse/Query.jl#134

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP Add lifted version of split #34

WIP Add lifted version of split #34

davidanthoff commented Nov 24, 2017

Uh oh!

codecov-io commented Nov 24, 2017 •

edited

Loading

Uh oh!

tcovert commented Nov 24, 2017

Uh oh!

davidanthoff commented Nov 24, 2017

Uh oh!

tcovert commented Nov 24, 2017

Uh oh!

tcovert commented Nov 24, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

WIP Add lifted version of split #34

Are you sure you want to change the base?

WIP Add lifted version of split #34

Conversation

davidanthoff commented Nov 24, 2017

Uh oh!

codecov-io commented Nov 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tcovert commented Nov 24, 2017

Uh oh!

davidanthoff commented Nov 24, 2017

Uh oh!

tcovert commented Nov 24, 2017

Uh oh!

tcovert commented Nov 24, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-io commented Nov 24, 2017 •

edited

Loading