home | news | discuss | issues | license LURE: range |
|
---|---|
range : simple discretiationtim@menzies.us Use this code to find "natural ranges"
in a list of numbers For example,
consider a random number function
Note that a line drawn through our breaks is a good approximation of the original curve. There are two benefits in using this approximation:
In the literature, finding such ranges ins called discretisation. A common division of discretisation methods is
This code is EFD, with some extras to avoid common problems. For example, consider a 33 percent split on the following numbers. Note that the first and second bin would contain the same numbers-- which is odd since you'd expect different bins to be, well, different.
This code builds ranges that at least the size of the equal frequency division. But it fuses one range to the next if they do not have different values. Input
OutputReturns a list of
The numbers in the input list
Example usage
Configuration
|
require "show"
local the=require "config"
local NUM=require "num"
local SOME=require "sample"
|
Initialize a range |
local function create() return {
_all= SOME.create(),
n = 0,
hi = -2^63,
lo = 2^63,
span = 2^64} end
|
Update range i with
some numeric x found from within
one. Note that, for efficiency sake, we only keep |
local function update(i,one, x)
if x ~= the.ignore then
SOME.update(i._all, one)
i.n = i.n + 1
if x > i.hi then i.hi = x end
if x < i.lo then i.lo = x end
i.span = i.hi - i.lo
return x end end
|
Update range manager i with a new range i.now. Push that range onto the list of all ranges i.ranges. |
local function nextRange(i)
i.now = create()
i.ranges[#i.ranges+1] = i.now end
|
Initialize the control parameters for range generation |
local function rangeManager(t, x)
local _ = {
x = x,
cohen = the.chop.cohen,
m = the.chop.m,
size = #t,
ranges= {} -- list of all known ranges
}
nextRange(_)
_.num = NUM.updates(t, _.x)
_.hi = _.num.hi
|
Any range holding less than enough items is ignored. |
_._.enough = _.size^_.m
|
Any range whose span is less than epsilon is ignored. |
_._.epsilon= _.num.sd * _.cohen
return _ end
|
Return a function that
|
return function (t, x, last)
x= x or function (z) return z end -- _x_ defaults to the identity
table.sort(t, function (z1,z2)
local one,two=x(z1),x(z2)
return one ~=the.ignore and
two ~=the.ignore and
one < two end )
local i= rangeManager(t, x)
for j,one in pairs(t) do
local x1 = x(one)
if x1 ~= the.ignore then
update(i.now, one, x1)
if j > 1 and
x1 > last and
i.now.n > i.enough and
i.now.span > i.epsilon and
|
These last two tests are important: they stop the final range being too small |
i.num.n - j > i.enough and
i.num.hi - x1 > i.epsilon
then nextRange(i) end
last = x1 end end
return i.ranges end
|
LegalLURE, Copyright (c) 2017, Tim Menzies All rights reserved, BSD 3-Clause License Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
|