Numpy finding interval which has a least k points
After a bit of struggle I came up with this solution.
First a bit of explanations, and order of thoughts:
- Ideally we would want to set a window size and slide it from the most left acceptable point until the most right acceptable point, and start counting when
min_points
are in the window, and finish count whenmin_points
no longer inside it (imagine it as a convultion oprtator or so) - the basic pitfall is that we want to discrete the sliding, so the trick here is to check only when amount of points can fall under or up higher than
min_points
, which means on every occurance of element orwindow_size
below it (asoptional_starts
reflects) - then to iterate over
optional_starts
and sample the first time condition mets, and the last one that condition mets for each interval
so the following code was written as described above:
def consist_at_least(start, points, min_points, window_size):
a = [point for point in points if start <= point <= start + window_size]
return len(a)>=min_points
points = [1.4,1.8, 11.3,11.8,12.3,13.2, 18.2,18.3,18.4,18.5]
min_points = 4
window_size = 3
total_interval = [0,20]
optional_starts = points + [item-window_size for item in points if item-window_size>=total_interval[0]] + [total_interval[0] + window_size] + [total_interval[1] - window_size] + [total_interval[0]]
optional_starts = [item for item in optional_starts if item<=total_interval[1]-window_size]
intervals = []
potential_ends = []
for start in sorted(optional_starts):
is_start_interval = len(intervals)%2 == 0
if consist_at_least(start, points, min_points, window_size):
if is_start_interval:
intervals.append(start)
else:
potential_ends.append(start)
elif len(potential_ends)>0 :
intervals.append(potential_ends[-1])
potential_ends = []
if len(potential_ends)>0:
intervals.append(potential_ends[-1])
print(intervals)
output:
[10.2, 11.3, 15.5, 17]
Each 2 consequtive elements reflects start and end of interval
So, after additional information were given regarding the nature of the "intervals", I propose the following solution, which assumes inter-interval distances of at least window_size
:
import numpy as np
def get_start_windows(inter, ws, p, mp):
# Initialize list of suitable start ranges
start_ranges = []
# Determine possible intervals w.r.t. to window size
int_start = np.insert(np.array([0, p.shape[0]]), 1,
(np.argwhere(np.diff(p) > ws) + 1).squeeze()).tolist()
# Iterate found intervals
for i in np.arange(len(int_start)-1):
# The actual interval
int_ = p[int_start[i]:int_start[i+1]]
# If interval has less than minimum points, reject
if int_.shape[0] < mp:
continue
# Determine first and last possible starting point
first = max(inter[0], int_[mp-1] - ws)
last = min(int_[-mp], inter[1] - ws)
# Add to list of suitable start ranges
start_ranges.append((first, last))
return start_ranges
# Example 1
interval = [0, 20]
window_size = 3.0
min_points = 4
points = [1.4, 1.8, 11.3, 11.8, 12.3, 13.2, 18.2, 18.3, 18.4, 18.5]
print(get_start_windows(interval, window_size, np.array(points), min_points))
# Example 2
points = [1.4, 1.8, 1.9, 2.1, 11.3, 11.8, 12.3, 13.2, 18.2, 18.3, 18.4, 18.5]
print(get_start_windows(interval, window_size, np.array(points), min_points))
# Example 3
points = [1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 3.49]
print(get_start_windows(interval, window_size, np.array(points), min_points))
(Code might be optimized, I didn't pay attention to that...)
Output:
[(10.2, 11.3), (15.5, 17.0)]
[(0, 1.4), (10.2, 11.3), (15.5, 17.0)]
[(0, 1.9)]
Hopefully, the desired cases are covered by that solution.
-------------------------------------
System information
-------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.8.5
NumPy: 1.19.2
-------------------------------------