@VRonin , @J-Hilk
So nobody wanted to examine/implement my suggested algorithm, shame.... :( ;-)
So.... I sat down this morning and provided an implementation of my approach. Here is complete, standalone code (C++ 17+): it has a main() which generates a bunch of the min-max range pairs randomly, calls the ChatGPT best implementation (i.e. @VRonin's second one) and my "jon" implementation (extensively commented), and reports the answer and the time for each.
#include <algorithm>
#include <chrono>
#include <iostream>
#include <random>
#include <vector>
std::pair<int, int> chatGPTNumberInMostIntervals(const std::vector<std::pair<int, int>>& intervals, int upperLimit)
{
if (intervals.empty())
return {-1, -1};
std::vector<int> frequency(upperLimit + 2, 0); // Use +2 to handle boundary at upperLimit properly
// Mark the start and end points for each interval
for (const auto& interval : intervals) {
++frequency[interval.first];
--frequency[interval.second + 1];
}
// Apply the prefix sum technique to get the frequency of each number
int maxCount = 0;
int result = 0;
int currentCount = 0;
for (int i = 0; i <= upperLimit; ++i) {
currentCount += frequency[i];
if (currentCount > maxCount) {
maxCount = currentCount;
result = i;
}
}
return {result, maxCount};
}
std::pair<int, int> jonNumberInMostIntervals(const std::vector<std::pair<int, int>>& intervals, int upperLimit)
{
if (intervals.empty())
return {-1, -1};
// vector of size intervals.size()
// the pair<int, int> elements will hold <min-bound, count-of-min-bound> as each eleemnt
std::vector<std::pair<int, int>> frequency(intervals.size());
// fill frequency[] with the min-bounds in each first-member, count in second-member doesn't matter here
for (int i = 0; i < intervals.size(); i++)
frequency[i] = {intervals[i].first, 0 };
// sort frequency[] by the min-bounds value in each element, ascending
std::sort(frequency.begin(), frequency.end(),
[](const std::pair<int, int> &x, const std::pair<int, int> &y)
{
return x.first < y.first;
});
// fill the sorted frequency[] count-of-min-bound in each element
// frequency[0].count = 1, frequency[1].count = 2, frequency[i].count = i + 1
for (int i = 0; i < frequency.size(); i++)
frequency[i].second = i + 1;
// go through the interval[] pair elements (random order for max range values)
// go *down* through the frequency[] pair elements (sorted by min range values ascending)
// while the freqency min values are greater than the interval max value decrement that element's frequency count
for (int i = 0; i < intervals.size(); i++)
for (int j = frequency.size() - 1; j >= 0 && frequency[j].first > intervals[i].second ; j--)
frequency[j].second--;
// find the frequency[] pair element with the greatest count
auto most_frequent = std::max_element(frequency.begin(), frequency.end(),
[](const std::pair<int, int> &x, const std::pair<int, int> &y)
{
return x.second < y.second;
});
return *most_frequent;
}
int main()
{
constexpr int upperLimit = 1000000000;
constexpr int num_intervals = 10;
std::vector<std::pair<int, int>> intervals;
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<int> distrib(0, 100);
for (int i = 0; i < num_intervals; i++) {
std::pair<int, int> pair(distrib(gen), distrib(gen));
if (pair.first > pair.second)
std::swap(pair.first, pair.second);
intervals.push_back(pair);
std::cout << "(" << pair.first << ", " << pair.second << ")" << std::endl;
}
uint64_t start_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
std::pair<int, int> result = chatGPTNumberInMostIntervals(intervals, upperLimit);
uint64_t end_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
std::cout << "ChatGPT: " << "Number: " << result.first << ", Frequency: " << result.second << ", Time: " << end_time - start_time << std::endl;
start_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
result = jonNumberInMostIntervals(intervals, upperLimit);
end_time = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
std::cout << "jon: " << "Number: " << result.first << ", Frequency: " << result.second << ", Time: " << end_time - start_time << std::endl;
return 0;
}
I am deliberately testing with 10 interval ranges but 1 billion(!) as upperLimit, i.e. large.
In that case ChatGPT uses (a) a large amount of memory (its frequency[] vector is size 1 billion elements) and (b) a large amount of iterations/time (it looks through the 1 billion elements in the vector). OTOH mine uses (a) small memory (frequency[] vector is same size as intervals[] vector) and (b) small amount of iterations/time (it looks through just the intervals[]/frequency[] vector elements).
Here is the output compiled and run for debug:
ChatGPT: Number: 17, Frequency: 6, Time: 6391
jon: Number: 17, Frequency: 6, Time: 0
and here compiled and run for release:
ChatGPT: Number: 45, Frequency: 8, Time: 2355
jon: Number: 45, Frequency: 8, Time: 0
So, I don't mean to blow my trumpet, but mine is somewhere between thousands and "infinitely" faster than the ChatGPT one, and uses like a hundred-millionth of the memory, at least with a "large" upperLimit... ;-) I realise it won't make so much difference with a "smaller" upper limit, and that is your RL case, but still....
@VRonin
Having taken the time to produce this to address your "what I'm doing now but feels sooooo inefficient", I hope you might take a look it/test t for yourself.
My conclusion: Mine is more work figuring it than copying a (decent) solution from ChatGPT, but in view of the vast efficiency and space improvements I feel I do not yet need to hang up my programming clogs, I feel I am not replaced by ChatGPT, yet... :D
P.S.
In the interests of clarity/fairness I must admit where my algorithm is not so good. As you increase the num_intervals mine will use more memory for the frequency[] array and take more time through its loops/sorting, where ChatGPT's will be relatively unaffected. Basically, for both speed and space, ChatGPT's is affected by upperLimit size while mine is affected by the size of num_intervals. So the "best" depends on whether you have a large range-bound in your ranges or a large number of ranges.