使用std::lower_bound和std::upper_bound解决常见的二分查找问题

来源：互联网发布：装修设计师知乎编辑：程序博客网时间：2024/05/17 02:07

常见二分查找的问题有如下几种：

1，有序数组查找特定的某个值。

2，有序数组查找小于某个值的数字中最大的那个。

3，有序数组查找小于或等于某个值的数字中的最大的那个。

4，有序数组查找大于某个值的数字中最小的那个。

5，有序数组查找大于或等于某个值的数字中的最小的那个。

这里的有序数组指的是升序。

第1种情况最简单，这里略去不谈。

第2种和第3种可以归为同一类，视为求下界。

第4种和第5种可以归为同一类，视为求上界。

文章末尾的第一篇参考资料中给出了这几种情况的实现代码。

除了自己实现之外，STL中也为我们提供了很好的实现，即std::lower_bound和std::upper_bound。

std::lower_bound有两种声明：

template< class ForwardIt, class T >ForwardIt lower_bound( ForwardIt first, ForwardIt last, const T& value );template< class ForwardIt, class T, class Compare >ForwardIt lower_bound( ForwardIt first, ForwardIt last, const T& value, Compare comp );

cppreference对此的描述是返回区间[first, last)内第一个不小于（即大于或等于）给定值的元素指针。

它的第一种声明的代码实现如下：

template<class ForwardIt, class T>ForwardIt lower_bound(ForwardIt first, ForwardIt last, const T& value){    ForwardIt it;    typename std::iterator_traits<ForwardIt>::difference_type count, step;    count = std::distance(first, last);     while (count > 0) {        it = first;         step = count / 2;         std::advance(it, step);        if (*it < value) {            first = ++it;             count -= step + 1;         }        else            count = step;    }    return first;}

经过分析源码，可以得知在二分过程中first指针在使条件(*it < value)成立的情况下不断右移，但它最终返回的是第一个使条件不成立的元素。它可以直接解决第5种情况的问题。其实这里这与第2种情况也有些类似，但是第2种情况需要返回的是最后一个使条件成立的元素。如果我们需要使用std::lower_bound解决第2种情况，只需要取std::lower_bound返回值左边相邻的元素即可，如果返回值已经是最左边的元素，说明要找的值不存在。

STL还提供了另外一种实现，它支持使用者自己提供条件函数。它的实现如下：

template<class ForwardIt, class T, class Compare>ForwardIt lower_bound(ForwardIt first, ForwardIt last, const T& value, Compare comp){    ForwardIt it;    typename std::iterator_traits<ForwardIt>::difference_type count, step;    count = std::distance(first,last);     while (count > 0) {        it = first;        step = count / 2;        std::advance(it, step);        if (comp(*it, value)) {            first = ++it;            count -= step + 1;        }        else            count = step;    }    return first;}

我们可以将此实现描述为返回区间[first, last)内第一个使comp为false的元素指针。区间[first, last)与comp需要满足一定的性质关系，即左部分区间的元素使得comp为true，右部分区间的元素使得comp为false。我们可以使用自己实现的comp并应用上面提到的办法解决第3种情况的问题。

同样地，std::upper_bound也有两种声明：

template< class ForwardIt, class T >ForwardIt upper_bound( ForwardIt first, ForwardIt last, const T& value );template< class ForwardIt, class T, class Compare >ForwardIt upper_bound( ForwardIt first, ForwardIt last, const T& value, Compare comp );

cppreference对此的描述是返回区间[first, last)内第一个大于给定值的元素指针。
它的第一种声明的代码实现如下：

template<class ForwardIt, class T>ForwardIt upper_bound(ForwardIt first, ForwardIt last, const T& value){    ForwardIt it;    typename std::iterator_traits<ForwardIt>::difference_type count, step;    count = std::distance(first,last);     while (count > 0) {        it = first;         step = count / 2;         std::advance(it, step);        if (!(value < *it)) {            first = ++it;            count -= step + 1;        } else count = step;    }    return first;}

std::upper_bound与std::lower_bound的实现代码仅有一处不同即比较条件(!(value < *it))，它等价于(value >= *it)。这段代码使用std::lower_bound的语义描述为返回第一个使(value >= *it)不成立的元素，也就是第一个使得(*it > value)成立的元素，这也就是std::upper_bound的语义。它可以直接用来解决上面第4种情况的问题。
std::upper_bound也支持使用者自己提供条件函数，它的代码如下：

template<class ForwardIt, class T, class Compare>ForwardIt upper_bound(ForwardIt first, ForwardIt last, const T& value, Compare comp){    ForwardIt it;    typename std::iterator_traits<ForwardIt>::difference_type count, step;    count = std::distance(first,last);     while (count > 0) {        it = first;         step = count / 2;        std::advance(it, step);        if (!comp(value, *it)) {            first = ++it;            count -= step + 1;        } else count = step;    }    return first;}

我们可以将此实现描述为返回区间[first, last)内第一个使comp为true的元素指针。区间[first, last)与comp需要满足一定的性质关系，即左部分区间的元素使得comp为false，右部分区间的元素使得comp为true。我们可以自己实现一个comp函数，可以用它来解决上面的第5种情况的问题。
注意，std::lower_bound和std::upper_bound中两个comp函数并不一样，它们的参数顺序不同。

观察上面的代码，可以发现std::lower_bound和std::upper_bound的实现中只在移动first指针而不曾移动last指针，这与我们自己平时实现的二分算法（参考本文末尾第一篇文章中的实现代码）有很大不同。我想这跟STL容器的设计有关，毕竟last指针总是一个常量end()，是无法移动的。这种实现使得面对第2和第3种情况下的问题无法直接解决，需要往左移动。

我写了一个简单的代码，用std::lower_bound和std::upper_bound解决上面提到的几种情况。

它使用了gtest和c++11。

#include <gtest/gtest.h>#include <algorithm>#include <vector>typedef std::vector<int>::const_iterator iter;const int NUMBER_SIZE = 100000;const int TEST_COUNT = 100;int GenARandomNumber() {  return ::rand() % 1000;}std::vector<int> GenerateRandomSortedVector(int n) {  std::vector<int> numbers(n);  for (int i = 0; i < n; ++i) {    numbers[i] = GenARandomNumber();  }  std::sort(numbers.begin(), numbers.end());  return numbers;}void ShowNumbers(const std::vector<int> & numbers) {  for (auto num: numbers) {    std::cout << num << " ";  }  std::cout << std::endl;}template<class Compare>iter FindFirstTrue(const std::vector<int> & numbers, int  target, Compare comp) {  auto it = numbers.begin();  for (; it != numbers.end(); ++it) {    if (comp(*it, target)) {      break;    }  }  return it;}template<class Compare>iter FindFirstFalse(const std::vector<int> & numbers, int target, Compare comp) {  auto it = numbers.begin();  for (; it != numbers.end(); ++it) {    if (!comp(*it, target)) {      break;    }  }  return it;}iter FindLessMax(const std::vector<int> & numbers, int target) {  auto it = numbers.begin();  for (; it != numbers.end(); ++it) {    if (it + 1 != numbers.end() && !(*(it + 1) < target)) {      break;    }  }  return it;}iter FindLessEqualMax(const std::vector<int> & numbers, int target) {  auto it = numbers.begin();  for (; it != numbers.end(); ++it) {    if (it + 1 != numbers.end() && !(*(it + 1) <= target)) {      break;    }  }  return it;}iter FindGreaterMin(const std::vector<int> & numbers, int target) {  auto it = numbers.begin();  for (; it != numbers.end(); ++it) {    if (*it > target) {      break;    }  }  return it;}iter FindGreaterEqualMin(const std::vector<int> & numbers, int target) {  auto it = numbers.begin();  for (; it != numbers.end(); ++it) {    if (*it >= target) {      break;    }  }  return it;}TEST(BsearchTest, findLessMax) {  std::vector<int> numbers = GenerateRandomSortedVector(NUMBER_SIZE);  for (size_t i = 0; i < TEST_COUNT; ++i) {    int target = GenARandomNumber();    auto it_line = FindLessMax(numbers, target);    auto it_bin  = std::lower_bound(numbers.begin(), numbers.end(), target);    if (it_bin != numbers.begin() && it_bin != numbers.end()) {      --it_bin;    }    ASSERT_TRUE(it_line == it_bin);   }}TEST(BsearchTest, findLessEqualMax) {  std::vector<int> numbers = GenerateRandomSortedVector(NUMBER_SIZE);  for (size_t i = 0; i < TEST_COUNT; ++i) {    int target = GenARandomNumber();    auto it_line = FindLessEqualMax(numbers, target);    auto it_bin  = std::lower_bound(numbers.begin(), numbers.end(), target, [](int val, int tar) {      return val <= tar;     });    if (it_bin != numbers.begin() && it_bin != numbers.end()) {      --it_bin;    }    ASSERT_TRUE(it_line == it_bin);   }}TEST(BsearchTest, findGreaterMin) {  std::vector<int> numbers = GenerateRandomSortedVector(NUMBER_SIZE);  for (size_t i = 0; i < TEST_COUNT; ++i) {    int target = GenARandomNumber();    auto it_line = FindGreaterMin(numbers, target);    auto it_bin  = std::upper_bound(numbers.begin(), numbers.end(), target);    ASSERT_TRUE(it_line == it_bin);   }}TEST(BsearchTest, findGreaterEqualMin) {  std::vector<int> numbers = GenerateRandomSortedVector(NUMBER_SIZE);  for (size_t i = 0; i < TEST_COUNT; ++i) {    int target = GenARandomNumber();    auto it_line = FindGreaterEqualMin(numbers, target);    auto it_bin  = std::upper_bound(numbers.begin(), numbers.end(), target, [](int tar, int val) {      return tar <= val;    });    ASSERT_TRUE(it_line == it_bin);   }}int main(int argc, char** argv) {  testing::InitGoogleTest(&argc, argv);    return RUN_ALL_TESTS();}

如果你对我的文章内容有疑问，或者认为文章中有不正确的地方，欢迎留言。

参考文章：

https://www.cnblogs.com/ider/archive/2012/04/01/binary_search.html。

http://en.cppreference.com/w/cpp/algorithm/lower_bound

http://en.cppreference.com/w/cpp/algorithm/upper_bound

0 0