Standard Template Library (STL) V - Function Objects
Functors (Function Objects or Functionals) are simply put object + (). In other words, a functor is any object that can be used with () in the manner of a function.
Function objects are useful in that they can further leverage STL library. Also, they are inlined and can produce efficient object code.
This includes normal functions, pointers to functions, and class objects for which the () operator (function call operator) is overloaded, i.e., classes for which the function operator()() is defined.
Sometimes we can use a function object when an ordinary function won't work. The STL often uses function objects and provides several function objects that are very helpful.
Function objects are another example of the power of generic programming and the concept of pure abstraction. We could say that anything that behaves like a function is a function. So, if we define an object that behaves as a function, it can be used as a function.
For example, we could define a struct names absValue that encapsulates the operation of converting a value of type float to its absolute valie:
#include <iostream> struct absValue { float operator()(float f) { return f > 0 ? f : -f; } }; int main( ) { using namespace std; float f = -123.45; absValue aObj; float abs_f = aObj(f); cout << "f = " << f << " abs_f = " << abs_f << endl; return 0; }
Output is:
f = -123.45 abs_f = 123.45
As we see from the definition, even though aObj is an object and not a function, we were able to make a call on that object. The effect is to run the overloaded call operator defined by the object absValue. The operator takes a float value and returns its absolute value. Note that the function-call operator must be declared as a member function.
So, Objects of class types, like the absValue object, that define the call operator are referred to as function object.
Let's look at another example simulating a line. The class object is working as a functor taking x for a given y-intercept(b) and slope(a) giving us the corresponding y coordinate.
y = ax + b
Here is the code:
#include <iostream> using namespace std; class Line { double a; // slope double b; // y-intercept public: Line(double slope = 1, double yintercept = 1): a(slope),b(yintercept){} double operator()(double x){ return a*x + b; } }; int main () { Line fa; // y = 1*x + 1 Line fb(5.0,10.0); // y = 5*x + 10 double y1 = fa(20.0); // y1 = 20 + 1 double y2 = fb(3.0); // y2 = 5*3 + 10 cout << "y1 = " << y1 << " y2 = " << y2 << endl; return 0; }
Here y1 is calculated using the expression 1 * 20 + 1 and y2 is calculated using the expression 5 * 3 + 10. In the expression a *x + b, the values for b and a come from the constructor for the object, and the value of x comes from the argument to operator()().
Now that we have a little bit of taste for functor, let's step back and think. So, what's the behavior of a function? A functional behavior is something that we can call by using parentheses and passing arguments. For example:
func(arg1, arg2);
If we want objects to behave this way, we have to make it possible to call them by using parentheses and passing arguments. It's not that difficult. All we have to do is to define operator () with the appropriate parameter types:
Class X { public: // define "function call" operator return-value operator() (arguments) const; ... };
Then we can use object of this class to behave as a function that we can call:
X fn; ... fn(arg1, arg2); // call operator () for function object fn
This call is equivalent to:
fn.operator()(arg1,arg2); // call operator () for function object fn
Here is a function object example.
#include <iostream> #include <vector> #include <algorithm> using namespace std; class Print { public: void operator()(int elem) const { cout << elem << " "; } }; int main () { vector<int> vect; for (int i=1; i<10; ++i) { vect.push_back(i); } Print print_it; for_each (vect.begin(), vect.end(), print_it); cout << endl; return 0; }
The for_each function applied a specific function to each member of a range:
for_each (vect.begin(), vect.end(), print_it);
In general, the 3rd argument could be a functor, not just a regular function. Actually, this raises a question. How do we declare the third argument? We can't declare it as a function pointer because a function pointer specifies the argument type. Because a container can contain just about any type, we don't know in advance what particular type should be used. The STL solves that problem by using template.
The class Print defines object for which we can call operator () with an int argument. The expression
print_itin the statement
for_each (vect.begin(), vect.end(), print_it);creates a temporary object of this class, which is passed to the for_each() algorithm as an argument. The for_each algorithm looks like this:
template<typename Iterator, typename Function> Function for_each(Iterator first, Iterator last, Function f) { while (first != end) { f(*first); ++first; } return f; }Note that the algorithm takes and returns a function object by value. Regarding this, in his book ("Effective STL" Chapter 6), Scott Meyers gives us the following comments: "Because function objects are passed and returned by value, the onus is on you to make sure that your function behave well when passed that way (i.e.copied). This implies two things, First, your function needs to be small. Otherwise they will be too expensive to copy. Second, your function object must be monomorphic (i.e., not polymorphic) - they must not use virtual functions. That's because derived class objects passed by value into the parameters of base class type suffer from the slicing problem....."
The for_each uses the temporary function f to call f(*first) for each element first. If the third parameter is an ordinary function, it simply calls it with *first as an argument. If the third parameter is a function object, it calls operator() for the function object f with *first as an argument. Thus, in this example for_each() calls:
print_it::operator()(*first)The out is:
1 2 3 4 5 6 7 8 9
Here are some advantages of function object listed in "The C++ Standard Library" by Nicolai M. Josuttis.
- Function object are "smart functions."
Objects that behave like pointers are smart pointers. This is similarly true for objects that behave like functions: They can be "smart functions" because they may have abilities beyond operator (). Function objects may have other member functions and attributes. This means that function objects have a state. .... - Each function object has its own type.
Ordinary functions have different types only when their signatures differ. However, function objects can have different types when their signatures are the same. In fact, each functional behavior defined by a function object has its own type. This is a significant improvement for generic programming using templates because you can pass functional behavior as a template parameter. ... - Function objects are usually faster than ordinary functions.
The concept of templates usually allows better optimization because more details are defined at compile time. Thus, passing function objects instead of ordinary functions often results in better performance.
A template nontype parameter is a constant value inside the template definition. We can use a nontype parameter when we need constant expressions. In the example, we need that to specify the size of an array. So, when arrayInit is called, the compiler figures out the value of the nontype parameter from the array argument.
Here is an example using integer as a non-type parameters. In the code below, we want to assign values for all the elements of a vector. So, we set the template parameter with an integer value as non-type, and as we traverse each element of the vector container, we set the value we want by calling setValue().
#include <iostream> #include <vector> #include <algorithm> using namespace std; template <int val> void setValue(int& elem) { elem = val; } template <typename T> class PrintElements { public: void operator()(T& elm) const {cout << elm << ' ';} }; int main() { int size = 5; vector<int> v(size); PrintElements<int> print_it; for_each(v.begin(), v.end(), print_it); for_each(v.begin(), v.end(), setValue<10>); for_each(v.begin(), v.end(), print_it); return 0; }
Output:
0 0 0 0 0 10 10 10 10 10
Here is a similar example but this time it's adding some value to each element of the vector at runtime:
#include <iostream> #include <vector> #include <algorithm> using namespace std; template <typename T> class Add { T x; public: Add(T xx):x(xx){} void operator()(T& e) const { e += x; } }; template <typename T> class PrintElements { public: void operator()(T& elm) const { cout << elm << ' ';} }; int main() { int size = 5; vector<int> v; for(int i = 0; i < size; i++) v.push_back(i); PrintElements<int> print_it; for_each(v.begin(), v.end(), print_it); cout << endl; for_each(v.begin(), v.end(), Add<int>(10)); for_each(v.begin(), v.end(), print_it); cout << endl; for_each(v.begin(), v.end(), Add<int>(*v.begin())); for_each(v.begin(), v.end(), print_it); return 0; }
Output:
0 1 2 3 4 10 11 12 13 14 20 21 22 23 24
The first call to Add(10) creates a Add type temporary object which is initialized with x = 10. Then, it adds its member value 10 to each of the element of the vector v while traversing. The second call to Add(*v.begin()) sets the member of the temporary object to v.begin() element, and then adds that value to each of the element of the vector.
STL refines functor concepts as follows:
- A generator is a functor that can be called with no argument.
- A unary function is a functor that can be called with one argument.
- A binary function is a functor that can be called with two arguments.
The print_it functor for for_each() we used in the previous section is a unary function because it is applied to one container element at a time.
A special auxiliary function for algorithm is a predicate. Predicates are functions that return a Boolean value (or something that can be implicitly converted to bool). In other words, a predicate class is a functor class whose operator() function is a predicate, i.e., its operator() returns true or false.
Predicates are widely used in the STL. The comparison functions for the standard associative containers are predicates, and predicate functions are commonly passed as parameters to algorithms like find_if. Depending on their purpose, predicates are unary or binary.
- A unary function that returns a bool value is a predicate.
- A binary function that returns a bool value is a binary predicate.
A typical example of predefined function object in STL is set. It sorts element in increasing order because by default it's using less(<) sorting criteria. So, when we do:
set<int> mySet;
internal code looks like this:
set<int, less<int> > mySet;
So, we want to store our elements in decreasing order in a set, we do:
set<int, greater<int> > mySet;
In the example below, the negate<int>() creates a function object from the predefined template class negate. It simply returns the negated int element.
The second transform() algorithm combines the elements from two vector containers by using multiplies<int>() operation. Then it writes the results into the 3rd vector container.
#include <iostream> #include <vector> #include <algorithm> using namespace std; template <typename T> class PrintElements { public: void operator()(T& elm) const { cout << elm << ' ';} }; int main() { PrintElements<int> print_it; int size = 5; vector<int> v; for(int i = 0; i < size; i++) v.push_back(i); for_each(v.begin(), v.end(), print_it); cout << endl; transform(v.begin(), v.end(), // source v.begin(), // destination negate<int>()); // operation for_each(v.begin(), v.end(), print_it); cout << endl; transform(v.begin(), v.end(), // source v.begin(), // second source v.begin(), // destination multiplies<int>()); // operation for_each(v.begin(), v.end(), print_it); cout << endl; return 0; }
Output:
0 1 2 3 4 0 -1 -2 -3 -4 0 1 4 9 16
Here are the two types of transform() algorithms:
- Transforming elements
OutputIterator transform ( InputIterator source.begin(), InputIterator source.end(), OutputIterator destination.begin(), UnaryFunc op )
- Combining elements of two sequences
OutputIterator transform ( InputIterator1 source1.begin(), InputIterator1 source1.end(), InputIterator2 source2.begin() OutputIterator destination.begin(), BinaryFunc op )
In the example below, we call the algorithm transform() to apply the operation defined by absValue to every element of the vec.
#include <iostream> #include <vector> #include <algorithm> using namespace std; class absValue { public: int operator()(int val) { return val < 0 ? -val : val; } }; int main() { int a[] = {-3,-2,-1, 0, 1, 2, 3}; int size = sizeof(a)/sizeof(a[0]); vector<int> vec(a, a+size); for(int i = 0; i < size; i++) cout << vec[i] << " "; transform(vec.begin(), vec.end(), vec.begin(), absValue()); cout << "\nafter transform()\n"; for(int i = 0; i < size; i++) cout << vec[i] << " "; return 0; }
Output:
-3 -2 -1 0 1 2 3 after transform() 3 2 1 0 1 2 3
Actually, the tranform() should look like this:
typedef vector<int>::iterator iter_type; iter_type transform(iter_type iter, iter_type last, iter_type result, absValue func) { while( iter != last) *result++ = func( *iter++); return iter; }
Though the STL takes the responsibility for holding new objects when we do push_back() etc., there are cases we should also take the responsibility as shown in the example below. When we use transform() and put the results into the container, we may want to make sure the container has enough room. In the code, the result vector does not have enough space for it:
#include <iostream> #include <vector> #include <algorithm> using namespace std; class absValue { public: int operator()(int val) { return val < 0 ? -val : val; } }; int main() { int a[] = {-3,-2,-1, 0, 1, 2, 3}; int size = sizeof(a)/sizeof(a[0]); vector<int> vec(a, a+size); vector<int> result; for(int i = 0; i < size; i++) cout << vec[i] << " "; transform(vec.begin(), vec.end(), result.begin(), absValue()); cout << "\nafter transform()\n"; for(int i = 0; i < size; i++) cout << vec[i] << " "; return 0; }
To make it work, we may want to call back_inserter to generate the iterator. Internally, the iterator triggers the call to push_back():
#include <iostream> #include <vector> #include <algorithm> #include <iterator> using namespace std; class absValue { public: int operator()(int val) { return val < 0 ? -val : val; } }; int main() { int a[] = {-3,-2,-1, 0, 1, 2, 3}; int size = sizeof(a)/sizeof(a[0]); vector<int> vec(a, a+size); vector<int> result; for(int i = 0; i < size; i++) cout << vec[i] << " "; transform(vec.begin(), vec.end(), back_inserter(result), absValue()); cout << "\nafter transform(), the result vector\n"; for(int i = 0; i < size; i++) cout << result[i] << " "; return 0; }
To be more efficient, we need to reserve enough space before we insert new elements since whenever we insert a new item, there will be a cost of shifting elements. (when we use inserter or front_inserter().
vector<int> result; result.reserve(vec.size());
Function adapter converts some other interface to an interface used by STL.
Here is the declaration of bind2nd():
template <class Operation, class T> binder2nd<Operation> bind2nd (const Operation& op, const T& x);
It returns function object with second parameter bound. This function constructs an unary function object from the binary function object op by binding its second parameter to the fixed value x.
The function object returned by bind2nd() has its operator() defined such that it takes only one argument. This argument is used to call binary function object op with x as the fixed value for the second argument.
#include <iostream> #include <vector> #include <algorithm> #include <iterator> using namespace std; struct PrintElm { void operator()(int & elm) const { cout << elm << ' ';} }; int main() { int size = 10; vector<int> v; for(int i = 0; i < size; i++) v.push_back(i); for_each(v.begin(), v.end(), PrintElm()); cout << endl; replace_if(v.begin(), v.end(), bind2nd(equal_to<int>(),0), -1); for_each(v.begin(), v.end(), PrintElm()); cout << endl; v.erase( remove_if(v.begin(), v.end(), bind2nd(less<int>(), 3)), v.end() ); for_each(v.begin(), v.end(), PrintElm()); cout << endl; transform(v.begin(), v.end(), ostream_iterator<int>(cout, " "), negate<int>()); cout << endl; cout << "reset vector: "; v.clear(); for(int i = 0; i < size; i++) v.push_back(i); for_each(v.begin(), v.end(), PrintElm()); cout << endl; typedef vector<int>::const_iterator CI; CI cib = v.begin(); CI cie = v.end(); vector<int>::iterator ib = v.begin(); cout << "add 10 to each element: "; transform(cib, cie, ib, bind2nd(plus<int>(),10)); copy(v.begin(), v.end(), ostream_iterator<int>(cout, " ")); cout << endl; /* starting from the 2nd element, set value after adding 10 to the previous element */ cout << "plus 10 to the previous element: "; transform(cib, --cie, ++ib, bind2nd(plus<int>(),10)); copy(v.begin(), v.end(), ostream_iterator<int>(cout, " ")); return 0; }
Output:
0 1 2 3 4 5 6 7 8 9 -1 1 2 3 4 5 6 7 8 9 3 4 5 6 7 8 9 -3 -4 -5 -6 -7 -8 -9 reset vector: 0 1 2 3 4 5 6 7 8 9 add 10 to each element: 10 11 12 13 14 15 16 17 18 19 plus 10 to the previous element: 10 20 30 40 50 60 70 80 90 100
Here are additional examples of bind2nd():
- To count all the elements within a vector that are less than or equal to 100, we can use count_if():
count_if(vec.begin(), vec.end(), bind2nd(less_equal<int>(), 100));
The 3rd argument uses bind2nd() function adaptor. This adaptor returns a function object that applies the <= operator using 100 as the right hand operand. So, the call to count_if() counts the number of elements in the input range (from vec.begin() to vec.end()) that are less than or equal to 100. - We can negate the binding of less_equal:
count_if(vec.begin(), vec.end(), not1(bind2nd(less_equal<int>(), 100)));
As in the previous sample, we first bind the second operand of the less_equal object to 100, transforming that binary operation into a unary operation. Then, we negate the return from the operation using not1. So, what it does is that each element will be tested to see if it is <= 100. Then, the truth value of the result will be negated. Actually, the call counts those elements that are not <= 100.
When we use remove() from STL, we need special care.
"... remove is the most confusing algorithm in the STL. Misunderstanding remove is easy, and it's important to dispel all doubt about what remove() does, why it does it, and how it goes about doing it." - Scott Meyers in Item 32: Follow remove-like algorithms by erase if you really want to remove something. In his book "Effective STL"
The code below shows the size of the container does not change even after the remove_if().
#include <algorithm> #include <iostream> #include <list> using namespace std; struct P { bool operator()(const int &n;) const { return n % 3 == 0; } }; int main() { const int a[] = { 1,2,3,4,5,6,7,8,9 }; const int count = sizeof(a) / sizeof(a[0]); list<int> lst(a, a+ count); cout << lst.size(); // size is 9 remove_if(lst.begin(), lst.end(), P()); cout << lst.size() << endl; // size is still 9 return 0; }
Actuall, remove_if() returns the something like lst.newEnd(), however, it does not change the lst.end() and the size remains the same. Internally, remove() walks down the range, overwriting the values that are to be removed with later values that are to be retained.
So, if we really want to remove the items, the remove() should be used with erase as shown in the code below:
lst.erase( remove_if(lst.begin(), lst.end(), P()), lst.end() );
#include <algorithm> #include <iostream> #include <iterator> #include <list> using namespace std; template<typename T> class printElements { public: void operator()(T &elm;) const {cout << elm << ' ';} }; bool compare(int a, int b) { return a % 2 == b % 2; } int main() { printElements<int> print_it; // source data int src[] = {1, 4, 1, 3, 8, 6, 5, 2, 2, 7}; const int count = sizeof(src) / sizeof(src[0]); // initialize col with elements from src list<int> col; copy(src, src+count, back_inserter(col)); for_each(col.begin(), col.end(), print_it); cout << endl; // remove consecutive even or odd number int *last = std::unique(src, src + count, compare); std::copy(src, last, std::ostream_iterator<int>(cout, " ")); cout << "\n\n"; // reinitialize col with elements from src copy(src, src+count, col.begin()); for_each(col.begin(), col.end(), print_it); cout << endl; list<int>::iterator new_pos; // remove consecutive duplicates new_pos = std::unique(col.begin(), col.end()); copy(col.begin(), new_pos, ostream_iterator<int>(cout, " ")); cout << "\n\n"; // reinitialize col with elements from src copy(src, src+count, col.begin()); for_each(col.begin(), col.end(), print_it); cout << endl; // remove an element smaller than the previous element col.erase(std::unique(col.begin(), col.end(), greater<int>()), col.end()); for_each(col.begin(), col.end(), print_it); cout << endl; return 0; }
Output:
1 4 1 3 8 6 5 2 2 7 1 4 1 8 5 2 7 1 4 1 8 5 2 7 2 2 7 1 4 1 8 5 2 7 2 7 1 4 1 8 5 2 7 2 2 7 1 4 8
The first call to unique() removes the consecutive even or odd numbers. Starting from 1 which is odd number, the next element 4 is even, so we keep 4. The next one is 1 (odd) and keep it because previous element 4 is even. But the next element after 1 (odd) is 3 which is also odd. So, we drop 3 because we have consecutive odd numbers.
The second call to unique() removes any consecutive duplicates.
The third call removes any element less than the previous element. The third element 1 is less than the second element 4, so we drop it. The fourth element 8 is greater than 4 which is the last element currently kept in the list. After 8, we erase all the rest since every element is less than 8.
As another function adapter, in this section, we're going to deal with mem_fun() and mem_fun_ref(). They enable us to call member function for each element of a collection.
Why do we need this?
The answer is because we need it to use STL. To be more specific, to use algorithm like for_each().
When we look at the for_each(),
template<typename Iterator, typename Function> Function for_each(Iterator first, Iterator last, Function f) { while (first != end) { f(*first); ++first; } return f; }it calls the f using the syntax of calling non-member function. So, to call a member function, we needs mem_fun(). STL could have two other types of
#include <algorithm> #include <iostream> #include <vector> #include <string> using namespace std; class A { public: explicit A(string s) :name(s) {} void NamePrint() const { cout << name << endl; } void PrefixNamePrint(string prefix) const { cout << prefix << name << endl; } static bool compare(A* a, A* b) { return a->name.length() > b->name.length(); } private: string name; }; int main() { vector<A*> v; v.push_back(new A("Steve")); v.push_back(new A("Jay")); v.push_back(new A("Fonseca")); std::stable_sort(v.begin(), v.end(), A::compare); for_each(v.begin(), v.end(), std::mem_fun(&A;::NamePrint)); cout << endl; for_each(v.begin(), v.end(), bind2nd(std::mem_fun(&A;::PrefixNamePrint), "Mr. ")); return 0; }
Output:
Fonseca Steve Judy Mr. Fonseca Mr. Steve Mr. Judy
In this sample, we put three pointers of object A. Then, we sort them decreasing order by the length of the member (length of the name) and print out using
Here is another version modified for use of mem_fun_ref() instead of mem_fun(). So, the vector elements have been switched to object not the pointer to the object. Also, the compare() member function has been changed.
#include <algorithm> #include <iostream> #include <vector> #include <string> using namespace std; class A { public: explicit A(string s) :name(s) {} void NamePrint() const { cout << name << endl; } void PrefixNamePrint(string prefix) const { cout << prefix << name << endl; } static bool compare(const A &a;, const A &b;) { return a.name.length() > b.name.length(); } private: string name; }; int main() { vector<A> v; A a1("Steve"); A a2("Jay"); A a3("Fonseca"); v.push_back(a1); v.push_back(a2); v.push_back(a3); std::stable_sort(v.begin(), v.end(), A::compare); for_each(v.begin(), v.end(), std::mem_fun_ref(&A;::NamePrint)); cout << endl; for_each(v.begin(), v.end(), bind2nd(std::mem_fun_ref(&A;::PrefixNamePrint), "Mr. ")); return 0; }
The following sample also uses mem_fun to call A:f(). In the code, we created 3 types objects from the base class A and two derived classes B and B. We put the pointers to vector and sort them in increasing order of the size of the class.
#include <algorithm> #include <iostream> #include <vector> using namespace std; class A { public: A() : mSize(sizeof(A)) { } public: virtual void f() const { cout << "sizeof(A) = " << sizeof(A) << endl; } virtual ~A() { } public: static bool compare(const A *a, const A *b) { return a->mSize < b->mSize; } protected: size_t mSize; }; class B : public A { public: B() : mB(0) { mSize = sizeof(B); } public: virtual void f() const { cout << "sizeof(B) = " << sizeof(B) << endl; } private: char *mB; }; class C : public A { public: C() { mSize = sizeof(C); } public: virtual void f() const { cout << "sizeof(C) = " << sizeof(C) << endl; } public: static int *mC; }; int *C::mC = 0; // for_each op () struct Destruct { void operator()(A *a) const { cout << "\ndelete"; delete a; } }; int main() { vector<A*> v; v.push_back(new C); v.push_back(new B); v.push_back(new A); // sort by increasing order of the size of a class std::stable_sort(v.begin(), v.end(), A::compare); for_each(v.begin(), v.end(), std::mem_fun(&A;::f)); Destruct d; // this calls Destruct::operator() for_each(v.begin(), v.end(), d); return 0; }
Output:
sizeof(C) = 8 sizeof(A) = 8 sizeof(B) = 12 delete delete delete
The size of A is 8 byte because 4(int) + 4(vtable pointer), and the size of C is also 8 because of the inherited member from A which is mSize (4) and 4 bytes for a vpointer. B requires additional 4 byte for a pointer to a character, so the size of B is 12. Note that static members are not counted when we calculate the size of a class.
Also note that we're using for_each() to delete our pointers to the three objects in the Destruct operator().
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization