I have seen this question come up often in C++ development. Even though this feature has been available since C++17, it is still not always obvious. How do we return a std::vector from a function? And the next natural question is: does the same rule apply when returning a std::vector from a class member function?

To answer this properly, we first need to understand an important language feature that became guaranteed in C++17: Return Value Optimisation, commonly referred to as RVO.

The compiler tries to avoid the costly copy of a std::vector under RVO. It does this by eliminating the copy altogether. This can happen in one of two ways: RVO itself, or move semantics.

With RVO, the compiler constructs the returned object directly in the caller’s storage. No temporary object is created, and no copy or move takes place at all.

When RVO cannot be applied, the compiler falls back to move semantics. In this case, std::move is used internally, and the underlying buffer of the std::vector is transferred. This operation is constant time, O(1), because only internal pointers are swapped rather than copying the elements.

The general rule is this:
If a function returns a local object by value through a single return statement, the compiler can apply RVO. If a function has multiple return paths that return different named local objects, RVO may not be possible, and the compiler will typically use move semantics instead. Either way, no expensive deep copy of the vector occurs.

In this blog post I am going to discuss the following scenarios:

  1. Free Standing Functions
  2. Multiple Return Path
  3. Exception: The Conditional Operator (?:)
  4. Never Return a Reference to a Local std::vector
  5. Returning from a member function
  6. Returning a Temporary Object
  7. Rule of Thumb for Returning std::vector in C++

Now, let us look at an example to understand this further.

Free Standing Functions

Now, let us have a look at the following example where Return Value Optimization is employed by the compiler internally:

#include <iostream>
#include <vector>
std::vector<int> getValues()
{
std::vector<int> myVec{1, 2, 3, 4};
return myVec;
}
int main()
{
std::vector<int> valueSet = getValues();
return 0;
}

In this example, the free-standing function getValues() allocates a local std::vector<int> called myVec and then returns it. The main() function calls getValues() and stores the returned value in a local vector called valueSet.

What the compiler does internally is that, rather than copying the elements from myVec into valueSet, it constructs valueSet directly in place, completely avoiding the copy operation.

You can verify this using the online compiler explorer at https://godbolt.org/. If you examine the generated assembly, you will notice instructions similar to the following:

leaq -48(%rbp), %rax ; address of valueSet (the return object storage)
movq %rax, %rdi ; pass it as first arg
call _Z9getValuesv

In the above assembly code, the following is happening:

  • -48(%rbp) is the stack slot chosen for valueSet
  • leaq computes the address of that slot
  • movq places that address into %rdi, the first argument register
  • getValues is called with the address of the caller’s vector storage

This shows that the compiler is not expecting a vector to be returned by value in the traditional sense. Instead, it provides a destination address where the vector must be constructed.

Inside getValues(), the vector is then constructed directly at that address:

movq %rdi, -40(%rbp)
movq -40(%rbp), %rax
movq %rax, %rdi
call ZNSt6vectorIiSaIiEEC1ESt16initializer_listIiERKS0

So this is what is happening inside the getValues() function:

  • The passed-in address is saved. %rdi already contains the caller’s vector address, and the compiler stores it in a local stack slot using movq %rdi, -40(%rbp).
  • The same address is then reloaded. The compiler retrieves the saved return-object address using movq -40(%rbp), %rax.
  • That address is used as the constructor’s this pointer. %rdi is set to the caller’s vector address via movq %rax, %rdi.
  • Finally, the vector is constructed at that address. The std::vector constructor is invoked directly on the caller’s storage using
  • call ZNSt6vectorIiSaIiEEC1ESt16initializer_listIiERKS0.

This clearly shows that the compiler is avoiding any copying of std::vector<int> from getValues() to main(). Instead, it constructs the result vector in place, meaning directly inside the caller’s stack frame.

Multiple Return Path

Now, the next case is where there are multiple return paths. In such situations, the RVO mechanism may not be suitable because the objects first need to be constructed, and only then can a decision be made about which one to return.

In this scenario, the compiler typically falls back to move semantics, which is still extremely efficient. A move operation only swaps internal pointers and therefore avoids any costly deep copy of the vector.

Let us have a look at the following example code.

#include <iostream>
#include <vector>
std::vector<int> getValues()
{
bool flag = true;
std::vector<int> myvec{1, 2, 3, 4};
std::vector<int> yourvec{100, 200, 200, 400, 500};
if(flag)
return myvec;
else
return yourvec;
}
int main()
{
std::vector<int> valueSet = getValues();
for(auto const &v : valueSet)
std::cout << v << "\n";
return 0;
}

In this case, there are multiple return paths from the getValues() function. As a result, the compiler has to construct both vectors, evaluate the condition, and then decide which one to return. Because there is no single, unambiguous return object, the opportunity to construct the vector directly in the caller does not exist. Hence, RVO is not applicable here.

Instead, the compiler falls back to move semantics and moves the selected vector rather than copying it, thereby saving a significant amount of CPU cycles.

However, you need to be careful if you assume that the compiler will always fall back to move semantics whenever RVO is not possible. There are important exceptions.

Exception: The Conditional Operator (?:)

Now, let us look at the following example and see what is going on.

#include <iostream>
#include <vector>
std::vector<int> getValues()
{
bool flag = true;
std::vector<int> myvec{1, 2, 3, 4};
std::vector<int> yourvec{100, 200, 200, 400, 500};
return flag ? myvec : yourvec;
}
int main()
{
std::vector<int> valueSet = getValues();
for(auto const &v : valueSet)
std::cout << v << "\n";
return 0;
}

At first glance, you might think the compiler will behave similarly to the previous example. But that is not what happens here.

The reason lies in how the conditional operator (?:) works. When both operands of the conditional operator are lvalues, the result of the expression is also an lvalue. In this case, both myvec and yourvec are lvalues, so the entire return expression is an lvalue.

When the return expression is an lvalue:

  • RVO cannot apply, because there is no single named return object.
  • The compiler is not allowed to implicitly move from an lvalue.
  • The only legal operation is a copy.

As a result, the compiler is forced to perform a copy operation in this case.

If you want to avoid the copy here, you need to make the move explicit, as shown below.

#include <iostream>
#include <vector>
std::vector<int> getValues()
{
bool flag = true;
std::vector<int> myvec{1, 2, 3, 4};
std::vector<int> yourvec{100, 200, 200, 400, 500};
return flag ? std::move(myvec) : std::move(yourvec);
}
int main()
{
std::vector<int> valueSet = getValues();
for(auto const &v : valueSet)
std::cout << v << "\n";
return 0;
}

Now, the costly copy does not happen. Instead, the selected vector is moved, and the operation remains efficient.

Never Return a Reference to a Local std::vector

A local vector in a free-standing function is destroyed when the function returns. Therefore, you should never return a reference to a local std::vector from such a function. Doing so would leave the caller with a dangling reference.

However, if the function is part of a class, that is, a class member function, the situation is different. We will see that in the next example.

Returning from a member function

When returning a std::vector from a member function, you need to consider the following cases.

  • In most situations, you will not want the caller to modify the vector. In that case, returning a const reference to the std::vector is usually the right choice.
  • In some cases, however, you may deliberately want to return a std::vector by value from a class member function because you want the caller to operate on its own copy. That then becomes a design decision based on requirements. You should take that call judiciously.

Let us look at the first case.

#include <iostream>
#include <vector>
class MyClass
{
public:
const std::vector<int>& getValues()
{
return my_class_vec;
}
private:
std::vector<int> my_class_vec{1, 2, 3, 4};
};
int main()
{
MyClass m;
const std::vector<int>& valueSet = m.getValues();
for(auto const &v : valueSet)
std::cout << v << "\n";
return 0;
}

Here, a const reference is returned, so there is no question of copying. This is straightforward and efficient. Because the std::vector is a data member of the class, its lifetime is tied to the lifetime of the object m, not to the member function call. Therefore, the reference remains valid after the function returns, as long as the object itself is alive.

Alternatively, if you choose to do the following, there will be a copy, and that is by choice, as mentioned earlier.

#include <iostream>
#include <vector>
class MyClass
{
public:
std::vector<int> getValues()
{
return my_class_vec;
}
private:
std::vector<int> my_class_vec{1, 2, 3, 4};
};
int main()
{
MyClass m;
std::vector<int> valueSet = m.getValues();
for(auto const &v : valueSet)
std::cout << v << "\n";
return 0;
}

In this case, the compiler cannot apply RVO, because the vector already exists as a subobject of MyClass and was constructed long before the function was called. There is no opportunity for in-place construction in the caller.

Also, applying move semantics here would mean moving from a class data member, which would leave the object in a moved-from state. The compiler is not allowed to implicitly do that. As a result, the only remaining option is a copy, and you have consciously chosen that behaviour by returning the vector by value.

Returning a Temporary Object

If the returned std::vector is a temporary object, you can be assured that no copy will occur in C++17 and later.

#include <iostream>
#include <vector>
std::vector<int> getValues()
{
return std::vector<int>{1, 2, 3, 4};
}
int main()
{
std::vector<int> valueSet = getValues();
for(auto const &v : valueSet)
std::cout << v << "\n";
return 0;
}

In this case, getValues() creates a temporary std::vector<int> and returns it. Since C++17, this is guaranteed copy elision. The temporary object is constructed directly in the caller’s storage, and no copy or move operation takes place at all.

Rule of Thumb for Returning std::vector in C++

  1. Single Return Statement: If a function returns a local object by value through a single return statement, RVO is typically applied, avoiding copies.
  2. Multiple Return Paths: If a function has multiple return paths with different local objects, it likely falls back to move semantics, which is efficient and avoids deep copies.
  3. Conditional Operator (?:): Avoid using the conditional operator to return local vectors. If you must, ensure to use std::move to avoid unnecessary copies.
  4. Never Return a Reference to a Local Vector: Returning a reference to a local std::vector is unsafe, as the vector is destroyed when the function returns, leading to dangling references.
  5. Returning from Member Functions:
    • Prefer returning a const reference to a std::vector if you do not want the caller to modify it.
    • If returning a vector by value is necessary, be aware that this involves copying or moving, depending on the context.
  6. Returning Temporary Objects: When returning a temporary std::vector, it is guaranteed that no copy will occur in C++17 and later due to guaranteed copy elision.
  7. Performance Consideration: Use move semantics or RVO to maintain efficient code; avoid unnecessary copies unless explicitly needed for the function’s design.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.