PrevNext

Introduction to Data Structures

Authors: Darren Yao, Benjamin Qi, Nathan Wang, Abutalib Namazov, Allen Li

Introduces the concept of a data structure, (dynamic) arrays, pairs, tuples.

Data Structures

A data structure determines how data is organized so that information can be used efficiently. Each data structure supports some operations efficiently, while other operations are either inefficient or not supported at all. Since different operations are supported by each data structure, you should carefully evaluate which data structure will work best for your particular problem.

C++

The C++ standard library data structures are designed to store any type of data. We put the desired data type within the <> brackets when declaring the data structure, as follows:

vector<string> v;

This creates a vector structure that only stores objects of type string.

For our examples below, we will primarily use the int data type, but note that you can use any data type including string and user-defined structures.

Essentially, every standard library data structure supports the size() method, which returns the number of elements in the data structure, and the empty() method, which returns true if the data structure is empty, and false otherwise.

Arrays

You already know one of the simplest data structures: the array! In C++11, in addition to normal arrays, there exists an array class in the STL. For example, an array of ints can be initialized using the following line of code:

array<int, 25> ar;

The array class supports the normal STL operations (such as .empty() or .size()) as well as the normal square-bracket access operator:

ar[5] //access the element at the 5th index.

Java

Java default Collections data structures are designed to store any type of object. However, we usually don't want our data structures to only store one type of data, like integers or strings. We do this by putting the desired data type within the <> brackets when declaring the data structure, as follows:

ArrayList<String> list = new ArrayList<String>();

This creates an ArrayList structure that only stores objects of type String.

For our examples below, we will primarily use the Integer data type, but note that you can have Collections of any object type, including Strings , other Collections, or user-defined objects.

Collections data types always contain an add method for adding an element to the collection, and a remove method which removes and returns a certain element from the collection. They also support the size() method, which returns the number of elements in the data structure, and the isEmpty() method, which returns true if the data structure is empty, and false otherwise.

Python

Python

Lists

The default way to store data in Python is using a list, which can automatically resize itself to accommodate more elements. We can add and delete elements at the end in O(1)O(1) time. A list can be initialized as follows:

list = []

Python lists are generic. This means that they can store any kind of data type, including objects. For example, the following code creates a dynamic array and adds the numbers 11 through 1010 to it:

for i in range(1, 11): # Note that range(i, j) includes i, but does not include j
list.append(i)

In Python, we can give a dynamic array an initial size. The code below creates a dynamic array with 3030 zeroes.

list = [0] * 30

Iterating

We can use a regular for loop to iterate through all elements of a list.

list = [1, 7, 4, 5, 2]
for i in range(len(list)):
print(list[i], end = " ")
print()
for element in list:
print(element, end = " ")
print()

We can also use iterators. An iterator allows you to traverse a container by pointing to an object within the container. iter(list) returns an iterator pointing to the first element of the list list.

list = [4, 2, 0, 0, 5]
it = iter(list)
print(next(it)) # 4
print(next(it)) # 2
print(next(it)) # 0

Inserting and Erasing

list = []
list.append(2) # [2]
list.append(3) # [2, 3]
list.append(7) # [2, 3, 7]
list.append(5) # [2, 3, 7, 5]
list[1] = 4; # sets element at index 1 to 4 -> [2, 4, 7, 5]
list.pop(1) # removes element at index 1 -> [2, 7, 5]
# this remove method is O(n); to be avoided
list.append(8) # [2, 7, 5, 8]
list.pop() # [2, 7, 5]

C++

Dynamic Arrays

Resources
IUSACOmodule is based off this
CPHvectors, strings
PAPS
LCPP

Dynamic arrays (vector in C++) support all the functions that a normal array does, and can resize itself to accommodate more elements. In a dynamic array, we can also add and delete elements at the end in O(1)O(1) time.

For example, the following code creates a dynamic array and adds the numbers 11 through 1010 to it:

vector<int> v;
for(int i = 1; i <= 10; i++){
v.push_back(i);
}

In C++, we can give a dynamic array an initial size. The code below creates a dynamic array with 3030 zeroes.

vector<int> v(30); // one way
vector<int> v; v.resize(30); // another way

In array-based contest problems, we'll use one-, two-, and three-dimensional static arrays most of the time. However, we can also have static arrays of dynamic arrays, dynamic arrays of static arrays, and so on. Usually, the choice between a static array and a dynamic array is just personal preference.

Warning!

Nothing beyond this warning is required knowledge for Bronze, though you may find it useful.

Iterating

Note: Iterating isn't required for Bronze, though it may be useful.

One way to iterate through all elements of a static or dynamic array is to use a regular for loop.

vector<int> v{1,7,4,5,2};
for (int i = 0; i < int(size(v)); i++){
cout << v[i] << " ";
}
cout << endl;

We can also use iterators. An iterator allows you to traverse a container by pointing to an object within the container. However, they are not the same thing as pointers.

For example, v.begin() or begin(v) returns an iterator pointing to the first element of the vector v. Apart from the standard way of traversing a vector (by treating it as an array), you can also use iterators:

for (vector<int>::iterator it = v.begin(); it != v.end(); ++it) {
cout << *it << " "; //prints the values in the vector using the iterator
}

Here is another way to write this. auto (since C++11) automatically infers the type of an object:

vector<int> v{1,7,4,5,2};
for (auto it = begin(v); it != end(v); it = next(it)) {
cout << *it << " "; //prints the values in the vector using the iterator
}

We can also use a for-each loop.

for (int element : v) {
cout << element << " "; //prints the values in the vector
}

Inserting and Erasing

Note: Inserting and Erasing isn't required for Bronze, though it may be useful.

Keep in mind that insertion and erasure in the middle of a vector are O(n)O(n).

vector<int> v;
v.push_back(2); // [2]
v.push_back(3); // [2, 3]
v.push_back(7); // [2, 3, 7]
v.push_back(5); // [2, 3, 7, 5]
v[1] = 4; // sets element at index 1 to 4 -> [2, 4, 7, 5]
v.erase(v.begin() + 1); // removes element at index 1 -> [2, 7, 5]
// this remove method is O(n); to be avoided
v.push_back(8); // [2, 7, 5, 8]
v.erase(v.end()-1); // [2, 7, 5]

Strings

This section is not complete.

Feel free to file a request to complete this using the "Contact Us" button.

All the vector operations above also work on strings.

Java

Dynamic Arrays

Dynamic arrays (ArrayList in Java) that support all the functions that a normal array does, and can repeatedly reallocate storage to accommodate more elements as they are added.

In a dynamic array, we can also add and delete elements at the end in O(1)O(1) time. For example, the following code creates a dynamic array and adds the numbers 11 through 1010 to it:

ArrayList<Integer> list = new ArrayList<Integer>();
for(int i = 1; i <= 10; i++){
list.add(i);
}

In array-based contest problems, we'll use one-, two-, and three-dimensional static arrays most of the time. However, we can also have static arrays of dynamic arrays, dynamic arrays of static arrays, and so on. Usually, the choice between a static array and a dynamic array is just personal preference.

Iterating

Note: Iterating isn't required for Bronze, though it may be useful.

To iterate through a static or dynamic array, we can use either the regular for loop or the for-each loop.

ArrayList<Integer> list = new ArrayList<Integer>();
list.add(1); list.add(7); list.add(4); list.add(5); list.add(2);
int[] arr = {1, 7, 4, 5, 2};
for(int i = 0; i < list.size(); i++){ // regular
System.out.println(list.get(i));
}
for(int element : arr){ // for-each
System.out.println(element);
}

Adding and Removing

Note: Adding and Removing isn't required for Bronze, though it may be useful.

We can add and remove at any index of a dynamic array in O(n)O(n) time.

ArrayList<Integer> list = new ArrayList<Integer>();
list.add(2); // [2]
list.add(3); // [2, 3]
list.add(7); // [2, 3, 7]
list.add(5); // [2, 3, 7, 5]
list.set(1, 4); // sets element at index 1 to 4 -> [2, 4, 7, 5]
list.remove(1); // removes element at index 1 -> [2, 7, 5]
// this remove method is O(n); to be avoided
list.add(8); // [2, 7, 5, 8]
list.remove(list.size()-1); // [2, 7, 5]

Pairs & Tuples

Note: Pairs & Tuples aren't required for Bronze, though it may be useful.

If we want to store a collection of points on the 2D plane, then we can use a dynamic array of pairs.

C++

Of course, both vector<vector<int>> and vector<array<int,2>> would suffice for this case, but a pair can also store two elements of different types.

C++ Pairs

  • pair<type1, type2> p: Creates a pair p with two elements with the first one being of type1 and the second one being of type2.
  • make_pair(a, b): Returns a pair with values a, b.
  • {a, b}: With C++11 and above, this can be used as to create a pair, which is easier to write than make_pair(a, b).
  • pair.first: The first value of the pair.
  • pair.second: The second value of the pair.

Example:

#include <bits/stdc++.h>
using namespace std;
int main() {
pair<string, int> myPair1 = make_pair("Testing", 123);
cout << myPair1.first << " " << myPair1.second << endl;
myPair1.first = "It is possible to edit pairs after declaring them";
cout << myPair1.first << " " << myPair1.second << endl;
pair<string, string> myPair2 = {"Testing", "curly braces"};
cout << myPair2.first << " " << myPair2.second << endl;

C++ Tuples

Of course, we can hold more than two values with something like pair<int,pair<int,int>>, but it gets messy when you need a lot of elements. In this case, using tuples might be more convenient.

  • tuple<type1, type2, ..., typeN> t: Creates a tuple with N elements, i'th one being of typei.
  • make_tuple(a, b, c, ..., d): Returns a tuple with values written in the brackets.
  • get<i>(t): Returns the i'th element of the tuple t. Can also be used to change the element of a tuple.

This operation only works for constant i. Namely, it is not allowed to do something like the following since i is not constant:

tuple<int,int,int> t{3,4,5};
int i = 1; cout << get<i>(t);
  • tie(a, b, c, ..., d) = t: Equals a, b, c, ..., d to the elements of the tuple tt accordingly.

Example:

#include <bits/stdc++.h>
using namespace std;
int main() {
int a = 3, b = 4, c = 5;
tuple<int, int, int> t = tie(a, b, c);
cout << get<0>(t) << " " << get<1>(t) << " " << get<2>(t) << endl;
get<0>(t) = 7;
cout << get<0>(t) << " " << get<1>(t) << " " << get<2>(t) << endl;

Java

Java

Warning: Language Note

Pairs and tuples are not available in Java's standard libraries, though we can work around that.

We can create our own generic Pair class in Java:

static class Pair<K, V> {
K first;
V second;
public Pair(K first_value, V second_value) {
first = first_value;
second = second_value;
}
}

Then, we can use the Pair class as follows:

Pair<Integer, String> p = new Pair(5, "hello");
System.out.println(p.first + " " + p.second);

Python

Warning: Language Note

Pairs are not available in Python. Just use tuples; no need for pairs!

Python Tuples

Use the link below (if you know of a better one, please let us know!) to learn about tuples.

Resources
Tutorialspoint

Memory Allocation

One thing to keep in mind when using arrays is the memory limit. Usually the USACO memory limit is 256 MB, which can store up to roughly 60 million 32-bit int values, or less if you're using 64-bit integers, long strings, custom objects, etc. As a general rule of thumb, if your array size exceeds 51065 \cdot 10^6 (5 million), you're at risk of running the memory limit and should consider alternative approaches to the problem.

To do the calculation on how many int values can be stored:

  1. Calculate the total memory size in bytes: for 256 MB, that's 25610241024256*1024*1024
  2. Divide by the size, in bytes, of an int (4), or a long long (8), etc: 25610241024/460256*1024*1024/4 \approx 60 million
  3. Be aware that program overhead (which can be very significant, especially with recursive functions) will reduce memory available.

C++

In C++, arrays initialized locally using the default syntax (i.e. int arr[10]; ) are initialized to random numbers, because C++ doesn't have built-in memory management. In order to initialize your arrays to zero, either declare them globally, or use either memset(arr, 0, sizeof(arr)) or std::fill(arr, arr+N, 0) where NN is the array length.

Java

Python

Problems

Nothing to see here! Arrays of fixed size suffice for almost every Bronze problem, but dynamic arrays, pairs, and tuples can greatly simplify implementation at times.

Give Us Feedback on Introduction to Data Structures!

Module Progress:

PrevNext