Programs

Python Collections Module: Counter, ChainMap, Deque & Tuple

Collection Module in Python

Python collections Module offers a variety of container types. An object called a Container holds various items and gives users access to them to iterate over them. Tuple, List, Dictionary, and other built-in containers are only a few examples. This article will discuss the many containers the collections module in Python offers.

Python Collections module offers a set of container data types that extend the features of stock containers like Lists, Tuples, Sets, and Dictionaries. With these special containers, you not only have the features of stock containers, but also some extra methods which come in very handy for certain tasks.

By the end of this tutorial, you’ll have the knowledge of the following:

  • What is the collections module?
  • Various functions like :
  1. Counter
  2. ChainMap
  3. Deque
  4. Named Tuple 
  • Working examples

The Collections module comes pre-installed in Python so we don’t need to pip install it. We can just import it and you’re ready to go! Let’s go into the most used functions in detail.

What is GC Library Python?

This module provides an interface to the optional garbage collector. The collector may be turned off, the collection frequency can be adjusted, and debugging parameters can be configured. The GC library Python offers garbage collection capability, which enables programmers to efficiently manage memory by automatically reclaiming memory used by objects that are no longer in use.  Additionally, it gives access to things the collector discovered but could not liberate. If you are certain that your program does not produce reference cycles, you can turn off the collector because it enhances the reference counting that Python already employs. Using the function gc.disable(), automatic collection can be turned off. 

Call gc.set_debug(gc.DEBUG_LEAK) to debug a leaking program. The garbage-collected objects are preserved in gc.garbage for inspection due to the inclusion of gc.DEBUG_SAVEALL.

Learn learn data science from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.

Must Read: Fascinating Python Applications in Real World

upGrad’s Exclusive Data Science Webinar for you –

Defaultdict

In Python, Defaultdict functions precisely like a dictionary. The only distinction is that when you attempt to access a key that doesn’t exist, it doesn’t throw an exception or a key error. It belongs to the built-in dict class as a subclass. A key that does not exist and never generates a KeyError is used to supply some default values. A function called default_factory gives the newly constructed dictionary its default value. The KeyError is generated if this parameter is missing.

Example

>>> favorites = {“pet”: “dog”, “color”: “blue”, “language”: “Python”}

>>> favorites[“fruit”]
Traceback (most recent call last):  File “<stdin>”, line 1, in <module>
KeyError: ‘fruit’

OrderedDict

A dictionary that makes sure its order is upheld is called OrderedDict. For instance, the order is maintained if the keys are put in a particular order. The location will not change even if you modify the key’s value later.

It has a similar application programming interface (API) to dict. OrderedDict, on the other hand, iterates over keys and values in the order in which they were added to the dictionary. The order of the key-value pair does not change if a new value is assigned to an existing key. The dictionary will relocate an entry to the end if it is deleted and subsequently reinserted.

Example

od = OrderedDict() 

od['a'] = 1

od['b'] = 2

od['c'] = 3

od['d'] = 4

print('Before Deleting')

for key, value in od.items(): 

    print(key, value) 

# deleting element

od.pop('a')

# Re-inserting the same

od['a'] = 1

print('\nAfter re-inserting')

for key, value in od.items(): 

    print(key, value)

Counter

The Counter is easily the most used and most useful function in the Collections module. Counter is a subclass of the dictionary class in Python. It counts the number of occurrences of each element in an iterable(such as strings, tuples, lists, etc.) and stores it in a dictionary. The dictionary keys are the unique elements in the iterable and the values are the counts of those elements. 

Let’s try it out with some examples.

Import collections
Marvel = ‘Bad Wolverine bullied poor Iron Man Bad Wolverine poor poor Iron Man’
Marvel_count = collections.Counter(Marvel.split())

 

#Output:
Counter({‘Bad’: 3,
        ‘Iron’: 2,
        ‘Man’: 2,
        ‘Poor’: 2,
        ‘Wolverine’: 2,
        ‘bullied’: 1})

As we see, it counted the occurrences of every element and put them in a dictionary. This can be used in any type of iterable. Now let’s see what all methods it has.

Marvel_count[‘Bad’]
#>> 3

Marvel_count.values()
#>> dict_values([3, 2, 1, 2, 2, 2])

Marvel_count.keys()
#>> dict_keys([‘Bad’, ‘Wolverine’, ‘bullied’, ‘Iron’, ‘Man’, ‘Poor’])

The most_common(n) method returns a list of the n most common elements arranged in a descending order of count.

Marvel_count.most_common(2)
#>> [(‘Bad’, 3), (‘Wolverine’, 2)]

Explore our Popular Data Science Certifications

ChainMap

ChainMap is used to make a single view of many dictionaries so that they can be accessed and updated from the single view i.e. the ChainMap object itself. Do keep in mind that these ChainMaps only consist of the references to the actual dictionaries and the update is also done in the actual dictionaries itself.

ChainMap is an extension of the dictionary class, so all the dictionary methods are supported, plus a few extra methods which we’ll be going over.

dic1 = {‘a’ : 1, ‘b’ : 2}
dic2 = {‘b’ : 3, ‘c’ : 4Dic3 = {‘b’ : 9, ‘d’ : 4}
chain1 = collections.ChainMap(dic2, dic1)
chain1

In the above code, we define two dictionaries dic1 and dic2 and put them in a ChainMap object. 

#Output:
ChainMap({‘b’: 3, ‘c’: 4}, {‘a’: 1, ‘b’: 2})

As we see, dic2 is ‘chained’ with dic1  in this very order. In essence, you can imagine dic2 being connected to dic1 like dic2–>dic1. So when we search for the key ‘b’, it will first search in the first mapping which is dic2 and if the key is not found, it will go to the next mappings. 

Therefore, the order of the ChainMap is important to determine which mapping is searched first. Let’s see that in action. 

chain1[‘b’]
#>> 3

As we see that above ChainMap has the key ‘b’ in both the dictionaries. So when we search for the key ‘b’, it searches in the first mapping which is dic2 and returns the value.

maps attribute

The maps attribute ChainMap returns a list of mappings in the order of search, i.e., dic2 is first in the map, so it will be searched first and so on.

chain1.maps
#>> [{‘b’: 3, ‘c’: 4}, {‘a’: 1, ‘b’: 2}]

Similarly, we can check for keys and values:

list(chain1.keys())
#>> [‘a’, ‘c’, ‘b’]

 

list(chain1.values())
#>> [1, 4, 3]

As we see, only the unique keys are shown and the values as well. 

new_child(m=None)

The new_child() method is used to add new maps into the ChainMap. This method returns a new ChainMap with the new map as the first map followed by the rest of maps. If m is specified, it becomes the first map, else an empty dictionary is added as the first map.

chain1.new_child(dic3)
chain1.maps

 

#Output:
[{‘b’: 9, ‘d’: 4}, {‘b’: 3, ‘c’: 4}, {‘a’: 1, ‘b’: 2}]

As we see, it added the dic3 in the beginning and returned a new ChainMap object.

reversed

You might be wondering how you can change the order of the ChainMap. That can be achieved using the reversed function which returns an iterator for iterating through the ChainMap in the reverse direction. Let’s see this in action.

The key ‘b’ is now in all the maps. The first map in the ChainMap has key ‘b’ with value as 9. 

chain1[‘b’]
#>> 9

Let’s see what happens once we iterate in the reversed direction.

chain1.maps = reversed(chain1.maps)
chain1[‘b’]
#>> 2

Keep in mind, the reversed function doesn’t really reverse the mapping, it just gives a reversed iterator.

Read: Python Tutorial

Top Data Science Skills to Learn

Deque

Deque (pronounced as ‘deck’) is an extension of lists, but a double ended one. Deque stands for: Double Ended Queue because we can remove/pop and append elements on either end of Deques efficiently unlike lists where all the operations are on the right side.

deque(iterable, maxlen) takes in iterables and returns deque objects. They also have a maxlen parameter which decides the upper limit on the number of elements. If not specified, deque can grow indefinitely. Let’s take a look at its snappy methods.

deq = collections.deque([1, 2, 3, 4, 5], maxlen=6)
deq.appendleft(8)

 

#Output:
deque([8, 1, 2, 3, 4, 5])

As we see, calling the appendleft method appended the element on the left end. Moreover, as we had initialized it with maxlen as 6 which it has reached now, appending another element will throw “StopIterationError”.

So, let’s remove the left most element using popleft:

deq.popleft()
#>> 8

We can also remove a specific element by value using remove:

deq.remove(5)
#>> deque([1, 2, 3, 4])

Note: calling remove method with an element which is not in the deque will throw a “ValueError”.

We can insert any element at the specified index using insert(index, element).

deq.insert(2,7)
#>> deque([1, 2, 7, 3, 4])

Deque can be reversed by calling the reverse method.

deq.reverse()
#>> deque([4, 3, 7, 2, 1])


Deque can also be rotated clockwise or anticlockwise using the
rotate method.

#Clockwise
deq.rotate(2)
#>> deque([2, 1, 4, 3, 7])

 

#Anti Clockwise
deq.rotate(-2)
#>> deque([4, 3, 7, 2, 1])

Named Tuple

namedtuple() is a great uplift of the usual tuple object in Python. Named Tuples allow us to index elements by their names rather than just positions. You can think of named tuples as tables with the table name as the tuple name and column names as the index names. Named Tuple essentially assigns meaning to each element for easier access and more readable code.

Read our popular Data Science Articles

Let’s take some examples and understand how it works.

Performance = collections.namedtuple(‘Employee_Rating’, [‘Q1’, ‘Q2’, ‘Q3’, ‘Q4’])

In the above code, we defined a Named Tuple object “Performance” of name “Employee_Rating” with field names as “Q1”, “Q2”, “Q3” and “Q4” which will store quarterly ratings of the Employees. Let’s make 2 named tuple entries of Employee_Rating.

rahul = Performance(3, 4, 3.5, 4.5)
ankit = Performance(4, 4.5, 4, 4.5)

 

#Output:
Employee_Rating(Q1=4, Q2=4.5, Q3=4, Q4=4.5)
Employee_Rating(Q1=3, Q2=4, Q3=3.5, Q4=4.5)


Now that we have created 2 entries, we can access them by index names.

ankit.Q1
#>> 4

 

ankit.Q3 > rahul.Q3
#>> True


To add new entries, or make new named tuple objects, we can use the
_make() method.

Milkha = Performance._make([4, 5, 5, 4.5])
Milkha

 

#Output:
Employee_Rating(Q1=4, Q2=5, Q3=5, Q4=4.5)

 

We can edit the elements by using the _replace method on any named tuple.

rahul._replace(Q1=2)

 

#Output:
Employee_Rating(Q1=2, Q2=4, Q3=3.5, Q4=4.5)

Before you go

The Collections module has a few more useful functions such as OrderedDict, defaultdict, UserList, UserString, UserDict. Make sure you get some hands on the functions we discussed in this tutorial. These container types not only make your life easier, but also improves the quality of code you write.

If you are curious to learn about python, data science, check out IIIT-B & upGrad’s  Executive PG Program in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.

Our learners also read: Top Python Free Courses

What is a collection module, and how is it useful?

Python's collection module supports several types of containers. A Container is an object used to store various items and give the means to retrieve and iterate over the enclosed objects. Tuple, List, Dictionary, and more built-in containers are available. Also, the Collections module has highly specialized and efficient container data types such as namedtuple(), deque, OrderedDict, counter, and so on that are far superior to standard Python containers.

Is the collection module a necessary topic for Python?

Yes, the collection module is a necessary topic while learning Python. Counting objects, constructing queues and stacks, managing missing keys in dictionaries, and more are all possible with Python's collections module. Collections' data types and classes are created to be efficient and Pythonic. They are pretty valuable for your Python programming career, so taking the time to learn about this collection module is well worth your time and effort. The containers in the Collections module may be quite beneficial for business-level projects and models, adding significantly to the usefulness of generic Python containers through improved optimization and execution speed.

What are the data types present in the collection Module?

Several data types are present in the collection module, such as deque, defaultdict, namedtuple, OrderedDict, Counter, ChainMap, UserDict, UserList, UserString, etc. These data types can be used for various reasons, such as adding and removing items from either end of the sequence, constructing default values for missing keys, and automatically adding them to the dictionary. These data types can also help by providing named fields that allow accessing items by name while keeping the ability to access items by index, counting unique items in a sequence or iterable, and treating several mappings as a single dictionary object, etc.

Want to share this article?

Prepare for a Career of the Future

Leave a comment

Your email address will not be published. Required fields are marked *

Our Popular Data Science Course

Get Free Consultation

Leave a comment

Your email address will not be published. Required fields are marked *

×
Get Free career counselling from upGrad experts!
Book a session with an industry professional today!
No Thanks
Let's do it
Get Free career counselling from upGrad experts!
Book a Session with an industry professional today!
Let's do it
No Thanks