Python tips

Python branching tricks

Multiple predicates short-cut.

>>> n = 10
>>> 1 < n < 20
True

For-else construct useful when searched for something and find it.

# For example assume that I need to search through a list and process each item until a flag item is found and 
# then stop processing. If the flag item is missing then an exception needs to be raised.

for i in mylist:
    if i == theflag:
        break
    process(i)
else:
    raise ValueError("List argument missing terminal flag.")

Trenary operator.

>>> "Python ROCK" if True else " I AM GRUMPY"
"Python ROCK"

Try-catch-else construct.

try:
  foo() 
except Exception: 
  print("Exception occured")
else:
  print("Exception didnt occur")
finally:
  print("Always gets here")

While-else construct.

i = 5

while i > 1:
    print("Whil-ing away!")
    i -= 1
    if i == 3:
        break
else:
    print("Finished up!")

Python Iterables tricks

Creating a sequence of numbers (zero to ten with skips).

>>> range(0,10,2)
[0, 2, 4, 6, 8]         

Summing a sequence of numbers (calculating the sum of zero to ten with skips).

>>> l = range(0,10,2)
>>> sum(l)
20

Checking whether any element in the sequence is Truthful (checking whether any elements between zero and ten with skips are even).

>>> any(a % 2==0 for a in range(0,10,2))
True

Checking whether all elements in the sequence are Truthful (checking whether all elements between zero and ten with skips are even).


>>> all(a % 2==0 for a in range(0,10,2))
True

Cumulative summing a sequence of numbers (calculating the cumulative sum of zero to ten with skips).

>>> import numpy as np
>>> res = list(np.cumsum(range(0,10,2)))
>>> res
[ 0,  2,  6, 12, 20]

Given each iterable we construct a tuple by adding an index.

>>> a = ['Hello', 'world', '!']
>>> list(enumerate(a))
[(0, 'Hello'), (1, 'world'), (2, '!')]

Concatenating iterable to a single string.

>>> a = ["python","really", "rocks"]
>>> " ".join(a)
'python really rocks'

Combining two iterable of tuples or pivot nested iterables.

# Combining two iterables
>>> a = [1, 2, 3]
>>> b = ['a', 'b', 'c']
>>> z = zip(a, b)
>>> z
[(1, 'a'), (2, 'b'), (3, 'c')]

# Pivoting list of tuples
>>> zip(*z)
[(1, 2, 3), ('a', 'b', 'c')]

Getting min/max from iterable (with/without specific function).

# Getting maximum from iterable
>>> a = [1, 2, -3]
>>> max(a)
2

# Getting maximum from iterable
>>> min(a)
1

# Bot min/max has key value to allow to get maximum by appliing function
>>> max(a,key=abs)
3

Getting sorted iterable (can sort by “compare” function).

>>> a = [1, 2, -3]
>>> sorted(a)
[-3, 1, 2]

>>> sorted(a,key=abs)
[1, 2, -3]

Splitting a single string to list.

>>> s = "a,b,c"
>>> s.split(",")
["a", "b", "c"]

Initializing a list filled with some repetitive number.

>> [1]* 10
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

Merging/Upserting two dictionaries.

>>> a = {"a":1, "b":1}
>>> b = {"b":2, "c":1}
>>> a.update(b)
>>> a
{"a":1, "b":2, "c":1}

Naming and saving slices of iterables.


# Naming slices (slice(start, end, step))
>>> a = [0, 1, 2, 3, 4, 5]
>>> LASTTHREE = slice(-3, None)
>>> LASTTHREE
slice(-3, None, None)
>>> a[LASTTHREE]
[3, 4, 5]

Finding the index of an item in a list.

>>> a = ["foo", "bar", "baz"]
>>> a.index("bar")
1

Finding the index of the min/max item in an iterable.

>>> a = [2, 3, 1]
>>> min(enumerate(a),key=lambda x: x[1])[0]
2

Rotating iterable by k elements.

>>> a = [1, 2, 3, 4]
>>> k = 2
>>> a[-2:] + a[:-2]
[3, 4, 1, 2]

Removing useless characters on the end/start/both of your string.

>>> name = "//George//"
>>> name.strip("/")
'George'
>>> name.rstrip("/")
'//George'
>>> name.lstrip("/")
'George//'

Reversing an iterable wit order (string, list etc).

# Reversing string
>>> s = "abc"
>>> s[::-1]
"cba"

# Reversing list
>>> l = ["a", "b", "c"]
>>> l[::-1]
["c", "b", "a"]

Printing values with a custom separator

Chances are that you haven't seen this trick before. Well in my case, till now I didn't have to use a custom separator so never got to know this one. But fret not my friends, it's quite easy.

print("05", "09", "2021", sep="/")  
print("username", "gmail.com", sep="@")

Output:

05/09/2021
username@gmail.com

Removing duplicates from a list

Lists can have duplicate elements and let's admit it people, even though it is useful but sometimes duplicates prove to be unwanted. I got a solution for you guys!

Num = [10, 12, 14, 16, 18, 18, 10, 10, 14]
print("Original list: ", Num) 
Num = list(set(Num))
print("After removing duplicates: ", Num)

Output:

Original list: [10, 12, 14, 16, 18, 18, 10, 10, 14]
After removing duplicates: [10, 12, 14, 16, 18]

Easy, isn't it? Just convert a list into a set and then change it back, all in 1 line! Changing it in a set removes all duplicates because sets don't have duplicates.

Python collections tricks

Set basic operations.

>>> A = {1, 2, 3, 3}
>>> A
set([1, 2, 3])
>>> B = {3, 4, 5, 6, 7}
>>> B
set([3, 4, 5, 6, 7])
>>> A | B
set([1, 2, 3, 4, 5, 6, 7])
>>> A & B
set([3])
>>> A - B
set([1, 2])
>>> B - A
set([4, 5, 6, 7])
>>> A ^ B
set([1, 2, 4, 5, 6, 7])
>>> (A ^ B) == ((A - B) | (B - A))
True

Counter data structure (an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values).

import collections

>>> A = collections.Counter([1, 1, 2, 2, 3, 3, 3, 3, 4, 5, 6, 7])
>>> A
Counter({3: 4, 1: 2, 2: 2, 4: 1, 5: 1, 6: 1, 7: 1})
>>> A.most_common(1)
[(3, 4)]
>>> A.most_common(3)
[(3, 4), (1, 2), (2, 2)]

Default dictionary structure (a subclass of dictionary that retrieves default value when non-existing key getting accessed).

>>> import collections
>>> m = collections.defaultdict(int)
>>> m['a']
0

>>> m = collections.defaultdict(str)
>>> m['a']
''
>>> m['b'] += 'a'
>>> m['b']
'a'

>>> m = collections.defaultdict(lambda: '[default value]')
>>> m['a']
'[default value]'
>>> m['b']
'[default value]'

>>> m = collections.defaultdict(list)
>>> m['a']
[]

Ordered dict structure (a subclass of dictionary that keeps order).

>>> from collections import OrderedDict

>>> d = OrderedDict.fromkeys('abcde')
>>> d.move_to_end('b')
>>> ''.join(d.keys())
'acdeb'

>>> d.move_to_end('b', last=False)
>>> ''.join(d.keys())
'bacde'

Deques structure (Deques are a generalization of stacks and queues).

>>> import collection
>>> Q = collections.deque()
>>> Q.append(1)
>>> Q.appendleft(2)
>>> Q.extend([3, 4])
>>> Q.extendleft([5, 6])
>>> Q
deque([6, 5, 2, 1, 3, 4])
>>> Q.pop()
4
>>> Q.popleft()
6
>>> Q
deque([5, 2, 1, 3])
>>> Q.rotate(3)
>>> Q
deque([2, 1, 3, 5])
>>> Q.rotate(-3)
>>> Q
deque([5, 2, 1, 3])

>>> last_three = collections.deque(maxlen=3)
>>> for i in range(4):
...     last_three.append(i)
...     print ', '.join(str(x) for x in last_three)
...
0
0, 1
0, 1, 2
1, 2, 3
2, 3, 4

Named tuples structure (create tuple-like objects that have fields accessible by attribute lookup as well as being indexable and iterable).

>>> import collections
>>> Point = collections.namedtuple('Point', ['x', 'y'])
>>> p = Point(x=1.0, y=2.0)
>>> p
Point(x=1.0, y=2.0)
>>> p.x
1.0
>>> p.y
2.0

Use A Dictionary To Store A Switch.


>>> func_dict = {'sum': lambda x, y: x + y, 'subtract': lambda x, y: x - y}
>>> func_dict['sum'](9,3)
12
>>> func_dict['subtract'](9,3)
6

Data classes structure

>>> from dataclasses import dataclass

>>> @dataclass
>>> class DataClassCard:
>>>    rank: str
>>>    suit: str

>>> queen_of_hearts = DataClassCard('Q', 'Hearts')
>>> queen_of_hearts.rank
'Q'
>>> queen_of_hearts
DataClassCard(rank='Q', suit='Hearts')
>>> queen_of_hearts == DataClassCard('Q', 'Hearts')
True

Concatenating Strings

When you need to concatenate a list of strings, you can do this using a for loop and adding each element one by one. However, this would be very inefficient, especially if the list is long. In Python, strings are immutable, and thus the left and right strings would have to be copied into the new string for every pair of concatenation. A better approach is to use the join() function as shown below:

# Naive way to concatenate strings
sep = ['a', 'b', 'c', 'd', 'e']
joined = ""
for x in sep:
    joined += x
print(joined)

abcde

# Joining strings
sep = ['a', 'b', 'c', 'd', 'e']
joined = "".join(sep)
print(joined)

abcde

Using List Comprehensions

List comprehension is one of the key Python features, and you may already be familiar with this concept. Even if you are, here's a quick reminder of how list comprehensions help us create lists much more efficiently.:

# Inefficient way to create new list based on some old list
squares = []
for x in range(5):
    squares.append(x**2)
print(squares)

[0, 1, 4, 9, 16]

# Efficient way to create new list based on some old list
squares = [x**2 for x in range(5)]
print(squares)

[0, 1, 4, 9, 16]

Unstack

Sometimes, you'll prefer to transform one level of the index (like email_provider) into the columns of your data frame. That's exactly what unstack() does. It's better to explain this with an example. So, let's unstack our code above:

# Moving 'Mail providers' to the column names
clients.groupby('state')['email_provider'].value_counts().unstack().fillna(0)

As you can see, the values for the email service providers are now the columns of our data frame. Now it's time to move on to some other general Python tricks beyond pandas.

Grouping Data

To demonstrate how we can group data efficiently in pandas, let's first create a new column with the providers of email services. Here, we can use the trick for splitting columns that you're already familiar with:

# Creating new columb with the email service providers
clients['email_provider'] = clients['email'].str.split('@', expand = True)[1]

clients['email_provider'].head()

Now let's group the clients by state and email_provider:

# Grouping clients by state and email provider
clients.groupby('state')['email_provider'].value_counts()

We've now got a data frame that uses several levels of indexing to provide access to each observation (known as multi-indexing).

Checking if Two Columns Are Identical

Since we've practiced joining and splitting columns, you might have noticed that we now have two columns with the first name (first_name and f_name) and two columns with the last name (last_name and l_name). Let's quickly check if these columns are identical. First, note that you can use equals() to check the equality of columns or even entire datasets:

# Checking if two columns are identical with .equals()
clients['first_name'].equals(clients['f_name'])

True

You'll get a True or False answer. But what if you get False and want to know how many entries don't match? Here's a simple way to get this information:

# Checking how many entries in the initial column match the entries in the new column
(clients['first_name'] == clients['f_name']).sum()

500

We've started with getting the number of entries that do match. Here, we again utilize the fact that True is considered as 1 in our calculations. We see that 500 entries from the first_name column match the entries in the f_name column. You may recall that 500 is the total number of rows in our dataset, so this means all entries match. However, you may not always remember (or know) the total number of entries in your dataset. So, for our second example, we get the number of entries that do not match by subtracting the number of matching entries from the total number of entries:

# Checking how many entries in the initial column DO NOT match the entries in the new column
clients['last_name'].count() - (clients['last_name'] == clients['l_name']).sum()

0