All about Python — 100 + Code Snippets, Tricks, Concepts and Important Modules
Python is the most popular language as of now. It is used heavily in all fields from website building to artificial intelligence.
Some of the people who use python in day today of their work are data analysts, data scientists, data engineers,machine learning engineers,web developers,etc. In this article I have shared some code snippets, concepts and important modules in python which I find very useful and hope it is useful to you.Most of this I found from python documentation, stackoverflow and kaggle. If you want to know in and out of data structures and algorithms then do practice leetcode or hackerrank type problems.
I have used the following tools
- carbon.now.sh for the code snippets.
- excalidraw for the drawings
- Gitmind for the flow diagrams.
Let's start.
1. Convert two lists into a Dictionary:
{‘CLIPPERS’: 3, ‘GSW’: 1, ‘LAKERS’: 2}
2. Use ZIP and ZIP(*)
[(‘car’, 10), (‘truck’, 20), (‘bus’, 30)]
(‘car’, ‘truck’, ‘bus’)
(10, 20, 30)
3. Flatten a list:
[1, 2, 3, 3, 7, 8, 9, 12, 17]
4. Save and Load a Machine Learning model Using Pickle:
Also, check this Kaggle article for more info
5. Melt Function.
Pandas melt() function is used to change the DataFrame format from wide to long.
df:
df1:
Another example:
For more info please check this stack-overflow.
6. Faker:
Faker is a Python package that generates fake data for you. Whether you need to bootstrap your database, create good-looking XML documents, fill-in your persistence to stress test it, or anonymize data taken from a production service, Faker is for you.
Jeremy Craig
1302 Brittany Estate
Lake Diamondburgh, HI 31774
Level president life time follow indicate size should. Consumer ability this perform write. Oil wait left tough product.
Need out per third most job special. Good gas star build blood.
7. String and Split:
Split strings around given separator/delimiter.
Output:
8. Case-upper and lower
warriors is the best team
WARRIORS IS THE BEST TEAM
wARRIORS IS THE BEST TEAM
9. Magic commands:
- Run the command %lsmagic to see all the available magic commands.
Available line magics:
%alias %alias_magic %autocall %automagic %autosave %bookmark %cat %cd %clear %colors %config %connect_info %cp %debug %dhist %dirs %doctest_mode %ed %edit %env %gui %hist %history %killbgscripts %ldir %less %lf %lk %ll %load %load_ext %loadpy %logoff %logon %logstart %logstate %logstop %ls %lsmagic %lx %macro %magic %man %matplotlib %mkdir %more %mv %notebook %page %pastebin %pdb %pdef %pdoc %pfile %pinfo %pinfo2 %pip %popd %pprint %precision %profile %prun %psearch %psource %pushd %pwd %pycat %pylab %qtconsole %quickref %recall %rehashx %reload_ext %rep %rerun %reset %reset_selective %rm %rmdir %run %save %sc %set_env %shell %store %sx %system %tb %tensorflow_version %time %timeit %unalias %unload_ext %who %who_ls %whos %xdel %xmodeAvailable cell magics:
%%! %%HTML %%SVG %%bash %%bigquery %%capture %%debug %%file %%html %%javascript %%js %%latex %%perl %%prun %%pypy %%python %%python2 %%python3 %%ruby %%script %%sh %%shell %%svg %%sx %%system %%time %%timeit %%writefileAutomagic is ON, % prefix IS NOT needed for line magics.
10. Reverse a String:
11. Colab: You can display the data frame in table format:
12. Swap Variables :
In python, you don’t need a temp variable.
13. Merge Dictionaries:
{‘apple’: 2, ‘banana’: 3, ‘orange’: 2}
14. Print Emojis
You can check the whole list of emojis in this link or you can google it. Use the CLDR short name.
15. TQDM: progress Bar
tqdm
derives from the Arabic word taqaddum (تقدّم) which can mean "progress," and is an abbreviation for "I love you so much" in Spanish (te quiero demasiado). Instantly make your loops show a smart progress meter — just wrap any iterable with, and you're done.trange(N)
can be also used as a convenient shortcut for tqdm(range(N))
.
Check out their GitHub repository for more info.
16. Unpack a list:
Output:
1 2 3[1, 2, 3, 4] 5 61 [2, 3, 4, 5] 6
17. Remove Punctuations in a Sentence:
Use the string module . You can remove the punctuations like above.
!”#$%&’()*+,-./:;<=>?@[\]^_`{|}~The wellknown story I told at the conferences about hypocondria in Boston New York Philadelphiaand Richmond went as follows
18. Find the common elements in 2 lists:
Output:
[‘g’, ‘f’, ‘e’, ‘h’]
19. Append a value to the list:(Append or Concatenate)
Output:
[‘apple’, ‘oranges’, ‘bananas’, ‘grapes’]
20. Remove a sublist from a list:
Output:
[10, 20, 60]
21.Use negative index[-1] or [~0]:
60
50
60
50
To understand slice notation please check this wonderful stackoverflow thread.
22. Slice Assignment:
Slice assignment is another way you can manipulate the list. Check this stackoverflow for more info.
Output:
[10, 21, 31, 40, 50, 60, 70, 80, 90]
[10, 40, 50, 60, 70, 80, 90]
[10, 10, 20, 30, 40, 50, 60, 70, 80, 90]
23. Variable Names:
- Variables names must start with a letter or an underscore.
- Variable names can’t start with a numeral.
- Variable name can’t start with a symbol.
- Variable can contain characters, numbers and underscores.
- Variable names are case sensitive
- Variable can’t contain the names of python keywords.
24.Mutable vs Immutable:
25.Build In functions:
To check the built in function in python we can use dir()
To know the functionality of any function, we can use built in function help .
The same applies for the module.
For example you use pandas and find out the functions available in pandas and then help.
import pandas as pd
dir(pd)
help(pd.util)
26.Comments:
- # single line comment.
- y = a + b # Calculates the sum -inline comment.
- “ “ “ This is multi line comment used in documentation of the function modules, classes, etc ”””
- A docstring is a multi-line comment used to document modules, classes, functions and methods. It has to be the first statement of the component it describes.
27.Break Vs Continue vs Return vs Pass:
- Break Statement-Control flows out of the loop immediately.
- Continuos-This statement skips to the next iteration and doesn’t execute the commands after the continuos in the current iteration.
- Return:This statement exits from the function without executing the code after that.
- Pass:the pass statement is a null statement.Nothing happens when the pass is executed. It results in no operation (NOP).This can be useful as a placeholder for code that is yet to be written.
Output:
0 1 2 3 4 Break
Continue:
Output:
0 1 2 3 4 Continue 5 Continue 6 7 8 9 Continue
Return:
Output:
1
2
3
2
4
3
Pass:
28.Enumerate:
If you want to loop through the index and have the element and also the index of the element of then use the enumerate function.
Output:
0 : sanjose 1 : cupertino 2 : sunnyvale 3 : fremont
Please check this stackoverflow discussion .
29.Pop,Remove and Reverse:
Pop:Removes and returns the item at index. With no argument it removes and returns the last element of the list.
l1 = [10,20,30,40,50,60,70,80]
pop(2)
returns 30
Remove:removes the first occurrence of the specified value. If the provided value cannot be found, a ValueError is raised.
l1 = [10,20,30,40,50,60,70,80]
l1.remove(30)
l1
[10, 20, 40, 50, 60, 70, 80]
Reverse: Reverses the list.
l1 = [10,20,30,40,50,60,70,80]
l1.reverse()
print(l1)
[80, 70, 60, 50, 40, 30, 20, 10]
30.Check if the list is empty:
Use len(list)== 0 or not list.
31.Check if the element exist in the list:
Just use in list like below.
32. All and Any:
- use all() to determine if all the values in an iterable evaluate to True.
- any() determines if one or more values in an iterable evaluate to True.
33.To find the n largest and n smallest numbers in a list:
To find the largest n items and smallest n items use heapq.
[800, 500, 320, 200]
[10, 25, 40, 59]
34.Check if the file exists:
Output;
True
35. Check if the file is not empty:
Output:
True
36.Copying a file and a directory:
37. os.path:
- check the current directory.
- check the parent directory.
- check if the path is a directory.
- check if the path is a file.
- check if the path is a mount point.
38.Difference between Python’s Generators and Iterators:
iterator
:any object whose class has a__next__
method and an__iter__
method that doesreturn self
.- Every generator is an iterator, but not vice versa. A generator is built by calling a function that has one or more
yield
expressions and is an object that meets the definition of aniterator
.
Please check out this stackoverflow question on the difference between iterators and generators.
39.Function with arbitrary number of arguments and arbitrary number of keyword arguments:
Output:
200 400 500 700
apple : 1 orange : 2 grapes : 2
40.Lambda Functions:
- Lambda creates a function that contains a single expression.
- No need to use return when writing lambda function.
- The value after :(colon) is returned.
- Lambda can take arguments also.
- Lambdas are used for short functions .
Check this stackoverflow question for more details on Lambda.
41.Map Function:
- Map takes a function and a collection of items. It makes a new, empty collection, runs the function on each item in the original collection and inserts each return value into the new collection. It returns the new collection.
42.Filter Function:
Filter takes a function and a collection. It returns a collection of every item for which the function returned True.
43.Python Decorator:
Check out my post in detail about python decorator.
- Decorators were introduced in Python 2.4.The Python decorator function is a function that modifies another function and returns a function.
- It takes a function as its argument.It returns a closure. A closure in Python is simply a function that is returned by another function.
- There is a wrapper function inside the decorator function.
- It adds some additional functionality to the existing function without changing the code of the existing function. This can be achieved by decorators.
- A decorator allows you to execute code before and after the function; they decorate without modifying the function itself.
- In Python, a decorator starts with the
@
symbol followed by the name of the decorator function.Decorators slow down the function call.
44. f Strings:
- f-strings introduced in version 3.6.
Chekout this blog for more f string formatting.
45.String Module:
46.Timeit — Measure execution time of small code snippets:
This module provides a simple way to time small bits of Python code. It has both a Command-Line Interface as well as a callable one.
47.Module Vs Package:
- A module is a single Python file that can be imported.
- A package is made up of multiple Python files (or modules), and can even include libraries written in C or C++.Instead of being a single file, it is an entire folder structure.
Check out this stack overflow question on the difference between module and package.
48.Copy and Deep Copy:
The default assignment “=” assigns a reference of the original list to the new name. That is, the original name and new name are both pointing to the same list object.
So do this way
49. Collections-Count:
50.Ordered Dictionary:
The order of keys in Python dictionaries is arbitrary. This will cause confusion when you keep adding keys and also in debugging.For example regular dictionary
Use orderedDict frm collections.
51. JSON File:
- Use json package to download a json file, read a json file
Write
Read and Print
52. Filter a list in a single line code:
53.Random Seed-Why it is important:
To check more about random see refer the stack overflow question.The random() function in Python is used to generate the pseudo-random numbers. It generates numbers for some values called seed value.The random.seed() function in Python is used to initialize the random numbers. By default, the random number generator uses the current system time. If you use the same seed value twice, you get the same output means random number twice.
Without using the see the output is different in each execution.
Whn using the seed, you can the same result again and again.
54. Read and Write a Text File:
55.Read and Write a CSV File:
56. Inverting a dictionary:
57. Merge two dictionaries:
Please check this stack overflow.
58.Sort the dictionary:
{‘apple’: 1,
‘bananas’: 3,
‘grapes’: 4,
‘oranges’: 2,
‘strawberries’: 5,
‘watermelon’: 6}
59.Convert a string to words:
60.Difference between list1 and list 2:
61.Load multiple CSV files into a dataframe:
Using glob
62. Unzip Files:
There are python moddules available to unzip the files
- tarfile
- zipfile
- gzipfile
zipfile
tarfile:
gzip
63.dict.get(key) Vs dict[key] Which is better:
Please check out this stack overflow one .
- If the key is not then dict[key] will raise an error.
- If you use dict.get() and a key is not found, the code will return None (or a custom value, if you specify one).
64.Difference between == and is:
is
will returnTrue
if two variables point to the same object (in memory)==
if the objects referred to by the variables are equal.
check out this stackoverflow one.
65.Convert a dictionary into list of tuples:
66. Convert a dictionary keys to a list:
67.Find out the indexes of an element in the list:
* Use enumerate
68. Flatten a list using Numpy and itertools:
There are lot of ways you can do it. You can also use numpy or itertools.
69. Removing leading and trailing spaces:
70.Difference in days and months between two dates:
In months.
71. Check if the date is a weekday or a weekend:
72. OS Module:
73.Handling exceptions-Use Try,except and finally:
74.Find out of the memory of the python objects:
- Use the sys.sizegetinfo()
75.Extract an email ID from the text using regex:
76.Regular import vs from module import Vs *:
- Regular import -import pandas as pd
- Python style guide recommendation is to put every import in single line.
import pandas,numpy,sys
This is not recommended.
import pandas as pd
import numpy as np
import sys
The above is recommended.
- from module import: Sometime you just want to import a part of the import. Python allows it using
from tensorflow import keras
from tensorflow.keras import layers
from keras.models import sequential
For example if you have imported only tensorflow then to use keras you have to do tensorflow.keras.
- from module import * : Wildcard(*) imports are not allowed or adviced to use. The reason is it will cause name space collusion. In short don’t use wildcard imports .The below is not allowed.
from tensorflow import *
- Also you can use local import instead of the glocal import. But its better to use glocal import.
77.Type Hints:
In other languages you need to define the data type before using. In python you don’t need to define the data type. In python 3.5 type hints are introduced.Please check the documentation for more info.
78. Delete a variable using del :
Also check out this stackoverflow discussion.
79. What happens when you try to change a immutable object:
Please check the stackoverflow discussion on this.
80.Triple Quotes:
- You can use triple quotes
- Also you can use backslash like \ before the quotes or you can use double quotes.
Please check the stackoverflow discussion.
81.urllib:
- urllib.request
Also check out
- urllib.parse
- urllib.error
- urllib.robotparser
Check out the python documentation.
82.Chainmap-Collections:
According to the documentation-A ChainMap
class is provided for quickly linking a number of mappings so they can be treated as a single unit. It is often much faster than creating a new dictionary and running multiple update()
calls.
83.Global Variable Vs Nonlocal Variable:
- Scenario 1-Use the global variable :Try to use the global variable in the function and you get the below error.
To fix this error declare the variable as global in the function-global glb_var
- Scenario 2: Use nonlocalvariablese.Don’t change the global variable value.
84. Walrus operator:
According to python documentation-There is new syntax :=
that assigns values to variables as part of a larger expression. It is affectionately known as “the walrus operator” due to its resemblance to the eyes and tusks of a walrus.Walrus operator was introduced in python 3.8.
Another example
Check out the documentation .
85. Better Error Messages in Python 3.10 :
For example in the previous version
In Python 3.10 version
There are lot of better error messages like below
SyntaxError: multiple exception types must be parenthesized
SyntaxError: expression expected after dictionary key and ':'
SyntaxError: ':' expected after dictionary key
SyntaxError: invalid syntax. Perhaps you forgot a comma?
SyntaxError: did you forget parentheses around the comprehension target?
SyntaxError: expected ':'
IndentationError: expected an indented block after 'if' statement in line 2
AttributeError: module 'collections' has no attribute 'namedtoplo'. Did you mean: namedtuple?
86. New match-case statement in Python 3.10:
The below example is from the python documentation.
Another example
Please check the documentation for more info.
87.For encoding and decoding you can use cryptography:
Please check the cryptography documentation.
88. Why if __name__ == “__main__”:
- The code under the condition if __name__ == “__main__”: will be executed only the python script is executed as a standalone.
- If you imported the above module in another script or module then the code under the if __name__ == “__main__”: won’t be executed.
Please check this stackoverflow question on long answer.
89.Why we need self in python Class:
Please refer to this stackoverflow discussion.
90.Why we need init in python class:
Another great discussion in stackoverflow on why we nede init. check it out.According to the documentation The __init__ method is the Python equivalent of the C++ constructor in an object-oriented approach. The __init__ function is called every time an object is created from a class. The __init__ method lets the class initialize the object’s attributes and serves no other purpose. It is only used within classes.
91. Sys Module:
- The sys module provides system specific parameters and functions.Please check the documentation for more info.
- We can see few of sys methods below
How to list all files in a directory using sys module.
9 2.Python Profiling:
According to the python documentation-cProfile
and profile
provide deterministic profiling of Python programs. A profile is a set of statistics that describes how often and for how long various parts of the program executed. These statistics can be formatted into reports via the pstats
module.Its simple. Just import the cprofile and start using it.
93. Read table data from PDF into dataframe and save it as CSV or JSON:
- Tabula-py is a tool for convert PDF tables to pandas DataFrame. tabula-py is a wrapper of tabula-java, which requires java on your machine. tabula-py also enables you to convert tables in a PDF into CSV/TSV files.
Please check this colab notebook for full implementation.
94. Extract images from a PDF file using PyMuPDF:
Checkout their github repository for more info.
95.Merging PDF Files:
You can use PyPDF2 or PyPDF4.The code snippet is for PyPDF2.
Checkout the stackoverflow for more info.
96.How to get an index of the element in the list:
Please refer for more info.
97. Add keys to existing dictionary:
Please check out this stackoverflow discussion on adding keys to dictionary.
98.Difference between append and extend:
append
adds its argument as a single element to the end of a list. The length of the list itself will increase by one.extend
iterates over its argument adding each element to the list, extending the list. The length of the list will increase by however many elements were in the iterable argument.- Please check out this stackoverflow discussion for more info.
99.How to put time delay in python script:
- use time module and time.sleep.
100. Convert a string to date :
- There are lot of ways you can do it.
101. Data Preprocessing Libraries:
The important data preprocessing libraries used heavily by data scientist, data analyst and all the data professionals are
- Pandas
- Numpy
102. Data Visualization Libraries:
Here are some of the important data visualization libraries
- Matplotlib
- Seaborn
- Plotly
- Bokeh
- Altair
103. Web Scraping Python Modules:
Some of the important web scraping libraries are
- Scrapy
- Beautiful Soup
- Selenium
- Requests
- Urllib
104. Machine and Deep Learning Libraries:
Some of the popular machine learning and deep learning libraries are.
- Sci-Kit Learn
- Keras
- Tensorflow
- Pytorch
- Mxnet
105: Python Excel Libraries:
- xlwings
- XLsxwriter
- xlrd
- pyexcel
Check the link for more info.
For example using xlwings to view a dataframe in excel.
106. Commonly used Doctsrings:
- reStructuredText (reST) format:
- Google format: For example
- Numpy format:
107. Iterable vs Iterator vs Iteration:
108. Remove the spaces in a string:
Use strip() ,replace() and split()
109. Find all the text files in a directory or any file type in a directory:
Please check the StackOverflow for more info.
110. PIP vs Conda:
Please check this one for more info about conda.
111. Remove sensitive information like name, email, phone no, etc from text:
- Use the scrubadub to hide the sensitive information like phone no, name, credit card details etc.
Check the documentation for more info.
112. Faster pandas dataframe using Modin:
- In pandas, you are only able to use one core at a time when you are doing computation of any kind. With Modin, you are able to use all of the CPU cores on your machine.Please check their documentation on installation and getting started guide.
import modin.pandas as pd
df = pd.read_csv("my_dataset.csv")
113. Convert a python notebook into a web app using Mercury:
pip install mljar-mercurymercury watch my_notebook.ipynb
The watch
command will monitor your notebook for changes and will automatically reload them in the Mercury web app.Check out their github repository for more info.
114. Apache Spark-Pyspark- Pandas API:
Pandas API is available in Spark 3.2 version which got released sometime in october last year. PySpark users can access to full PySpark APIs by calling DataFrame.to_spark()
. pandas-on-Spark DataFrame and Spark DataFrame are virtually interchangeable.
Conclusion:
Hope you find this article useful. Again python is super popular and have been used in every field these days. The main purpose of this article is to serve as a reference for you. For example if you want to know how to merge two PDF files then come and check the code snippet which points to the python module PyPDF2/PyPDF4 and then deep dive into the github repository of the libraries.Please free to connect with me in linkedin. Thanks!