Utilizing Google Search more effectively for finding data

Parul Pandey
Towards Data Science
4 min readSep 14, 2020

--

Image by Author

“Data! Data! Data!” he cried impatiently. “I cannot make bricks without clay.”

Sherlock Holmes in “The Adventure of the Copper Beeches,” Sir Arthur Conan Doyle

The importance of data cannot be emphasized enough in a data science process. The outcomes of a data analysis task represent the kind of data that has been fed into it. However, sometimes getting the data in itself is also a big pain point. Recently, I did a short course titled Data Journalism and Visualization with Free Tools, and there were some great resources shared through that course. I’ll be sharing some of the valuable tips through a set of articles. In these articles, I’ll try to highlight some ways you can find data on the internet for free and then use it to create something meaningful out of it.

This article is part of a complete series on finding good datasets. Here are all the articles included in the series:

Part 1: Getting Datasets for Data Analysis tasks — Advanced Google Search

Part 2: Useful sites for finding datasets for Data Analysis tasks

Part 3: Creating custom image datasets for Deep Learning projects

Part 4: Import HTML tables into Google Sheets effortlessly

Part 5: Extracting tabular data from PDFs made easy with Camelot.

Part 6: Extracting information from XML files into a Pandas dataframe

Part 7: 5 Real-World datasets for honing your Exploratory Data Analysis skills

Advanced Google Search

Let’s begin with the advanced Google search, one of the most common ways to access publicly available datasets. By merely typing the name of the required dataset in the search bar, we can access a plethora of resources. However, here is a simple trick that could ease this process to a great extent and help you find files with specific types on the internet.

1. Using Filename and extension of the file to be downloaded

Let’s say we have a task at hand to find healthcare-related data in CSV format. A CSV file indicates a…

--

--

Principal Data Scientist @H2O.ai | Author of Machine Learning for High-Risk Applications