When I’m working with data, I probably spend 90% of my time cleaning. In a recent project, I ran into an issue where I needed to make ZIP codes consistent – some were five-digit, some had the plus-four, some had a hyphen while others didn’t, and some were just invalid. I searched for a simple […]
What this is for: Extracting bulk data from the NPI registry and downloading as CSV Requirements: Knowledge of JSON, Pandas dataframes Concepts covered: Making an API request, flattening JSON, combining dataframes Download: Download the Jupyter notebook If you work with health care provider data, you are no doubt familiar with NPI (National Provider Identifier) data. […]
What this is for: Querying Facebook’s ad library for political or issue ads and downloading results to .CSV Requirements: Verified Facebook ID, Facebook developer account, Basic understanding of APIs and Pandas dataframes In 2019, Facebook launched its Ad Library to improve transparency in advertising, allowing anyone to easily search ads that pages are running. While […]
What this is for: Forecasting seasonal data Requirements: Python Anaconda distribution, Understanding of statistics and experience with machine learning Concepts covered: Calucating confidence intervals and forecasting future values with pmdarima library Download the Jupyter notebook One of the more helpful applications of data science to marketing is developing forecasts. You can use forecasts to predict […]
What this is for: Collecting comments from a public Facebook page and performing a basic content analysis Requirements: Python Anaconda distribution, basic understanding of HTML structure and Chrome inspector tool Concepts covered: Social listening, word clouds For anyone who works in strategic communications, it’s critical that you have a finger on the pulse of your […]
What this is for: Telling a Selenium web scraper to click on link or enter basic information into a form Requirements: Python Anaconda distribution, Basic knowledge of HTML structure and Chrome Inspector tool Concepts covered: Selenium Selenium is an incredibly useful tool for scraping websites with Python, but occasionally your scraper may need to interact […]
What this is for: Isolating elements you want to scrape with Selenium Requirements: Python Anaconda distribution, Basic knowledge of HTML structure and Chrome Inspector tool Concepts covered: Selenium, XPath Occasionally when you’re testing or scraping a web page with Selenium, you may need to select an element or group of elements where you may only […]
What this is for: Analyzing and visualizing CTR and search position for organic search terms Requirements: Google Analytics data connected to Google Search Console, Python Anaconda Distribution Concepts covered: Simple data cleaning, Pandas dataframe, Matplotlib, search engine result position, click through rates Complete python file available for download We all know search rankings matter. Users […]
What this is for: Scraping web pages to collect review data and storing the data into a CSV Requirements: Python Anaconda distribution, Basic knowledge of HTML structure and Chrome Inspector tool Concepts covered: Selenium, Error exception handling Download the entire Python file In an earlier blog post, I wrote a brief tutorial on web scraping […]
Update: Since writing this post, Google has removed documentation for Python for this API. My code still works for me, but as this is not a supported language, I’d recommend building in a different language. Read more here. What this is for: Getting reviews from multiple Google My Business locations and converting data into a […]