How to analyze trends in content marketing

Taking a quick glance at your content marketing data, you can usually glean a couple of insights and major themes.

While this anecdotal evidence can be beneficial, sometimes we need a deeper dive – the insights might not always be clear, or perhaps you have a theory that you want to test.

Are certain topics more engaging than others? Are there factors that work together to create better content?

In this tutorial, we’ll look at how to do some basic statistical analysis of blog data using Google Analytics and Excel, but you could use the same principles for Facebook, Twitter or other social data. By adding some data, we’ll be able to create segments and glean more meaningful insights.

There are many paid tools that will help drive insights, but we’ll take a look at some simple methods using free or readily available tools.

Clean the data

First, we’ll need a clean dataset. I’m using data from the All Pages report in Google Analytics (Behavior > Site Content > All Pages).

In this example, I’m only interested in blog posts, so I’ll run a query in Google Analytics to only show these pages.

Depending on how your site is set up, this may be straightforward or may involve some additional steps or workarounds. If all blog posts begin with the same path (/blog or /news or something similar) you can simply create a query for pages that begin with that path.

Once you download the data as an Excel spreadsheet, look for any major issues that will affect your analysis. Common issues with URLs might include fragments (/blog/entry-one#fragment) or parameters (/blog/entry-one?fbclid=123455) that are not stripped from the page path. (Learn more about this here).

Next, we’ll isolate the dependent variables we want to track. In this example, we’re most interested in pageviews, average time on page and bounce rate, so we’ll remove the page value, entrances, and exits columns.

Adding variables

From doing a peripheral scan of our data, we should have a general idea of trends we want to examine more closely.

For example, if several of the top blog posts appear to be customer case studies, maybe we want to test whether these posts perform better than posts about new products. Or maybe we want to compare blog posts among different writers. Perhaps we want to test the correlation between word length and engagement or between the number of links and bounce rates.

The difficulty and time involved with this step will depend on how your website is structured and what data you have readily available. If you utilize tags, categories or other similar data on your blog, you may be able to pass these values to Google Analytics via custom dimensions and actually do some analysis within Google Analytics. If there are variables that you would find value in regularly measuring, it is well worth your time to set these up as custom dimensions.

If this data is not stored in Google Analytics, you will need to enter into your spreadsheet manually or obtain from other sources.

For example, if we are going to compare blog posts among different writers, we can create a new column called “Author” and fill in the author data for each row.

We could also use categories or topics to slice our data. For example, we could create a “Category” column and fill in the appropriate category.

Similarly, we could create columns for any other variable you want to test such as the type of visual used, whether a video was embedded, or other variable you suspect might influence performance.

Analysis

Once we have added the additional data we want to analyze, we’ll create a pivot table to summarize the data. Click on the Insert tab and select Pivot Table.

First we’ll compare blog posts among authors.

Drag the Author field to the Rows box. Then drag Pageviews, Avg. Time on Page, and Bounce Rate to the Values box. Click on the drop down arrow for each of the Values fields and change from sum to average.

This gives us a table with high level summaries of our blog data segmented by author.

Next, we’ll add another layer to this data. Suppose you want to see the big picture of how different categories are performing, but within those categories see data broken down by each writer.

Simply drag the “Category” column into the Rows box above the Author field.

Testing Correlations

Now suppose we wanted to see if there was a correlation between word count and average time on page. The hope is that the longer the post, the longer our visitors are spending on the page.

To test this, we could create a new column called “Word Count” and then add the word count data for each blog post.

Once the data is entered, it will help to first visualize with a scatter plot. Highlight the two columns then click on the Insert tab and select Scatter chart.

Open the Chart Elements options and select Trendline. The chart shows word count on the X axis and average time on page in seconds on the Y axis.

The trend is generally as we expected, but it doesn’t appear to be a very strong correlation. Next we’ll calculate the correlation coefficient to measure the strength of the correlation between time on site and word count.

To do this, you will need the Analysis ToolPak installed (free).

Remember that a correlation coefficient is a number between -1 and 1. A 0 suggests there is no correlation between the two data sets, while a 1 indicates a perfect positive correlation and a-1 indicates a perfect negative correlation.

To calculate the correlation coefficient, we’ll move the word count column next to the average time on page column.

Click on the Data tab then select Data Analysis. When the Data Analysis prompt appears, select Correlation.

For the input range, select the two columns. Check the box that the first row contains headers.

For this data set, Excel returned a correlation coefficient matrix showing a correlation coefficient of 0.19. This confirms what we saw in the scatter plot and trend line – that there was a weak correlation between word count and time on page.

This analysis tells us a couple things:

  • Increasing the word count does not significantly increase the time on site. This might provide some guidance on how long users are willing to spend on a page, and therefore how long your content should be.
  • There may be other variables that are more strongly correlated with time on site. For example, maybe the presence of video content or photo galleries has a stronger correlation, or maybe the topic of the post is a strong indicator.

Grouping pivot tables

We may want to segment our data based on bounce rates and then look for commonalities among articles with lower or higher bounce rates. To do this, we’ll set the Columns to Categories, the Bounce Rate to Rows, and the Count of Bounce Rate to the Values.

The resulting table will be overwhelming because each bounce rate value will create its own row. To fix this, we’ll group the bonce rates by right clicking on the column and selecting Group.

The new table is much more user friendly and allows us to better see how bounce rates compare across different categories.

As you dig deeper, an important thing to keep in mind is there may be many factors that you may not be controlling for. For example, if some posts received more page views, perhaps it is because they were promoted on social media while others were not.

You may also like