Search console API in R

How to use Google Search Console API in R

One of the big assets of R is to be able to overlap large amount of data and process large amount of data. So, use R in SEO can be an asset to quickly analyze data from several different services.

Let's discover how to use API in R with one of the most popular for SEO: the Google Search Console API.

Why use R to explore the Search Console API?

The asset to use the API is to be able to extracts dimensions and metrics adapted to your project. You can automatically integrate data about the visibility of your websites on your data frames.

Moreover, this API allow you to overlap dimensions that aren't available on the web version of the Search Console. So, if this one is limited by only one dimension (query or page for example), the API allows you to get a more granularity. To explore the Search Console, we will use 2 libraries:

  • googleAuthR: a library to be authenticated on Google services
  • searchConsoleR: the library that we will use to explore Search Console data

These packages are created by Mark Edmonson and are available on the CRAN repository, the default package repository for RStudio. The first step will be to install these packages and load them on our programming environment with the following lines.

install.packages("googleAuthR")
install.packages("searchConsoleR")
library(googleAuthR)
library(searchConsoleR)

Google Search API authentication in R

Once the package is loaded, the first step will be to specify the services you need to use, in order to get the rights and then authenticate to the Google services. The easiest way to connect to the API is to use the web browser authentication. To do that, use the scr_auth() function.

options(googleAuthR.scopes.selected = "https://www.googleapis.com/auth/webmasters")
scr_auth()

This command line will open a window to request your authorization access for Search Console R.

Then your token will be stocked at the root of your work repository, on the sc.oauth file.

Once you'll be authenticated, you can return to RStudio. Note that you can be authenticated thanks to the Service Key. This way is more efficient to automatize your script. To use this mean of authentication, use the following command line.

gar_auth_service("myKey.json")

This way is more complex but can be essential to automatically launch your scripts and limit manual operation

Setting Search Console API exploration

Now that you're authenticated, we will explore data of websites associated to your account.

To do that, we will configure 5 variables.

  • website: URL of the website to analyze, as it's mentioned on the Search Console
  • start: older date for the analysis
  • end: most recent date for the analysis
  • download_dimensions: dimensions to download (4 dimensions are available: query, page, device, country)
  • type: type of research, mobile or web le type de recherche, mobile ou web

Before to launch our analysis, let's take two minutes to explain our plan.

For our example, we will study the visibility of the website aseox.fr. This analysis will be focused on the last 90 days. Note that the Search Console have a delay of 3 days before to make available these data, that's why the end date will ever take this delay into account.

The downloaded dimensions can be the query, the page, the type of device or the country. For our example, we will study the visibility of queries by page.

At last, we will only study the web researches, the video or picture researches will be excluded from this analysis.

Now that the problem is explained, there is how we will be adapted it on our script:

# Website settings
website <- "http://www.aseox.fr/"
# Data available D-3
start <- Sys.Date() - 93 #we begin 93 before (delay of 3 days)
end <- Sys.Date() - 3

# Dimensions to download: data, query, page, device, country
download_dimensions <- c('page', 'query')

# Type of research
type <- c('web')

That's it, it's the R script adapted to resolve our problem.

Your configuration is ready, you just have to launch the exploration.

How to explore Search Console API in R?

To explore the Search Console API, we will use the search_analytics() function from the searchConsoleR package. This function will return a data frame, we advise you to keep it on an object. Our parameters are configured, we just have to launch this function.

searchquery <- search_analytics(siteURL = website,
startDate = start,
endDate = end,
dimensions = download_dimensions,
searchType = type,
rowLimit = 5000)
print("Search Console API Request : completed")
searchquery <- dplyr::arrange(searchquery, desc(clicks), desc(impressions))

To improve the visibility, these data are descending sorted at the end of the analysis. Your analysis is now available on the searchquery data frame.

You will find the clicks, impressions, ctr and position that are available on the Search Console but detailed by page and by query. This query allows us to study the performance and the ranking of a page for a query or a set of specific queries, data really useful to study the ranking of your pages.

To go further with the Google Search Console API and R

The searchConsoleR package have several functions to use data of this service. For example, you can have the list of your accounts and the associated alerts. Moreover, use this script on a function and integrate it on a monitoring project will be a great improvement to automatize your monitoring and the crosscheck of your SEO data.

Search console analysis in R: the script

Feel free to test this feature by yourself, find below the script to analyze your visibility with the Search Console API in R.

# Search Console Analysis
install.packages("googleAuthR")
install.packages("searchConsoleR")
library(googleAuthR)
library(searchConsoleR)

# Website settings
website <- "https://www.example.com"

options(googleAuthR.scopes.selected = c("https://www.googleapis.com/auth/webmasters"))

# Web browser authentication
scr_auth()

# Data available D-3
start <- Sys.Date() - 93 # we begin 93 before (delay of 3 days)
end <- Sys.Date() - 3

# Dimensions to download: data, query, page, device, country
download_dimensions <- c('page', 'query')

# Type of research
type <- c('web')

#Search console API query
searchquery <- search_analytics(siteURL = website,
startDate = start,
endDate = end,
dimensions = download_dimensions,
searchType = type,
rowLimit = 5000)

print("Search Console API Request : completed")

searchquery <- dplyr::arrange(searchquery, desc(clicks), desc(impressions))