Google Search API Leak: SEO Hot Takes, Helpful Tools & Making Sense of the Hype

Google confirmed the authenticity of 2,500 leaked internal documents recently, and the SEO industry is freakin’ out, man. These documents reveal details about the data Google collects and how it might have been used in their super-secret search ranking algorithm.

SEO OGs Rand Fishkin and Mike King were the first to dig into the details, each sharing their initial analyses. Since then, we’ve seen a ton of hot takes – but how do you know what to believe? Which bits and pieces may actually be useful?

Let’s take a look at how the Google Leak occurred, who’s been talking about it, and what interesting nuggets have surfaced since the story broke on May 27.

An important caveat before we jump in: Every time Google sneezes, the SEO industry goes wild – and this was a big one. There are all kinds of theories and opinions being floated, and the only guarantee is that each one feels valid to the person it’s coming from. Personally, I find myself landing somewhere between “meh” and “let’s wait and see how it shakes out” on most earth-shattering SEO news, but then I’m jaded and desensitized after twenty years of it. Learning to take in a massive amount of information, filter out the noise, and get busy testing what actually matters to you is an important skill to develop in SEO.

All of that is to say, don’t feel like you have to “know the answer” or are somehow falling behind because everyone else seems so sure they know what these leaked documents mean. If you’re looking to expand your knowledge on this Google Search API docs leak and get a bigger-picture view of what it’s all about, hopefully, these takes will help inform a well-rounded view and give you plenty of new things to test and play with.

Disclaimer: Sisters in SEO often refers and links to websites, tools, apps, and other content that can help improve your skills and build your business. Sometimes, we receive compensation if readers sign up or make a purchase. A sister’s gotta eat! 👏 👏🏾 👏🏼

First up, Rank Fishkin and Mike King verify the Google Search API leak.

In a May 27, 2024, blog post entitled ‘An Anonymous Source Shared Thousands of Leaked Google Search API Documents with Me; Everyone in SEO Should See Them,’ Rand Fishkin, founder of Moz and now CEO of SparkToro, shares the process by which he received and then validated the leaked documents. The person who brought him the information initially wanted to be kept anonymous. However, Erfan Azimi outed himself the very next day in a 13-minute YouTube video.

“In early May, I made the decision to bring to Rand the documents that I had. Previously, before this particular leak, I’ve spoke to certain Googlers who are no longer working in Google Search,” he said. Erfan went on to explain that he originally wanted to make a documentary in which former Googlers would appear on-camera but anonymized and spill the algorithm’s secrets. His source chose not to be involved in this.

You can watch the video below to get his full story, but in short, Erfan felt Googlers like John Mueller weren’t being honest when responding to SEO questions, and the truth needed to come out. He had an interaction with John sometime around 2021 that he felt was a sure indication that click data and Chrome data were being used in Google’s Search algorithms, despite Google’s denials.

Back to Rand… he reached a point in analyzing the leaked documents at which he wanted to bring in an API ringer. As Rand said, “I’ve worked with APIs a bit, but it’s been 20 years since I wrote code and 6 years since I practiced SEO professionally. So, I reached out to one of the world’s foremost technical SEOs: Mike King, founder of iPullRank.”

Over the course of a 40-minute call in which they reviewed the documentation together, Rand says it became clear to him: “this appears to be a legitimate set of documents from inside Google’s Search division, and contains an extraordinary amount of previously-unconfirmed information about Google’s inner workings.

The next piece to hit the SEO airwaves was Mike King’s ‘Secrets from the Algorithm: Google Search’s Internal Engineering Documentation Has Leaked‘ on May 27.

Mike took a deeper dive into the leaked internal documentation for Google Search’s Content Warehouse API. “While there is no detail about Google’s scoring functions in the documentation I’ve reviewed, there is a wealth of information about data stored for content, links, and user interactions. There are also varying degrees of descriptions (ranging from disappointingly sparse to surprisingly revealing) of the features being manipulated and stored,” he wrote.

So – take an hour here when you can, and read those pieces as a starting point. It doesn’t all need to make perfect sense, and you don’t have to understand all the concepts. But this is what others are talking about, so it’s important to have the base context first. With that under your belt, let’s move on to a few other resources that may help you make sense of the Google Search API Leaks hype.

Marie Haynes: What is this leaked Google code?

I love me some Marie Haynes perspective. She has a way of cutting through the noise to get to what matters. In this case, Marie says, the documents provide attributes and modules that, while not part of Google’s ranking algorithms, can offer valuable insights into how Google structures and uses data. These attributes, such as those related to NavBoost and quality raters, may indirectly inform us about Google’s ranking considerations and methodologies.

This is way oversimplified, of course… read her full article here (you’ll find a link there to part two, where she talks more about attributes).

Aleyda Solís: Let’s not read too much into this

Aleyda weighed in on the Google Leaks issue June 3 via LinkedIn, reminding readers it’s important to “take whatever is shared in SEO – from Google or not – with a healthy dose of skepticism. She recommends that SEOs have a proactive “test for yourself” mindset, to help you determine whether something is actually useful/impactful and makes sense in your particular case.​

You can read more of Aleyda’s take in her latest SEOFOMO newsletter. One of the things I (and many others, clearly) love most about Aleyda is her ability to quickly curate data and her willingness to share it openly. She’s already pulled together a spreadsheet of Leaked Google Search API Doc resources including tools, analysis, and interviews that’ you’ll want to check out that’s growing by the day. Check it out:

Chima Mmeje: This tea is too freaking good

Chima shared a recap of Erfan’s video and kicked off a delicious conversation about the controversy in the Sisters in SEO FB group. Her quick take? Stop obsessing over Gen AI. Collect topical links like infinity stones because they still matter. Build websites that provide helpful content because that naturally attracts a lot of clicks and engagement.

And, of course: If you’re not thinking of Brand Authority, are you even doing SEO in 2024? Sisters in SEO members can read Chima’s post here in the group.

Andrew Ansley: The Google algorithm leak lifts the veil for SEOs

Among the key takeaways Andrew has identified from the leaked docs are:

  • Google has 7 different types of PageRank mentioned, one of which is the famous ToolBarPageRank.
  • The most important components of Google’s algorithm appear to be NavBoost, NSR, and ChardScores. These three components of Google’s search algorithm directly conflict with what Google publicly discloses.
  • Nearest Seed has modified PageRank. The algo is called pageRank_NS and it is associated with document understanding. Specifically, this is a more focused version of Page Rank that can theoretically be used with clusters and low quality pages for a variety of purposes.

His article is definitely worth a read, and you’ll find it here.

Nina Clapperton: Here’s what the Google Search algorithm leak means for bloggers

Nina reminds us to take a breath and says, “The sky isn’t falling.” She inserts much-needed caution and calm to the conversation. “Just because this data exists, doesn’t mean it’s up to date. It also doesn’t mean that every attribute existing in the API data is in use. How many of us have half written blog posts taking up space in our Google Drive? It could very well be similar,” she wrote, and continued, “Rand noted that since this document is for internal use, and it seems to show attributes that are no longer in use, that it’s likely it’s pretty up to date. I’m not discounting that, but we can’t be 100% sure what’s being used.”

She’s already put out an hour-long video on it, which is worth a watch (especially if your eyes are bugging out from all the reading so far).

Nina shared a list of 17 finds that will be particularly interesting to bloggers and content creators, so be sure to check those out here. She also shared this searchable database of all 14,000+ entries, created by Dixon Jones from InLinks. Damn, those SEOs are quick to make sense of massive amounts of data… like it’s their JOB, I tell you.

Andrew Shotland: Here’s what the Local SEOs among us should be looking at

Andrew Shotland at Local SEO Guide identified 271 Google API Docs and 461 attributes as relevant to Local SEO. To do so, he and his team identified various terms that implied “local,” then crawled them with ScreamingFrog V. 20 connected to ChatGPT and had it summarize each document. You can scroll through their summarized documents and data structure in Andrew’s blog post here.

Wil Reynolds: Balance how deep a dive you do on the Google Algorithm doc leak

In a lengthy post on LinkedIn, Wil Reynolds from Seer Interactive calls for caution and balance. “Does the Google algorithm leak help you better understand your customer? No. Does it help you better understand if your customer is getting answers somewhere other than Google? No,” he writes. “IMHO understanding Google better will not help you understand your customer better.”

While the leak gives SEOs plenty of things to test, it’s important to keep the info in context. Read more of Wil’s take here on LinkedIn.

Rank Fishkin: The Google Leak in 7 mins

Oh hey, it’s Rand again. Fair enough, he was first to the table so he gets an extra helping from the bottomless and completely imaginary Sisters in SEO publicity buffet. Plus, I think we can all appreciate those who are making the effort to distill this monstrosity down into something manageable for the attention-deficit among us. 🤓

You’ll find this video in his May 31 blog post, ‘The Google API Leak Should Change How Marketers and Publishers Do SEO.’

Patrick Stox: Remember, everyone interprets things through the bias of their own experience

It’s a great reminder, and we sometimes need to hear it again. As Patrick Stox shared on Ahrefs, “Having some features or information stored does not mean they’re used in ranking. For our search engine, Yep.com, we have all kinds of things stored that might be used for crawling, indexing, ranking, personalization, testing, or feedback. We store lots of things that we haven’t used yet, but likely will in the future.

It can be dangerous to get caught up in the hype. Ask anyone who’s been in SEO for a few years, and I bet almost all, if not every single one of us, have been asked to do something we don’t agree with because this SEO said this on their blog or that SEO suggested that at a conference. Patrick’s ‘Google Documents Leaked & SEOs Are Making Some Wild Assumptions‘ is on the Ahrefs blog, and it’s well worth a read.

If there’s only one thing you get out of this blog post, I hope it is this…

No magic bullet secret has been revealed in the Google Search API Leak.

We all have some reading to do, and plenty of potential factors to consider. We’ve all had an important and timely reminder that Google reps aren’t always going to answer SEO questions fully and accurately… such is the nature of the beast.

Come on, did you really think we were going to “solve” SEO? That would be no fun at all! SEO will continue on as the wild, unpredictable ride we may love but will never really know – not fully. And I, for one, wouldn’t have it any other way.

share this post:

Facebook
LinkedIn
Pinterest
Twitter
Reddit