We spoke to the CEO of Holistic SEO, Koray Tuğberk GÜBÜR, about all things SEO - including semantic search, knowledge graphs, tools and more.
When was the last time I learned something in SEO that opened my eyes?
The first rule of thumb as an SEO is to always be hungry for knowledge and innovation. However, when you read 4 to 5 hours of articles every day, it becomes very difficult to be surprised at something new. That's why I don't get excited when the so-called industry leaders present something I learned years ago, and what the Google engineers announced years ago as a novelty.
Also, look at the situation from Bill Slawski's perspective. He announced the "entities" exactly 14 years ago. Google was 10 years before that. Do you think he is surprised by something new? I do not think so.
Many times, an industry leader comes to me and tells me an idea. Unfortunately, sometimes I pretend to be surprised so as not to be offensive. In reality, they are relaying to me an item I read and tried probably 2 or 3 years ago.
Another problem is that most industry leaders are very bad at conceptualisation and standardisation. They call the same things by different names. They produce false definitions. By not seeing the big differences between small things, they make up wrong methods. This has turned into a shiny thing syndrome, which I don't like at all.
For example, when I publish my own SEO Course, you will probably be introduced to concepts from over 200 intersections of Semantic and SEO. Some straight from Google Engineers, some from me. However, they all have clean and sharp definitions.
So what most SEOs are surprised about is last year's news in my case. In Bill Slawski's case, it's news from the year before that!
“What most SEOs are surprised about is last year's news in my case.”
Therefore, when I use some concepts that surprise and delight me, it may not be very clear. For example, context specifying and context deepening, or query processing and query phrasification.
The difference between a topic and a context, the difference between a processed query and a raw query, and many other concepts are among those that have puzzled and improved me in recent years. In technical SEO, the concept of “cost of retrieval” from Trystan Upstill, or “crawl resource allocation", and its interest in "links on homepage" are among the surprises.
In the PageRank field, the concepts of topic-sensitive PageRank, and RankMerge are among them. Those who have finished any book on information retrieval and are familiar with the everyday problems of search engine engineers can easily understand these concepts.
That's why the SEO Industry Leaders I'm talking about come largely from the "editorial" and "verbal" schools. In other words, they distribute the same deck with different orders in different years and countries. After a while, everyone starts to take SEO lightly, however, with the energy required to train an SEO, you can train at least 5 developers.
What I've done in recent years is to increase the value and depth of SEO, to increase the respect for it. So even though I don't learn much from the industry, there are still things that surprise me in search engine research papers and patents.
For example, MUM is spoken about today when the research paper is actually from 2019. Recently, the search engine received patents on performing searches with hand gestures. We can talk about that in the future, but the gap between those who are surprised and those who already know will not close.
In 2019, I made three promises to myself.
The first was to create 10 SEO Case Studies. I have published 7 SEO Case Studies so far, including 18 websites and SEO Success stories and over 200,000 words.
I have now written 5 more SEO Case Studies, written with 16 websites in total, and they are around 70,000 words.
I am very happy to have kept this promise. My promise was to leave documents, ways of thinking, methods, and conclusions that could guide someone 100 years from now if they were to do research on search engines or similar technologies.
“My promise was to leave documents, ways of thinking, methods and conclusions that could guide someone 100 years from now.”
Search Engine Optimisation is not only for ranking websites, but it is also highly relevant to probability analysis, and comparative thinking, along with creative thinking methods. It requires a high level of discipline, and multi-tasking skills from coding to marketing.
Therefore, it is more important to me than money to be able to create permanent resources on behalf of SEO and to do so by documenting it in the most inclusive, consistent, and proving manner possible.
In this context, my biggest wish is to be able to completely change the SEO Industry.
The four key facts that should always be on an SEO's mind are:
1. PageRank
2. Information Retrieval
3. Information Extraction
4. Cost of Retrieval
Cost of retrieval is the ratio of the cost experienced during crawling, understanding, evaluating, indexing, ranking, and serving, and the value obtained afterwards. Therefore, in order to build a rank-worthy website, one must understand how the search engine will approach the source regarding PageRank, IR, and IE.
Within the scope of PageRank, link graph understanding, and link labels should be kept in mind.
For information retrieval, subtypes such as term-weight calculation and IR Types, for example, boolean retrieval, should be kept in mind.
For information extraction, query processing, question generation, and question and answer pairing should be kept in mind.
In this context, "Evaluation" or "Serving" can be considered individually. For example, saying "Evaluation" will iteratively change the rankings in a rotating fashion in case of any canonicalisation errors, or Indexing Signal Conflict. This situation creates "Ranking Signal Dilution".
Or, Google search engine gets a 404 status code from Robots.txt pages 25% of the time. This makes it difficult to understand the source.
Or, they rank "web entities" instead of websites. Web entities include websites, social media profiles, employees, offices, and company owners. One Identity Graph competes with another for reliability.
In this context, from Technical SEO to Semantic SEO, there are many basics to keep in mind. However, they all basically refer to the four concepts.
There is a tool inflation in the SEO industry. Many SEO tools do the same thing with different designs. For the most part, I don't use a dedicated SEO tool, much of what I do comes from insight that is pretty hard to measure and requires search engine understanding.
However, Advertools and Python are of particular importance to me. I like to code and modify my own tools in a jiffy. Google Chrome, DevTools, or Firefox Mozilla Developer are fundamental tools.
As a resource, I use fields such as Google Research Papers, Google AI, Google Blog, Microsoft Bing Blog, Google Patents, Microsoft Patents, and SEOBythesea for most of my research. However, I can easily recommend tools such as Keyword Cupid, Ahrefs, JetOctopus, OnCrawl, and NewzDash to many people. Especially after Russia attacks Ukraine, I started to support all SEO tools in Ukraine.
“After Russia's attacks on Ukraine, I started to support all SEO tools in Ukraine.”
I sometimes use Screaming Frog if I want to speed something up using a tool. Sometimes I do things directly with formulas on Google Sheets. However, most of the time this depends on the level of the client and the project.
A lot of my time lately has been spent teaching. Although I usually recommend Python to my employees, I realised that it takes more time to reach that level. That's why we are progressing with them using many different tools such as Screaming Frog, JetOctopus, Ahrefs, Authoritas, SEOTesting, and NewzDash.
I'm in love with every SEO Task that allows me to discover something new. Sometimes this can be creating a topical map, or designing a semantic content network.
In some cases, it may be to see how the language and region differences create a PageRank dilution on the link graph and to calculate and report whether it’s worth translating the website.
“I'm in love with every SEO Task that allows me to discover something new.”
In some cases, when performing Technical SEO tasks, it may be to find out how many seconds each extra request delays DOMContentLoad. It could be to measure the cost to Googlebot in terms of time, in a year.
That's why I called my company Holistic SEO. But I can definitely say that the things I hate most in the business are the things that are not related to SEO, for example, meetings, emails, or accounting…..
Perseverance and Diligence.
Most people, especially in the “editorial professions”, are extremely lazy. They have no military discipline. They don't see or feel that the incorrect data they give to the Google search engine today, or an indication of poor quality, can destroy a business after 6 months.
The date of August 1 is unlucky for me, for example. On August 1, 2018, Medic Update destroyed the PBN I was managing and many of the black hat techniques I used.
On August 1, 2020, the servers crashed on the first international SEO Case Study I published, Hangkredi.com. Therefore, I had to regain the same traffic and more in 3 months. If it wasn't for this mistake, I could have come here 3 months earlier!
This example is quite apt. Because I understood from Googlebot's "crawl delay" and "crawl frequency" that the server was getting tired and warned the CTO. I did not rest. After 2 days, there was a trending event, and the servers were down for 9 hours. Competitors regained the rankings by taking all the historical data.
Even this is about the "Multi-stage Query Processing" patent. If there is a trending event, Google will raise the pages related to the trending entity. Even if they are old pages. If, meanwhile, you are giving status codes 404 or 500, it will take a long time for the search engine to trust you again.
I know that small things like this can create irreversible errors. Therefore, I take my job seriously. However, I must say that over the past year, I haven't been as sharing or helpful as I used to be. This is because I've seen many international leads have different kinds of ulterior motives. However, I'm still the same for a client or a former client that converts to ally.
Therefore, perseverance and diligence are the two most basic reasons.
“Perseverance and Diligence.”
If I had to add a few more, I might add creativity and solitude. In the field of SEO, you often do not find support. Therefore, I am well aware of the suffering of many SEO consultants.
In fact, both in my course and on my YouTube channel, I want to help them by telling them: You can make loneliness a friend, it's okay to read articles for 5 hours a day or work 16 hours a day.
Another reason might be "honesty". When I worked in agencies in Istanbul, I saw that every agency lied to every client. For example, the reason for a traffic drop may simply be called a "trend change", whereas the main reason is a minor Google update made recently. Or, I've witnessed moments where customers are unfairly overcharged or simple tasks are exaggerated... I don't do that in my own business.
This pushes me to develop myself more. That's why my articles are so detailed because fact and proof go together. This also attracts people's attention. When you say something, it must have weight.
In my agency, I think there are five different religions and four different national identities. I sometimes learn the languages of my employees from different cultures. In some cases, I research their religion and celebrate their holidays and special days.
As for men and women, I think most of the people in my agency are women. I have female and male team leaders.
I think it is necessary not to focus on it to provide "diversity". In my case, I have no expectation or lack of expectation, neither in terms of genders, origins, or beliefs. In this case, I give the opportunity to anyone with a good personality and talent.
When I look back after a while, they already come from different places, histories, and stories. This inadvertently provides diversity.
“I give the opportunity to anyone with a good personality and talent.”
In a few of my projects, especially USA-based projects, I have encountered the situation of "overlooking". There is a perception of being "in the center" from study times, meetings or other matters. Sometimes, my employees also got back to me about the same issue. I think it's essential to make every employee feel like they belong, regardless of their background. When you achieve this for everyone, kindness and diversity come naturally. And for that, sometimes you have to send some customers.
Some of these are social skills, and most of them are about where I come from in Turkey. Another part is professional skills, and a lot of it is about where SEO is going.
My biggest mistake for social skills was assuming people were honest and fair, too. My father was a civil servant, a teacher. Therefore, he always instilled righteousness in everyone, and I think I suffer from it too.
I put it this way because, when you are working in agencies in Turkey, if you are very good at your job, nobody will like you. Even if you help and close the gaps, you will not be loved. I always work, get results, and wait for it to be noticed. It took me a long time to see that this is not so.
Therefore, I stopped waiting to be appreciated and moved quickly to act according to my own wishes. From this point on, I understood the importance of establishing a network and showing it as much as achieving success.
This can sometimes be harmful. For example, don't be mad at me, but I call many people "fake industry leaders" because I don't see any unique success or experience behind them. There are people who are rising in the field of SEO without ever doing SEO.
The reason for this is purely networking and empty marketing. In this regard, I would advise newbies in the SEO field to be successful and oversee marketing. Thus, your difference is better revealed.
As for the professional skills, I would definitely like to have entered the fields of front-end development, back-end development, data science, machine learning, and NLP earlier. The reason I learned web development was because of bad developers. Because when they didn't do their job, I had to do it to get a case study.
NLP was born out of personal curiosity and interest and, of course, Semantic SEO. I still need to learn a lot in the field of ML, likewise for data science and visualisation. However, I did a few things in these areas as well. I wrote a 30,000-word guideline for Data Science and SEO which is now being turned into training within the Traffic Think Tank.
Therefore, in the field of social skills, I would like to learn to trust people less, and in the field of vocational skills, I would like to learn to program sooner.
Programming SEO will make a name for itself in the future.
Natural language processing (NLP) will be able to generate content that is largely unique. Or, it will be able to replace existing content. Even now, some people I know are able to publish 9,000 words of content a day, but the Google algorithm is largely catching up. When Google can't catch up, 40,000 clicks per day are possible in 2 weeks.
Since it is related to this subject, I prepared a research paper of 15,000 words. It focuses on the concepts of Google Author Rank, Author Identity, and Expression Identity. When content is produced with AI, the search engine can understand that it does not belong to a real author, tolerate it to some extent, or degrade it from rankings.
Increasingly, everything, including informational content, ends up in a "service" or "product". As this distance decreases, true identity brands and their web entities will come to the fore.
“Programming SEO will make a name for itself in the future.”
In this context, “Natural Language Generation” will be of great importance in the future, but people will need to configure it, make it unique, and train it for an even better "context specification and deepening".
That's why I said the same thing in my part in David Bain's book SEO in 2022. SEOs who can code their Transformers can change or renew a whole network of semantic content in one day.
I also published two SEO case studies within the scope of Semantic Content Network. However, the concept is still not understood. In fact, I will be presenting Semantic Content Networks in Poland at Kulturalnie o SEO. I will be proving this concept and why it is necessary in my Semantic SEO course.
Because the information graph and information content are taken from a cross-content network and placed in a semantic network by the search engine. Even if you produce content with Natural Language Generation, if your semantic network structure is not correct, you will not be able to win.
For example, which entity, which attribute, which sentence structure, which facts, and in which order should be given. What questions should be asked, and how the design and centerpiece annotation should be. How many links should fall per page, and what should be the link distance.
The concept of Semantic Content Network and Natural Language Generation will be the concepts of future SEO winners. However, at this point, I should add that the job of the search engine is difficult. Due to relative and comparative ranking, a 0.7 PageRank and 94 IR score may be sufficient for a query today, but next year, this may increase even more due to content inflation.
Too much similar content can create different requirements for "similarity threshold" and "content hashing". Between 2000 and 2018, blackhat SEOs, myself included, relied heavily on links and information retrieval. In the future, there will be many SEO connections with large device networks and artificial intelligence, at this point, playing the "mind-game" with Google will become even more enjoyable.
Semantic SEO was born from the concepts of Semantic Web, Semantic Search, and Semantic Search Engine. The query processing and even indexing methods of semantic search engines are very different. A semantic search engine can use "entity-lists" and "triple storages" instead of "phrase-lists".
In this context, it is necessary to organise a website within the scope of "entity-oriented search". A parallel between its own semantic content network and the search engine's knowledge base should be provided with high consistency and factuality.
In this case, there are concepts such as relevance attribution, dilution, and radius. However, a lot of it is about when to open a new page, or how many separate topics can be put on a page.
Semantic SEO is to perform meaning-based optimisation for meaning-based search engines, mainly by utilising concepts such as taxonomy and ontology, onomastics, semantic role labels, named entity recognition, and parts of speech tag.
In other words, it is to talk to the search engine. Even if I don't want to talk too much, you can change a search engine's mind, or mislead it. This illusion can also affect the ranking of other web pages.
For this reason, "slow but consistent" reflexes were adopted by search engines over "fast and unstable" reflexes. Here, the "complex-adaptive system" should also be mentioned, but it stretches the point. In this context, I have previously transferred the query processing methods of Semantic Search Engines - you can read more on my Slideshare.
I also touched on how the concept of Knowledge-based Trust can reduce the dependency on PageRank with Semantic Content Networks in a recent blog for OnCrawl.
"Semantic SEO is to perform meaning-based optimisation for meaning-based search engines."
Therefore, the benefit of semantic SEO for users is that they can use their semantic nature when searching. All humans are semantic creatures. The way we define or associate things is lived entirely by semantic rules. This is why even the Neural Networks technology, which is adapted from the human brain, advances with semantic models.
The main benefit of semantic SEO for Search Engines is that it reduces costs and increases user satisfaction. Semantic search engines can spend much less time on document clustering, or query clustering. Entities in the same group have queries of the same type, making query patterns, query pattern generation, question generation, and information extraction much faster.
They can see many links that string-based search engines cannot. The concept of phrase-based indexing should be known in this context. Anna Patterson has published a very nice set of patents. He also wanted to establish a semantic structure with phrase-based indexing, but this was not enough either.
Information extraction means to extract and extract information, while information retrieval is to understand the relevance of a document to a group of words. Therefore, information extraction is only valid for semantic search engines.
In this context, it should be understood that a semantic search engine fully understands the content and SEO should also describe the content. Semantic SEO is a full conversation with a search engine.
Content hubs provide semantic annotation to web documents while creating a contextual crawl path for search engines.
A content hub is largely for creating topicality differences between query and entity groups, separating minor search intents and major search intents from each other, and supporting them with correct PageRank distribution.
Every possible query of the user and every possible search thought is covered by part of the same content hub. In this context, while the user is satisfied without the need for a second search activity, positive historical data is obtained from other users in the same group.
"Content hubs are largely associated with semantic SEO."
Content hubs are largely associated with semantic SEO. Semantically, it can be connected by creating different content hubs. At this point, it is necessary to understand the semantic network of the search engine, which entities are central, which attributes are more prominent, and create a context vector accordingly.
In this context, the concept of Semantic Content Network may appear as a more detailed and organised version of content hubs. From the word count of the anchor texts to their order, the position of lists and tables, and the ordering of list elements and table elements.
A content hub can contain different phrase taxonomy and different n-grams. However, related entities and phrases form substantially closer co-occurrence matrices. In this context, activities within the scope of semantic SEO are largely paralleled by content hubs
A knowledge graph is essentially a visualisation of knowledge-base. In this context, the actual data is kept in the knowledge base. While a knowledge base is in data-frame format, a knowledge graph more often refers to the same data frame as nodes and edges. Every knowledge graph is a knowledge base, but not every knowledge base is a knowledge graph.
Along with this refinement, knowledge graphs are a network of knowledge links that include real-world entities and concepts, their interconnections, and knowledge of real feature sizes. A search engine aims to constantly expand its knowledge graph, and as it expands, it categorises and clusters the web pages in its index accordingly. This categorisation and indexing process is largely driven by the relationships and properties of entities.
The definition of an entity can affect search result pages.
If you recognise that thing again, the search result pages will be affected.
If Google always accepts your definition, you determine who is right and who is wrong.
"Understanding the knowledge graph for content marketing is seeing the connections of real-world things through Google and witnessing Google's perception."
Understanding the knowledge graph for content marketing is seeing the connections of real-world things through Google and witnessing Google's perception. Google is expanding its data network in collaboration with the World Data Bank and many other organizations, recording billions of real-world entities with a variety of information and dimensions.
The knowledge graph contributes to search engines, from parsing a query with success and being able to understand which source is telling the truth, grouping a query, and seeing whether a search session ends with satisfaction. For SEO, understanding the knowledge graph is a foundation for entity-oriented search.
As an example, consider this SEO case study on InLinks. You can see that a "beverage" is recorded as a "food" in the knowledge graph is addressed with different and should not be questions. In this context, Web Answers (Featured Snippets), and many more are linked.
For example, to the question "how many legs do horses have", Google used to answer that they have 3 wings. Later, they realised this. Thanks to the knowledge graph. Corroboration of answers from the open web is a research that I suggested and that Google engineers also benefited from.
Links and link texts say a couple of things to a search engine.
▪️ Crawl this.
▪️ Index this.
▪️ Relate these two pages.
▪️ Flow the PageRank.
▪️ Register the link text into the anchor tag index.
Our consideration of each of the above constructs requires that we publish a research paper on links and crawling patterns.
However, when one web page links to another, one segment loses PageRank and transfers it. Using the wrong internal links may cause Google to ignore the internal link network after a while. For example, linking two different web pages with the same word gives Google the ability to rank both pages for that word.
Putting links in lists, tables or sidebars has different meanings. Using different header menu structures can sometimes create brand identity problems. Using the same link twice may prevent the second from being considered. In this context, there is a very high number of "clickability" points for all these points.