09/06/17 // Written by Emma Phillips

Google Hangout Sessions Update: Sites Scraping Content, Location Pages and Treating Diacritics

Google frequently arranges live Google Hangout sessions online, hosted by its Webmaster Trends Analyst, John Mueller, who runs one-on-one sessions with a select group of search specialists.

Ingenuity Digital’s SEO Technical Account Manager, Dan Picken attends these sessions to ensure he’s at the leading edge of search science, asking the questions that will enable us to keep our clients’ businesses fully optimised online.

Please see below highlights from the 2nd of June’s Google Hangout.

Highlights

Are sites scraping your content?

An interesting question was asked in relation to the widespread problem of scraper sites. For those of you who don’t know what scraper sites are, they are essentially websites that are created to copy other site’s content with the goal of generating visibility and potentially revenue (such as adsense).

One of the main issues here (other than this being madly annoying!) is that these sites could rank higher than your site for the content you’ve produced. Now although Google does try to recognise when a site is an absolute scrape of another, it is algorithmic in nature and so will make mistakes ranking this duplicate content above yours from time to time.

John Mueller from Google says the best way to deal with this issue is to submit a DMCA (Digital Millennium Copyright Act) claim to get the copied content removed. This is a legal process and so as such needs to be dealt with via Google’s legal team. They will assess all requests and “take action if appropriate.”

Google do stress that there is no “back door system” to remove this content; you must go through the legal process below.

To submit a DMCA claim to take down copied content you can submit the request here: https://support.google.com/legal/troubleshooter/1114905?hl=en-GB

For more information on DMCA claims please visit: https://support.google.com/legal/answer/3110420?hl=en-GB

Here is the video:

https://youtu.be/FCMU2mPhcsE?t=678

How does Google treat diacritics in text on the page or in title tags?

We submitted a question to the Google Plus page to ask John Mueller how Google understands and interprets diacritics in text. For those of you that are not familiar, diacritics are these little accent marks above letters such as the Skoda example below:

So how does Google handle these exactly?

Well, according to John Mueller they simply “treat them as they come.” They “don’t have a linguistic model i.e. this character matches to this one and so all words that use this character can be found like this” but instead in practice understand that words are synonyms of one another and try to understand that in an algorithmic way. Their understanding of word synonyms is based on user search and, on that basis, try to show that up in Google search for users.

He would recommend to just write the content naturally as you would and in theory this should “just work out fine”.

Here is the video:

https://youtu.be/FCMU2mPhcsE?t=1133

Location Pages, one page for all or multiple pages for one?

A question was asked by a childcare business which have 5 physical locations in Houston, TX. They asked should they have one page with all 5 locations on or 5 separate pages each with a specific location. Which would perform better in Google?

JM says, “in theory you could go either way”, so there is no right or wrong way. However, optimally he would suggest that if you have various locations and the content is unique per location, for example different opening hours, then it would make sense to have an individual page for each location.

He does stress that one page would be fine but from the sounds of things this would be less optimal than to have one page for these locations.

So as a rule, I’d recommend where possible to have a location page per location so you can optimise the meta and content for that location to give it the best chance to rank in Google and Google local search.

Here is the video:

https://youtu.be/FCMU2mPhcsE?t=1219

Why does Googles Search Console data take so long to update?

Why is it that data such as the crawl/duplication errors in the ‘html improvements’ section take so long to crawl and update when you’ve fixed them?

Well, according to JM the issue lies with Google’s incremental approach to crawling the site. Reasons given by JM are potential overloading of site’s servers, and, technical reasons on Google’s part (he doesn’t go into detail) why they can’t crawl the entire site immediately when you’ve fixed the issues. This comes down to crawl budget and you can find out more on crawl budget here: https://www.seroundtable.com/google-crawl-budget-23265.html

Google’s incremental approach to crawling sites can mean that it could take between 2 days to 2 months dependant on the page to update.

So, some errors can update quickly and others will take some time to update dependant on the site and the page. He goes on to say that in practice when they see bigger site wide changes across the website they will try to crawl that a little bit faster than they normally would.

What I would recommend is that once you have fixed the issues you’ve identified in SC, download the latest errors such as crawl errors and use a tool like Screaming Frog to crawl those URLs. You can then find out quickly whether the issues are fixed and it’s just a case of waiting for data to refresh in Search Console, just my 2 cents ?.

Here is the video:

https://youtu.be/FCMU2mPhcsE?t=2144