Is the Knowledge Graph Ethical?

“A system of morality which is based on relative emotional values is a mere illusion, a thoroughly vulgar conception which has nothing sound in it and nothing true” – Socrates

Socrates poised that an ethical argument based on emotion was not one worth discussing, yet as SEO consultants, we have been guilty of this in recent weeks. Back in 2012 Google introduced a feature to the search results called the Knowledge Graph. It gave users an improvement to the level of their interaction with the SERPs that they had never seen before (or since). There has been much talk of this aspect of search over the past few months but the ethical questions to be discussed around the implementation of raw data into search deserve to be discussed with logic at the forefront. When discussing ethical matters like this; I’ve long been an advocate of the voice of the collective so for this post I have decided to surround myself with people much more intelligent than I such as Bill SlawskiDr Pete Meyers and Gianluca Fiorelli. We discussed briefly over email the topics in question and here were their answers:

With Google expanding to include more panels pulled from pages without markup; how do you see information retrieval effecting brands, publishers and retailers alike?

Bill Slawski

The purpose behind knowledge panels are really two-fold. The first of those is to improve discoverability, to make it easier for people who don’t know a topic well to learn more, so that they have related information and topics to search for. The second purpose is similar to that of a snippet in search results. Knowledge panels provide a representation of the entities they are about, include some disambiguation information when there are other entities or concepts by the same name so that a search can explore those as well. In neither instance is the purpose to replace web pages or documents that might be pointed to by Google, but instead to give people more to search for from the search engine, including in many instances, topics that people often search for next historically when they perform a search for the original entity.

Dr Pete

I think it depends a lot on the vertical. It’s easy to look at a quick answer derailing a result and see nothing but bad news. It’s fair to ask, though – if your business is nothing but aggregating easy answers (plus ads, most likely), how much value do you add? Sites that listed dates for major holidays provided a service for a while and made good money on ads, but now that Google can answer a question like “When is Christmas?”, that business model is over. Being brutally honest, though – it wasn’t a very strong model to begin with. On the other hand, imagine you’re a local restaurant, and Google is serving up a rich knowledge panel with your photos, address, telephone and today’s operating hours. Have they potentially taken a click from your website? Sure, but does that matter? They’ve made your brand look more credible and given people the information they need to find you. If those people walk in the door, it doesn’t matter where the information comes from. I’m not arguing about Google’s intent or responsibility to webmasters (I think they’ve milked “good for users” a bit too hard lately). I’m just saying that the impact on your business can vary wildly. Some people will do well.

Gianluca Fiorelli

I think it is already doing it, if it true what implementation data are telling us about the real use of schema.org and other structured data, being it quite small with respect the total amount of web document indexed by Google. A very simple example is how Google is able (well, not always) to interpret authorship thanks to the by-line and with the rel=”author” being absent. How brands, publisher et al are going to be affected? I think that at first they will see and notice a traffic decrease, probably… But what they will also see will be – IMHO – a better quality of the traffic that still they will receive, also from a Knowledge Graph navigation. They will loose traffic that tends to bounce a lot or that is not going to convert ever. More over, if web site owners/SEO are able to monitor and control what Google is “scraping” from them, they can gain visibility above the fold in the SERPs, which is quite a precious value right now that organic search snippets visibility is shrinking.

Many see the see Google’s expansion of the knowledge graph to include more and more terms to be aggressive; Do you and would you ever recommend against schema.org or other microformats to limit information passed to search engines?

Bill Slawski

Search engines have been working to extract structured data from the somewhat unstructured nature of web pages for a long time. The labels from microformats and schema might make it easier for a search engine to extract information from a page, and if you want your page to be a source of such information, including that kind of markup isn’t a bad idea. I can envision some people portraying Google’s knowledgebase to be “aggressive”, and there have been people who have written about search engine bias, and a desire for search engines to show their own properties instead of those from original sources. But often those other properties are just more finely focused vertical searches.

Dr Pete

There may be isolated cases, but in general, I wouldn’t recommend that. Google is going to find ways to extract data from someone, somehow. Either you can control that data and make sure it comes from you, or it can either (a) come from a competitor, or (b) come from you however Google finds and mangles it. From a purely commercial standpoint, I’m not sure what choice we have but to play the evolving game.

Gianluca Fiorelli

No, I wouldn’t. What I would suggest, and actually that’s what I suggest to my clients from some time now, is to craft their content in order to have “answers” ready to be used by Google in the Knowledge Graph and Answers box, but to put special efforts in offering in-depth content in the same page. For instance, using as an example a site offering IP information, if it was just answering to a question like “what’s my IP” with just the IP number of a domain name, then that site is going to sink due to Answers box. But if in the same page the site offers deep information as what others domain are hosted in the same IP, what country is that IP assigned to, what historical information we can find about that IP, if that IP was ever flagged for malware and what kind of malware and so on, then we are offering informations that will be valuable to the users and that Google cannot offer with a simple answer.

Many webmasters have complained about results containing scraped data; but in your opinion is Google doing anything wrong? Is there any logical or ethical argument (from a user perspective) against Google presenting scraped data within panels?

Bill Slawski

One of the tenets of copyright is the concept of fair use, and there’s a 4 pronged test for whether a use of someone’s artistic work is or isn’t fair use. Facts themselves aren’t something you can copyright, though unique compilations of facts have been shown to be. So, Abraham Lincoln’s height isn’t something that you can copyright, and the fact that Bill Clinton plays the Saxophone isn’t either. If a summary of facts is shown in a knowledge panel from a templated Wikipedia biography box, that information isn’t necessarily going to stop people from visiting the Wikipedia page, and may actually encourage more people to visit it.

Dr Pete

I think they’re starting to tip the balance. Google will argue that this data is good for users and that they’ve made webmasters a lot of money over the years. This is true, and we should be honest and admit it. Many of us have made a lot of money off of Google and they leveled the playing field for a while for small business. On the other hand, they make $60B/year, and the vast majority of that comes from either putting advertisements on search results extracted from our sites (AdWords) or on ads placed directly on our sites (AdSense). There’s always been an implied promise – Google will make money from our data, but in return they’ll drive traffic back to us. Once they start to extract answers or create knowledge panels that just link to more Google searches, the relationship starts to break. Is that illegal? No. Is it unethical? I think it’s a broken promise, even if the promise is implied. I think they run the risk that, pushed too hard, we may block our sites and abandon Google. They still hold most of the power, admittedly, but I don’t think they should take the balance lightly.

Gianluca Fiorelli

My first reaction, as a marketer, is not really an happy one when I see Google “scraping” an answer from a site. But as a user I must admit that it really makes my life easier, and if the answer is followed by a link to the source (and that link should be more visible as such, not in light grey), I found myself clicking on that link many times and with a far more convinced interest than when I find the same hint from a search organic snippet. And that is surely better also for a the web site owner. So… after a more paused reflection, what I think Google is doing is not really scraping, but: a) offering an immediate answer for who is looking just that, especially on mobile; b) is doing somehow a sort of Curation of its own indexed data.

Where do you see the Knowledge Graph expanding to by 2020?

Bill Slawski

I can see more people working to help expand the amount of information shown in knowledge panels by 2020. We will see information that is publicly accessible but not necessarily publicly available on a wide scale, showing up in knowledge panel or Google Now card, or Google Field Trip card. These will include things like information from historical marker programs, inscriptions on landmarks and memorials, or from documents like historical register applications.

Dr Pete

I strongly expect the on-the-fly Knowledge Graph to expand rapidly. Google can’t rely on human-edited databases for entity data – they have to be able to create entities and relationships directly from their index. Honestly, though, that expansion will happen in 2014-2015. By 2020, Google will have made the SERP completely modular, allowing for any variation of device, screen, resolution, etc. Ten-result pages will be gone and replaced with fully dynamic combinations of knowledge panels, targeted results (maybe just one or a handful, depending on the use case), and entity/relationship browsing. I’d expect something less linear and more mind-map style, especially for data on people, places, and things. I’d also expect the Knowledge Graph to expand into social and be more and more personalized. Part of that is already available in Google Now cards, but I’m not just talking about things like your flight status. I think Google will try to extract your own relationships and build on your network. There’s a huge untapped commercial potential in being able to personalize product recommendations built on your trust of your own connections, for example. Your Knowledge Graph experience and mine in 2020 may be completely different.

Gianluca Fiorelli

It’s hard to know or even preview. What I expect is that Google will start looking at ways to avoid that people will be “spamming” the Knowledge Graph itself, which is now theoretically possible (and easy), as we can manipulate the sources from where big part of the information is pulled from.

In summary

The question of ethics surrounding the Knowledge Graph will no doubt continue for many months/years but there is one fact that is not going away; users love it. Providing answers within the search results not only allows users access to information at a glance but they also allow them to do all this within Google’s environment. That’s good UX. To paraphrase Socrates once more “From the users deepest desires often come the SEOs deadliest hate.” While the Knowledge Graph continues to give users a superior search experience; we can expect them to display more and more information within the SERPs. Ethical or not…

More on influencing the Knowledge Graph here but as always, lets discuss in the comments!

  • http://www.barryadams.co.uk/ Barry Adams

    This is a good start to the debate around the ethical implications of the Knowledge graph. I hope others will delve in to this more thoroughly in upcoming months, as I feel it deserves an in-depth debate.

    For example I think, like Pete, that Google is skirting the edges of what is morally acceptable. Yes if a site offers nothing but easy answers it doesn’t have a particularly rich business model. But if a website has worked hard for many years to collate a whole range of obscure facts about a specific niche, and has created a great database of these facts, is it morally acceptable for Google to scrape this database and serve up those facts directly in its SERPs?

    In the Netherlands this week a judge argued that an aggregator site that scrapes another site’s database in order to present that information as its own and leech traffic from the original site (in that case a job vacancy site) violates the original site’s copyright.

    Arguably Google is doing the same thing on an increasingly large scale, and at some stage ‘fair use’ does tip over in to content theft, especially if that content was created/collated with significant effort involved. And its excuse that this is solely for a better user experience becomes a bit hollow if it starts monetising those knowledge graph entries with relevant ads…

    • http://www.andrewisidoro.co.uk/ Andrew Isidoro

      I agree. A few paragraphs is in no way enough to iron out the arguments needed for a topic like this but I’m happy that it has clearly struck a chord with many.

      Though I agree with your sentiment; I’d like to play devils advocate for a second with two counter questions to extend the argument:

      1. If a company holding a large database of known facts; do you not agree that the replication of their service in a more user accessible way is an advancement in the eyes of the common user, and therefore the public at large?

      2. Does making the user experience paramount to their search product make the Knowledge Graph un-ethical if a by product is revenue?

      I’m not for one second indicating that Google is naive in it’s approach here; but I do wonder if a service from a smaller vendor (with less “notoriety” for doing “evil”) such as Bing or Yahoo would have the same reception.

      Would love to hear your thoughts…

      • http://www.barryadams.co.uk/ Barry Adams

        1. Aye in the short term it will definitely result in a better user experience – fewer clicks to get to the desired action. What Google perhaps doesn’t realise is that they’re disincentivising companies from creating these types of databases in the first place or, more likely, causing these companies from going bust, therefore depriving Google of that knowledge graph information (and thus reducing user experience in the long run).

        Though at such a time Google has already harvested the info they need to fill the KG, so maybe they’re all too aware of the damage they’re causing and simply don’t care.

        2. Google’s marketshare is definitely a core aspect of the issue here, as their market dominance does mean that if they decide to ‘replace’ a website’s features with in-SERP functions, that website is very likely to see a severe decrease in traffic (and revenue) as a result of Google’s pervasive presence. Smaller search engines would not have such a dramatic impact on other websites’ traffic.

        So yes, Google’s monopoly in search is truly the crux of the matter.

  • JJ Grice

    In hindsight, I think we’d be all doing the same thing if we
    were to put ourselves in Google’s shoes. At the
    end of the day, one of their main objectives is to enhance UX, which, in fairness,
    the Knowledge Graph does this perfectly.

    Our job as SEOs continues to evolve at a rapid rate. We are
    tasked with a lot more than simply driving traffic to a site via Google – for me;
    that alone doesn’t quite cut it anymore. If we are living in fear about whether
    or not the KG will steal our traffic then we are surely setting ourselves up to
    fail.

    I wholeheartedly agree with Dr. Pete’s point in that the
    potential success the KG could bring is highly dependent on the business model
    of the site. Ethical or not, are webmasters really going to start
    banning Google from their sites? I personally can’t see it.

    • http://www.andrewisidoro.co.uk/ Andrew Isidoro

      I agree. I think while SEOs have to evolve, it’s also our job to inform our companies and clients of the effect on the business model as a whole.

      Dr Pete made an excellent point that highlights exactly that.

  • Jeremy Niedt

    Great article guys, three authors (and one host) I respect very highly.

    Impact on Brands:

    For the most part I agree with what everyone said, I think that the KG improves search experience overall. I don’t completely agree with Dr. Pete’s statement on restaurants, I think the local KG results leave a lot to be desired, but as they evolve I could see this being true.

    Application of Microdata:

    “Either you can control that data and make sure it comes from you, or it can either (a) come from a competitor, or (b) come from you however Google finds and mangles it.” – Nail on the head

    Ethical Nature:

    I agree, Google’s definately treading the line as of late. I fully support, as Bill said in the first section, using KG data to improve searcher experience through disambiguation and linking related searches. I think where they start to fail Bill’s fair use test is with KG that try to replace web page, and as Dr. Pete said so perfectly, break their implied promise.

    The Future:

    Great insights, curious what we should do to prepare…

    • http://www.andrewisidoro.co.uk/ Andrew Isidoro

      In all honesty no-one is fully “clued up” on how to combat the Knowledge Graph issue. I have been researching this for over a year with limited success and I know this is an area that Bill, Pete and Gianluca are very interested also.

      I recently gave a talk at BrightonSEO that might help give a basic understanding of what I have found so far but it is by no means comprehensive: http://www.andrewisidoro.co.uk/blog/brighton-seo-hacking-the-knowledge-graph/