As part of our Blogvember experiment, Bill Slawski of Go Fish Digital very kindly agreed to an interview. As a fan of Bill’s blog SEO by the Sea, I was keen to talk patents with him (although we meander off into tech and sci-fi halfway through)
- How do you decipher patents?
- How are patents implemented at Google?
- Are you reading patents that describe something amazing in the future?
- Will Chromecast and online streaming mean an end for broadcast TV?
- Is Google Glass a waste of time or a masterstroke?
- In the future, could Google Glass take the form of a contact lens?
- Predict a groundbreaking invention in the not-too-distant future.
- Have you ever encountered a patent that describes Panda or Penguin type spam detection?
- Is the recent ‘Mugshot’ algorithm opening a can of worms for Google?
- Have you seen a patent on methods that might tackle Negative SEO?
Q1: I have read a few Google patents, and one thing I’ve noticed is that the language they use is often not definitive:
- ‘One implementation of this…’
- ‘Can be calculated…’
- ‘Can be implemented…’
This implies that when they come to implement the techniques, they may only be using a few bits and pieces of what was granted in the patent. However when I have come to read blog posts that refer to patents, often on well respected blogs, the writer will jump to conclusions and say things like ‘it will be implemented in this manner’.
What are your thoughts on this? Do you think it’s dangerous for authority figures to be making such claims without clearer evidence?
Note: I’ve never seen this on seobythesea.com!
A1: My usual mode of operation when I’m writing about a patent is to copy the whole thing into Notepad, and then start whittling away at it, removing a lot of the boilerplate and the meaningless legal jargon. That means that phrases such as “In an alternative implementation,” and “One skilled in the art will recognize that” get excised from the patent as I summarize it, and make it more readable for me. I also end up adding a lot of the words “may” and “might” to the existing text, to indicate that some options or things that could be part of the process behind a patent might be added or used.
It’s never possible to completely decide what to leave in and what to remove, since I want as many people as possible to get an idea of what a patent says, but I don’t want to over simplify things either. I sometimes received comments that something I’ve written about was too “obvious”, and for most of those, it often wasn’t obvious within the patent itself, but took a lot of work and writing to get something to sound “obvious.”
When I wrote about patents at Search Engine Watch and Search Engine Land a few years ago, Danny Sullivan used to insist that I include a disclaimer for any posts involving patents. He was concerned that people might walk away from a patent post thinking that it is a surefire fact that Google is using what is described in a patent. I’ve also seen a lot of people who read something that describes a potential process or technology that might be used by a search engine, and go on to insist that is what the search engine is using. I try to avoid making any claims like that at all. As Matt Cutts noted in one of his “help” videos earlier this year, just because Google has a patent on something doesn’t mean that they are presently using the technology or approach described in the patent “at this time.” I don’t like repeating a disclaimer like that either, but try not to write about a patent as if it’s clear proof of Google’s use of something. I’d rather write a post that help to create discussions, raise questions, or spark ideas for experiments.
It is dangerous making claims that Google is doing something a certain way because there’s a patent. But there’s still a lot of value in knowing about a patent that covers certain issues. Those help give us insights into issues that a search engine might have explored, and considered using. They can help provide perspective from a search engines’ point of view (or search engineers’ points of view). They can give us a peek into business issues that surround different topics, and issues such as privacy concerns and technological needs.
When Google was granted the patent I called the “Reasonable Surfer” patent, I started the post by pointing out statements from Matt Cutts and from one of Yahoo’s search engineers that described how neither Google nor Yahoo treated every link on a page as if it passed along the same weight or value but might treat links from main content areas as if they were stronger than links from footers or comments. Adding things like that provide a lot more value than just a summary of a patent. When Google was recently granted a patent that describes how they might treat some content as web spam, where they identify gibberish content on a page, as if it were created by a source such as Mechanical Turk, or through translation into another language and then back to the original language, the patent sounded like something that many people felt that Google had been doing for a while (including me). But it was satisfying to see that written out in a patent, as if it were confirming an intuition about part of how Google works.
Sometimes there really isn’t any “confirming” sources of information that might be pointed to when writing about a process in a patent, though I will often research the inventors listed on the patent to see if there are other closely related patents, or white papers that might be on similar or the same topic. If there are, I like to include those in a post that I write about a patent, too.
Q2: Can you explain a bit about how the implementation process works at Google, regarding patents and implementing new technologies. For instance, say they had developed a breakthrough technique in identifying paid links, and applied for the patent 2 years ago. Would they put the technique into practice before the patent was granted, or would they need to wait?
A2: I really can’t describe the implementation process at Google for patents, because I haven’t experienced it personally, and because it’s quite possible that it differs from patent to patent.
I remember seeing a blog post from one of the Google inventors listed on one of the patents I wrote about, and he said, “the invention behind the patent used less lines of code than this guy’s blog post.” 🙂
Some patents involve changes to what is displayed at the search engine, and may be something very new that Google might want to announce and describe to searchers when they release it. One of the things that is sometimes fun about those is that one of the inventors listed on the patent might write a blog post on one of Google’s blogs announcing and describing the change or update. When Google introduced personalized search, I had written about the patent applications describing that around 6 months before it was launched, and those pending patents did a great job of describing how well it worked.
The same is true of the patent application on Google authorship badges, and how authorship markup at Google works seems to echo what was in the patent application behind it.
Some patents don’t involve any changes to what is displayed at the search engine, like the many different patents that involve phrase-based indexing.
It can be hard to tell if those have been implemented, or to what degree they may have.
When Google released a patent application about named entities in queries, and how Google might assume that a searcher was implying that they wanted to do a site search at a site that might be associated by Google with that named entity, the impact of that process was pretty evident. You could search for something like [spaceneedle hours] in Google, and 8 of the first results returned at Google were from spaceneedle.com, and featured “hours” on those pages.
If you wait too long to file a patent, it’s possible to lose the ability to be granted a patent. I don’t think there’s a specific time limit, but would guess that anything used more than a year before a patent was filed on it might be subject to a higher level of scrutiny from a patent examiner than something that a search engine might have only been doing for a shorter period of time.
You don’t have to wait until a patent is granted to start using what is described within it. I saw a patent within the last year of so get granted which described how pages in a directory might be sorted by something like PageRank so that the highest PageRank pages were listed first – like in the Google Directory, which ran for many years before Google discontinued it.
Q3: I guess the thing which I find amazing is the foresight of some of the patents – for instance you were writing about the original Agent Rank patent application back in 2007, and we are only just now starting to really see Authorship and potentially ‘AuthorRank’ come into play. Do you see any patent applications now that you think could be describing something amazing happening in like 5 or 6 years time?
A3: Sometimes I see a patent and wonder if it’s about something that we’ve already had Google or Bing implement or wonder if Yahoo would have if they still had their own search engine. Sometimes a patent may involve something that I’m sure that has already happened, and I’m thankful that I now know what the search engineers have been calling things. Those include terms like link merging and domain clustering. The Agent Rank one was a blast to come across in 2007, but it kept me wondering when Google might incorporate some reputation score into the rankings of pages. That seems like something that might happen any day now, and there are a lot of people impatiently asking at places like the Google Plus Author Rank group, “Are we there yet?”
I’ve been asking that since February of 2007 🙁
Google’s summer time presentation this year introduced a number of aspects of the Knowledge Base, and the ability to perform spoken searches on Desktop computers, and Amit Singhal told us that Google is only about 1% of the way towards a knowledge base search, which means that Google still has a long way to go on that one. Someone back in 2007 at a conference asked me what was new in search, and I started describing how named entities might be useful in returning information filled search results. I suspect I was more than a few years early with that answer, and that while we may see more and more of the knowledge base in our search results, that the really big aspects of that aren’t going to be uncovered for another 5-6 years. There’s a lot going on with sensors on phones (and smart watches and Google Glass) that I think are going to be incredible, but that we’re going to have to sit on our hands and wait for, too.
It’s actually helpful to not only look at granted patents, but also newly published pending patents, and even white papers. The granted patents more often focus upon older things, the pending applications on things that might be closer in time, and the white papers on things that are still in the middle of happening. Regardless of that though, many of the initiatives that we first learn about in sources like those sometimes end up having more significance when looked at in perspective with others.
Self driving cars? While sounding like science fiction, they collect a lot of data, which is what you want when building a Google Earth and a Google Maps. Android phones and Google Glasses – make it easy for people to crawl the world for you, and phone home things like traffic estimation data and driving times.
We’re probably going to run into a lot of surprises, too. Who knew that something like Google Chromecast would be the biggest selling electronics item at Amazon.com, given Google’s questionable sales of Google TV?
Q4: Yes, it must be immensely satisfying to see features finally coming to fruition when you’ve been anticipating them for years.
The Chromecast thing is an interesting one, as it reflects how the consumption of televisual content has evolved over the last few years. With on-demand, catch up and online steaming increasingly becoming the norm, do you think we’ll see the end of ‘live TV’?
A4: It is exciting seeing features that I’ve only read about in patents or papers come to life, but part of the excitement is seeing what shape or form they might be presented in when they do arrive. For example, regardless of how many Agent Rank patents and Authorship patents and social annotation papers I’ve read, I would have had a hard time envisioning what Google Plus ended up looking like, or that Google would use authorship markup to come out with something such as In-Depth snippets and articles.
It’s a little like reading advance summaries and rumors about new television shows, and having a sense of what they might be about, and then experiencing them for yourself.
After buying Chromecast, and getting 3 free months of Netflix, I’m not ready to cut the cord. At least not yet. I’d want more out of Netflix before I would do that.
I do end up watching a lot of television through On-Demand, especially since I often work or write through many shows when they are broadcast for the first time. I’m also a big movie fan, but I haven’t been to the cinema in a few years. There used to be one within a short walk from my house, but the closest is now a half-hour drive away, and the cost of going to the theatre is getting up there. Being able to watch a movie on a big screen in my own home, and make my own popcorn is usually a better experience, anyway.
Q5: I can’t believe that they could possibly have planned Google+ as far back as 2007, particularly considering they tried and failed at social a few times in between. I guess it solves a few problems for them all at once. So whilst we’re talking about technology, then, what are your thoughts on Google Glass? Is it the gigantic leap that Google claim? It seems to be just a smartphone for your eye, with a few whistles and bells. Is that unfair?
A5: Developing Google Glass is a pretty smart move on Google’s behalf. Eric Schmidt remarked a couple of years ago that Google attempts to only involve itself in areas that allow the company to be innovative, and in Google Glass, it’s found one of those areas.
Glass isn’t just a smartphone for the eye; it’s a hardware device that enables Google to focus upon a completely different way of interacting with the Web and others by developing a visual paradigm that not only focuses upon visual searches and queries of the kind covered by Google Goggles, which includes object recognition, facial recognition, bar code reading, similar image searches, landmark and location searches, but also enables Google to delve heavily into mobile searches and making location based search technology stronger.
With applications such as Google Now, which uses predictive algorithms to try to deliver information as you may need it and acts as a personal assistant, Google Glass enables you to create your own memories of things that have happened in your life in a format that’s searchable. There were a number of visually-based patents that were granted for Google Glass by Google before the Glass Explorer program even began to allow developers to create more. These include actual ways to augment your senses and your memory.
One of those involves an inward facing camera that can watch where you’re gazing that can be matched up with an outward facing camera, to provide eye tracking information on how people experience the world around them. The cameras can also work together to provide the ability to zoom in on things that might be far away, or provide information about things that you see and want to learn more about.
Mobile devices contain a growing number of sensors, including micro-electronic devices such as gyroscopes, magnetometers, temperature and pressure sensors, as well as GPS and Bluetooth sensors, and data collected from those, and aggregated together might be useful in a number of ways.
Google uses some of these sensors in mobile devices presently to provide traffic estimates in things like Google Maps and Navigation already, but imagine that these types of sensors can be helpful in other ways as well, such as being used to predict weather, or even the possible spread of epidemics and illnesses.
Google recently acquired a couple of hundred patents from Hitachi this past summer, and one of those describes using input from gestures captured on camera (like the cameras that Google Glass has), as computer inputs. Google also acquired the startup Flutter at the end of the summer that also uses gestures captured on webcams to act as a computer interface. Together, these acquisitions seem to point towards a gesture-based interface for Google Glass. Google has also been granted a patent for a user interface that can project something like a keyboard and capture inputs on that keyboard as an interface as well.
There is a realm of work assisted possibilities for Google Glass that we might see as well, such as assisted surgeries and mechanical work where others can watch through the wearer’s glasses and provide their input. Glass can be used to create training simulations and capture images for others to see as well.
It’s the “bells and whistles” that make Glass special, and Google has hired a good number of people who have been working on visual augmented reality for years to work on Glass. Expect Glass to deliver something that is truly innovative in many ways.
Q6: Wow, that actually sounds really cool, like something out of a sci-fi novel. In fact, many ‘future Earth’ stories end up with Glass-esque technology evolving into neutral implants that communicate directly with the brain. Whilst that’ll never happen in my lifetime at least, I wonder if we could ever see that sort of thing integrated into a contact lens overlay or something. Do you think Glass development could go in that direction?
A6: I’m not sure if Google will venture into augmented reality contact lenses at any point in the foreseeable future. If you want a look at what that might potentially look like, the Hugo award winning 2007 book Rainbows End by Vernor Vinge includes contact lenses with their own operating systems, augmented reality overlays on those, and has both Microsoft and Google as active participants in the story line in a not too distant future. Part of the action takes place at a protest at Stanford’s Library, where the Google book scanning project involves putting books through a shredder, and then scanning the fragments left and piecing images of those together electronically to reproduce the books.
It wouldn’t surprise me to run across a patent from Google that might cover this possibility, but I don’t know if Google might make the business decision to be involved in the creation or release of such technology. At it’s roots, Google is still a search engine, and I’m not sure if they would venture too far from those roots.
The whole idea of Google creating actual hardware products was boosted significantly with strong sales of Google’s Chromecast, and they do own a number of patents describing other hardware and devices that they might build upon. Will they develop game controllers or in-house entertainment systems? Those are both possibilities. Google has released a few whitepapers that describe how cloud computing might be used by robots to learn more about the world using technology like object recognition.
It’s hard to speculate too much and too far in the future, but I suspect that we might have some serious surprises headed our way.
Q7: I certainly hope so. Come on then Bill, don your Arthur C. Clarke hat and give us a prediction for a semi-feasible yet relatively groundbreaking tech invention or innovation within the next 20 years (if that’s not too far in the future). You could go down in history here…
A7: Coming up with an innovation that’s worth developing is hard because it requires a deep level of understanding of what will actually make a difference in people’s lives. Creating science fiction that might meet some goals like that is probably easier, especially if those innovations can help drive the plot of a story further along. But not everyone can envision some of the things that we see featured in television shows like Prophets of Science Fiction. Besides, if I let some of those slip, it might ruin plotlines of anything I might want to publish in the future. 🙂
One of the tests that many patents have to face when they are submitted is a three pronged one that asks, “Is it new, is it useful, is it obvious?”
But when it comes to innovation, most patents I see don’t shake the world with a unique view on something that might cause its magnetic poles to shift, and change our views forever. Sometimes they describe something that might just add a little something new, or make small changes. For example, I did see a patent granted in the last year on how to make an easier-to-build snowman, which probably wasn’t too great of an innovation, but made for an interesting read in a Jorge Luis Borges kind of way (especially the part on the history of snowmen). It’s also not unusual, when you’re keeping an eye out for patents from Google and Yahoo and Microsoft (Bing), to see patents that might come up with very similar results, but follow very different paths to arrive at them. The innovations in those isn’t necessarily the thing being created, but rather the path followed.
Q8: A deft sidestep if ever I saw one. Ok slaloming the questions back on course, have you identified patents that described spam detection a la Panda or Penguin?
A8: Patents that describe spam detection for Panda or Penguin?
I suspect that there’s a good chance that the patent behind Penguin is Methods and systems for identifying manipulated articles which uses a link analysis method to try to identify a dense cluster of low quality pages that might try to funnel PageRank to another page, and a content analysis on that page being pointed to. If it isn’t the patent behind Penguin, it’s quite possible that it shares a number of elements with that particular patent.
As for Panda, I’m not sure that there’s a specific patent that we can point to directly, but there are more than a couple that might fit. I did write the post Document Level Classifiers and Google Spam Identification a couple of years ago, looking in more depth at a couple of possibilities, but there are other approaches that could have been taken, and it’s possible that if there’s a patent more on point, that it might not have been granted yet.
Q9: That patent certainly looks Penguin-esque. For those that can’t be bothered to read it, the patent describes methods for identifying documents which link to target articles, that have been ‘manipulated’; and that a manipulation indicator can be used in a ranking function to lower the rank of a document.
It also contains details of methods we’ve seen implemented way before Penguin:
- “…the corresponding document has a large number of keywords without a proportional number of sentences”
- “…the corresponding document includes meta tags having a large number of repeated keywords”
- “…the corresponding document includes more than a threshold amount of text having the same color as a background color of the corresponding document”
- “…the corresponding document includes more than a threshold number of unrelated links”
Ok so I just want to finish up by talking about a couple of algorithm updates that have been announced recently. One of them went relatively unnoticed – The Google “Mugshot” Algorithm – which penalised websites that featured people who had been arrested. The underlying ‘ issue’ was that when people’s names were being searched, they were coming up first on this site – whether or not they had been convicted or not.
Is this sort of action opening a can of worms for Google? Where do they draw the line as to what is acceptable and not acceptable to include in their results?
A9: Regarding the Mugshots change at Google, I think that’s an area where search engines are going to struggle with the best thing to do. There’s a lot of information that is publicly accessible and available, possibly because at one point in time it wasn’t very accessible. As a former Court employee, it’s something I’ve seen a lot of. There are ways to get official Criminal Records history from different states, but usually access to that information needs to be accompanied by a release from the person covered by that history, or by a need to access that kind of information as defined by state laws. I needed a security clearance to access that kind of criminal history information, and I supervised clerks in the criminal division of the highest level trial court in my state.
An arrest record and a mugshot isn’t an indication of whether or not someone has been found guilty of a particular crime, and the existence of sites that make such arrest information available to the public can have a profound and negative impact upon people who have been arrested because other people might draw the wrong inferences from that information.
Innocent people are sometimes arrested, and people can sometimes have charges on their criminal records expunged if charges have been dropped, or if they’ve gone through some kind of diversion or probation prior to prosecution type process. If a potential employer wants to perform a criminal history background check on someone, they do have ways to access that information, but it usually requires a legal release form be filled out by the subject of their inquiry, and such records checks are carefully supervised by states and the federal government.
What role should a search engine participate under when there are sites that scrape such information from public sources? Google decided that it would make it so that sites that display arrest and mug shot type information don’t rank as highly as other sites that might mention the name of people who might be listed on those sites. From what I’ve read, it appears that many, if not most, of these sites listing arrest records charge a “removal” fee to have people have their information removed from their sites.
Q10: I guess few people could argue that their stance (regarding the Mugshots) is fair, but if it starts a trend of ‘censorship’ then I think it could upset a few people. Apparently during his PubCon keynote, Matt Cutts mentioned that “child porn, international issues and really nasty queries are being addressed”, so we may well see a few more of these very targeted updates in the coming months.
Another controversial topic is negative SEO, which Google have tried to claim doesn’t have an impact. I have very recently come across a very clear example of negative SEO getting a site heavily penalised by Penguin 2.1. From my observations the spammy links are very obviously negative SEO, but they have managed to trick Google’s algorithms – clearly this is something they need to work on further.
Have you ever seen a patent on methods that might identify manipulative linkbuilding specifically to damage another website? What is your opinion on negative SEO?
A10: I can’t say that it bothers me in any way that Google might reduce the rankings of pages that are set up purposefully to cause harm to people, and to extort them into paying to have a mugshot and arrest record removed from the web, that were put there under an argument that those sites are “protecting the public.” There are limitations to free speech, such as a prohibition against shouting “fire” in a crowded movie theatre when there is none. We have trials and court proceedings to determine whether or not someone is guilty. Given the “pay for removal” approach that these sites are taking, their motivation seems more inspired by cash than protecting anyone.
Google doesn’t have any patents that I’ve seen that are written about negative SEO, or that mention it in anyway. There are people who attempt to manipulate rankings in search results in many ways, including ones that might get a competitor’s site penalized by the search engines. Google representatives have said that it is possible to be harmed in that manner, but that the best approach is to build or attract quality links to your site, publish content that provides value to your audience, and to follow Google’s guidelines.
Google has become more transparent regarding penalties in the past year or so, including details about “manual” actions that they may have taken about sites, so they are working on this. It’s likely that most algorithms that come out these days anticipate the possibility that they might be manipulated in some manner, and attempt to make it so that the cost of that manipulation might make it easier and less expensive to not attempt to manipulate those ranking signals.
I have seen at least one patent that focuses upon identifying manipulative link building, including the creation of the connection of a lot of low quality pages that tend to only link to each other and to a page that may attempt to push PageRank towards another page in the fashion of a doorway page. Chances are good that Google didn’t need to pursue too many patents that spell out how they might try to stop people from spamming it. Google has been involved in the AIRWeb (Adversarial Information Retrieval on the Web) workshops over the past few years where they’ve been sharing information with others, including Microsoft and Yahoo and academics about
Web Spam, for the mutual benefit of all. A lot of interesting papers have come out of the workshops that describe ways to take action against spam and spammers.
My opinion about negative SEO? I’d much rather spend my time and energy trying to build something creative and positive and remarkable than something harmful and destructive that might hurt people’s lives and livelihoods. I’m tired of seeing people refer to “black hat” and “white hat” SEO, as if it’s something like a John Wayne western.