How Flickr Favorites Get Your Images Scraped
Bob Leggitt | Friday, 17 June 2016 |
Back in early 2013, I wrote a post about Flickr’s attitude to its own ‘licence’ categorisations. Flickr had confirmed to me that the official Flickr API, used by third party scraper sites to ‘legitimately’ scrape content from Flickr and display it on their own domains, was engineered to allow the scraping and automatic republishing of All Rights Reserved images. In other words, the licencing choices made by Flickr users had no relevance at all to Flickr, and were disrespected by the site’s own redistribution service.
At the time, I was advised by Flickr that if I didn’t want my All Rights Reserved content scraped and republished with Flickr’s blessing, I needed to change my settings. If I reset my Hide your stuff from public searches selection to Yes, on 3rd-party sites, the problem, Flickr assured me, would be resolved. In the course of ‘resolving’ this problem, however, I’d be creating another huge disadvantage for myself. Namely, the search engines would not be able to access my images, so outside of Flickr, none of my content could be found. But hey, that was just my hard luck. Should have spent my time building online scraper sites instead of actually contributing something useful and enriching to the world.
So I took Flickr’s advice, and blocked all third party access. Then I got on with other projects, stopping very occasionally to add the odd image to Flickr.
But recently, I decided to consolidate a proper, unified home for my images. A site I can monetise, and perhaps use to gain a little compensation for all the time I’ve spent creating and publishing my photos.
When adding any of my published photos to a new site, I’ll reverse search them in Google’s index. I do this primarily to see what I’m duplicating and ensure that I don’t end up with a site Google thinks is some kind of copycat venture. So, as I transferred posts from Flickr to their new home, I ran each photo past Google Images. I knew (or at least I thought I knew) the pics were not available to Google from Flickr. So my goal was to determine if I’d posted the shots anywhere else and forgotten about doing so, or if anyone else had reposted the pics without permission.
I wasn’t surprised to see my supposedly inaccessible images appearing on Google. I was, however, surprised to see them indexed not only from a major Flickr scraper site, but also from Flickr itself. How could this be? My Flickr settings have been clear and unchanged since early 2013. All content unavailable to third party sites. So why was Google finding my photos on Flickr, and why was a scraper site using the official Flickr API able to scrape them?
The answer quickly became apparent. Favorites.
Investigating each instance of my images on Google (images Google was not meant to be able to access, remember), I could see that Google’s links did not lead to my Flickr page. They led to the Flickr pages of people who’d Favorited my pics. If those Flickr users made their content available to Google and other third party sites, then Google could feasibly index a 500px version of the photo. But worse than that, a Flickr API-driven scraper site could then somehow access a 1024px version of the photo, which was substantial enough to outrank the 500px Flickr version in search. Ergo, the scraper site gets the top Google result for my images. My images, which Flickr claimed would not be fed to scrapers using the official API.
Flickr’s Favorites function quite literally leaks your content out to the exact parties Flickr tells you you’ve blocked.
So, despite the sacrifices I’ve made… Despite taking myself out of search, on Flickr’s advice, in order to keep scrapers’ hands of my work (even though my images are marked as All Rights Reserved and should never have been officially available to scrapers in the first place)… Despite losing over three years of search publicity for my own pages… I now find that Flickr is STILL officially pushing my images to scrapers. And what’s more, it’s now doing so in a way that enables scrapers to gain the premium, number one slot on Google.
The lesson is, if you don’t want to be scraped with Flickr’s blessing, the only way you can achieve your aim is: DO NOT USE FLICKR.
Flickr has no respect whatsoever for the photographers who built it. Its entire mechanism is an illusion, designed to lull posters of original content into the sense that they’re somehow ‘safe’ from the grasping hand of parasites.
I know it would be incredibly naïve to believe that a photo posted on the public Web was in any way immune to unauthorised redistribution. Of course people will steal pictures. The Internet deliberately conditions them to do so. But this is not unathorised distribution. It’s authorised. By Flickr. Try getting a photo taken down from a scraper site that uses the official Flickr API and see how you get on. Even a DMCA notice will be difficult to enforce, because the scraper is using Flickr's official API, and therefore the onus is on Flickr to decline the scraper access. With the API as it is, Flickr is tacitly giving scrapers permission to display your All Rights Reserved images. And since you can immediately remove your image from the scraper site by taking it off Flickr, the situation is deemed to be within your control.
You can’t win. By directly feeding the scrapers with your content, against your wishes, Flickr legitimises those scrapers, and effectively immunises them from DMCA notices. And if you take Flickr’s advice, you’re worse off still. The Favorites loophole leaves photographers who block third party access at an even greater disadvantage. Not only do the scrapers still get your content. You're not even on Google to compete with them!
In fairness, I was going to progressively transfer my images away from Flickr anyway, so I’m not going to claim I’m leaving the site in disgust. But that doesn’t mean I’m not disgusted. I am. I’m disgusted by the contempt in which all photographers and image providers' rights are held by the Internet. But most of all, I’m disgusted by the way the Internet pretends it gives some kind of a stuff about the rights and choices of photographers. There is no greater example of this pretence than the example you'll find on Flickr.
Flickr’s behaviour should serve as a lesson for every photographer. Yes, you can trust Flickr, but only to completely disregard the sham user choices it manufactures for PR purposes.
Planet Botch provides a contact facility for business matters only. Here's the link to the Contact Page.