This is a public-interest archive. Personal data is pseudonymized and retained under
GDPR Article 89.
Re: blogs and scrapers
On 25-Jan-08, at 3:42 PM, Lawrence F. London, Jr. wrote:
> Douglas Green wrote:
>
>> A while ago in the "blogging world", the topic of "scrapers" came up.
>> A bunch of blogs went to part feeds as a result of this in what I
>> think was a misguided sense that they could "protect" their content
>> by
>> doing so. (for the uninitiated, "scraping" is where automated
>> software links into your blog feed and puts the content of that feed
>> up on another website - it "scrapes" the content from your site and
>> puts it on its own.)
>
> Without dragging this out, it seems that you can either allow or
> deny RSS feeds.
> So, if you allow them you have no control of who, where or for what
> use
> your blog content is used by others in their own sites.
You have no electronic control of any of your site's contents at any
time. Period. I'm sure Chris Lindsey can comment further on this but
robot scrapers can take site content from any site -blog or web. If
it's electronic and they can find it, they can scrape it/take it.
The control you have is with the DMCA and the ability to take down
their material (always assuming you can find the owner etc etc - it's
black and white in text but not always easy to accomplish)
>
>
> I asume this WP plugin at least guarantees equal exchange of
> blogmatter
> and gives you the ID of the blog using your content so you can
> contact them or complain, etc.
No - the plugin puts a bit of code on each feed from your blog that
delivers a back link. Period. You have to use Google or Technorati
or other search engine to find the sites using your material with this
system. If you don't use this system, you can use copyscape.com
>
>
>> Site content is not particularly "safe" with only a part feed. Sad
>
> What is a "part" feed? I thought RSS feeds were or were not.
Nope. With Wordpress, you have the choice of putting the entire
article on your feed or only a small portion of it. This is probably
software dependent I note.
>
>
>> In other words, if your site is scraped, you get a link back from the
>> scraping site. This works on two levels. The first is that you get
>> an inbound link
>
> What if you don't want their content?
You're not getting their content. All you're getting is an inbound
link. And that is always good. No. Google doesn't penalize you for
the quality of your inbound links as you have no control over that (to
forestall another common question) :-)
>
>
>> The second is that if you're really determined, you can find
>> and stomp scraper sites with dmca complaints by using a "links:your-
>> site " search on google or going to Technorati and finding inbounds.
>
> So, you can use this Google ccommand to discover who is using your
> RSS feeds?
You can use google to discover damn near anything. LOL!
All the best
Doug
Douglas Green
Online Garden Publishing
Blog: http://blog.douggreensgarden.com
Home: http://www.simplegiftsfarm.com
_______________________________________________
gardenwriters mailing list
gardenwriters@lists.ibiblio.org
http://lists.ibiblio.org/mailman/listinfo/gardenwriters
GWL has searchable archives at:
http://www.hort.net/lists/gardenwriters
Send photos for GWL to gwlphotos@hort.net to be posted
at: http://www.hort.net/lists/gwlphotos
Post gardening questions/threads to
"Gardenwriters on Gardening" <gwl-g@lists.ibiblio.org>
For GWL website and Wiki, go to
http://www.ibiblio.org/gardenwriters
Other Mailing lists |
Author Index |
Date Index |
Subject Index |
Thread Index