How to scrape a target site for articles

Scraping an Url for content

Scraping a specific site for articles is possible with SEO Content Machine through Article Downloader.

Scrape target site

Add URL by clicking on the + sign at the bottom.

art dwnldr add URL

Go to the URL grid -> Right click -> Paste the URL from your clipboard.

Click [Download Article]

The above procedure will have scrape content from the input URL.

Scrape target site

If you do not have a list of URLS, you can scrape it using the Article Downloader together with custom sites.

Scraping a target domain

1.    Insert a new custom site. Go to Article Downloader -> Settings tab -> Edit Custom Sites

2.    Add the site you want to scrape from by clicking on the + sign at the grid’s bottom. In the Domain column, for example, use:
www.therapytribe.com for

http://www.therapytribe.com/therapy/therapy_topics.html

3.    Enter Google for SE
4.    Save and check that site

5.    Click on the keywords tab. Enter a keyword. In the above sample page, use “therapy topics.”
6.    Then click [Scrape URLs]

The above will populate the URL grid.

After which, you can click on [Download Article]

You can likewise use a Spinner for the articles.

SCM provides a free spinner, Soft Spin.

admin has written 31 articles

4 thoughts on “How to scrape a target site for articles

    1. admin says:

      You can scrape all articles from target custom website.
      To find all articles however, you need to use many keywords (since SCM is scraping Google SERPS)

      Otherwise, you can use an link crawling program to find all URLs that belong to the site and use SCM to download the content from that.

    1. admin says:

      Ill have to tweak this in SCM, in the next version it will have better URL filtering.

      The output of google [none] depends on what you selected in custom sources. By default none means google isn’t being used.

Cancel reply

Leave a Reply to Juanfran

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.