How to use Management-Ware Extract anywhere to create a script to grab content from a website or a Html document?
The concept it's very simple. The following steps will describe the most common scenario. Most website extraction will follow that pattern. For any help DO NOT hesitate to contact us!
Click on New Script
To create a new script click on New Script on the menu (Ctrl+N).
Fill the Script settings
Fill the Script settings form and click the Save button to apply the changes. Make sure you enter the Website page address (URL) where the search will start.
Note* Most of the time that will be the search page.
Set search form if applicable
Define the search form fields (You can set the search field as multi-line for automatic search).
Set the submit button if applicable
Define the search form submit button.
Turn on navigation
Now turn off the element capture on Click and perform a search to get to the results page. You should set the results page as the default Test URL (Use as default...top right navigation menu).
Turn off navigation and set the repeated section
Define the Repeat section with all the information you want to extract (On yellow pages each company is a repeat section).
Note* Each repeat region is a row with content (a new contact). Normally your script should content only ONE Repeat section.
Set the elements to capture
Set each element that you want to extract (E.g.: company name, address, Telephone, etc.). Make sure that each element is in the repeat section (The repeat section is the parent like a sub menu).
Set the details page URL
Click on a link on the results page to set the details page URL if applicable. The details URL should be in the repeat section as a child...
Navigate to the details page URL
Now turn off the element capture and navigate to the details page address. Set it as the default test URL.
Set the details page elements to capture
Set each details element. Each element should be in the details page as a child.
Go back to the results page
Use the Open URL menu to go back to the results page to continue the script editing.
Set the next page nivation URL
Now set the next page if applicable. If the next page set is set, the Data Extractor will move to each page to get the data automatically.
Video
The video below show you how to create a script to extract data from yellow pages Website. After you watch this video, you will get a good idea of the concept and how the software is working.