Extraction Rules
Customize your response by adding extraction rules.
WebScrapingAPI allows you to extract specific sections of the webpage. You can do so by using the extract_rules
parameter.
This parameter's value can be a string
(the CSS selector or XPath) or a stringified object
. In the second case, the parameter accepts the following options:
Parameter | Type | Description |
---|---|---|
|
| The CSS selector or the XPath. |
|
| The type of the |
|
| The output format of the selected element. Accepted values are:
- |
|
| Returns all possible elements. The default value for this parameter is |
|
| Removes leading and trailing white spaces, line terminator characters, and newlines from the result. The default value for this parameter is |
A full example of how this parameter would look in production is:
or:
Extraction Rules Integration Examples
Extract Content Based on CSS Rules
GET
https://api.webscrapingapi.com/v2
The following examples shows how the extraction_rules
parameter is used in order to extract specific elements from the targeted website.
Query Parameters
Name | Type | Description |
---|---|---|
api_key* | String |
|
url* | String |
|
extract_rules | Object |
|
The full GET request for the extract_rules
should be:
Important! The url
& extract_rules
parameters have to be encoded.
( i.e. &url=https%3A%2F%2Fwww.webscrapingapi.com%2F&extract_rules=%7B%22title%22%3A%20%7B%22selector... )
More extract_rules
object examples
extract_rules
object examplesHere are more examples that should help you better understand how the object passed to the extract_rules
parameter should look like:
HTML Sample | Extraction Rule | Rule Description | JSON Output |
---|---|---|---|
|
| Return the text content of the elements having the CSS class |
|
|
| Return the |
|
|
| Return the |
|
|
| Return the JSON format of the first table having the CSS class |
|
|
| Return the array format of the first table having the CSS class |
|
|
| Return the name and the price of each list item. |
|
Last updated