GET
/
v1
/
scrape
curl --request GET \
  --url https://api.screenshotmax.com/v1/scrape

It’s simple to use: you only need to submit your access_key and a url url of a webpage. The API will return the content of the webpage.

Getting started

REST

The Scrape API, like all of ScreenshotMAX’s APIs, is organized around REST. It is designed to use predictable, resource-oriented URL’s and to use HTTP status codes to indicate errors.

HTTPS

The Scrape API requires all communications to be secured TLS 1.2 or greater.

API Versions

All of ScreenshotMAX’s APIs are versioned. The Scrape API is currently on Version 1.

Your Access Key

Your access key is your unique authentication key to be used to access ScreenshotMAX APIs. To authenticate your requests, you will need to append your access key to the base URL as a query parameter for GET requests. You can also use the X-Access-Key header to pass your access key. You can find your access key in your account dashboard.

Base URL

https://api.screenshotmax.com/v1/scrape

Validation endpoint

ScreenshotMAX’s Scrape API simply requires your unique access key and url to be passed in the URL. The API will return the content of the webpage.

https://api.screenshotmax.co/v1/scrape
?access_key=YOUR_ACCESS_KEY
&url=https://example.com

This was a successful request, so the API returned a 200 OK response. The content of the webpage is returned in the body of the response.

Request parameters

access_key
string
required

Your unique access key. You can find your access key in your account dashboard.

url
string
required

The URL of the webpage you want to rendering of. Must be a valid URL and accessible from the internet. If the URL contains a querystring, it must be URL-encoded.

For example, https://example.com/test?param=1 should be passed as https%3A%2F%2Fexample.com%2Ftest%3Fparam%3D1.

format
string
default:"html"

The format of the screenshot. Available formats are html, md. The html format returns the HTML content of the webpage, while the md format returns the content in Markdown format.

js_enabled
bool
default:"true"

Whether to enable JavaScript on the page. If set to false, the API will return the HTML content of the page without executing any JavaScript.

gpu_rendering
bool
default:"false"

Whether to use GPU rendering. Only available for scale paid plan.

capture_beyond_viewport
bool
default:"false"

Whether to capture content beyond the viewport.

viewport_device
string

The device type for the viewport.

viewport_width
number
default:"1280"

The width of the viewport in pixels.

viewport_height
number
default:"1080"

The height of the viewport in pixels.

viewport_landscape
bool

Whether the viewport should be in landscape mode.

viewport_has_touch
bool

Whether the viewport has touch capabilities.

viewport_mobile
bool

Whether the viewport is a mobile device.

device_scale_factor
number

The device scale factor for the viewport.

block_annoyance
string
default:"cookies_banner"

The annoyance to block. Options include none, cookies_banner, ads, tracking.

block_ressources
string

The resources to block. Options include document, stylesheet, image, media, font, script, texttrack, xhr, fetch, eventsource, websocket, manifest and other.

media_type
string
default:"screen"

The media type for the rendering. Options include screen and print.

vision_deficiency
string

The vision deficiency for the rendering. Options include reduced_contrast, blurred_vision, deuteranopia, achromatopsia.

dark_mode
bool
default:"false"

Whether to use dark mode for the rendering.

reduced_motion
bool
default:"false"

Whether to reduce motion for the rendering.

geolocation_accuracy
number

The accuracy of the geolocation in meters. Minimum is 0. Maximum is 1000.

geolocation_latitude
number

The latitude of the geolocation. Minimum is -90. Maximum is 90.

geolocation_longitude
number

The longitude of the geolocation. Minimum is -180. Maximum is 180.

media_type
string
default:"screen"

The media type for the rendering. Options include screen and print.

attachment_name
string

The name of the attachment, without the extension filename. This is the name that will be used when downloading the response. Extension will be automatically added based on the format parameter.

timezone
string

The time zone for the request. This allows you to simulate different time zones. Available time zones from the IANA Time Zone Database.

authorization
string

The authorization header to use for the request. This should be a base64-encoded string (e.g., for Basic Auth, encode “username:password” using base64). This allows you to authenticate with the webpage before capturing the content.

user_agent
string

The user agent to use for the request. This allows you to simulate different browsers and devices.

cookies
string[]

The cookies to use for the request. This allows you to simulate different sessions and states. Example: cookies=name=value; name2=value2.

headers
string[]

The headers to use for the request. This allows you to simulate different requests and responses. Example: headers=header1:value1; header2:value2.

ip_location
string

The IP location to use for the request. This allows you to simulate requests from different countries by routing them through proxy servers with corresponding IP addresses. This feature is only available on scale paid plan.

Supported locations:

  • United States (us)
  • China (cn)
  • Europe (eu) (random EU country)
  • Canada (ca)
  • Mexico (mx)
  • United Kingdom (gb)
  • Germany (de)
  • France (fr)
  • Switzerland (ch)
  • India (in)
  • Japan (jp)
  • South Korea (kr)
  • Russia (ru)
  • Brazil (br)
  • Australia (au)
proxy
string

The proxy to use for the request. This allows you to route the request through a different IP address. The proxy must be in the format http://username:password@host:port or https://username:password@host:port.

bypass_csp
bool
default:"false"

Whether to bypass the Content Security Policy (CSP) of the webpage. This allows you to capture content of webpages with strict CSPs.

delay
number
default:"0"

The delay in seconds before rendering. This allows you to wait for specific elements to load before capturing the content. Maximum is 30.

timeout
number
default:"30"

The timeout in seconds for the rendering. This allows you to set a maximum time for the request to complete. Maximum is 30.

wait_until
string[]
default:"['domcontentloaded']"

The conditions to wait for before rendering. This allows you to ensure that specific elements are loaded before capturing the content. Available options include:

  • load: Wait for the load event to be fired.
  • domcontentloaded: Wait for the DOMContentLoaded event to be fired.
  • networkidle0: Wait for no network connections for at least 500 ms.
  • networkidle2: Wait for no more than 2 network connections to be active for at least 500 ms.
metadata_icon
bool
default:"false"

Whether to include the metadata icon in the response. This allows you to capture the favicon of the webpage. The link of the icon will be included in the header X-Screenshotmax-Metadata-Icon.

metadata_title
bool
default:"false"

Whether to include the metadata title in the response. This allows you to capture the title of the webpage. The title will be included in the header X-Screenshotmax-Metadata-Title.

metadata_fonts
bool
default:"false"

Whether to include the metadata fonts in the response. This allows you to capture the fonts used on the webpage. The fonts will be included in the header X-Screenshotmax-Metadata-Fonts.

metadata_hash
bool
default:"false"

Whether to include the metadata hash in the response. This allows you to capture the hash of the webpage. The hash will be included in the header X-Screenshotmax-Metadata-Hash.

metadata_status
bool
default:"false"

Whether to include the metadata status in the response. This allows you to capture the HTTP status code of the webpage. The status code will be included in the header X-Screenshotmax-Metadata-Status.

metadata_headers
bool
default:"false"

Whether to include the metadata headers in the response. This allows you to capture the headers of the webpage. The headers will be included in the header X-Screenshotmax-Metadata-Headers.

cache
bool
default:"false"

Whether to store the content of the rendering in the cache. This allows you to store the rendered content for a specified time-to-live (TTL) period.

cache_ttl
number
default:"604800"

The time-to-live (TTL) for the cache in seconds. This allows you to set a maximum time for the cached resources to be valid. Maximum is 30 days in seconds (2592000).

async
bool
default:"false"

Whether to use asynchronous processing for the requestt. This allows you to capture content without blocking the request.

webhook_url
string

The callback URL for asynchronous processing. This allows you to receive the response via a webhook. The webhook will be triggered when the response is ready. The webhook URL must be a valid URL and must be accessible from the internet. The webhook URL must be HTTPS and must support the POST method. More information about webhooks can be found in the async & webhook documentation.

webhook_signed
bool
default:"true"

Indicates whether the webhook request should be signed. Enabling this option allows you to verify the authenticity of incoming webhook requests. For more details, refer to the async & webhook documentation.

signature
string

The cryptographic signature used to verify the authenticity of the request. For more information on how to compute and validate this signature, see the signed requests documentation.

Response and error codes

Error Codes

Whenever you make a request that fails for some reason, an error is returned also in the JSON format. The errors include an error code and description, which you can find in detail below.

CodeTypeDetails
200OKThe request was successful.
400Bad requestThe request was malformed or invalid.
401UnauthorizedThe request was rejected due to an invalid access key or missing signature when signed requests are enabled.
403ForbiddenThe signature provided is invalid. Occurs when signed requests are enabled.
402Payment RequiredAccess denied due to an unpaid invoice. Applies to paid plans.
423LockedThe request was denied due to insufficient quota.
429Too Many RequestsThe rate limit has been exceeded (too many requests per minute).
500Internal server errorThe request failed due to an internal server error.