Skip to main content
The Get Attributes command (GetAttributesCommand) enables you to extract attributes from DOM elements for detailed scraping operations. This command is essential for extracting links, images, data attributes, and other element attributes from web pages.

Overview

The Get Attributes command provides functionality for:
  • Extracting attributes from DOM elements
  • Using CSS selectors to target elements
  • Extracting multiple attributes at once
  • Filtering elements by various criteria
  • Returning structured attribute data

Parameters

ParameterTypeDescriptionRequired
elementsLocatorobjectElement locator configurationYes
attributesarrayList of attributes to extractYes
sessionIdstringBrowser session identifierNo
keystringKey name for storing extracted attributesYes

Element Locator Configuration

The elementsLocator object configures how elements are targeted:
{
  "elementsLocator": {
    "selector": "a.article-link",
    "timeout": 10000
  }
}

Locator Properties

PropertyTypeDescriptionRequired
selectorstringCSS selector for element targetingYes
timeoutnumberMaximum wait time in millisecondsNo
filterobjectFilter criteriaNo

Usage Examples

{
  "command": "GetAttributes",
  "params": {
    "elementsLocator": {
      "selector": "a.article-link",
      "timeout": 10000
    },
    "attributes": ["href", "title"],
    "key": "article_links"
  }
}

Extract Image Attributes

{
  "command": "GetAttributes",
  "params": {
    "elementsLocator": {
      "selector": "img.article-image",
      "timeout": 10000
    },
    "attributes": ["src", "alt", "width", "height"],
    "key": "article_images"
  }
}

Extract Data Attributes

{
  "command": "GetAttributes",
  "params": {
    "elementsLocator": {
      "selector": "[data-item]",
      "timeout": 10000
    },
    "attributes": ["data-item-id", "data-item-type", "data-item-value"],
    "key": "item_data"
  }
}

Extract with Session

{
  "command": "GetAttributes",
  "params": {
    "sessionId": "#{UUID}",
    "elementsLocator": {
      "selector": "a.link",
      "timeout": 10000
    },
    "attributes": ["href", "title", "class"],
    "key": "links"
  }
}

Common Attributes

  • href: Link destination URL
  • title: Link title/tooltip
  • target: Link target (_blank, _self, etc.)
  • rel: Link relationship

Image Attributes

  • src: Image source URL
  • alt: Alternative text
  • width: Image width
  • height: Image height
  • title: Image title

Data Attributes

  • data-*: Custom data attributes
  • id: Element ID
  • class: Element classes
  • aria-*: Accessibility attributes

Output Structure

The command stores extracted attributes under the specified key:

Single Element

{
  "article_link": {
    "href": "https://example.com/article",
    "title": "Article Title"
  }
}

Multiple Elements

{
  "article_links": [
    {
      "href": "https://example.com/article1",
      "title": "Article 1"
    },
    {
      "href": "https://example.com/article2",
      "title": "Article 2"
    }
  ]
}

Variable Support

The Get Attributes command supports variable interpolation in:
  • Selector: Use variables in CSS selectors
  • Session ID: Use #{UUID} for session tracking

Best Practices

  • Use Stable Selectors: Prefer class names and IDs over text-based selectors
  • Set Appropriate Timeouts: Configure timeouts for dynamic content loading
  • Extract Relevant Attributes: Only extract attributes you need
  • Handle Missing Attributes: Account for cases where attributes don’t exist
  • Use Descriptive Keys: Use clear, descriptive key names for extracted attributes

Common Use Cases

  • Link Extraction: Extract links from pages
  • Image Extraction: Extract image sources and metadata
  • Data Attribute Extraction: Extract custom data attributes
  • Metadata Extraction: Extract element metadata