For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Log inSign up
Support
GuidesReference
GuidesReference
    • Core
      • Overview
    • Agents
      • Overview
      • AgentBase
      • AgentServer
      • BedrockAgent
      • CLI Tools
      • Configuration
      • ContextBuilder
      • DataMap
      • FunctionResult
      • Helper Functions
      • LiveWire
      • MCP Gateway
      • PomBuilder
      • Prefabs
      • Search
      • SkillBase
      • Skills
        • api_ninjas_trivia
        • claude_skills
        • Custom Skills
        • datasphere
        • datasphere_serverless
        • datetime
        • google_maps
        • info_gatherer
        • joke
        • math
        • mcp_gateway
        • native_vector_search
        • play_background_file
        • spider
        • swml_transfer
        • weather_api
        • web_search
        • wikipedia_search
      • SWAIGFunction
      • SWMLBuilder
      • SWMLService
      • WebService
    • RELAY
      • Overview
      • Actions
      • Call
      • Constants
      • Events
      • Message
      • RelayClient
      • RelayError
    • REST Client
      • Overview
      • Addresses
      • Calling
      • Chat
      • Compat
      • Datasphere
      • Fabric
      • Imported Numbers
      • Logs
      • Lookup
      • MFA
      • Number Groups
      • Phone Numbers
      • Project
      • PubSub
      • Queues
      • Recordings
      • Registry
      • RestClient
      • Short Codes
      • SignalWireRestError
      • SIP Profile
      • Verified Callers
      • Video
LogoLogoSignalWire Docs
Log inSign up
Support
On this page
  • delay
  • concurrent_requests
  • timeout
  • max_pages
  • max_depth
  • extract_type
  • max_text_length
  • clean_text
  • selectors
  • follow_patterns
  • user_agent
  • headers
  • follow_robots_txt
  • cache_enabled
AgentsSkills

spider

|View as Markdown|Open in Claude|
Was this page helpful?
Edit this page
Previous

swml_transfer

Next
Built with

Fast web scraping and crawling. Fetches web pages and extracts content optimized for token efficiency.

Tools: scrape_url, crawl_site, extract_structured_data

Requirements: lxml

Multi-instance: Yes

delay
floatDefaults to 0.1

Delay between requests in seconds.

concurrent_requests
intDefaults to 5

Number of concurrent requests allowed (1–20).

timeout
intDefaults to 5

Request timeout in seconds (1–60).

max_pages
intDefaults to 1

Maximum number of pages to scrape (1–100).

max_depth
intDefaults to 0

Maximum crawl depth. 0 means single page only (0–5).

extract_type
strDefaults to fast_text

Content extraction method: "fast_text", "clean_text", "full_text", "html", or "custom".

max_text_length
intDefaults to 10000

Maximum text length to return (100–100000).

clean_text
boolDefaults to True

Whether to clean extracted text by collapsing whitespace.

selectors
dictDefaults to {}

Custom CSS or XPath selectors for structured data extraction. Keys are field names, values are selector strings.

follow_patterns
listDefaults to []

URL patterns (regex strings) to follow when crawling. Only links matching at least one pattern are followed.

user_agent
str

User agent string sent with each request.

headers
dictDefaults to {}

Additional HTTP headers to include with each request.

follow_robots_txt
boolDefaults to True

Whether to respect robots.txt rules when crawling.

cache_enabled
boolDefaults to True

Whether to cache scraped pages in memory to avoid re-fetching.

1from signalwire import AgentBase
2
3class MyAgent(AgentBase):
4 def __init__(self):
5 super().__init__(name="assistant", route="/assistant")
6 self.set_prompt_text("You are a helpful assistant.")
7 self.add_skill("spider", {
8 "timeout": 10,
9 "concurrent_requests": 3,
10 "max_pages": 5
11 })
12
13agent = MyAgent()
14agent.serve()