Rendering JavaScript, Handling Redirects, and Mapping Links

Some sites render most of their content in the browser, or require redirects that break a simple HTTP fetch. In those cases, a plain curl will not return the final content. This post shows two essential techniques with v1/scrape:

Enable JavaScript rendering to fetch the real, user-visible content.
Extract and map links from a page for simple site discovery.

When `curl` is not enough

The Gemini API docs are a good example. A direct curl to the page will bounce through Google auth redirects and fail. With Supacrawler, Playwright automatically renders the page with a real browser before extracting content.

Scrape a JS-rendered page (Gemini docs)

curl -G https://api.supacrawler.com/api/v1/scrape \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d url="https://ai.google.dev/gemini-api/docs" \
  -d format="markdown" \

Real output (truncated)

# Gemini Developer API
[Get a Gemini API Key](https://aistudio.google.com/apikey)
Get a Gemini API key and make your first API request in minutes.

### Python
from google import genai
client = genai.Client()
response = client.models.generate_content(
  model="gemini-2.5-flash",
  contents="Explain how AI works in a few words",
)
print(response.text)
...
Title: Gemini API | Google AI for Developers
Status: 200
Description: Gemini Developer API Docs and API Reference

This matches what a user sees in the browser because the page was rendered before extraction.

Extracting all links from a page

Links are always included in the scrape response, making it easy to discover related pages without a separate request.

Get links from a site

curl -G https://api.supacrawler.com/api/v1/scrape \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d url="https://supacrawler.com"

Real output

{
  "success": true,
  "markdown": "# Supacrawler\n\n...",
  "links": [
    "https://supacrawler.com/pricing",
    "https://supacrawler.com/about",
    "https://supacrawler.com/contact",
    "https://supacrawler.com/terms-of-service",
    "https://supacrawler.com/work",
    "https://supacrawler.com/dashboard/scrape",
    "https://supacrawler.com/signin",
    "https://supacrawler.com/privacy-policy",
    "https://supacrawler.com/blog/your-first-web-scrape"
  ]
}

Takeaways

JavaScript rendering is automatic for SPA/redirect-heavy pages.
Links are always included in the response for easy page discovery.
Use the scrape endpoint to capture content and discover related pages in one request.

Ready to try it yourself? Grab an API key and run these examples in minutes.

Rendering JavaScript, Handling Redirects, and Mapping Links

When `curl` is not enough

Scrape a JS-rendered page (Gemini docs)

Real output (truncated)

Extracting all links from a page

Get links from a site

Real output

Takeaways

उत्पाद

कंपनी

ब्लॉग

समर्थन

Rendering JavaScript, Handling Redirects, and Mapping Links

When curl is not enough

Scrape a JS-rendered page (Gemini docs)

Real output (truncated)

Extracting all links from a page

Get links from a site

Real output

Takeaways

When `curl` is not enough