Is it legal to scrape Amazon search results?

Scraping publicly visible search pages (no login, public data) is generally treated as lawful in the US, but it does violate Amazon's Terms of Service and Amazon actively blocks bots. Keep it to research, throttle your requests, never scrape behind a login, and don't republish copyrighted listing content. When in doubt, get legal advice for your use case.

How many Amazon search result pages can I scrape?

Amazon paginates search up to roughly page 20 publicly. After that, the results stop being unique. Most product research stops at pages 3-5, where the top organic and sponsored positions live.

What is the difference between organic and sponsored Amazon listings?

Sponsored listings are paid placements (Sponsored Products, Sponsored Brands). They are labeled in the HTML with a 'Sponsored' badge and a `data-component-type='sp-sponsored-result'` attribute. Organic listings rank algorithmically. For SEO research, separate them — they tell different stories.

Does Amazon search return the same results for different users?

Not exactly. Logged-in users get personalized results based on browse and purchase history. Public scrapers see the unpersonalized version, which is what you want for SEO baseline measurements.

Can I scrape Amazon search by zip code?

Yes — Amazon respects the postal-code preference set on the homepage. Sending the right session cookie (or query parameter) lets you see local availability. Without it, you get the default US-wide view.

What is the best way to track ranking changes on Amazon?

Scrape the same search keyword daily and store (keyword, position, asin, sponsored, scraped_at) per row. Plot position-over-time for your target ASINs. Daily rank tracking is the foundation of Amazon SEO.

← Back to blogTutorial

How to Scrape Amazon Search Results & Track Rankings (Python)

May 12, 2026 · 8 min read

You want to know what shows up when an Amazon shopper searches "wireless earbuds." Maybe you are an SEO tracking where your product ranks, or you are researching a new category. The Amazon search results page has organic listings, sponsored placements, and editorial slots — and Amazon does not expose any of this through an official API. You scrape it.

TL;DR: there are two ways to scrape Amazon search results — a DIY requests + BeautifulSoup script (free, you maintain it, Amazon blocks it often) or a managed API (you send a search URL, get back parsed positions). This guide shows both, what is actually in the search HTML, and how to turn it into daily Amazon rank tracking. Jump to the code.

One thing up front: scraping public Amazon search pages (no login, public data) is generally treated as lawful in the US, but it does break Amazon's Terms of Service and Amazon aggressively blocks bots — keep it to research, throttle your requests, and expect captchas. With that out of the way: this guide covers the DIY scrape (requests + BS4 on a search URL), what is actually in the HTML, and why search pages are tougher to scrape than product pages.

Why Search Pages Are Harder Than Product Pages

Three things ratchet up the difficulty:

Heavier bot detection. Search is where Amazon's most commercially sensitive data lives — exposing sponsored bid signals, ranking patterns, and ad-spend efficiency. The anti-bot stack on /s?k=... URLs is more aggressive than on /dp/<ASIN> URLs.

More dynamic JS. Product pages render most useful data server-side. Search pages lean on client-side rendering for some sponsored carousels and filter panels. A raw requests.get misses about 10-15% of the result blocks; you need to compensate at the parser level.

Sponsored vs organic is signal, not noise. Both occupy positions in the result grid. Treating them as the same row corrupts your ranking data. The DOM marks them separately (data-component-type="sp-sponsored-result" vs s-search-result), but only if you check.

Pagination caps. Amazon paginates search up to roughly page 20, then the results become non-unique. For SEO research, pages 1-5 are what matters; deeper is rarely actionable.

The DIY Approach

import requests
from bs4 import BeautifulSoup
from urllib.parse import quote_plus

HEADERS = {
    "User-Agent": (
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
        "AppleWebKit/537.36 (KHTML, like Gecko) "
        "Chrome/127.0.0.0 Safari/537.36"
    ),
    "Accept-Language": "en-US,en;q=0.9",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
}


def scrape_search(keyword: str, page: int = 1) -> list[dict]:
    url = f"https://www.amazon.com/s?k={quote_plus(keyword)}&page={page}"
    r = requests.get(url, headers=HEADERS, timeout=15)
    if r.status_code != 200 or "validateCaptcha" in r.url:
        return []

    soup = BeautifulSoup(r.text, "html.parser")
    out = []
    position = 1
    selector = (
        "div[data-component-type='s-search-result'], "
        "div[data-component-type='sp-sponsored-result']"
    )
    for card in soup.select(selector):
        asin = card.get("data-asin")
        if not asin:
            continue
        title_el = card.select_one("h2 a span")
        price_el = card.select_one("span.a-price > span.a-offscreen")
        sponsored = card.get("data-component-type") == "sp-sponsored-result"
        out.append({
            "position": position,
            "asin": asin,
            "title": title_el.get_text(strip=True) if title_el else None,
            "price": price_el.get_text(strip=True) if price_el else None,
            "sponsored": sponsored,
        })
        position += 1
    return out


if __name__ == "__main__":
    for entry in scrape_search("wireless earbuds", page=1):
        print(entry)

Real Limitations

Mid-page carousels. "Editor's picks" and "Highly rated" carousels mid-page have their own DOM structures; the selector above skips them.
Position counting. Should you count sponsored slots in the position field, or only organic? For SEO research, separate counters per type tell you more.
Page-1 vs deep pages. Page 1 has more curated content (badges, "Amazon's Choice"). Pages 2+ are more uniform organic + sponsored alternation.
Local availability. The default US-wide view differs from a zip-code-specific view. Set the session-token cookie or accept the default.

A failure mode that costs you bad data:

# Your scraper saw 0 results because Amazon shipped a new search layout
# where the result wrapper changed from data-component-type="s-search-result"
# to data-csa-c-content-id="s-search-result". Same ASINs, different attribute.
# Your job logged "scraped 0 results for keyword X" — no error, just empty.

Always alert when result count drops below a threshold for a known-stable keyword.

Scaling Beyond a Single Keyword Script

For SEO tracking across hundreds of keywords daily:

Separate sponsored from organic. Two distinct position columns per row: position_in_organic, position_in_sponsored. Plot them separately over time.

Track multiple pages. First-page rank matters most, but movement on pages 2-3 predicts page-1 entry. Scrape 3-5 pages per keyword.

Watermark by date + locale. A (keyword, locale, scraped_at, page, position, asin, sponsored) schema covers most ranking research.

Capture "Amazon's Choice" as a boolean. It is not a position — it is a marker on a specific result. Store it as a field, not a row.

Compute share-of-voice. For your brand: % of first-page slots, % of sponsored slots, weighted by position. That is the real KPI, not raw rank.

The LogPose smart endpoint accepts a search URL and a pages parameter (1-10), returning organized search-result objects with the sponsored flag pre-parsed:

import os
import time
import requests
from urllib.parse import quote_plus

API_KEY = os.environ["LOGPOSE_API_KEY"]
BASE = "https://api.logposervices.com/api/v1"
HEADERS = {"X-API-Key": API_KEY}


def scrape(url: str, pages: int = 1) -> dict:
    submit = requests.get(
        f"{BASE}/ecommerce/amazon/smart",
        params={"url": url, "pages": pages},
        headers=HEADERS, timeout=30,
    ).json()
    job_id = submit["job_id"]
    while True:
        s = requests.get(f"{BASE}/jobs/{job_id}", headers=HEADERS, timeout=15).json()
        if s["status"] in ("completed", "failed"):
            break
        time.sleep(2)
    return requests.get(f"{BASE}/jobs/{job_id}/result", headers=HEADERS, timeout=15).json()


keywords = ["wireless earbuds", "bluetooth speaker", "smartwatch"]
for k in keywords:
    url = f"https://www.amazon.com/s?k={quote_plus(k)}"
    data = scrape(url, pages=5)
    results = data.get("results", [])
    print(f"{k}: {len(results)} positions across 5 pages")

For sustained tracking, persist (keyword, scraped_at, position, asin, sponsored) to a database and run the scrape on a daily cron.

Common Mistakes

Treating sponsored as position-equivalent. They are different signals; separate them in storage.
Scraping the same keyword 100× a day. Daily is enough; hourly is noise.
Ignoring locale. US, UK, DE rankings are independent. Tag rows with locale.
No keyword normalization. "wireless earbuds" vs "Wireless Earbuds" vs "wireless%20earbuds" — pick one canonical form.
Reading "Amazon's Choice" as a brand signal. It is algorithmic and changes hourly. Use as a binary, not a trust metric.

The Landscape

For Amazon search-results tracking:

DataForSEO — has Amazon SERP endpoints; SERP-focused tool with broad search engine coverage.
Helium 10's Cerebro — Amazon-only keyword research; depth on ASIN-to-keyword mapping.
JungleScout — keyword + ranking research bundled with sales estimates.
DIY + residential proxies — full control if you already run scrapers; you own the data and schema.
LogPose — smart endpoint with pages parameter on search URLs; useful when search is one of several Amazon surfaces (product, reviews, BSR) you scrape together.

If your goal is pure Amazon keyword research, a dedicated tool like Helium 10 is usually faster to value. For SEO ops teams that need raw rank data piped into their own warehouse, a managed scraping API is more flexible.

Get Started

Sign up at logposervices.com.
Generate an API key.
Run the snippet above for your top 10 keywords daily.
After two weeks of data you will have actionable rank-over-time charts.

External: Amazon SP-API docs, BeautifulSoup docs.

How to Scrape Amazon Search Results & Track Rankings (Python)

Why Search Pages Are Harder Than Product Pages

The DIY Approach

Real Limitations

Scaling Beyond a Single Keyword Script

Common Mistakes

The Landscape

Get Started

Frequently asked questions

Related posts

How to Scrape Amazon Search Results & Track Rankings (Python)

Why Search Pages Are Harder Than Product Pages

The DIY Approach

Real Limitations

Scaling Beyond a Single Keyword Script

Common Mistakes

The Landscape

Get Started

Frequently asked questions

Related posts

CamelCamelCamel Alternatives for Tracking Amazon Prices at Scale

Helium 10 Alternatives for Sellers Who Want the Raw Search & BSR Data

Jungle Scout Alternatives for Amazon Research on Raw Data