← Back to blogTutorial

How to Scrape Zillow New Listings by ZIP Code Every Morning

· 9 min read

If you do real-estate deals — wholesale, fix-and-flip, BRRRR, iBuying — being first to a new listing is most of the edge. Zillow shows new inventory in its search UI and can email you alerts, but neither feeds into a workflow you can pipe to Slack, Airtable, or your acquisitions team's CRM. This guide shows how to scrape a Zillow ZIP-code search every morning, diff against yesterday's run, and ship only the genuinely new listings to wherever your team actually works. The code runs as written.

Why Public Listing Data Is Enough for Most Use Cases

You don't need MLS access to build a useful deal-flow pipeline. The three use cases that drive most of this work — investor deal sourcing, neighborhood market reports, and lead-gen prospecting — all run on data Zillow already shows the public.

What MLS gives you that public listings don't: exact listing-agent commission splits, private remarks, internal status changes ("temporarily off-market" vs "withdrawn"), and the precise list-to-sold history. Useful if you're a licensed agent, mostly cosmetic if you're an investor sorting through 200 new listings for the three that look distressed.

What public listings give you, which is plenty: address, list price, beds/baths, sqft, lot size, year built, property type, listing status, days on market, photos, listing-agent name, brokerage, price history, and Zestimate. That's enough to filter, score, and rank.

If your downstream workflow is "show me everything that hit the market today in 78701 under $500K with at least 1500 sqft," public data is the right tool. If your workflow needs commission structure or pocket listings, public scraping isn't going to get you there and you should be talking to an MLS broker.

What Zillow Actually Returns

For a search page, you get an array of listing summaries. Each summary has roughly the same shape:

{
  "zpid": "70905927",
  "address": "1234 East 6th St, Austin, TX 78702",
  "price": 549000,
  "bedrooms": 3,
  "bathrooms": 2,
  "sqft": 1620,
  "lot_size": "5,227 sqft",
  "year_built": 1955,
  "property_type": "Single Family",
  "listing_status": "For Sale",
  "days_on_market": 1,
  "zestimate": 562300,
  "listing_url": "https://www.zillow.com/homedetails/.../70905927_zpid/",
  "photos": ["https://photos.zillowstatic.com/..."]
}

For an individual property (the scape_property endpoint), you also get price_history (every list/relist/reduction), tax_assessment, full photo list, school zone, and listing_agent when public. You do not get the seller's name, off-market data, or any contact info that isn't already on the public listing page.

The Manual Flow, Just Enough to Anchor

Before automating anything, do the search once by hand so you know what your URL looks like:

  1. Open Zillow, search the ZIP (78701).
  2. Set filters: For Sale, your price range, beds/baths minimums.
  3. Sort by "Newest."
  4. Copy the URL from the address bar — it'll look like https://www.zillow.com/austin-tx-78701/?searchQueryState=.... The searchQueryState parameter encodes every filter.

That URL is your input. Save it somewhere; you'll feed the exact same URL to the API every morning.

The API Flow With Working Python

A single Python helper covers submit-and-poll for both Zillow endpoints. Set LOGPOSE_API_KEY in your environment and you can use it as-is.

import os, time, requests

API_KEY = os.environ["LOGPOSE_API_KEY"]
BASE = "https://api.logposervices.com/api/v1"
HEADERS = {"X-API-Key": API_KEY}


def submit_and_wait(path: str, params: dict, timeout_s: int = 120) -> dict:
    r = requests.get(f"{BASE}/{path}", params=params, headers=HEADERS, timeout=30)
    r.raise_for_status()
    job_id = r.json()["job_id"]
    deadline = time.time() + timeout_s
    while time.time() < deadline:
        s = requests.get(f"{BASE}/jobs/{job_id}", headers=HEADERS, timeout=15).json()
        if s["status"] == "completed":
            break
        if s["status"] == "failed":
            raise RuntimeError(s.get("error", "unknown failure"))
        time.sleep(2)
    else:
        raise TimeoutError(f"job {job_id} did not finish in {timeout_s}s")
    return requests.get(f"{BASE}/jobs/{job_id}/result", headers=HEADERS, timeout=15).json()


if __name__ == "__main__":
    SEARCH_URL = "https://www.zillow.com/austin-tx-78701/"
    result = submit_and_wait(
        "realestate/zillow/scape_search",
        {"url": SEARCH_URL, "pages": 3},
    )
    print(f"Got {len(result['listings'])} listings")

A few things to notice. The endpoint is scape_search and the property one is scape_property — that's how they're spelled in the API today, not a typo in this article. The pages parameter is required for the search endpoint and accepts 1-10. Three pages on a busy ZIP returns roughly 120 listings, which is more than enough for a daily ZIP-level diff.

The endpoint is asynchronous: the GET returns a job_id, you poll /jobs/{id} until status is completed, then fetch the result. Same pattern across every LogPose endpoint, so if you've read the Amazon scraping guide the shape is familiar.

The Real Workflow: Daily Diff to Find New Listings

A single run is a snapshot. The value comes from running it on a cron, comparing today's zpids against yesterday's, and shipping only the new ones.

import json, os, pathlib, datetime as dt
from typing import Iterable

STATE_DIR = pathlib.Path(os.environ.get("ZILLOW_STATE_DIR", "./zillow_state"))
STATE_DIR.mkdir(exist_ok=True)


def fetch_listings(search_url: str, pages: int = 3) -> list[dict]:
    result = submit_and_wait(
        "realestate/zillow/scape_search",
        {"url": search_url, "pages": pages},
    )
    return result.get("listings", [])


def load_seen(zip_code: str) -> set[str]:
    f = STATE_DIR / f"{zip_code}.json"
    if not f.exists():
        return set()
    return set(json.loads(f.read_text()))


def save_seen(zip_code: str, zpids: Iterable[str]) -> None:
    (STATE_DIR / f"{zip_code}.json").write_text(json.dumps(sorted(zpids)))


def find_new_listings(zip_code: str, search_url: str) -> list[dict]:
    listings = fetch_listings(search_url, pages=3)
    seen = load_seen(zip_code)
    current = {str(l["zpid"]) for l in listings if l.get("zpid")}
    new_zpids = current - seen
    save_seen(zip_code, current)
    return [l for l in listings if str(l.get("zpid")) in new_zpids]


if __name__ == "__main__":
    TARGETS = {
        "78701": "https://www.zillow.com/austin-tx-78701/",
        "33606": "https://www.zillow.com/tampa-fl-33606/",
    }
    today = dt.date.today().isoformat()
    for zip_code, url in TARGETS.items():
        new = find_new_listings(zip_code, url)
        print(f"[{today}] {zip_code}: {len(new)} new listings")
        for l in new[:10]:
            print(f"  ${l['price']:,} — {l['address']} ({l['bedrooms']}bd/{l['bathrooms']}ba)")

Wire that to cron at 7am local time:

0 7 * * * /usr/bin/python3 /opt/zillow-daily/run.py >> /var/log/zillow-daily.log 2>&1

The first run shows every listing as "new" because the state file is empty. Throw that run away. From day two onward, the diff is meaningful: only listings that weren't in yesterday's result get flagged.

To push the new listings somewhere useful, swap the print for a Slack webhook, an Airtable insert, or an email:

import os, requests

SLACK_WEBHOOK = os.environ["SLACK_WEBHOOK_URL"]


def notify_slack(zip_code: str, listings: list[dict]) -> None:
    if not listings:
        return
    lines = [f"*{zip_code} — {len(listings)} new listings today*"]
    for l in listings:
        lines.append(
            f"• ${l['price']:,} — <{l['listing_url']}|{l['address']}> "
            f"({l['bedrooms']}bd/{l['bathrooms']}ba, {l['sqft']} sqft)"
        )
    requests.post(SLACK_WEBHOOK, json={"text": "\n".join(lines)}, timeout=10)

Run that inside the cron loop and your acquisitions team wakes up to a single Slack message with everything new in their target ZIPs, ranked however you sorted them.

Scaling Beyond a Single ZIP

A handful of ZIPs on a daily schedule is fine sequential. Twenty ZIPs run morning and evening starts to matter — at three pages each, two runs a day, that's 120 jobs, and you don't want each one waiting on the previous to finish.

The bulk endpoint takes a list of targets in one POST and runs them in parallel:

def submit_zillow_bulk(targets: list[dict]) -> str:
    r = requests.post(
        f"{BASE}/realestate/zillow/scape_search/bulk",
        json={"targets": targets},
        headers=HEADERS, timeout=30,
    )
    r.raise_for_status()
    return r.json()["job_id"]


bulk_job = submit_zillow_bulk([
    {"url": "https://www.zillow.com/austin-tx-78701/", "pages": 3},
    {"url": "https://www.zillow.com/austin-tx-78702/", "pages": 3},
    {"url": "https://www.zillow.com/tampa-fl-33606/", "pages": 3},
    {"url": "https://www.zillow.com/tampa-fl-33611/", "pages": 3},
])
# Poll bulk_job the same way; result contains an array of per-target results.

For price-drop alerts on individual properties you've flagged from the daily diff, the monitor endpoint stores history server-side and fires a webhook when the price moves:

requests.post(
    f"{BASE}/monitors",
    headers=HEADERS,
    json={
        "url": "https://www.zillow.com/homedetails/1234-East-6th-St/70905927_zpid/",
        "name": "1234 E 6th — price watch",
        "metric": "price",
        "condition": "drops_below",
        "threshold": 525000,
        "check_interval_hours": 12,
        "notify_channels": ["email"],
    },
).raise_for_status()

Monitors are per-URL, which is why the new-listing flow stays DIY: there's no native "watch this search for new zpids" monitor. The cron-and-diff pattern is the right tool for that half of the problem.

Legality and Ethics

Scraping public Zillow listings is treated as accessing public data under US case law. The hiQ Labs v. LinkedIn ruling (9th Cir. 2022) is the relevant precedent. Zillow's Terms of Use prohibit automated access — that's a contract dispute, not a CFAA matter, and is mostly relevant if you have a Zillow account or hit their infrastructure hard enough to constitute interference.

What's downstream of scraping matters more than scraping itself. Cold-calling sellers off scraped numbers runs straight into TCPA and the National Do-Not-Call Registry — and most Zillow listings show the listing agent's contact, not the seller's. Using listing photos in marketing materials runs into copyright (the photos are owned by the photographer or brokerage, not Zillow and not you). Read listings, build your shortlist, then approach sellers through licensed channels.

Common Mistakes

  • The pages parameter is required for scape_search. Omitting it returns a 422. Start with 3, increase only if you're missing tail listings.
  • The URL must contain zillow. — the endpoint validates the host. If you pass a Redfin or Realtor URL, you'll get a clean rejection.
  • Don't dedupe by address. Use zpid. Addresses get reformatted between runs ("E 6th St" vs "East 6th Street"), zpids are stable.
  • Cloudflare's 100s edge timeout. api.logposervices.com is fronted by Cloudflare. Any job that runs past 90 seconds will appear to the client as a 524 even though the job is still executing server-side. Always poll the /jobs/{id} endpoint; never rely on a single blocking GET.
  • First-day baseline. Your first cron run will treat every listing as new. Either discard it, or seed the state file with yesterday's manual run before going live.
  • The endpoints are spelled scape_search and scape_property (no r). That's how they're routed; use them exactly.

Get Started

  1. Sign up at logposervices.com and generate an API key from Tool → API Keys.
  2. export LOGPOSE_API_KEY=lp_xxxxxxx
  3. Run the snippet above against your own Zillow ZIP-search URL.

Related reads: Monitoring Zillow listings for real estate deals covers the price-drop side of the same workflow. Extracting real estate agent contacts from Realtor.com covers the agent-prospecting variant on the other major listing site. For an overview of when a managed scraping API saves time over rolling your own, see the web scraping API guide.

Frequently asked questions

Does Zillow have an API for new listings by ZIP code?
No public one that gives you new-listing alerts by ZIP. The old Zillow Web Services API (ZWSAPI) was retired years ago, and the Bridge Interactive feed requires MLS membership and broker sponsorship. For non-licensed users — investors, wholesalers, analysts — the practical path is scraping the public search page filtered to your ZIP, then diffing daily results to detect what's new. Zillow's own email alerts exist for end-users but cannot be piped to Slack, a database, or your CRM.
What does 'new listing' actually mean on Zillow?
Zillow flags a listing as 'new' for the first few days after it appears in the MLS feed, and the search page can be sorted by 'newest' to surface those. In practice 'new to me' is more useful than 'new to Zillow' — what you want is any listing in your ZIP that wasn't on yesterday's run. Diffing by zpid (Zillow's stable property ID, visible in every listing URL) catches both brand-new and previously-off-market relistings, which is usually what an investor wants to see.
How often can I scrape Zillow without getting blocked?
Zillow uses Cloudflare plus its own behavioural detection. From a clean residential IP, paced 3-5 seconds between requests, you can typically pull 30-100 pages before friction. From a datacenter IP, expect a Press & Hold or Cloudflare challenge within the first few requests. For a daily once-a-day pull across a handful of ZIPs, even a single residential IP is usually fine. For hourly checks across dozens of ZIPs, you want a rotating residential pool.
Can I just use Zillow's email alerts and skip all this?
If you only care about being notified, yes — Zillow's saved-search alerts are free. They fail the moment you want the data anywhere else: in a Slack channel your team watches, in an Airtable for triage, in a Postgres for analysis over time, or filtered by criteria Zillow doesn't expose in its UI (price-per-sqft below X, or specific listing-agent firms). Scraping gives you structured data; email alerts give you HTML in a person's inbox.
What's the legal status of scraping Zillow listings?
Scraping public listing pages is generally treated as accessing public data under US case law (hiQ Labs v. LinkedIn, 9th Cir. 2022). Zillow's Terms of Use prohibit automated access, which is a contract matter — relevant if you have an account or do anything that affects their service. The bigger downstream issue isn't scraping; it's what you do with the data. Cold-calling sellers off scraped listings runs into TCPA and Do-Not-Call rules, and using listing photos commercially can run into copyright. Read public listings, do not call numbers on the DNC list, and you're in normal territory.

Related posts

Tutorial

How to Monitor Zillow Listings for Real Estate Deals

7 min read
Tutorial

How to Monitor Amazon BuyBox Changes (and Get Alerted When You Lose It)

9 min read
Tutorial

How to Track Amazon Competitor Prices Daily (Export to CSV and Google Sheets)

10 min read