← Back to blogTutorial

The Real-Estate Wholesaler's 9 AM Routine: New Zillow Listings in Your ZIPs Before the Coffee's Cold

· 11 min read

If you wholesale houses, your edge is not capital and it is not a bigger marketing budget — it is being the first credible call a motivated seller gets. On-market deals move fast: a mispriced or "as-is" listing that hits Zillow at 7 AM can have three offers by lunch, and the wholesaler who saw it first and dialed first is the one who locks it up. The raw material for that edge is narrow and specific: the day's net-new listings in your target ZIPs, filtered to your buy-box, in your hand before the coffee's cold.

This guide is the full morning-routine pipeline. We will cover why a one-time snapshot of a Zillow search is not enough and why you need a daily diff, how to encode your buy-box (ZIP, price band, beds/baths) directly into a Zillow search URL, how to fire a search-scrape job and poll it, how to diff today's listings against yesterday's by Zillow's stable zpid so you only ever see what's new, and how to push that short list to your phone or a CSV before 9 AM. The example is a $120k–$250k, 3-bed buy-box across a handful of ZIPs, but the same code covers any market by swapping the ZIPs and the price band.

Why a Snapshot Isn't Enough — You Need a Daily Diff

Every Zillow search is encoded entirely in its URL. When you set filters in the Zillow UI — a ZIP, a price range, a minimum bed count, keywords — Zillow packs all of it into the query string, so the URL is the saved search. That is the property the whole pipeline leans on: if you can build the URL, you can reproduce the exact filtered search on demand, every morning, without ever touching the Zillow UI again.

But scraping that search once only ever gives you a snapshot — every active listing that matches your buy-box right now. The first time you run it that is useful; you get your standing inventory. Every run after that, the snapshot is mostly noise: the same listings you saw yesterday, plus a handful of new ones buried in the middle. For a wholesaler the only rows that matter are the net-new ones — the listings that appeared since your last pull. Those are the ones nobody else has called yet.

So the pipeline is a diff problem, not a scrape problem. The scrape is the easy part. The value is in computing the difference between today's result set and yesterday's, and surfacing only the additions. The clean way to do that is Zillow's own zpid — a stable numeric identifier unique to each property. Deduping and diffing on zpid is far more reliable than diffing on address strings (which vary in formatting) or price (which changes when a seller cuts), because the zpid stays constant for the life of the listing.

The routine, then, is: build one search URL per buy-box, pull it every morning, diff against the stored zpid set from yesterday, and push only the new rows to your phone.

Step 1: Encode Your Buy-Box in a Zillow Search URL

Open Zillow, type one of your target ZIPs into the search bar, and apply your filters in the UI: the price band, minimum beds and baths, and any keyword that signals motivation — "as-is", "motivated", "price cut", "investor", "TLC". Zillow writes every one of those filters into the URL. Copy that URL out of the address bar; it is your saved search, reproducible forever.

A Zillow search URL looks like this — the ZIP and filters live in the path and the query string:

https://www.zillow.com/austin-tx-78704/houses/?searchQueryState=...

You will usually run several of these, one per ZIP (or per ZIP-plus-criteria combination), because a wholesaler's buy-box is rarely a single ZIP. Keep them in a small list keyed by a label you will recognize in the morning, and a short Python list is all the "config" this pipeline needs:

# One Zillow search URL per target ZIP, already filtered to your buy-box
# (price band, min beds/baths, and motivation keywords applied in the UI).
SEARCHES = {
    "78704 (3bd, 120-250k)": "https://www.zillow.com/austin-tx-78704/houses/?searchQueryState=...",
    "78745 (3bd, 120-250k)": "https://www.zillow.com/austin-tx-78745/houses/?searchQueryState=...",
    "78744 (3bd, 120-250k)": "https://www.zillow.com/austin-tx-78744/houses/?searchQueryState=...",
}

Two practical notes. First, apply the price and bed filters in the URL rather than scraping wide and filtering in Python — narrowing at the source means fewer rows to pull and a tighter diff. Second, keyword filters like "as-is" are a coarse signal, not a guarantee; treat them as a way to rank the day's new listings, not as a hard gate, because plenty of real deals never say the magic words.

Step 2: Fire a Search-Scrape Job and Poll It

The Zillow search endpoint takes one search URL and a page count, scrapes the matching listings, and returns the structured rows — each with zpid, address, price, beds, baths, status, and days on Zillow. Every call is asynchronous: you submit, get the job id back immediately, then poll.

Confirm one search works with curl before you wire up the loop:

# 1) Submit one search — returns a job id immediately
curl -G "https://api.logposervices.com/api/v1/realestate/zillow/scape_search" \
  -H "X-API-Key: lp_xxxxxxx" \
  --data-urlencode "url=https://www.zillow.com/austin-tx-78704/houses/?searchQueryState=..." \
  --data-urlencode "pages=3"
# → {"job_id": "zl_7c2b...", "status": "pending"}

# 2) Poll the job until status == "completed"
curl -H "X-API-Key: lp_xxxxxxx" \
  https://api.logposervices.com/api/v1/jobs/zl_7c2b

# 3) Fetch the listing rows
curl -H "X-API-Key: lp_xxxxxxx" \
  https://api.logposervices.com/api/v1/jobs/zl_7c2b/result

The async pattern is not optional here. api.logposervices.com sits behind Cloudflare, which kills any single connection at roughly 90 seconds. A multi-page Zillow search can run longer than that, so you must never wait on one inline request — submit the job, let it run server-side, and poll the job id for the result.

pages=3 is plenty for a single ZIP buy-box: a tight price band rarely has more than a few dozen active listings, and the net-new count on any given morning is usually a handful. You are not trying to deep-page the whole market — you are pulling one focused search and diffing it. More pages only matters if a ZIP plus a wide price band genuinely carries hundreds of active listings.

Step 3: Submit Every Search and Collect the Rows

For the morning routine you have a small set of searches — one per ZIP — so the right pattern is fire-all-then-poll: submit every search up front (each returns instantly with a job id), then poll the outstanding job ids until they all finish. The searches run concurrently server-side instead of one-at-a-time.

import os, time, requests

API_KEY = os.environ["LOGPOSE_API_KEY"]
BASE = "https://api.logposervices.com/api/v1"
HEADERS = {"X-API-Key": API_KEY}


def submit(url, pages=3):
    r = requests.get(
        f"{BASE}/realestate/zillow/scape_search",
        params={"url": url, "pages": pages},
        headers=HEADERS, timeout=30,
    )
    r.raise_for_status()
    return r.json()["job_id"]


def collect(job_ids, poll_every=5, timeout_s=600):
    """Poll a batch of job ids; return the merged list of listing rows."""
    pending = set(job_ids)
    rows, deadline = [], time.time() + timeout_s
    while pending and time.time() < deadline:
        for jid in list(pending):
            s = requests.get(f"{BASE}/jobs/{jid}", headers=HEADERS, timeout=15).json()
            status = s.get("status")
            if status == "completed":
                res = requests.get(f"{BASE}/jobs/{jid}/result",
                                   headers=HEADERS, timeout=30).json()
                rows.extend(res.get("listings", []))
                pending.discard(jid)
            elif status == "failed":
                print(f"  search job {jid} failed: {s.get('error')}")
                pending.discard(jid)
        if pending:
            time.sleep(poll_every)
    return rows


# Submit every buy-box search, then poll them all
job_ids = [submit(url, pages=3) for url in SEARCHES.values()]
print(f"submitted {len(job_ids)} searches")
today_rows = collect(job_ids)
print(f"collected {len(today_rows)} active listings across all ZIPs")

Submitting first and polling second is what keeps the whole routine to a couple of minutes of wall-clock time even across several ZIPs — the jobs run in parallel on the server up to your account's concurrency cap, and your script just watches the queue drain.

Step 4: Diff Against Yesterday by zpid

This is the step that turns a snapshot into a morning call list. Load the set of zpid values you saw yesterday, compare it against today's pull, and keep only the listings whose zpid is new. Store today's full zpid set back to disk so tomorrow's run has something to diff against.

import json, pathlib

STATE = pathlib.Path("seen_zpids.json")


def load_seen():
    if STATE.exists():
        return set(json.loads(STATE.read_text()))
    return set()


def save_seen(zpids):
    STATE.write_text(json.dumps(sorted(zpids)))


def net_new(today_rows, seen):
    """Return only the listings whose zpid we have not seen before."""
    fresh = []
    for r in today_rows:
        zpid = str(r.get("zpid") or "")
        if zpid and zpid not in seen:
            fresh.append(r)
    return fresh


seen = load_seen()
new_listings = net_new(today_rows, seen)

# Fold today's full set back into state for tomorrow's diff
all_today = {str(r.get("zpid")) for r in today_rows if r.get("zpid")}
save_seen(seen | all_today)

print(f"{len(new_listings)} net-new listings since yesterday")
# e.g. ~80 active across all ZIPs -> 4 net-new this morning

One subtlety worth getting right: fold today's entire active set into the stored state, not just the new rows. If you only saved the net-new ones, a listing that was already active when you first started running this would re-appear as "new" forever. Storing the full daily set means a zpid is marked seen the first time it shows up and never falsely re-surfaces. The first morning you run this, every active listing counts as "new" — that is the one-time baseline; from day two on, the diff is small and real.

Step 5: Rank and Push the Morning List

The net-new rows are already short, but a wholesaler still wants them ordered by where the deal is most likely to be. Two cheap signals do most of the work: a brand-new listing (low days on Zillow) you want to call today before anyone else, and motivation keywords in the description. Sort on those, then push the list somewhere you will actually see it at 8:45 AM — a CSV for your call sheet, or a message straight to your phone.

import csv

MOTIVATION = ("as-is", "as is", "motivated", "price cut",
              "investor", "tlc", "handyman", "cash only")


def deal_score(row):
    text = (row.get("description") or "").lower()
    kw_hits = sum(1 for k in MOTIVATION if k in text)
    days = row.get("days_on_zillow")
    freshness = 1 if (days is not None and days <= 1) else 0
    return (kw_hits, freshness)  # sort key: keywords first, then freshness


def write_call_sheet(rows, out_path):
    rows = sorted(rows, key=deal_score, reverse=True)
    fields = ["address", "price", "beds", "baths", "status",
              "days_on_zillow", "zpid", "url"]
    with open(out_path, "w", newline="", encoding="utf-8") as f:
        w = csv.DictWriter(f, fieldnames=fields, extrasaction="ignore")
        w.writeheader()
        for r in rows:
            w.writerow({
                "address": r.get("address", ""),
                "price": r.get("price", ""),
                "beds": r.get("beds", ""),
                "baths": r.get("baths", ""),
                "status": r.get("status", ""),
                "days_on_zillow": r.get("days_on_zillow", ""),
                "zpid": r.get("zpid", ""),
                "url": r.get("url", ""),
            })
    return len(rows)


n = write_call_sheet(new_listings, "morning_call_sheet.csv")
print(f"wrote {n} net-new listings, ranked, to morning_call_sheet.csv")

Schedule this whole script with a plain cron entry around 8 AM and your call sheet is waiting before you sit down. The ranking is deliberately simple: keyword hits surface the listings that announced a motivated seller, and the freshness flag surfaces the ones that just went live — exactly the two reasons you would jump a listing to the top of your dial list.

Scaling This Into a Standing Pipeline

The script above works, but it leaves you owning two pieces of infrastructure: the cron that fires it every morning, and the seen_zpids.json state file that has to survive between runs. For a single market on your own laptop that is fine. The moment you are covering several markets, or you want the alert to reach you even when your laptop is asleep, hosting your own scheduler-plus-state-store stops being worth it.

LogPose exposes a monitor primitive that collapses both pieces. Instead of scraping a search on a cron and diffing it yourself, you register the saved Zillow search once and let the monitor poll it on a schedule, firing an alert the moment new listings appear — no cron, no state file, no diff code of your own:

curl -X POST "https://api.logposervices.com/api/v1/monitors" \
  -H "X-API-Key: lp_xxxxxxx" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.zillow.com/austin-tx-78704/houses/?searchQueryState=...",
    "name": "78704 buy-box — net-new listings",
    "metric": "new_listings",
    "condition": "increase",
    "threshold": 1,
    "check_interval_hours": 24,
    "notify_channels": ["telegram"]
  }'

Set check_interval_hours to match your routine — a daily check that lands before your morning calls. The notify_channels list takes telegram, email, webhook, slack, or discord, so the new listings can land in the same place you already watch: a Telegram chat on your phone, an email, or a webhook into your CRM. Run one monitor per ZIP buy-box and the morning routine stops being a script you maintain and becomes a notification you act on.

The Honest Fit

This pipeline fits well when your deals come from the on-market side: wholesalers who want first contact on fresh MLS-fed listings, investors watching specific ZIPs for price cuts, and agents doing lead-gen against active inventory. The daily zpid diff and the buy-box-in-the-URL approach are the two primitives that make "what's new in my ZIPs since yesterday" reliable instead of a manual re-scan.

Where it is not the right tool: Zillow shows you what is on the market, so this is not a source of off-market deals. Probate leads, tax-lien and pre-foreclosure lists, absentee-owner and high-equity targeting, and skip-traced owner phone numbers do not live on Zillow — they come from county records and dedicated investor-data vendors like PropStream. If your model is driving for dollars or mailing absentee owners, this pipeline complements that work but does not replace it; it is the on-market half of a wholesaler's pipeline, run on autopilot so you never miss the listing that hit at 7 AM.

Get Started

  1. Sign up at logposervices.com and generate an API key under Tool → API Keys.
  2. export LOGPOSE_API_KEY=lp_xxxxxxx
  3. Build one filtered Zillow search URL per ZIP, then test a single search:
curl -G "https://api.logposervices.com/api/v1/realestate/zillow/scape_search" \
  -H "X-API-Key: lp_xxxxxxx" \
  --data-urlencode "url=https://www.zillow.com/austin-tx-78704/houses/?searchQueryState=..." \
  --data-urlencode "pages=3"

Then run the submit / collect / net_new functions over your ZIP list, write the ranked call sheet, and — once it's working — register a monitor per buy-box so the net-new listings come to you every morning instead of you going to get them.

Related reading: How to scrape new Zillow listings in a ZIP code every day for the daily-diff fundamentals, Working with Zillow real-estate data for the full field set, and PropStream alternatives for Zillow real-estate leads for where on-market and off-market data fit together.

External: Zillow, hiQ Labs v. LinkedIn.

Frequently asked questions

Is it legal to pull public Zillow listing data for a deal pipeline?
The listing fields you act on — address, list price, beds, baths, status, and days on Zillow — are public data displayed without a login to anyone who opens the search page. Scraping public web data is not a CFAA violation in the United States, per hiQ Labs v. LinkedIn (9th Cir. 2022), which held that accessing publicly available information does not constitute unauthorized access. What you are doing here is reading the same on-market listings a buyer sees and computing a daily diff for your own internal call list, not republishing a competing listings product or touching anything behind authentication. The genuinely regulated step is downstream: when you contact a seller or agent, cold-call and cold-text rules (TCPA in the US) govern how you reach out, not how you assembled the list — and that is where the real compliance work lives.
Why diff every day instead of just scraping the search once a week?
Wholesaling is a speed game — the first credible offer in front of a motivated seller usually wins, and on-market deals get stale within hours of going live. A weekly snapshot tells you what is on the market, but it buries the only rows that matter (the ones that appeared since you last looked) inside hundreds you already saw. A daily diff against yesterday's set of Zillow zpids isolates exactly the net-new listings, so your morning call list is short, fresh, and entirely actionable instead of a re-read of stale inventory. The diff is the whole point: a snapshot tells you the market; the diff tells you what changed overnight while you were asleep.

Related posts

Comparison

PropStream Alternatives for Building Real-Estate Lead Lists from Zillow

10 min read
Tutorial

How to Scrape Zillow New Listings by ZIP Code Every Morning

9 min read
Tutorial

How to Monitor Zillow Listings for Real Estate Deals

7 min read