← Back to blogTutorial

How to Extract Real Estate Agent Contacts from Realtor.com

· 9 min read

If you sell to real estate agents — mortgage brokers, transaction-coordination services, photography vendors, CRM tools, lead-gen platforms — your highest-intent prospect is an agent who is actively listing properties right now. Realtor.com publishes that exact list on every listing page: the listing agent's name, their brokerage, and frequently a public phone number. This guide shows how to turn a city-wide Realtor.com search into a deduped CSV of active agents, with working Python. It runs as written.

Why Public Listing Data Is Enough for Agent Prospecting

You don't need a NAR membership or an MLS feed to build an active-agent list. Three use cases drive most of this work and all of them run on public data:

Vendor prospecting. You sell something agents buy — sign installation, listing photography, transaction coordination, drone shots, virtual staging, CRM, lead-gen. You want the names of every agent who listed a property in your service area in the last 30 days. That's a publicly observable behaviour.

Lateral recruiting. A brokerage trying to recruit agents in a new market wants a sorted list of who's listing in that market and where they're currently hanging their license. Brokerage name on the listing card answers that directly.

Mortgage / title / inspection cross-sell. You want to introduce yourself to listing agents in your service area because their next listing is going to need a buyer, and their buyer is going to need a loan. Again — listing card data, in public.

What public listings won't give you: the agent's personal email (it's not on the card), their cell unless they chose to publish it, their NMLS or license number, or their MLS-internal preferences. Realtor.com's listing card is a marketing surface — agents publish what they want prospects to see. That's the dataset you're building from.

What Realtor.com Returns

For a city search, the API returns a listings array. Each listing has the property data plus a per-listing agent block:

{
  "listing_url": "https://www.realtor.com/realestateandhomes-detail/...M1234567890",
  "address": "5012 W Bay Ave, Tampa, FL 33611",
  "price": 685000,
  "beds": 3,
  "baths": 2,
  "sqft": 1840,
  "lot_size": "7,250 sqft",
  "property_type": "Single Family Home",
  "days_on_market": 12,
  "mls_id": "T3501234",
  "listing_agent_name": "Sarah Mitchell",
  "listing_agent_phone": "(813) 555-0142",
  "brokerage_name": "Smith & Associates Realty",
  "photo_urls": ["https://ap.rdcpix.com/..."]
}

listing_agent_phone is present when Realtor.com renders it on the public card — which is most active listings, but not all. The property-details endpoint returns a superset of these fields plus the full price history, school zone, neighborhood data, and description text.

The Manual Flow

Before automating, do one search by hand so you know what you're feeding the API:

  1. Open realtor.com, search the city (Tampa, FL).
  2. Adjust filters if you want — price range, property type, beds.
  3. Copy the URL from the address bar: https://www.realtor.com/realestateandhomes-search/Tampa_FL.

That URL is your input. Realtor.com search URLs are stable and human-readable, which makes them pleasant to template across multiple cities.

The API Flow With Working Python

A small helper handles submit-and-poll for both the search and the property-details endpoint.

import os, time, requests

API_KEY = os.environ["LOGPOSE_API_KEY"]
BASE = "https://api.logposervices.com/api/v1"
HEADERS = {"X-API-Key": API_KEY}


def submit_and_wait(path: str, params: dict, timeout_s: int = 120) -> dict:
    r = requests.get(f"{BASE}/{path}", params=params, headers=HEADERS, timeout=30)
    r.raise_for_status()
    job_id = r.json()["job_id"]
    deadline = time.time() + timeout_s
    while time.time() < deadline:
        s = requests.get(f"{BASE}/jobs/{job_id}", headers=HEADERS, timeout=15).json()
        if s["status"] == "completed":
            break
        if s["status"] == "failed":
            raise RuntimeError(s.get("error", "unknown failure"))
        time.sleep(2)
    else:
        raise TimeoutError(f"job {job_id} did not finish in {timeout_s}s")
    return requests.get(f"{BASE}/jobs/{job_id}/result", headers=HEADERS, timeout=15).json()


if __name__ == "__main__":
    SEARCH_URL = "https://www.realtor.com/realestateandhomes-search/Tampa_FL"
    result = submit_and_wait(
        "realestate/realtor/search",
        {"search_url": SEARCH_URL, "max_pages": 5},
    )
    print(f"Pulled {len(result['listings'])} listings")

The endpoints are asynchronous: the GET returns a job_id, you poll /jobs/{id} until status is completed, then fetch the result. Same shape as every other LogPose endpoint, so if you've worked through the Amazon scraping guide the pattern is familiar.

The max_pages parameter is optional and caps how many pages of search results to walk. Five pages on a mid-sized city is roughly 200 listings; ten pages is roughly 400. The right ceiling depends on how deep you want to go before diminishing returns kick in — the back pages tend to be stale listings that have been on market 90+ days.

The Real Workflow: Search → Details → Dedupe → CSV

The search endpoint already gives you listing_agent_name, listing_agent_phone, and brokerage_name for most listings, which is enough for a first-pass list. The property-details endpoint enriches with anything the search summary truncated — useful if you're noticing agent records with missing phone numbers in the search results that turn out to be populated on the detail page.

A practical pipeline looks like this:

import csv

def collect_agents(search_url: str, max_pages: int = 5) -> list[dict]:
    result = submit_and_wait(
        "realestate/realtor/search",
        {"search_url": search_url, "max_pages": max_pages},
    )
    agents = []
    for listing in result.get("listings", []):
        name = (listing.get("listing_agent_name") or "").strip()
        if not name:
            continue
        agents.append({
            "agent_name": name,
            "agent_phone": (listing.get("listing_agent_phone") or "").strip(),
            "brokerage": (listing.get("brokerage_name") or "").strip(),
            "listing_url": listing.get("listing_url", ""),
            "listing_price": listing.get("price"),
            "listing_address": listing.get("address", ""),
        })
    return agents


def dedupe_agents(rows: list[dict]) -> list[dict]:
    """One row per (name, brokerage). Keep listing count and most recent listing."""
    by_key: dict[tuple[str, str], dict] = {}
    for r in rows:
        key = (r["agent_name"].lower(), r["brokerage"].lower())
        if key not in by_key:
            by_key[key] = {
                "agent_name": r["agent_name"],
                "agent_phone": r["agent_phone"],
                "brokerage": r["brokerage"],
                "listing_count": 0,
                "sample_listing_url": r["listing_url"],
                "sample_listing_address": r["listing_address"],
            }
        by_key[key]["listing_count"] += 1
        # Prefer a non-empty phone if we've seen one
        if not by_key[key]["agent_phone"] and r["agent_phone"]:
            by_key[key]["agent_phone"] = r["agent_phone"]
    return sorted(by_key.values(), key=lambda x: x["listing_count"], reverse=True)


def write_csv(rows: list[dict], path: str) -> None:
    with open(path, "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=list(rows[0].keys()))
        writer.writeheader()
        writer.writerows(rows)


if __name__ == "__main__":
    raw = collect_agents(
        "https://www.realtor.com/realestateandhomes-search/Tampa_FL",
        max_pages=8,
    )
    unique = dedupe_agents(raw)
    write_csv(unique, "tampa-agents.csv")
    print(f"Wrote {len(unique)} unique agents from {len(raw)} listings")
    print("\nTop 10 by listing count:")
    for row in unique[:10]:
        print(f"  {row['listing_count']:3d}  {row['agent_name']}  ({row['brokerage']})")

Two things this dedupe does that you'll want even if you change everything else: it keys on (name, brokerage) not just name (two "John Smith" agents at different brokerages are different people), and it counts listings per agent. The listing count is your single best signal of how active an agent is — sort descending and the top of the file is the people you actually want to talk to.

Enriching With Property Details

The search summary is usually enough. If you find a meaningful number of rows have missing phones, you can enrich by hitting the property-details endpoint on each listing URL:

def enrich_with_details(listing_url: str) -> dict:
    result = submit_and_wait(
        "realestate/realtor/property-details",
        {"property_url": listing_url},
    )
    return {
        "listing_agent_name": result.get("listing_agent_name"),
        "listing_agent_phone": result.get("listing_agent_phone"),
        "brokerage_name": result.get("brokerage_name"),
    }

Run that only on rows where the search-level phone was empty. Walking property-details for every listing in a major-city search is expensive — both in time and in proxy traffic — and rarely changes more than 10-15% of records.

Scaling Beyond a Single City

Once you have a city working end-to-end, the only thing in your way is throughput. Sequential GETs across 30 US cities is hours of wall time. The bulk endpoint takes a list of targets in one POST:

def submit_realtor_bulk(targets: list[dict]) -> str:
    r = requests.post(
        f"{BASE}/realestate/realtor/search/bulk",
        json={"targets": targets},
        headers=HEADERS, timeout=30,
    )
    r.raise_for_status()
    return r.json()["job_id"]


CITIES = [
    "https://www.realtor.com/realestateandhomes-search/Tampa_FL",
    "https://www.realtor.com/realestateandhomes-search/Orlando_FL",
    "https://www.realtor.com/realestateandhomes-search/Jacksonville_FL",
    "https://www.realtor.com/realestateandhomes-search/Miami_FL",
]
job_id = submit_realtor_bulk(
    [{"search_url": url, "max_pages": 5} for url in CITIES]
)
# Poll job_id the same way; result contains a list of per-target results.

For a recurring pipeline — "give me the new agents who appeared this week in my 20 target cities" — combine the city sweep with a state file keyed on (name, brokerage) and treat anything not in the previous run as net-new. Same pattern as the Zillow new-listing diff workflow, just keyed on agents instead of zpids.

If you genuinely need verified cell numbers and email beyond what Realtor.com publishes, the right play is a two-stage pipeline: scrape Realtor.com for the names + brokerages + listing counts, then push that list to a B2B enrichment service that resolves contact info. Apollo and Hunter both have APIs that take (name, company) and return verified email; ZoomInfo does the same for phone. The scrape gives you the audience, enrichment fills in the channel.

Legality and Ethics

Reading public listing pages is treated as accessing public data under hiQ v. LinkedIn. The agent name, brokerage, and any publicly-displayed phone are business contact information — the agent published them on a marketing site to attract prospects, and B2B outreach to a published business number is normal commerce.

What requires care is the outreach itself. Auto-dialers, ringless voicemail drops, and SMS-blast tools all run into TCPA and state-level consent rules even when the contact is B2B. The National Do-Not-Call Registry covers B2C; business numbers aren't on it, but several states have stricter rules. Manual outreach, opt-in CRM sequences, and asking before texting are the safe path. If you're not sure, route through a compliance review before you scale.

Common Mistakes

  • The URL must contain realtor. — the endpoint validates the host. Zillow URLs go to the Zillow endpoints, not this one.
  • Dedupe by (name, brokerage), not by name alone. "John Smith" at Coldwell Banker and "John Smith" at Keller Williams are different agents.
  • One high-volume agent can dominate a search. A top producer with 40 active listings will show up 40 times before dedupe. Always count, then sort.
  • max_pages is a ceiling, not a guarantee. If the city only has 3 pages of listings and you ask for 10, you get 3. Don't treat missing pages as an error.
  • Cloudflare's 100s edge timeout. api.logposervices.com is fronted by Cloudflare. Any job that runs past 90 seconds returns 524 to the client even though it's still running server-side. Always poll the /jobs/{id} endpoint; never wait on a single blocking GET.
  • Phone fields can be empty. A real fraction of listings publish only the brokerage's main line, or omit the phone entirely. Build the pipeline so missing phones are a "needs enrichment" flag, not a hard error.

Get Started

  1. Sign up at logposervices.com and generate an API key from Tool → API Keys.
  2. export LOGPOSE_API_KEY=lp_xxxxxxx
  3. Run the snippet above against your target city's Realtor.com search URL.

Related reads: Scraping Zillow new listings by ZIP code is the buy-side companion to this post — same listing data, different intent. Monitoring Zillow listings for real estate deals covers the price-watch workflow. For broader B2B lead-gen patterns that pair well with agent prospecting, see building a B2B lead list from Yellow Pages and scraping Google Maps for local business leads. If you're evaluating tools, the Octoparse alternatives for lead generation post lays out the landscape.

Frequently asked questions

Is it legal to scrape real estate agent contact info from Realtor.com?
Reading public listing pages is generally treated as accessing public data under US case law (hiQ Labs v. LinkedIn, 9th Cir. 2022). The agent name, brokerage, and any phone number Realtor.com displays publicly are business contact information, not private data — the agent has chosen to publish them as a marketing channel. What changes the calculus is downstream use: cold-calling agents using auto-dialers, or text-blasting them, runs into TCPA and Do-Not-Call rules. Treat scraped phone numbers as B2B contacts and apply normal business-outreach hygiene.
Will I get cell phone numbers or just office lines?
What Realtor.com shows publicly is what you get — usually a single business number that may be a cell or an office line depending on the agent. You will not get the agent's personal email, MLS-internal contact, or any number they haven't chosen to publish on their listing card. If you need verified cell numbers and email at scale, you pair this scrape with a B2B enrichment service that joins on name plus brokerage. Tools in that adjacent space include Apollo, ZoomInfo, and Hunter — useful as a second-pass enrichment step, not as a replacement for the listing scrape itself.
How is this different from buying a list of agents from a data broker?
Bought lists are static, often stale, and frequently include agents who haven't closed a deal in two years. Scraping active Realtor.com listings gives you agents who have at least one current listing — which is a proxy for being active. Combining listing count, brokerage, and city gives you a much sharper segmentation than a flat list ('agents in Florida') ever does. The tradeoff is you're building the dataset yourself instead of paying for it pre-built.
How many agents can I expect from one city scrape?
A search of a mid-sized US city (Tampa, Charlotte, Nashville) with 5-10 pages of results typically yields 200-400 unique active agents after dedupe. A major metro (Phoenix, Houston, Atlanta) at the same depth produces 800-1500. The dedupe matters: a single high-volume agent can show up 20+ times across one search because their listings dominate the page. Always dedupe by (name, brokerage) before counting.
Can I scrape Realtor.com without getting blocked?
Realtor.com uses Cloudflare and behavioural fingerprinting similar to Zillow. From a clean residential IP paced 3-5 seconds between requests, you can typically pull dozens of pages before friction. A datacenter IP usually hits a Cloudflare challenge inside the first few requests. For a one-time city-level scrape, even modest pacing is fine. For repeated city-by-city runs across the US, you want rotating residential proxies, which is most of what a managed scraping API handles on your behalf.

Related posts

Tutorial

How to Monitor Amazon BuyBox Changes (and Get Alerted When You Lose It)

9 min read
Tutorial

How to Track Amazon Competitor Prices Daily (Export to CSV and Google Sheets)

10 min read
Tutorial

How to Scrape Zillow New Listings by ZIP Code Every Morning

9 min read