Skip to main content
All CollectionsWays to use LutraAI Web Scraping & Extraction
🔗 Two-Step (Nested) Scraping with Lutra

🔗 Two-Step (Nested) Scraping with Lutra

Capture a list of items from a primary page, then automatically follow each item’s link to grab detailed info from secondary pages. This “two-step” approach helps you build a richer, more complete dataset.

Lutra avatar
Written by Lutra
Updated over 2 months ago

Sometimes, scraping data from a primary webpage is only half the job. Each item in a table or list may contain a link leading to a detailed page—like a product info page, location details, or user profile.

With Lutra, you can perform a “two-step” (nested) scrape:

  1. Scrape a list of items (and their links) from a primary source.

  2. Follow each item’s link to gather deeper, more detailed information.

This two-step approach is powerful for capturing more comprehensive data, and Lutra’s ability to “follow” links makes it far easier than typical scrapers.


Extract the Main Listing

Before you can do multi-layer extraction, you’ll first need to extract the initial list of items and their links. If you’re new to this or have multiple pages, check out our article on Scraping Multi-Page Websites for best practices.

  1. Identify the listing page and the data fields you need (e.g., name, price).

  2. Instruct Lutra to extract each item’s main info plus the link to its detail page.

  3. Save the data in a spreadsheet (Google Sheets, Excel, CSV, etc.) and ensure you have a column with links.


Follow Each Link for More Details

Once you have a sheet of items and their detail-page links, you can have Lutra recursively open each link to gather additional data.

  1. Use Your Existing Spreadsheet

    • Instruct Lutra to read the column that contains the detail-page links: "Use the Link column, and read the webpage there to get the item's information."

  2. Define the Fields to Extract

    • For example: “Scrape each item’s extended description, user ratings, and shipping details.”

  3. Output to a New Sheet (or the Same One)

    • You can create a fresh spreadsheet or add new columns to your existing one.

    • If you split them, add a unique identifier (e.g., a product ID) to link the two datasets together.


Combine or Cross-Reference Data

  • Same Sheet or Separate Sheets: It’s often easiest to keep your main list in one sheet (listing info + link) and detailed info in another.

  • Linking Rows: If you choose separate sheets, ensure both contain a shared ID so you can cross-reference easily.

Tip: Cross-reference data by asking Lutra to use the same ID across different sheets:

“Create a column titled ‘Product ID’ in both the ‘Product Listings’ and ‘Product Details’ sheets to link details to their respective items.”


Save as a Playbook

  1. Create Playbook: After you confirm your two-step scraping approach works, click “Create Playbook”.

  2. Name & Document: Give it a clear title (e.g., “Two-Step Product Scrape”) and a short description so you or your teammates can quickly understand it.

  3. Reuse or Automate: Run this Playbook anytime you need a fresh round of data. You can even schedule it if your source content updates regularly.


Tips & Best Practices

  • Test on a Few Items First: Make sure the detail links work and the data looks correct before scaling.

  • Be Explicit: If detail pages have tabs or nested sections, specify which parts you want extracted.

Did this answer your question?