Introducing Workflow Builder 2.0
Workflow Builder 2.0 is focused on making large scraping operations easier to build, debug, and maintain. This release brings significant improvements to how workflows handle branching logic, pagination, and execution performance — all based on feedback from teams running production extraction workflows.
Here is what changed, why it matters, and how to take advantage of the new capabilities.
What changed in 2.0
Conditional branching
The original workflow builder supported linear sequences of steps: navigate, click, extract, repeat. This worked well for simple extractions but became awkward when workflows needed to handle different page layouts, error states, or conditional logic.
Workflow Builder 2.0 introduces conditional branching nodes. You can now define “if/else” paths based on:
- Element presence — Does a specific element exist on the page? If a product is out of stock, take one path. If it is available, take another.
- Text content — Does a text element contain specific keywords? Route the workflow based on what the page actually says.
- URL patterns — Has the page redirected to a login wall or error page? Detect it and handle it gracefully instead of extracting garbage data.
- Extracted values — Is the price above or below a threshold? Branch based on the data you have already collected in the workflow.
Conditional branching means you can build a single workflow that handles multiple page variants instead of maintaining separate workflows for each edge case.
Improved loop handling
Pagination is the most common source of workflow complexity. Sites paginate differently — some use “Next” buttons, some use infinite scroll, some use URL parameters, and some use a combination. The original builder handled basic pagination but required workarounds for more complex patterns.
2.0 introduces dedicated loop nodes with built-in support for:
- Click-based pagination — Click a “Next” or “Load More” button and wait for new content
- Scroll-based pagination — Scroll to the bottom of the page to trigger lazy loading
- URL-based pagination — Increment URL parameters (e.g.,
?page=1,?page=2) automatically - Stop conditions — End the loop when a specific element disappears, when a maximum page count is reached, or when no new data is found
- De-duplication — Automatically skip items that have already been extracted in previous loop iterations
These improvements make it possible to scrape entire catalogs, search results, and directories without manually managing pagination state.
Faster execution
We rebuilt the execution engine to reduce startup time and memory usage for large workflows. The improvements are most noticeable for workflows with many steps or those that process hundreds of pages in a single run.
Startup time is now 40% faster on average. Workflows that previously took 8-10 seconds to initialize now start in under 5 seconds. For scheduled workflows that run frequently, this adds up.
Memory usage is reduced by approximately 30% for workflows with more than 50 steps. This means fewer out-of-memory failures on complex extraction tasks and better reliability for long-running workflows.
Parallel step execution is now available for independent steps within a workflow. If your workflow extracts data from multiple elements on the same page, those extractions now run concurrently instead of sequentially, reducing total execution time.
Improved error handling
2.0 adds better visibility into what happens when a workflow step fails:
- Step-level error reporting — See exactly which step failed and why, with a screenshot of the page state at the time of failure
- Configurable retry behavior — Set retry counts and delays per step, not just per workflow
- Fallback selectors — Define backup selectors that activate when the primary selector fails to match, reducing breakage when sites make minor HTML changes
- Graceful degradation — Configure steps to skip on failure instead of stopping the entire workflow, so you still get partial data when one element is missing
What this unlocks
Production-grade extraction logic
Teams can now model complex, real-world extraction scenarios in a single workflow. Before 2.0, teams often needed multiple workflows and external orchestration to handle sites with varied page layouts, authentication flows, or conditional navigation. Now, a single workflow can handle all of these cases with branching and improved loop controls.
Less maintenance debt
The number one complaint from teams running scrapers in production is that they break when target sites change. 2.0 reduces this friction in two ways: fallback selectors provide automatic resilience against minor HTML changes, and conditional branching lets you build defensive logic that handles unexpected page states instead of failing silently.
Scalable data collection
The performance improvements in 2.0 make it practical to run workflows that collect data across thousands of pages in a single execution. Teams that previously split large extraction jobs into multiple smaller workflows can now consolidate them, reducing complexity and improving data consistency.
How to use the new features
Building conditional branches
- Open the workflow builder and add a new step
- Select “Condition” from the step type menu
- Define the condition (element exists, text contains, URL matches, or extracted value comparison)
- Add steps to the “true” branch and the “false” branch
- Both branches rejoin the main workflow after the condition block
Configuring improved loops
- Add a “Loop” step to your workflow
- Choose the pagination type: click, scroll, or URL parameter
- Set your stop condition: maximum iterations, element disappears, or no new data
- Add extraction steps inside the loop body
- Enable de-duplication if the target site may repeat items across pages
Setting up fallback selectors
- Click on any extraction step in your workflow
- Open the “Advanced” panel
- Add one or more fallback selectors
- ScrapingLab tries the primary selector first, then falls back to alternatives in order
Migration from 1.0
Existing workflows continue to run without any changes. You do not need to rebuild anything. However, new workflows created after this update will use the 2.0 engine by default, which includes all the performance improvements.
If you want to take advantage of conditional branching or improved loops in existing workflows, you can add those nodes to your current workflows at any time. The 2.0 features are additive — they do not require you to restructure workflows that are already working.
What is next
We are working on several features that build on the 2.0 foundation:
- Workflow templates — Pre-built extraction patterns for common use cases like ecommerce monitoring, job board scraping, and directory collection
- Version history — Roll back workflow changes to a previous state when an update introduces problems
- Team-level analytics — Dashboard showing workflow health, success rates, and data volume across your entire team
These features are in active development and will ship over the coming months. If you have feedback or feature requests, reach out to our team at any time.
Frequently asked questions
Do I need to rebuild my existing workflows?
No. All existing workflows continue to run on the updated engine without any changes. The performance improvements (faster startup, lower memory) apply automatically to all workflows. New features like conditional branching and improved loops are opt-in — you add them to existing workflows when you are ready.
Can I mix 1.0 and 2.0 features in the same workflow?
Yes. Conditional branches and improved loop nodes work alongside existing step types. You can add a conditional branch to an existing workflow without restructuring anything else.
How do fallback selectors work in practice?
When you configure multiple selectors for an extraction step, ScrapingLab tries them in order during execution. If the primary selector returns no results, it tries the first fallback, then the second, and so on. This happens automatically on every run — you do not need to monitor selector health manually.
Is there a limit to how many conditions I can nest?
There is no hard limit on nesting depth, but we recommend keeping conditional logic to 2-3 levels deep for maintainability. If your workflow requires deeply nested conditions, consider splitting it into multiple workflows that call each other.
Will there be a migration tool for converting 1.0 loop patterns to 2.0?
We are evaluating this for a future release. For now, we recommend rebuilding pagination logic using the new loop nodes when you next need to modify an existing workflow. The new loop interface is significantly simpler, so rebuilds typically take less time than maintaining workarounds in the old pattern. Most teams report that converting an existing pagination setup to the new loop format takes under 10 minutes.
Getting started
Workflow Builder 2.0 is available now for all ScrapingLab plans. Log in to your account to start building workflows with conditional branching, improved pagination, and faster execution. If you are new to ScrapingLab, you can explore the builder immediately after creating an account.
Related on ScrapingLab:
- How to Scrape Paginated Results — Handle multi-page data
- How to Extract Data from JavaScript Sites — Dynamic page support
- Getting Started With Web Scraping — Beginner tutorial