• 15
name

A PHP Error was encountered

Severity: Notice

Message: Undefined index: userid

Filename: views/question.php

Line Number: 191

Backtrace:

File: /home/prodcxja/public_html/questions/application/views/question.php
Line: 191
Function: _error_handler

File: /home/prodcxja/public_html/questions/application/controllers/Questions.php
Line: 433
Function: view

File: /home/prodcxja/public_html/questions/index.php
Line: 315
Function: require_once

I am currently using selenium webdriver to parse through facebook user friends page and extract all ids from the AJAX script. But I need to scroll down to get all the friends. How can I scroll down in Selenium. I am using python.

You can use

driver.execute_script("window.scrollTo(0, Y)") 

where Y is the height (on a fullhd monitor it's 1080). (Thanks to @lukeis)

You can also use

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

to scroll to the bottom of the page.

If you want to scroll to a page with infinite loading, like social network ones, facebook etc. (thanks to @Cuong Tran)

SCROLL_PAUSE_TIME = 0.5

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

another method (thanks to Juanse) is, select an object and

label.sendKeys(Keys.PAGE_DOWN);
  • 260
Reply Report
    • Excellent, can you explain a little bit on scrollHeight, what does it mean and how does it work in general?
      • 1
    • How would you then use the variable "last_height"? I have something similar in my code and the browser is scrolling down. However, when I look at the data I'm scraping it only scrapes the data from the first page k times with "k" being the number of times the browser scrolls down.

If you want to scroll down to bottom of infinite page (like linkedin.com), you can use this code:

SCROLL_PAUSE_TIME = 0.5

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height

Reference: https://stackoverflow.com/a/28928684/1316860

  • 72
Reply Report
      • 2
    • This is great. For anyone who is trying to use this on instagram, you may need to first tab to the "Load more" button using ActionChains, then apply Cuong Tran's solution... at least that's what worked for me.
      • 1
    • Thanks for the answer! What I would like to do is scroll for instance in instagram to the bottom of the page, then grab the entire html of the page. Is there a function in selenium where I could give last_height as input and get the entire page html, after I have scrolled to the bottom?
element=find_element_by_xpath("xpath of the li you are trying to access")

element.location_once_scrolled_into_view

this helped when I was trying to access a 'li' that was not visible.

  • 15
Reply Report

For my purpose, I wanted to scroll down more, keeping the windows position in mind. My solution was similar and used window.scrollY

driver.execute_script("window.scrollTo(0, window.scrollY + 200)")

which will go to the current y scroll position + 200

  • 10
Reply Report

None of these answers worked for me, at least not for scrolling down a facebook search result page, but I found after a lot of testing this solution:

while driver.find_element_by_tag_name('div'):
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    Divs=driver.find_element_by_tag_name('div').text
    if 'End of Results' in Divs:
        print 'end'
        break
    else:
        continue
  • 6
Reply Report

When working with youtube the floating elements give the value "0" as the scroll height so rather than using "return document.body.scrollHeight" try using this one "return document.documentElement.scrollHeight" adjust the scroll pause time as per your internet speed else it will run for only one time and then breaks after that.

SCROLL_PAUSE_TIME = 1

# Get scroll height
"""last_height = driver.execute_script("return document.body.scrollHeight")

this dowsnt work due to floating web elements on youtube
"""

last_height = driver.execute_script("return document.documentElement.scrollHeight")
while True:
    # Scroll down to bottom
    driver.execute_script("window.scrollTo(0,document.documentElement.scrollHeight);")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.documentElement.scrollHeight")
    if new_height == last_height:
       print("break")
       break
    last_height = new_height
  • 6
Reply Report

I was looking for a way of scrolling through a dynamic webpage, and automatically stopping once the end of the page is reached, and found this thread.

The post by @Cuong Tran, with one main modification, was the answer that I was looking for. I thought that others might find the modification helpful (it has a pronounced effect on how the code works), hence this post.

The modification is to move the statement that captures the last page height inside the loop (so that each check is comparing to the previous page height).

So, the code below:

Continuously scrolls down a dynamic webpage (.scrollTo()), only stopping when, for one iteration, the page height stays the same.

(There is another modification, where the break statement is inside another condition (in case the page 'sticks') which can be removed).

    SCROLL_PAUSE_TIME = 0.5


    while True:

        # Get scroll height
        ### This is the difference. Moving this *inside* the loop
        ### means that it checks if scrollTo is still scrolling 
        last_height = driver.execute_script("return document.body.scrollHeight")

        # Scroll down to bottom
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

        # Wait to load page
        time.sleep(SCROLL_PAUSE_TIME)

        # Calculate new scroll height and compare with last scroll height
        new_height = driver.execute_script("return document.body.scrollHeight")
        if new_height == last_height:

            # try again (can be removed)
            driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

            # Wait to load page
            time.sleep(SCROLL_PAUSE_TIME)

            # Calculate new scroll height and compare with last scroll height
            new_height = driver.execute_script("return document.body.scrollHeight")

            # check if the page height has remained the same
            if new_height == last_height:
                # if so, you are done
                break
            # if not, move on to the next loop
            else:
                last_height = new_height
                continue
  • 5
Reply Report

This code scrolls to the bottom but doesn't require that you wait each time. It'll continually scroll, and then stop at the bottom (or timeout)

from selenium import webdriver
import time

driver = webdriver.Chrome(executable_path='chromedriver.exe')
driver.get('https://example.com')

pre_scroll_height = driver.execute_script('return document.body.scrollHeight;')
run_time, max_run_time = 0, 1
while True:
    iteration_start = time.time()
    # Scroll webpage, the 100 allows for a more 'aggressive' scroll
    driver.execute_script('window.scrollTo(0, 100*document.body.scrollHeight);')

    post_scroll_height = driver.execute_script('return document.body.scrollHeight;')

    scrolled = post_scroll_height != pre_scroll_height
    timed_out = run_time >= max_run_time

    if scrolled:
        run_time = 0
        pre_scroll_height = post_scroll_height
    elif not scrolled and not timed_out:
        run_time += time.time() - iteration_start
    elif not scrolled and timed_out:
        break

# closing the driver is optional 
driver.close()

This is much faster than waiting 0.5-3 seconds each time for a response, when that response could take 0.1 seconds

  • 5
Reply Report

scroll loading pages. Example: medium, quora,etc

last_height = driver.execute_script("return document.body.scrollHeight")
    while True:
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight-1000);")
        # Wait to load the page.
        driver.implicitly_wait(30) # seconds
        new_height = driver.execute_script("return document.body.scrollHeight")

        if new_height == last_height:
            break
        last_height = new_height
        # sleep for 30s
        driver.implicitly_wait(30) # seconds
        driver.quit()
  • 3
Reply Report
      • 2
    • should driver.quit() be outside the while block or not? and also the last implicit wait is not required.. someone pls confirm. @ashishmishra

if you want to scroll within a particular view/frame (WebElement), what you only need to do is to replace "body" with a particular element that you intend to scroll within. i get that element via "getElementById" in the example below:

self.driver.execute_script('window.scrollTo(0, document.getElementById("page-manager").scrollHeight);')

this is the case on YouTube, for example...

  • 1
Reply Report

Trending Tags