Search Struggles Solution¶
from selenium import webdriver
from selenium.webdriver.common.by import By
# Instantiate a WebDriver
driver = webdriver.Chrome()
driver.get("https://seleniumplayground.practiceprobs.com/")
# Make fullscreen
driver.fullscreen_window()
# Fetch the search field
search_field = driver.find_element(By.NAME, "query")
# Enter "Breed"
search_field.send_keys("Breed")
# Fetch the results
results = driver.find_elements(By.CLASS_NAME, "md-search-result__item")
html_results = [res.get_attribute("innerHTML") for res in results]
# Close the driver
driver.close()
# Build list of relevant URLs
links = []
for res in html_results:
if "Akita Inu" in res:
split_res = res.split("href=")
link_part = split_res[1]
links.append(link_part.split('"')[1])
print(links)
# ['https://seleniumplayground.practiceprobs.com/dogs/breeds/akita/']
Explanation¶
First, we need to address a tricky issue. If you looked closely at the"Hello World" example, you
might have noticed that the Selenium browser opens as a narrow window by default. The issue here is that the search
field on the playground page is minimized under that circumstance. Therefore, we first have to
make the browser fullscreen via the WebDriver.fullscreen_window()
method.
driver.fullscreen_window()
Then, we find the search field element by its name property (searching those where name = "query") via the
find_element()
method.
# Fetch the search field
search_field = driver.find_element(By.NAME, "query")
How would I know this?
Use Chrome's Inspect tool to find the element in the source code.
Then look for identifying ids, classes, tags, etc. that can be used to target the element.
Next we use the send_keys() method to enter a string into the search field.
search_field.send_keys("Solution")
You might be wondering how else we can interact with this element. Beyond using search_field.click()
to simulate a mouse click, we can als use webdriver.common.keys.Keys
to send keystrokes such as Keys.RETURN
.
Next, we find all the results from our search in a similar fashion.
results = driver.find_elements(By.CLASS_NAME, "md-search-result__item") # (1)!
len(results) # 6
results
is a list of WebElements.
Note that we use find_elements()
and not find_element()
to retrieve multiple items.
HTML of first result
>>> print(results[0].get_attribute('outerHTML'))
<li class="md-search-result__item">
<a href="https://seleniumplayground.netlify.app/dogs/breeds/" class="md-search-result__link" tabindex="-1">
<article class="md-search-result__article md-typeset" data-md-score="1520.29">
<div class="md-search-result__icon md-icon"></div>
<h1>Dog <mark>Breeds</mark></h1>
About Dog <mark>BreedsTable</mark> of Different Dog <mark>Breeds</mark>
<p>Text and images from Wikipedia, the free encyclopedia.</p>
<p>A dog <mark>breed</mark> is a particular strain of dog that was purposefully bred by humans to
perform specific tasks, such as herding, hunting, and guarding. Dogs are the most variable
mammal on earth, with artificial selection producing around 450 globally recognized
<mark>breeds.</mark> These <mark>breeds</mark> possess distinct traits related to morphology,
which include body size, skull shape, tail phenotype, fur type, body shape, and coat colour.
Their behavioural traits include guarding, herding, and hunting, and personality traits such as
hypersocial behavior, boldness, and aggression. Most <mark>breeds</mark> were derived from small
numbers of founders within the last 200 years. As a result, today dogs are the most abundant
carnivore species and are dispersed around the world.
</p>
</article>
</a>
</li>
>>> print(results[0].get_attribute('innerHTML'))
<a href="https://seleniumplayground.netlify.app/dogs/breeds/" class="md-search-result__link" tabindex="-1">
<article class="md-search-result__article md-typeset" data-md-score="1520.29">
<div class="md-search-result__icon md-icon"></div>
<h1>Dog <mark>Breeds</mark></h1>
About Dog <mark>BreedsTable</mark> of Different Dog <mark>Breeds</mark>
<p>Text and images from Wikipedia, the free encyclopedia.</p>
<p>A dog <mark>breed</mark> is a particular strain of dog that was purposefully bred by humans to
perform specific tasks, such as herding, hunting, and guarding. Dogs are the most variable
mammal on earth, with artificial selection producing around 450 globally recognized
<mark>breeds.</mark> These <mark>breeds</mark> possess distinct traits related to morphology,
which include body size, skull shape, tail phenotype, fur type, body shape, and coat colour.
Their behavioural traits include guarding, herding, and hunting, and personality traits such as
hypersocial behavior, boldness, and aggression. Most <mark>breeds</mark> were derived from small
numbers of founders within the last 200 years. As a result, today dogs are the most abundant
carnivore species and are dispersed around the world.
</p>
</article>
</a>
Next, we collect the inner HTML code from each WebElement in results
.
html_results = [res.get_attribute("innerHTML") for res in results]
len(html_results) # 6 (1)
- Same as
len(results)
.
Finally, we extract the actual links using Python string manipulation and list comprehension.
links = []
for res in html_results:
if "Akita Inu" in res:
split_res = res.split("href=")
link_part = split_res[1]
links.append(link_part.split('"')[1])
print(links)
# ['https://seleniumplayground.practiceprobs.com/dogs/breeds/akita/']
Here, we used the fact that the result elements we are looking for has "Akita Inu" as a string in its HTML code. We also
know that all links are generally passed by the href=
property and surrounded by quotation marks ""
. Hence, we can
split the string to extract the first link. Now we can finally find the answers about our favourite dog breed without so
much clicking.