TagUI tutorial 1 - goodreads.com - Part 2

Problem

In TagUI Tutorial 1 - goodreads.com - Part 1, we have grabbed the Title of the list "New York Times 2017 Ten Best"

In this tutorial, we will grab the title of the first book in the list as indicated by the red arrow below::

tutorial-01

Solution

//goodreads_v02
https://www.goodreads.com/list/show/118408.New_York_Times_2017_Ten_Best
read //h1[@class="gr-h1 gr-h1--serif"] to list_title
echo "list_title = " + list_title
 
read (//td/a/span[@itemprop="name"])[1] to book_title1
echo "book_title1 = " + book_title1

Explanation

Lines 1-4: Same as TagUI tutorial 1 - goodreads.com - Part 1
Line 6: read the title of the first book to the variable book_title1
  1. In Chrome browser, right click on the title of the first book and choose "Inspect". The Chrome Developer Tools will be displayed and you will see that the HTML of the title is:
  2. <td width="100%" valign="top">
          <a class="bookTitle" itemprop="url" href="/book/show/28446947-autumn">
            <span itemprop="name" class="xh-highlight">Autumn</span>
    
  3. Hence we use the following XPath to get the titles of the book:
    //td/a/span[@itemprop="name"]

    Note: if you are new to XPath, please refer to the XPath Tutorial.

  4. However, the above will return the titles of all the 10 books in that page. To verify this, let's continue from (a) above. In the Chrome Developer Tools, press Ctrl-F (in Windows) or Cmd-F (in Mac) to start DOM searching. Type in the XPath above as shown below:
  5. tutorial01-2.03.png

  6. You will see the search result 1 of 10 as shown below:
  7. tutorial01-2.03.png

  8. This means that the XPath will match 10 elements on this web page. If you click the down arrow besides the search result, it will highlight each of the element one by one, and you will see that this actually corresponds to the titles of all the 10 books displayed on this page.
  9. To get only the title of the first book, we use the following:
    (//td/a/span[@itemprop="name"])[1]
  10. Note that in XPath, the first element of an array starts with 1.
  11. The value read is stored in the variable book_title1.
Line 7: Output the title of the first book to the command window using the echo command

Other TagUI Tutorials

Other Resources

Comments (0)

There are no comments posted here yet

Leave your comments

Posting comment as a guest. Sign up or login to your account.
Attachments (0 / 3)
Share Your Location