XPath Finder For Selenium

The Problem


You're trying to test a website, with
Selenium, that isn't exactly perfect for
that sort of thing: nothing has ids, except the things that have
non-unique IDs. You could just use something like
xpather, but
that generates really brittle xpaths
(/body/div[1]/div[2]/div[1]/...) that break the instant the
developers change anything.

So, it's taking you an hour to find each xpath for each element you
want to deal with. What a staggering waste of time!

The Solution: XPath Finder/Generator/Thingy For Selenium


The software is a simple .tar.gz file
with just the code itself, and the HTML I used for the examples
below.

Requirements

  • Perl
  • XML::LibXML
    • which requires libxml2
  • it's only been tested on Linux

How It Works, In Brief


The XPath Finder For Selenium takes a string (which can actually be
a full Perl regular expression) and a file with your XML/HTML/XHTML
in it (you can use wget to grab that, or have your Selenium script
use get_html_source() and print that if you want to be really sure
you're seeing what your Selenium script is seeing).

The string can match absolutely any part of the HTML, including any
attribute values, and it can be a full Perl regex. Do note that it
is case sensitive, though.

It then presents 2 possible xpaths for every parent of every
node that contains the string in question. One is the normal
relative-to-the-parent xpath segment, like "div[4]" if it's the 4th
div inside its parent, and the other starts with // and uses the
attributes and some other tricks to find that node and only that
node, even though it's unrooted.

The basic idea is you go back into the parents looking for the first
thing that you don't expect to change very often (like something
that's a part of the major overall structure of the page; a
nearly-outermost div or something). You use the second type of
xpath for that node to start your xpath, and you use the first type
for the rest.

I Didn't Follow That At All


Totally understandable, as I didn't explain it very well, so here's
an example session, using example.html from the tarball.

The example is from lojban.org, a site I run (not that I run
Selenium against it, but the HTML is nice and complicated and
crappy).

So in the left-hand sidebar is a dropdown called "The LLG". Pretend
that I couldn't make that reasonably unique, for illustration
purposes.

So here's what I run:

./xpath_generator LLG example.html | less


Piping it to a pager like that is very important: this thing
generates a lot of output.

For each matching section it finds, it tells you about it, like so:

***********************************************************************
Matching Block:
    <p>This site is the official repository of materials from <a
title="The Logical Language Group" href="The+Logical+Language+Group"
class="wiki ">The Logical Language Group</a> (LLG), the non-profit
corporation which has led Lojban development since 1987. </p>
=======================================================================


Well, that's not the one I want, so I keep going until I find the
right one:

***********************************************************************
Matching Block:
        <span class="menuText">The LLG</span>
=======================================================================


It happens to be the third one.

Directly below is all the xpath components it found for me, starting with the node itself:

Walking the tree upwards, generating xpaths, starting with the lement itself which we'll call 'parent 0' because it makes the code easier.
    parent 0 looks a bit like this: <span class='menuText'>
        Finding an unrooted xpath for this node
            An initial xpath: //span[@class='menuText']
            Which returns 28 results
            Final unrooted XPath for parent 0: (//span[@class='menuText'])[24]
        Finding an xpath for this node based on the parent
            An initial xpath: span
            Which returns 1 results
            Final parent-rooted XPath for parent 0: span
        Final dual-rooted XPath for parent 0: (//span[@class='menuText'])[24]


(Please ignore the "Final dual-rooted XPath" bit for now.)

This informs me that I could refer to this item as "(//span[@class='menuText'])[24]", that is, the 24th thing that matches //span[@class='menuText'].

Obviously, that would be a bad plan.

The part that informs me of that is:

            Final unrooted XPath for parent 0: (//span[@class='menuText'])[24]


It also informs me that, if I know the xpath for its parent, call that FOO, I can refer to this as simply "FOO/span".

The part that informs me of that is:

            Final parent-rooted XPath for parent 0: span


Now, let's say that I happen to know that once you get to the div that holds this particular menu, the structure above that is pretty stable, so that div makes a good starting point to anchor my xpath: things are going to change within it, and things will change around it, but that div should remain relatively stable if I can point to it directly. So, here's a bunch more output:

    parent 1 looks a bit like this: <a href='The+Logical+Language+Group' class='separator'>
        Finding an unrooted xpath for this node
            An initial xpath: //a[@href='The+Logical+Language+Group' and @class='separator']
            Which returns 1 results
            Final unrooted XPath for parent 1: //a[@href='The+Logical+Language+Group' and @class='separator']
        Finding an xpath for this node based on the parent
            An initial xpath: a
            Which returns 2 results
            Final parent-rooted XPath for parent 1: a[2]
        Final dual-rooted XPath for parent 1: //a[@href='The+Logical+Language+Group' and @class='separator']/span
    parent 2 looks a bit like this: <div class='separator'>
        Finding an unrooted xpath for this node
            An initial xpath: //div[@class='separator']
            Which returns 6 results
            Final unrooted XPath for parent 2: (//div[@class='separator'])[6]
        Finding an xpath for this node based on the parent
            An initial xpath: div
            Which returns 12 results
            Final parent-rooted XPath for parent 2: div[11]
        Final dual-rooted XPath for parent 2: (//div[@class='separator'])[6]/a[2]/span
    parent 3 looks a bit like this: <div role='navigation'>
        Finding an unrooted xpath for this node
            An initial xpath: //div[@role='navigation']
            Which returns 1 results
            Final unrooted XPath for parent 3: //div[@role='navigation']
        Finding an xpath for this node based on the parent
            An initial xpath: div
            Which returns 1 results
            Final parent-rooted XPath for parent 3: div
        Final dual-rooted XPath for parent 3: //div[@role='navigation']/div[11]/a[2]/span
    parent 4 looks a bit like this: <div id='lojban_org_Menu' style='display:block;'>
        Finding an unrooted xpath for this node
            An initial xpath: //div[@id='lojban_org_Menu' and @style='display:block;']
            Which returns 1 results
            Final unrooted XPath for parent 4: //div[@id='lojban_org_Menu' and @style='display:block;']
        Finding an xpath for this node based on the parent
            An initial xpath: div
            Which returns 1 results
            Final parent-rooted XPath for parent 4: div
        Final dual-rooted XPath for parent 4: //div[@id='lojban_org_Menu' and @style='display:block;']/div/div[11]/a[2]/span


That last bit is the div in question, and, thankfully, the xpath for accessing it directly quite clean; no (...)[24] this time. It's just "//div[@id='lojban_org_Menu' and @style='display:block;']".

So we start with that, and then we add on to that the parent-rooted XPath segment for each child. First, for the <div role='navigation'>, we have just "div". Then for the <div class='separator'> we have "div[11]", which is not so great, but you take what you can get. Then "a[2]", then "span" as previously mentioned.


So the xpath to actually use is:

//div[@id='lojban_org_Menu' and @style='display:block;']/div/div[11]/a[2]/span


The astute among you will have noticed that this is exactly the value in "Final dual-rooted XPath for parent 4", which is why that line is there: so you don't have to do the compositing yourself.

Just find the first stable parent of your element, use the "Final dual-rooted XPath", and you should be good to go.

You might want to trim or modify it in some cases, though. Here come some examples.

Since the id alone is probably unique, you could also just do:

//div[@id='lojban_org_Menu']/div/div[11]/a[2]/span


If the ID was variable depending on the data set, like it was lojban_org_menu_1234 and the 1234 changes each time, you could do something like this:

//div[contains(@id, 'lojban_org_Menu')]/div/div[11]/a[2]/span


It's not perfect, but it's going to be way less brittle in the face of changes than the fully-rooted xpath, which by the way is:

/body/div[4]/div/div[2]/div/div[2]/div/div[2]/div[1]/div/div/div[11]/a[2]/span


See what I mean?