Hacking Competitive Pricing Analysis with Scraping

Last Updated: July 3, 2016

Show Article Summary

Show Full Article

Ever wanted to find out what price your competitors are charging for all of their products? Perhaps you want to monitor changes to their pricing so that you can adapt quickly or just monitor trends.

Well, I’m going to show you a hack that I’ve used in the past and continue to use (especially within e-commerce projects) to monitor all of my competitors’ pricing, and it’s automated.

Not only will you be able to avoid extremely expensive and largely inaccurate ‘competitive intelligence’ software, but you’ll have complete control over what data you want to pull through, when you want it, and there’s absolutely no limit to how often you do it. Sounds good, right?

What You’ll Need

There are a few things that you’re going to need to be able to do this – nothing big or expensive, and all the knowledge you need will be within this post:

  1. Subscription to URL Profiler ($15.95 per month).
  2. Microsoft Excel.
  3. Very basic knowledge of HTML/CSS (I’ll walk you through what you need to know).

That’s it.

Hacking Competitive Pricing Analysis

Before I get into the technical side of running the competitive pricing analysis, I’m going to quickly outline each of the steps. Don’t worry if you don’t quite understand some of these steps at this stage, because I’m going to run through them in detail.

You should be able to follow all of this, even if you’ve never done any coding before – if you have any issues, drop a comment and I’ll do my best to answer them. So here’s the process on a basic level:

Step 1: Gather Your  Competitors’ Product Pages

The first step in this process involves you gathering all of the product page URLs that you want to pull pricing information (and more) from. There are a number of ways to do this, and the reasoning for this will become more apparent as we progress.

There are three main ways to get these URLs:

  1. Pull them in from the sitemap of your competitor’s website.
  2. Crawl the site with a tool like Screaming Frog SEO Spider or Deep Crawl.
  3. Scrape them from listing pages on their website (I’m not going to go into this because it’s worth a whole post in itself).

The first method is by far the easiest. The only reason why you’d go with option two or three is if your competitor doesn’t have a sitemap.

To find your competitor’s sitemap, go to Google and type the following query, replacing COMPETITOR-DOMAIN with the domain name of your competitor’s website (for example, amazon.com):

site:COMPETITOR-DOMAIN inurl:sitemap OR filetype:xml

Sometimes you’ll get a few results here of different sitemaps here. This is because a lot of websites, especially large e-commerce sites, have multiple sitemaps. You’ll just need to do a bit of manual digging to find the right one.

ASOS sitemap

You can also just add ‘sitemap.xml’ to the end of their domain name and that will sometimes do the trick.

Once you find their sitemap, copy the URL and then visit this awesome URL extractor tool by Rob Hammond, so that it just gives you a list of all the URLs within the sitemap without any of the other data. You can then copy and paste this into a new Excel spreadsheet.

XML sitemap extractor

If you can’t find their sitemap then you’ll want to crawl their website to pull through all of their URLs.

I’m not going to go into this process in detail because it’s very straightforward once you download the software to do it. I’d recommend using Screaming Frog SEO Spider for this – there’s a free version, too.

All you’ll need to do is add your competitor’s domain and it will pull in all the URLs from their website. I even put together a full tutorial on using Screaming Frog SEO Spider that you can check out.

Once you have the URLs…

Once you’ve pulled in all of the URLs from your competitor’s website, you’ll want to filter down on the product pages. Sometimes this is easy because they have something like /product/ in all of their product page URLs (a lot of Shopify stores have this).

If this is the case, just use a filter in Excel to filter down on any URL containing /products/.

In the case where there’s no way that you can distinguish a product page from its URL, don’t worry, as you can just process all the URLs that you have and any that aren’t product pages will just return blank results.

Like this Hack?

Download my growth hacking ebook with 25 hacks to implement straight away.

Download for Free

Step 2: Identify the Elements to Scrape

The next step in the process is to identify the elements on the product pages that you want to extract (i.e. the price).

This is where a basic knowledge of HTML and CSS comes in really useful, but as I mentioned previously, I’ll explain this in a way that even someone with no code knowledge will understand.

First of all, navigate to one of your competitor’s product pages. For the purposes of explaining this process, I’m going to use Ebuyer.com as my competitor product page:

Ebuyer.com product page

Here’s the URL of the product page above so that you can follow along: http://www.ebuyer.com/712178-apple-macbook-pro-mjlt2b-a

For the competitive pricing analysis, I might want to pull in the following data points for each product:

  1. The product name.
  2. The price.
  3. The product category.
  4. Whether it’s in stock.
  5. The product description.

It’s worth noting that all of this information has to be visibly present on your competitor’s product page for you to be able to extract it.

Let’s start with the product name to show how you’d identify the HTML element that contains this information:

  1. Open the product page up in Google Chrome.
  2. Identify where on the page the product name is shown.
  3. Right-click on the product name.
  4. Select ‘Inspect’ from the menu.
  5. Look at the line of code highlighted in the Developer Tools box that’s just popped up at the bottom of your browser.
  6. The HTML element that contains the product name is what we’re looking for.

Here’s a visual guide to this process:

finding a html element

In the example above, the HTML element was a H1 with the class, ‘product-title’.

Now that you have the HTML element, it’s time to build the CSS selector or XPath to extract this data.

Step 3: Write the CSS Selector or XPath Query

This step is all about being able to communicate with software in order to tell it where to find the information you want within a webpage.

To do this, we either use a CSS selector or some XPath. I’m not going to get into all the details on what these both are because you don’t really need to know that. However, if you want to do some more research on them, check out W3Schools.

For this step we’re going to just build slightly on step 2. For this, let’s go back to finding the product name element using the 6 steps that I outlined above.

All you need to do is right-click on the line of code shown in the Developer Tools and then select Copy > Copy selector. Here’s a visual walkthrough:

copying the CSS selector

This will copy the CSS selector code to your clipboard. Just open up a blank text editor and paste this into it to keep track of it for now. Just make sure you write next to it what data it’s related to (i.e. the product name).

In the example above, the CSS selector I’m given is:

#main-content > div > div:nth-child(1) > div.clearfix > div.product-main > div.product-header > div.product-info > h1

This is what you’ll need for the next step to identify the elements that we want to extract.

One word of warning is that pulling in CSS selector info like this can sometimes be a little inaccurate (for reasons that I don’t have time to explain). If you want to be completely certain that it will work, you can write your own XPath query.

To do this, we just need to know a couple of things about the HTML element that’s holding the data we want. The first thing we need to know is what kind of HTML element it is (for example, is it a h1, div, p, a, span, etc.). You’ll be able to find this out because it’s the first word after the opening <.

In the case of the product name in my example, the full line of code is:

<h1 class="product-title" itemprop="name">Apple MacBook Pro</h1>

In this case, the HTML element will be h1.

The second piece of information that we need is some kind of unique attribute. Within this line of code there are two attributes, the ‘class’ and the ‘itemprop’.

The unique identifier for the class attribute is ‘product-title’, whilst the unique identifier for the ‘itemprop’ attribute is ‘name’.

This is all we need to identify this specific piece of data. Now we just need to turn this into an XPath query.

Here’s the structure of an XPath query with dummy placeholder text (in bold) in the areas we need to add our HTML element, attribute and unique identifier:

//element[@attribute="unique identifier"]

So using this syntax, the XPath for me to pull in the product name from the Ebuyer.com product page would be:


Or if we use the itemprop as the attribute for identifying it (instead of the class):


It’s completely up to you which attribute you use.

You’ll need to go through this process for each of the elements on the page that you want to extract. Just to give another example, here’s the code for the product price on the Ebuyer.com product page:

<span itemprop="price">1888.97</span>

In this instance, the HTML element is ‘span’, the attribute is ‘itemprop’ and the unique identifier is ‘price’. The XPath for this would be as follows:


Hopefully you’re starting to follow this now.

Step 4: Scrape the Data With URL Profiler

This is where it starts to get fun.

Open up URL Profiler and untick any boxes that may be preselected. To just run a test, add only one of the product page URLs from your competitor into the box on the right. You can literally just copy the URL and then paste it into the box.

add a URL to URL Profiler

Now you’ll want to select the ‘Custom Scraper‘ option.

Once you click this, a new box will open. This is where you’re going to add in your CSS selectors or XPath queries. You can do up to 10 data points at a time; all you need to ensure is that you select the right data type for each one.

If you’ve gone down the route of writing your own XPath then you will select the XPath data type, as shown below:

Custom Scraper in URL Profiler

If you used CSS selectors instead, just be sure to select the data type as CSS.

Once you’ve added in the relevant CSS selector or XPath for each piece of data you want to extract, click ‘Apply’. Now all you need to do is click the ‘Run Profiler’ button and URL Profiler will start doing its thing.

After a short while, you’ll get a spreadsheet with a few bits on extra data on the URL and then you’ll see all of the values within the columns labelled, ‘Data 1’, ‘Data 2’, ‘Data 3’…

The extracted data

In the spreadsheet above it shows the two pieces of data that I pulled for the Ebuyer.com product URL (the product name and product price).

All that’s left for you to do is add all of the product page URLs into URL Profiler and run it exactly the same way. Instead of just having the data for one URL, you’ll have it for all of them – and it only takes a few minutes to process!

Here’s what my spreadsheet looked like after processing a larger batch of URLs:

Competitive pricing analysis data

As you can see, I also pulled in data on whether the product was currently in stock and what category the product falls under within the website.

Now tell me this isn’t pretty cool!

Step 4: Organise the Data

The fourth and final step to this process is to organise all of the data that you’ve extracted.

You won’t need all of the extra data that URL Profiler pulls in by default (e.g. TLD, HTTP Status, Encoding, etc.) so I’d just delete all of these, leaving only the product URL and the data that you’ve extracted.

Next, change the column titles (Data 1, Data 2, Data 3 …) to something more descriptive; for example, ‘Price’.

Finally, you’ll want to label the spreadsheet with the name of your competitor and the date you extracted the data. You can then create a separate sheet that has all of your competitors’ data housed in one place to do full comparative analysis.

To be honest, you can choose what works best for you to display all of this data because it’ll vary by project.

As always, if you have any questions, feel free to leave them below and I’ll do my best to answer them.

Like this Hack?

Download my growth hacking ebook with 25 hacks to implement straight away.

Download for Free

Competitive analysis should be a staple part of any businesses' weekly tasks. Staying up to date on changes to your competitors' pricing is essential to being able to adapt, especially within e-commerce. The only problem is that it's really difficult to run this kind of analysis.

Typically, you have two options. The first option is pay large sums for software that does this for you - most of the time the results you get aren't that accurate and require a lot of setup time on your part. The other option is to trawl through your competitors' websites and make a note of all their pricing manually - this doesn't exactly scale well.

Now I'm going to present a third option: hacking the whole process using scraping to instantly pull through all of your competitors pricing whenever you want, how often you want and for no cost to you at all. Oh, and you don't even need to have done this before. Just follow my step-by-step guide and you'll be able to immediately reap the benefits.

Download My Growth Hacking Ebook

Over 10,000 words long and packed with 25 specific growth hacking techniques that you can implement on your website today.

Download for Free

Powered by TLDR

About Matthew Barby

Global Head of Growth & SEO at HubSpot, award winning blogger, industry speaker and lecturer for the Digital Marketing Institute.

Sign Up To My Newsletter

12 Responses


A powerful, yet quite simple, how-to for e-commerce owners willing to do some competitive analysis.
Thanks a lot for sharing Matthew!

Nick Julia

Nice! I’ve never tried using url profiler to scrape data like that.

Using xpath or the css might be the best way to get that data… but have you tried import.io?
(no affiliated)

You can still write custom code. But for people that have trouble with that, they provide a point and click interface.

Also wondering, do you think this would be easy to extend … to get notifications of competitors sales ( price increases/decreases of a certain percentage) ?

Matthew Barby

Hey Nick, yeah I’ve used Import.io and Kimono Labs, et al. but found them a little restricting and a bit slow on large scale projects. In terms of extending this out – there’s tons extra that you can do!


I use kimonolabs — its much easier than having to write xpath selectors.


Yea, Kimonolabs is amazing but they’re shutting down their service soon 🙁 You can still use import.io though


Once again a truly great hack! Many thanks for introducing me to URL Profiler it really is something else 🙂 Just one question, I used Rob Hammond’s URL Extractor on a very large ecommerce site and it returned 4000 odd urls with only 900 odd in view? Sorry to sound thick but is there an upgrade to Robs tool – or another tool – that will allow me to place bigger databases of urls into Excel for use in the URL Profiler?

Many thanks,

Matthew Barby

Hey Phil, so I spoke with the guys at URL Profiler and you can actually import a sitemap directly into the tool. Just right-click on the area where you add links in and then select the sitemap option 🙂


I use the Chrome extension Web Scrapper to to that, basic but easy to use and free. URL Profiler seems to be more advanced


Great article! I’ve personally been using scrapy (http://scrapy.org/) a fair amount for doing web scraping and I’ve found it really easy to use for large scale projects. It does require some programming though.


Hi Matt, thanks for your post, I actually did my first scraping project with import.io last week before I saw this, although I think I was inspired by your other content audit articles from last year

my blog is based around travel and i’ve just done my first story based on scrapes with the tool too – it feels like it’s slightly on the naughty side, but it’s all otherwise accessible data…

What are your thoughts on affiliate blogs scraping websites to create articles?

Matthew Barby

I think scraping is a very powerful technique for any marketer. That said, I’m not a fan of stealing other people’s content to pass off as your own.

James Hughes

Is this really happening? Your article is very interesting. If you can have an automated hack of your competitor’s product price it will give you the upper hand in the market. I want to try this out.