Want to become an expert in VBA? So this is the right place for you. This blog mainly focus on teaching how to apply Visual Basic for Microsoft Excel. So improve the functionality of your excel workbooks with the aid of this blog. Also ask any questions you have regarding MS Excel and applying VBA. We are happy to assist you.

When to use the getElementsByClassName method

In the previous post we learnt why it is easier to use the getElementById than getElementsByClassName method. Read this post if you want to learn how to use the getElementById method in web scraping.

getElementsByClassName Vs getElementById

But you can’t use getElementById all the time. Because web developers don’t assign id's for all the elements. And sometimes we need to cope with these elements which don't have id’s. Here is an example.

There are two submit buttons in the above sample HTML code. This is the code of the first button.

And following is the code for the second button.

Assume that we want to click the second submit button. So how do we find that element? Can we use the easiest method explained in the last post? As you can see we can’t use that method as there is no id related to this second submit button. However this second button is assigned to a class called “a-button-input”. Because of that, we can use getElementsByClassName method to click the button.

Now the first thing we need to find is the index number of this element with the specified class. To find that we need to count the number of elements in the webpage before this submit button with the specified class. Let’s assume that there are five elements which belongs to same class before this submit button. As the index starts at 0, the index of this second submit button should be 5. Then we can click the second button as follows.

Set objIE = CreateObject("InternetExplorer.Application")

objIE.Top = 0
objIE.Left = 0
objIE.Width = 800
objIE.Height = 600

objIE.Visible = True

objIE.Navigate ("Url here")

Do
DoEvents
Loop Until objIE.readystate = 4

objIE.document.getElementsByClassName("a-button-input")(5).Click

Note that you need to change the number in the last line to suit with the index number. If the index number is x, then you can write it as follows.

objIE.document.getElementsByClassName("a-button-input")(x).Click

getElementsByClassName Vs getElementById

Programmers use various languages to develop programs to collect data from websites. Even the programmers who are using the same language use different techniques to extract the data. If you are new to web scraping, a typical question you will have is when to use which method. getElementsByClassName and getElementById are such two methods which often confuse the novices. Because lots of beginners have a doubt about why they should use getElementsByClassName or getElementById over the other in different situations.

So in this post I thought to teach you how to use these two methods appropriately. First thing we need to understand is that the websites are not developed aiming at helping web scraping. Web developers use various techniques and methods to have various functionalities, make the website pleasing to the eye, increase the speed and easy to make changes. So if we develop a web scraping program then we have to consider the inherent features of that particular website when doing the coding.

Now let’s consider the following HTML code. As you can see there are two input tags of type "submit" in this code. Assume we want to click the first submit button.

If you carefully examine the code, you can see that both buttons belong to the same class called "a-button-input". However both buttons have unique IDs as well. As the first button has a unique id, the easiest method to click that button is using the getElementById method. This is how you can do it.

Set objIE = CreateObject("InternetExplorer.Application")

objIE.Top = 0
objIE.Left = 0
objIE.Width = 800
objIE.Height = 600

objIE.Visible = True

objIE.Navigate ("Url here")

Do
DoEvents
Loop Until objIE.readystate = 4

objIE.document.getElementById ("button-search").Click

If you want to click the same button using the getElementsByClassName method, then you can do it as follows.

Set objIE = CreateObject("InternetExplorer.Application")

objIE.Top = 0
objIE.Left = 0
objIE.Width = 800
objIE.Height = 600

objIE.Visible = True

objIE.Navigate ("Url here")

Do
DoEvents
Loop Until objIE.readystate = 4

objIE.document.getElementsByClassName("a-button-input")(0).Click

Several elements can have the same class name. So in this method we have to use the index number of the element. Index starts at 0. You may not see a big difference in the above examples as there are only two input tags in the HTML code. But in real life, it is not simple like this. Some web pages have a large number of elements with the same class name. So then it is difficult to find the index number of the element we need. But if there is an id for that element, then we can use the getElementById method without thinking about the index number of the element.

Popular Posts

Contact Form

Name

Email *

Message *