Selenium WebDriver Architecture: Components, Functions & Limitations

Testing the system is a challenging task, and there is nothing like a tool that automates that. One tool that comes into mind for automation testers is Selenium.  If you are eager to learn about automation testing skills using Selenium WebDriver, then, you have come to the right place. Let’s get started. 

Check out our free courses to get an edge over the competition.

What is Selenium?

Selenium is an open-source automation testing tool. The tool only tests web-based applications and is compatible with multiple browsers and operating systems.

There are primarily three versions of Selenium:

  • Selenium RC
  • Selenium IDE
  • Selenium Grid

All these versions were released in 2007.

Check out upGrad’s Java Bootcamp.

Selenium WebDriver

Until 2011, Selenium RC was widely used. In mid-2011 Selenium released a new version, WebDriver 2.0. It was not an upgrade to RC but a completely different tool. The difference was Selenium WebDriver 2.0 has its own commands. The latest Selenium WebDriver version is 3.14.

Features of Selenium WebDriver

  • Capable of making dynamic scripts.
  • Compatible with multiple browsers. 
  • Generates reports and logs. 
  • Fast, as it communicates directly with the browser using the browser’s engine.
  • Real-life interaction between page elements.
  • Selenium WebDriver’s API is much simpler and does not contain confusing and redundant commands.
  • Selenium WebDriver can support the headless HtmlUnit browser.

Check out upGrad’s Advanced Certification in DevOps

There are five components of Selenium WebDriver Architecture:

  1. Language Binding or Selenium Client Library: These are Jar files, and this is the language used to write the Selenium framework. The script for Selenium is written in Java, C#, Ruby, Python and Perl.
  2. Selenium Application Programming Interface (API): API provides the set of rules and specifications that any software language adheres to. It is also necessary to communicate with other software programs. In short, API acts as the interface between software programs and AC channels of communication. 
  3. Remote WebDriver: It is the WebDriver interface’s implementation class. A test script developer uses the class on a remote machine to execute the test script through a WebDriver server.
  4. JavaScript Object Notation (JSON) Wired Protocol: JSON is a lightweight data-interchangeable format to facilitate the interchange of data. It transfers data between the client and server on the web. The JSON file has a .json extension. JSON wired protocol sends data in the JSON format. Then, the server parses the data and executes it. After execution, the server gives a response and sends it back to the client in JSON format. 
  5. WebDriver: WebDriver is the tool that automates web applications and verifies they work as expected.

Explore our Popular Software Engineering Courses

Selenium WebDriver Architecture

We will now focus on the Selenium WebDriver Architecture. The Selenium WebDriver API facilitates interactions between browsers and browser drivers. The architecture comprises the following four layers: 

  • Selenium Client Library
  • JSON Wire Protocol
  • Browser Drivers
  • Browsers

How Selenium WebDriver Works Internally?

The code for Selenium WebDriver is written in the Eclipse Integrated Development Environment (IDE). It uses any one of the Selenium client libraries such as Java.

Once the script is ready, click Run to execute the program. Based on the above script, the Chrome browser will launch and navigate to the SeleniumHQ website.

Use the following generic steps for Selenium WebDriver’s internal architecture:

1. Click Run.

The Selenium client library communicates with the Selenium API.

2. Selenium API sends the language command from the level binding to the browser driver. 

The communication is done via JSON wired protocol.

3. Selenium API sends the request to the browser driver.

The browser driver uses the HTTP server for getting the HTTP request.

4. The HTTP server filters out all the commands needed for execution.

The commands in the Selenium script execute on the browser.

5. The HTTP server sends the response to the automation test script.

Technical Specifications of Selenium WebDriver

  • Operation System (OS) – Windows, Solaris, Linux and Mac OS
  • Supported Browser – Internet Explorer, Google Chrome 12.0.712.0 and above, Safari, Opera 11.5 and above, Mozilla Firefox, Internet Explorer, HtmlUnit 2.9, Android and iOS

Best Features of Selenium WebDriver

  • Multiple Browser Support – Supports almost all browsers.
  • Multiple Languages Support – Supports most of the commonly used programming languages.
  • Speed – Selenium WebDriver is faster compared to other tools of Selenium Suite.
  • Simple Commands – Common commands are used and implemented in Selenium WebDriver easily. For example, to launch a browser in Selenium WebDriver execute the following command::
    • WebDriver driver = new FirefoxDriver(); (Firefox browser )
    • WebDriver driver = new ChromeDriver(); (Chrome browser)
    • WebDriver driver = new InternetExplorerDriver(); (Internet Explorer browser)
  • Methods and Classes – Selenium WebDriver has multiple solutions to resolve potential challenges in automation testing.

In-Demand Software Development Skills

Read: Selenium Project Ideas & Topics

Limitations of Selenium WebDriver

  • Selenium WebDriver does not automatically support new browsers 

As WebDriver operates on the OS-level, every browser communicates with the OS in varied ways. So, for a new browser, the communication with the OS may be different, resulting in a compatibility issue. You will have to provide your Selenium WebDriver team some time to make the new browser compatible with the Selenium WebDriver.

  • Selenium WebDriver does not have a built-in command to automatically generate a ‘Test Results’ file

You have to rely on the integrated development environment’s (IDE) output window. You can also design it yourself using your preferred language and store it as an HTML file or as text.

Also Read: Selenium Developer Salary in India

Enroll in Software Engineering Courses from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.

Explore Our Software Development Free Courses

Final Thoughts

  • Selenium WebDriver is a tool that tests web applications on different browsers. 
  • It uses different programming languages.
  • Selenium WebDriver is an upgraded version of Selenium RC because of its simpler architecture.
  • Selenium WebDriver has a concise API.

If you’re interested to learn more about full-stack software development, check out upGrad & IIIT-B’s Executive PG Program in Full-stack Software Development which is designed for working professionals and offers 500+ hours of rigorous training, 9+ projects, and assignments, IIIT-B Alumni status, practical hands-on capstone projects & job assistance with top firms.

What is Selenium IDE?

Selenium IDE, which stands for Integrated Development Environment, is the most basic tool in the Selenium Suite. It's a Firefox add-on that makes it simple to create tests by using record-and-playback functionality. This feature is similar to that of QuickTest Professional (QTP). It's easy to set up and comprehend. Due to its simplicity, Selenium IDE can only be used as a prototyping tool and not as a complete solution for creating and managing complex test suites. You can use the autocomplete mode in Selenium IDE when composing tests. This feature is beneficial in two ways. It enables the tester to enter commands more quickly while also preventing the user from typing wrong commands.

What is Selenium RC?

Selenium Remote Control (RC) was the major Selenium project prior to Selenium WebDriver for a long time. Because WebDriver provides more powerful features, Selenium RC is no longer widely used, but it could still be used to write scripts. We can create automated web application UI tests that include reading and writing files, accessing a database, and sending test results using computer languages like Python, Java, Perl, C#, and PHP. Client libraries can communicate with the Selenium RC Server and pass each Selenium command to be executed, which is how Selenium RC works. The server then sends the Selenium instruction to the browser using Selenium-Core JavaScript commands.

What is a Selenium Grid?

Selenium Grid is a Selenium Suite component that lets you run numerous tests in parallel across different browsers, operating systems, and computers. It's done by routing commands from remote browser instances through a hub server. The user must first configure the remote server before running the tests. Selenium Grid has two variants, Grid 1 and Grid 2. The Selenium Team is rapidly phasing out Grid 1. As a result, we'll concentrate primarily on Grid 2. Selenium Grid uses a hub-node approach, in which the test is run on a single computer termed a hub, while the execution is handled by numerous nodes.

Want to share this article?

Prepare for a Career of the Future

Leave a comment

Your email address will not be published. Required fields are marked *

Our Popular Software Engineering Courses

Get Free Consultation

Leave a comment

Your email address will not be published. Required fields are marked *

Get Free career counselling from upGrad experts!
Book a session with an industry professional today!
No Thanks
Let's do it
Get Free career counselling from upGrad experts!
Book a Session with an industry professional today!
Let's do it
No Thanks