Hacker Rank leaderboard scraping step-by-step

Noé Antona
4 min readJun 24, 2019

TLDR: The fully working JS script is located at the bottom of the page.
That being said, and according to Guillaume LHOTE in his SOSUEE talk; a tool itself is less valuable than the understanding of the way the tool is working!

Disclaimer: Being able to scrape Hacker Rank data doesn’t mean you should scrape it. Use it at your own risk and according to the GDPR policy. I will not be responsible for any misuse of this technic.

The question “Do you source for candidates on http://hackerrank.com/”was asked on Recruiter’s kitchen, the French slack community of Recruiters.

Actually, I didn’t. And I can’t get a single reason why. Hacker Rank leaderboard is a great way to discover qualified candidates; all you got to do is gather some pieces of information. Unfortunately, the view on the leaderboard menu is not really convenient. Only the username, country, and rank are showed. Because of that, Dataminer was not really helpful.

Then, I noticed that a mouseover on any username was loading a snippet with much more interesting data. Unfortunately, it was generated server-side so any other scraper would not have helped me either.

Time to dig into the API!
Thanks to Dev Tools (native Chrome feature), you can see that some requests are generated to the following kind of URL: https://www.hackerrank.com/rest/contests/master/hackers/USERNAME/profile

For instance, for “Gennady” profile, it loads the content of the following webpage: https://www.hackerrank.com/rest/contests/master/hackers/Gennady/profile

If you’re not comfortable with what you read yet, learn about JSON.
All the data needed is located there.

Time to scrape! At this stage, I first need a json list of every username. Thanks to the inspect function of Chrome, we notice that the username is located in a class named “d-flex justify-content-between ellipsis”. We can grab them thanks to the JS function named “getElementsByClassName”, which lead to the following script:

var candidates = []
var list = document.getElementsByClassName(‘d-flex justify-content-between ellipsis’);
for (let item of list) {
candidates.push(item.getElementsByTagName(‘a’)[0].getAttribute(‘data-value’));
}
console.log(JSON.stringify(candidates))

Note: The final “console.log” only shows you the results so that you can check that the script is working;

Now I have the full list of username, let’s find a way to load every URL following this pattern: https://www.hackerrank.com/rest/contests/master/hackers/USERNAME/profile
The “fetch” JS function is the key!
Once loaded, the data needs to be stored somewhere: that’s what our jsonFinal variable is for.

var jsonFinal = {};
candidates.map(candidate => {
var path = `https://www.hackerrank.com/rest/contests/master/hackers/${candidate}/profile`;
fetch(path).then(result => {
result.json().then(data => {
jsonFinal[candidate] = data.model;
});
});
});
setTimeout(() => {
console.log(JSON.stringify(jsonFinal));
}, 1000);

In the end, we need to set a timer in order to wait for the script to visit every single URL and to parse the data into jsonFinal. I’ve set mine on 1s (i.e. 1000ms), but the best waiting time does depend on the speed of your internet connection and the number of profiles you want to scrape.

The results come as a JSON, looking like that:

Now, copy it and paste it into https://json-csv.com/ .
You’ll get a fully working CSV, showing the username, name, country, languages known, university attended, job title, LinkedIn profile, etc.. looking like that:

You can now import it to excel and/or spreadsheets. Have fun!

TLDR:

  • Go on the Hackerrank LeaderBoard with Chrome
  • Select the category you’re interested in (Algorithm, Java, Python, etc)
  • Click right anywhere on the page, and select “Inspect”
  • Copy and paste this script into the console and press enter

Any feedback: contact me on Twitter or by mail :)
Thanks to Xavier who was really helpful when I reached the end of my JS skills!

--

--