Understand URLs

Every web page has a URL. Checking the URL is one of the best ways to tell whether a page is genuine — so you can avoid phishing, malware and other scams.
The important bit is the domain name. This is the rightmost part before the first slash or, if there are no slashes, the rightmost part overall.

The role of the URL

You might not know it’s called a URL, but you’ve almost certainly seen one:

https://en.wikipedia.org/wiki/Hamster

A uniform resource locator (URL) identifies a website or a particular page or action within a website. The example above identifies the Wikipedia article on hamsters.

In the early days of the World Wide Web, it was common to request a page by typing its URL into your browser. You’d have learned the URL somewhere ‘offline’ like in a magazine or on the radio.

Nowadays we mostly find pages by typing key words or the name of a place, company or product into a search engine like Google or Bing. But each result still has a URL that uniquely identifies it.

Depending on your device and browser, you’ll see the URL of the current page either at the top or bottom of the screen.

The protocol

Most web URLs begin with https followed by a colon and two slashes:

https://en.wikipedia.org/wiki/Hamster

This indicates the protocol and, while historically significant, it’s usually hidden nowadays.

The domain name

This is the important bit in terms of identifying the legitimacy of a page. The domain name continues until either a slash / or the end of the URL, whichever comes first. In our example, the domain name is en.wikipedia.org:

en.wikipedia.org/wiki/Hamster

Computers actually read domain names from right to left, separating them at the dots. Usually the rightmost part, or two, indicates a country or type of organisation, for example:

co.uk (UK, commercial)
org (non-profit organisation)
fr (France)

The next part to the left is typically the name of the organisation:

vodafone
unicef
renault

There are exceptions, like diy.com which is the website of do-it-yourself retailer B&Q.

Combining these parts, we get complete domain names:

vodafone.co.uk
unicef.org
renault.fr

These are the best indicators of the legitimacy of a web page.

If the domain name matches what you’d expect of the website you’re visiting, that’s a very good indicator that you can proceed with confidence to enter personal information, log in, download software — or anything else where being on the correct website is particularly important.

If it doesn’t match, stop and think: am I about to fall for a scam?

Subdomains

Many organisations use subdomains to more easily manage large websites. In our earlier example, ‘en’ is a subdomain; it identifies the English version of Wikipedia:

en.wikipedia.org/wiki/Hamster

Here’s another example:

blogs.unicef.org

But remember: read right to left. The part that matters is immediately before the first single slash or, if there is no slash, the end of the URL:

blogs.unicef.org

blogs.unicef.org/blog/ukraines-water-heroes/

Misspelt domains

Fake or malicious sites might use a misspelling of a genuine name:

vodaf0ne.co.uk (number zero where letter ‘o’ should be)
uncief.org (two letters swapped around)
renalt.fr (missing a letter)

Fraudsters operating such sites may not need to actively entice victims through phishing; they may simply rely on people hitting the wrong letters on their keyboard. The practice of holding onto subtly misspelt domains for nefarious ends is called typosquatting.

Since progress is being made to facilitate domain names in alphabets other than the Latin A to Z used in English, it’s now possible to find even subtler examples of misspelt domains. For example, in 2017 a researcher discovered that someone had registered the domain adoḅe.com and was using it to distribute malware disguised as Adobe Flash Player.

In some languages, a dot can be placed above or below a letter to indicate pronunciation, like a long vowel. Other languages’ alphabets include letters with a dot in their own right. The letter ‘b’ with a dot underneath, as in this example, is in the alphabets used by the Kalabari language in Nigeria.

Deceiving people with characters that are visually similar – or in some cases identical, but in different alphabets – is called a homograph attack.

Bonus knowledge

If you’ve had enough of URLs for one day, do finish here, happy knowing you’re equipped with the knowledge to avoid the vast majority of online dangers.

For completeness, though, it would be wrong of me not to detail four caveats that can make it harder to interpret URLs – or harder to tell the authenticity of a URL – in certain cases.

Strange-looking subdomains

Sometimes you’ll see URLs where the domain name is a bit more obscure, like this:

secure-appldnld.apple.com/itunes12/

Is this the real Apple website? Yes! Check the rightmost part of the domain – remember, just before the first slash – and you’ll see it’s apple.com:

secure-appldnld.apple.com/itunes12/

Apple has simply chosen to name a subdomain ‘secure-appldnld’.

Trick subdomains

On the other hand, a crafty bogus site might use a subdomain in this fashion:

bbc.co.uk-news-health-39217858.martinedwards.co.uk

Is this the real BBC website? No! At a glance, it looks like an article in the News > Health section, but there’s no slash after bbc.co.uk — the domain name continues, in this case to the end of the address. It’s an elaborate one which, if it was real, would probably resolve to my own website:

bbc.co.uk-news-health-39217858.martinedwards.co.uk

I could create a fake page there, mimicking the BBC but with a notice saying you needed to update some software to watch a video. Of course, the ‘update’ would actually be malware!

Redirects

A URL that looks suspicious at first might actually be a redirect to a genuine site. For example, if you’re on John Lewis’s mailing list, the emails you receive might contain links to promotions like this:

johnlewis.us13.list-manage.com/track/click?u=eef5926

This is authentic. The domain list-manage.com is used by the email marketing company Mailchimp to track subscribers clicking links, to help its clients monitor the success of their campaigns.

Because you can’t tell where a redirect will go, you need to wait until you arrive at the destination before checking the URL in your browser.

Shorteners

Some companies use URL shorteners to make URLs that are more compact and easier to communicate. These are technically just redirects, but deserve a special mention. This example uses Twitter’s shortener to redirect to Gordon Buchanan’s documentary about wolves on iPlayer:

t.co/Rk466PgT3r

And this one uses Microsoft’s shortener to redirect to the much longer URL of the Windows Update Troubleshooter. It’s much more convenient for their technicians to dictate over the phone:

aka.ms/wudiag

Next: Appendix: Secure a Compromised Account