Understand URLs
- Every web page has a URL. Checking the URL is one of the best ways to
tell whether a page is genuine — so you can avoid phishing, malware and
other scams.
- The important bit is the domain name. This is the rightmost part before
the first slash or, if there are no slashes, the rightmost part
overall.
The role of the URL
You might not know it’s called a URL, but you’ve almost certainly seen
one:
https://en.wikipedia.org/wiki/Hamster
A uniform resource locator (URL) identifies a website or a
particular page or action within a website. The example above identifies the
Wikipedia article on hamsters.
In the early days of the World Wide Web, it was common to request a page by
typing its URL into your browser. You’d have learned the URL somewhere
‘offline’ like in a magazine or on the radio.
Nowadays we mostly find pages by typing key words or the name of a place,
company or product into a search engine like Google or Bing. But each result
still has a URL that uniquely identifies it.
Depending on your device and browser, you’ll see the URL of the current
page either at the top or bottom of the screen.
The protocol
Most web URLs begin with https followed by a colon and two slashes:
https://en.wikipedia.org/wiki/Hamster
This indicates the protocol and, while historically
significant, it’s usually hidden nowadays.
The domain name
This is the important bit in terms of identifying the legitimacy of a page.
The domain name continues until either a slash / or the end
of the URL, whichever comes first. In our example, the domain name is
en.wikipedia.org:
en.wikipedia.org/wiki/Hamster
Computers actually read domain names from right to left, separating them at
the dots. Usually the rightmost part, or two, indicates a country or type of
organisation, for example:
- co.uk (UK, commercial)
- org (non-profit organisation)
- fr (France)
The next part to the left is typically the name of the organisation:
There are exceptions, like diy.com which is the website of do-it-yourself
retailer B&Q.
Combining these parts, we get complete domain names:
- vodafone.co.uk
- unicef.org
- renault.fr
These are the best indicators of the legitimacy of a web page.
If the domain name matches what you’d expect of the website you’re
visiting, that’s a very good indicator that you can proceed with confidence
to enter personal information, log in, download software — or anything else
where being on the correct website is particularly important.
If it doesn’t match, stop and think: am I about to fall for a scam?
Subdomains
Many organisations use subdomains to more easily manage
large websites. In our earlier example, ‘en’ is a subdomain; it identifies
the English version of Wikipedia:
en.wikipedia.org/wiki/Hamster
Here’s another example:
blogs.unicef.org
But remember: read right to left. The part that matters is immediately
before the first single slash or, if there is no slash, the end of the
URL:
blogs.unicef.org
blogs.unicef.org/blog/ukraines-water-heroes/
Misspelt domains
Fake or malicious sites might use a misspelling of a genuine name:
- vodaf0ne.co.uk (number zero where letter ‘o’ should be)
- uncief.org (two letters swapped around)
- renalt.fr (missing a letter)
Fraudsters operating such sites may not need to actively entice victims
through phishing; they may simply rely on people hitting the wrong letters
on their keyboard. The practice of holding onto subtly misspelt domains for
nefarious ends is called typosquatting.
Since progress is being made to facilitate domain names in alphabets other
than the Latin A to Z used in English, it’s now possible to find even
subtler examples of misspelt domains. For example, in 2017 a researcher
discovered that someone had registered the domain adoḅe.com and was using it
to distribute malware disguised as Adobe Flash Player.
In some languages, a dot can be placed above or below a letter to indicate
pronunciation, like a long vowel. Other languages’ alphabets include
letters with a dot in their own right. The letter ‘b’ with a dot
underneath, as in this example, is in the alphabets used by the Kalabari
language in Nigeria.
Deceiving people with characters that are visually similar – or in some
cases identical, but in different alphabets – is called a homograph
attack.
Bonus knowledge
If you’ve had enough of URLs for one day, do finish here, happy knowing
you’re equipped with the knowledge to avoid the vast majority of online
dangers.
For completeness, though, it would be wrong of me not to detail four
caveats that can make it harder to interpret URLs – or harder to tell the
authenticity of a URL – in certain cases.
Strange-looking subdomains
Sometimes you’ll see URLs where the domain name is a bit more obscure, like
this:
secure-appldnld.apple.com/itunes12/
Is this the real Apple website? Yes! Check the rightmost part of the domain
– remember, just before the first slash – and you’ll see it’s apple.com:
secure-appldnld.apple.com/itunes12/
Apple has simply chosen to name a subdomain ‘secure-appldnld’.
Trick subdomains
On the other hand, a crafty bogus site might use a subdomain in this
fashion:
bbc.co.uk-news-health-39217858.martinedwards.co.uk
Is this the real BBC website? No! At a glance, it looks like an article in
the News > Health section, but there’s no slash after bbc.co.uk — the domain
name continues, in this case to the end of the address. It’s an elaborate
one which, if it was real, would probably resolve to my own website:
bbc.co.uk-news-health-39217858.martinedwards.co.uk
I could create a fake page there, mimicking the BBC but with a notice
saying you needed to update some software to watch a video. Of course, the
‘update’ would actually be malware!
Redirects
A URL that looks suspicious at first might actually be a
redirect to a genuine site. For example, if you’re on John
Lewis’s mailing list, the emails you receive might contain links to
promotions like this:
johnlewis.us13.list-manage.com/track/click?u=eef5926
This is authentic. The domain list-manage.com is used by the email
marketing company Mailchimp to track subscribers clicking links, to help its
clients monitor the success of their campaigns.
Because you can’t tell where a redirect will go, you need to wait until you
arrive at the destination before checking the URL in your browser.
Shorteners
Some companies use URL shorteners to make URLs that are
more compact and easier to communicate. These are technically just
redirects, but deserve a special mention. This example uses Twitter’s
shortener to redirect to Gordon Buchanan’s documentary about wolves on
iPlayer:
t.co/Rk466PgT3r
And this one uses Microsoft’s shortener to redirect to the much longer URL
of the Windows Update Troubleshooter. It’s much more convenient for their
technicians to dictate over the phone:
aka.ms/wudiag