Understand URLs
- Every web page has a URL. Checking the URL is one of the best ways to
tell whether a page is genuine — so you can avoid phishing, malware and
other scams.
- The important bit is the domain name, the rightmost part before the
first slash or, if there are no slashes, the rightmost part overall.
The role of the URL
You might not know it’s called a URL, but you’ve almost certainly seen
one:
https://en.wikipedia.org/wiki/Hamster
A uniform resource locator (URL) identifies a website or a
particular page or action within a website. The example above identifies the
Wikipedia article on hamsters.
In the early days of the World Wide Web, it was common to request a page by
typing its URL into your browser. You’d have learned the URL somewhere
‘offline’ like in a magazine or on the radio.
Nowadays we mostly find pages by typing key words or the name of a place,
company or product into a search engine like Google or Bing. But each result
still has a URL that uniquely identifies it.
Depending on your device and browser, you’ll see the URL of the current
page either at the top or bottom of the screen.
The protocol
Most web URLs begin with https followed by a colon and
two slashes:
https://en.wikipedia.org/wiki/Hamster
This indicates the protocol and, while historically
significant, it’s usually hidden nowadays.
The domain name
This is the important bit in terms of identifying the legitimacy of a page.
The domain name continues until either a slash / or the end
of the URL, whichever comes first. In our example, the domain name is
en.wikipedia.org:
en.wikipedia.org/wiki/Hamster
Computers actually read domain names from right to left, separating them at
the dots. Usually the rightmost part, or two, indicates a country or type of
organisation, for example:
- co.uk (UK, commercial)
- org (non-profit organisation)
- fr (France)
The next part to the left is typically the name of the organisation:
There are exceptions, like diy.com which is the
website of do-it-yourself retailer B&Q.
Combining these parts, we get complete domain names:
- vodafone.co.uk
- unicef.org
- renault.fr
These are the best indicators of the legitimacy of a web page.
If the domain name matches what you’d expect of the website you’re
visiting, that’s a very good indicator that you can proceed with confidence
to enter personal information, log in, download software — or anything else
where being on the correct website is particularly important.
If it doesn’t match, stop and think: am I about to fall for a scam?
Subdomains
Many organisations use subdomains to more easily manage large websites. In our earlier example, ‘en’ may be referred to as a subdomain; it identifies the English version of Wikipedia:
en.wikipedia.org/wiki/Hamster
Here’s another example:
blogs.unicef.org
But remember: read right to left. The part that matters is immediately before the first single slash or, if there is no slash, the end of the URL:
blogs.unicef.org
blogs.unicef.org/blog/ukraines-water-heroes/
Misspelt domains
Fake or malicious sites might use a misspelling of a genuine domain name:
- vodaf0ne.co.uk (number zero where letter ‘o’ should be)
- uncief.org (two letters swapped around)
- renalt.fr (missing a letter)
Fraudsters operating such sites may not need to actively entice victims through phishing; they may simply rely on people hitting the wrong letters on their keyboard. The practice of holding onto misspelt domains for nefarious ends is called typosquatting.
Since progress is being made to facilitate domain names in alphabets other than the Latin A to Z used in English, it’s now possible to find even subtler examples of misspelt domains. For example, in 2017 a researcher discovered that someone had registered the domain adoḅe.com and was using it to distribute malware disguised as Adobe Flash Player.
In some languages, a dot can be placed above or below a letter to indicate pronunciation, like a long vowel. Other languages’ alphabets include characters with a dot as letters in their own right. The lowercase ‘b’ with a dot underneath found in this example is a letter of the alphabets used by the Kalabari language in Nigeria.
This is just one of numerous examples of what’s known as a homograph attack. It’s not something you’d type by mistake; the malicious link would be spread by phishing or other means.
Bonus knowledge
If you’ve had enough of URLs for one day, do finish here, happy knowing you’re equipped with the knowledge to avoid the vast majority of online scams.
For completeness, though, it would be foolish of me not to detail four caveats that can make it harder to interpret URLs – or harder to tell the authenticity of a URL – in certain cases.
Strange-looking subdomains
Sometimes you’ll see URLs where the domain name is a bit more obscure, like this:
secure-appldnld.apple.com/itunes12/
Is this the real Apple website? Yes! Check the rightmost part of the domain – remember, just before the first slash – and you’ll see it’s apple.com:
secure-appldnld.apple.com/itunes12/
Apple has simply chosen to name a subdomain ‘secure-appldnld’.
Trick subdomains
On the other hand, a crafty bogus site might use a subdomain in this fashion:
bbc.co.uk-news-health-39217858.martinedwards.co.uk
Is this the real BBC website? No! At a glance, it looks like an article in the News > Health section, but there’s no slash after bbc.co.uk — the domain name continues, in this case to the end of the address. It’s an elaborate one which, if it was real, would probably resolve to my own website:
bbc.co.uk-news-health-39217858.martinedwards.co.uk
I could create a fake page there, mimicking the BBC but with a notice saying you needed to update some software to watch a video. Of course, the ‘update’ would actually be malware!
Redirects
A URL that looks suspicious at first might actually be a redirect to a genuine site. For example, if you’re on John Lewis’s mailing list, the emails you receive might contain links to promotions like this:
johnlewis.us13.list-manage.com/track/click?u=eef5926
This is authentic. The domain list-manage.com is used by the email marketing company Mailchimp to track subscribers clicking links, to help its clients learn about their customers.
Because you can’t tell where a redirect will go, you need to wait until you arrive at the destination before checking the URL in your browser.
Shorteners
Some companies use URL shorteners to make URLs that are more compact and easier to communicate. These are technically just redirects, but deserve a special mention. This example uses Twitter’s shortener to redirect to Gordon Buchanan’s documentary about wolves on iPlayer:
t.co/Rk466PgT3r
And this one uses Microsoft’s shortener to redirect to the much longer URL of the Windows Update Troubleshooter. It’s much more convenient for their technicians to dictate over the phone:
aka.ms/wudiag