Everyday Security in an Online World

Understand URLs

You might not know it’s called a URL, but you’ve almost certainly seen one:

https://en.wikipedia.org/wiki/Hamster

A uniform resource locator (URL) most commonly identifies a website or a particular page or action within a website. The example above identifies the article on hamsters in the English-language Wikipedia.

In the early days of the World Wide Web, it was common to bring up a desired page by typing its URL into your browser. You’d have learned the URL somewhere ‘offline’ like a magazine or on the radio: “…for all things Ambridge visit BBC dot co dot UK slash Radio 4”.

Nowadays we mostly find the pages we want by typing key words or the name of a place, company or product into a search engine like Google or Bing. But behind every web page is a URL that uniquely identifies it — and it’s that uniqueness that makes checking the URL just about the best way to tell whether a page is genuine.

Depending on your device and browser, you’ll see the URL of the current page either at the top or at the bottom of your screen.

Part 1: http or https

The first part of a URL is either http or https followed by a colon and two slashes:

https://en.wikipedia.org/wiki/Hamster

https means your interaction with the website uses encryption to protect against eavesdropping or interference by your Internet provider, employer, government — or just someone else on the cafe Wi‑Fi.

Most sites now use https, so your browser may no longer show this part. Instead, an encrypted connection will likely be indicated by a padlock, while plain old http will carry a warning like “not secure”.

But the padlock has nothing to do with whether a page is genuine. A bogus site can just as easily use an encrypted connection. As an analogy, it’s no consolation knowing a conversation is private if the person you’re talking to isn’t who you think they are.

It’s possible that in future the padlock symbol will be removed, in addition to hiding the https:// part of the URL, because encrypted connections will be universal.

Part 2: the domain name

This is the important bit in terms of identifying the legitimacy of a web page. The domain name continues until either a slash / or the end of the URL, whichever comes first. In our example, the domain name is en.wikipedia.org:

en.wikipedia.org/wiki/Hamster

Computers actually read domain names from right to left, separating them at the dots. Usually the rightmost part, or two, indicates a country or type of organisation. For example:

co.uk (UK, commercial)
org (non-profit organisation)
fr (France)

The next part to the left is typically the name of the organisation:

vodafone
unicef
renault

There are exceptions, like diy.com which is the website of do-it-yourself retailer B&Q.

Combining these parts, we get complete domain names:

vodafone.co.uk
unicef.org
renault.fr

These are the best indicators of the legitimacy of a web page.

If the domain name matches what you’d expect of the organisation or individual whose website you’re visiting, that’s a very good indicator that you can proceed with confidence to enter personal information, log in, download software — or anything else where being on the correct website is particularly important.

If it doesn’t match, stop and think: am I about to fall for a scam?

Subdomains

Many organisations use subdomains to more easily manage large websites. In our earlier example, ‘en’ may be referred to as a subdomain; it identifies the English version of Wikipedia:

en.wikipedia.org/wiki/Hamster

Here’s another example:

blogs.unicef.org

But remember: read right to left. The part that matters is immediately before the first single slash or, if there is no slash, the end of the URL:

blogs.unicef.org
blogs.unicef.org/blog/ukraines-water-heroes/

Misspelt domains

Fake or malicious sites might use a misspelling of a genuine domain name:

vodaf0ne.co.uk (number zero where letter ‘o’ should be)
uncief.org (two letters swapped around)
renalt.fr (missing a letter)

Fraudsters operating such sites may not need to actively entice victims through phishing; they may simply rely on people hitting the wrong letters on their keyboard. The practice of holding onto misspelt domains for nefarious ends is called typosquatting.

Since progress is being made to facilitate domain names in alphabets other than the Latin A to Z used in English, it’s now possible to find even subtler examples of misspelt domains. For example, in 2017 a researcher discovered that someone had registered the domain adoḅe.com and was using it to distribute malware disguised as Adobe Flash Player.

In some languages, a dot can be placed above or below a letter to indicate pronunciation, like a long vowel. Other languages’ alphabets include characters with a dot as letters in their own right. The lowercase ‘b’ with a dot underneath found in this example is a letter of the alphabets used by the Kalabari language in Nigeria.

This is just one of numerous examples of what’s known as a homograph attack. It’s not something you’d type by mistake; the malicious link would be spread by phishing or other means.

Bonus knowledge

If you’ve had enough of URLs for one day, do finish here, happy knowing you’re equipped with the knowledge to avoid the vast majority of online scams.

For completeness, though, it would be foolish of me not to detail four caveats that can make it harder to interpret URLs – or harder to tell the authenticity of a URL – in certain cases.

Strange-looking subdomains

Sometimes you’ll see URLs where the domain name is a bit more obscure, like this:

secure-appldnld.apple.com/itunes12/

Is this the real Apple website? Yes! Check the rightmost part of the domain – remember, just before the first slash – and you’ll see it’s apple.com:

secure-appldnld.apple.com/itunes12/

Apple has simply chosen to name a subdomain ‘secure-appldnld’.

Trick subdomains

On the other hand, a crafty bogus site might use a subdomain in this fashion:

bbc.co.uk-news-health-39217858.martinedwards.co.uk

Is this the real BBC website? No! At a glance, it looks like an article in the News > Health section, but there’s no slash after bbc.co.uk — the domain name continues, in this case to the end of the address. It’s an elaborate one which, if it was real, would probably resolve to my own website:

bbc.co.uk-news-health-39217858.martinedwards.co.uk

I could create a fake page there, mimicking the BBC but with a notice saying you needed to update some software to watch a video. Of course, the ‘update’ would actually be malware!

Redirects

A URL that looks suspicious at first might actually be a redirect to a genuine site. For example, if you’re on John Lewis’s mailing list, the emails you receive might contain links to promotions like this:

johnlewis.us13.list-manage.com/track/click?u=eef5926

This is authentic. The domain list-manage.com is used by the email marketing company Mailchimp to track subscribers clicking links, to help its clients learn about their customers.

Because you can’t tell where a redirect will go, you need to wait until you arrive at the destination before checking the URL in your browser.

Shorteners

Some companies use URL shorteners to make URLs that are more compact and easier to communicate. These are technically just redirects, but deserve a special mention. This example uses Twitter’s shortener to redirect to Gordon Buchanan’s documentary about wolves on iPlayer:

t.co/Rk466PgT3r

And this one uses Microsoft’s shortener to redirect to the much longer URL of the Windows Update Troubleshooter. It’s much more convenient for their technicians to dictate over the phone:

aka.ms/wudiag

If you found this useful, you can support my work by buying me a coffee or ordering a paperback or Kindle copy of the book.