No link left behind

From October the 17th GOV.UK will replace Directgov and Business Link as the best place to find government services and information, but what will that actually mean for people following links for these sites, or visiting bookmarked pages?

URLs – links and Web addresses – are the ‘strands’ in the Web metaphor. When URLs change, the strands break.

The advantages, and implications of preserving links has been nicely explained by the inventor of the Web, Sir Tim Berners-Lee in his essay “cool URLs don’t change”. But preserving URLs isn’t just about being a good citizen of The Web, it’s about putting users first.

Links in the wild

Directgov URLs abound on posters in Job Centres, official government stationery, mugs and even calculators handed out over the years. The same is true for Business Link, which has been amassing inbound links to its collection of Web sites since 2004.

It must be Gov ..

All of these are likely to be found out in the wild for some time to come. For example, there’s a Directgov URL direct.gov.uk/workplacepensions in the current Workplace Pensions campaign:

Somewhat ironically, that video has been deleted from YouTube!

When GOV.UK is released we don’t want people to visit these and find broken or abandoned websites. We also want to make sure that departments don’t have to throw away their old stationery or get every leaflet reprinted simply because the link has changed.

So, from October the 17th people following links or bookmarks to Directgov and Business Link will be automatically redirected to an appropriate page on GOV.UK.

It must be GOV

GOV.UK has been live (as a beta) for 10 months, so the launch on the 17th isn’t so much about releasing new software; it’s about removing the beta warnings and decommissioning a number of existing Web sites.

Redirection Day

Many organisations decide that redirecting sites is an onerous task, so they either redirect the all the old links to the front page of the new site, or simply switch the site off in its entirety. We’re not doing that. Instead we’ve created individual redirections for each and every page on Directgov and Business Link to an equivalent page on GOV.UK.

This means that someone looking for the Directgov Bank Holidays page will be taken straight to the new GOV.UK Bank Holidays page

Proxy

To ensure we redirect as many URLs as possible, we’ve collected log files from the many machines which run Directgov and Business Link.

The logs themselves are reasonably large; a single day from one Directgov server contains in the order of 25 million lines, containing 9 million distinct URLs. These URLs have proven to be useful test data which we replay against our redirection service to test our redirection as a part of our continuous integration environment.

Map all the things

What we then do is map those URLs: create a table that matches an old URL with a new URL, and assigns an HTTP status code to each so we know how to treat each one.

Assembling the mappings and ensuring they redirect people to the correct place has been an enormous undertaking, involving hundreds of people across dozens of organisations. For each redirection we have to collectively decide on which user-need is being met by the old page, and how best to satisfy the need on GOV.UK.

Some pages are straightforward to map, but in some cases a mapping may not be obvious. This is particularly true for pages which cover multiple topics, or serve to help people navigate several pieces of content. People might have bookmarked these pages for any number of reasons, so we’ve reached a decision based in part on data from use of the sites, and in part based on the judgement of the team at GDS and beyond.

Departments were given tools to review the mappings side-by-side onscreen, so we could get feedback straight away if the mapping appeared incorrect.

Not every road leads to GOV.UK

As Etienne mentioned earlier this week, there are some pages for which no user need has been identified on GOV.UK.

In those cases we’re going to show users a notice that tells them what has happened to the page, and offers them a link to a copy of that page on the National Archives website. In some cases we might also show users a suggested link to the canonical source – another website that meets the user need represented by the old page.

We’re confident all this effort and the hard work and collective attention to detail made by so many people will prove to have been worthwhile. Come October the 17th, rather than waking up to a broken Web, you’ll find it just that little bit better.

41 comments

  1. This is a really refreshing attitude – and you are to be commended for it.

    One question – when someone is redirected, will they see a banner saying “You are here because…”?
    I can imagine that someone who has been faithfully following a specifc link for the last 10 years, may think that the site is broken, or they’re being phished etc.

    Which, I guess leads to a suplimentary question – is the above also true for files as opposed to web pages?

    Keep up the good work.

  2. Does that mean 1st party cookies that are placed by directgov etc. sites will still be there? Have you mentioned that in your cookie policy and will citizens be able to opt-out of the user identifying ones?

      1. Using this code allows us to f distinguich between redirections which don’t yet have a destination URI, and missing mappings to our CI environment. Sadly we don’t ship if we’ve 418s left in our configuration, even though it’s a perfectly legitimate use of a fun 4xx status code.

  3. Terence: GOV.UK does have a “formerly DirectGov or Business Link” notice on the page, but we are concerned about people worrying about being phished. We did consider a staging page for the redirection, but that faired poorly in “guerrilla” user-testing and can have an adverse impact on search engines following links. I personally think the shorter, cleaner http://www.gov.uk domain will help that, greatly.

    As for “files”, we’re concerned not to break applications which may rely upon DirectGov or Business Link assets, such as images, JavaScript, CSS and PDFs, so we’ll continue to serve many of those for some time to come.

    DEFRA aren’t a part of the October launch, but many departments are moving to GOV.UK as a part of the Inside Government programme.

    David, dh.gov.uk isn’t a part of the GOV.UK launch. The DirectGov and Business Link sites have been repeatedly indexed by The National Archives, so links to the webarchive will continue to work, and you’ll be able to find old copies of any pages we redirect or remove.

    Mike, your browser won’t pass cookies from direct.gov.uk to http://www.gov.uk thanks to “Same Origin Policy”. A lot of thought lies behind GOV.UK’s implementation of cookie law, making it exemplar. There’s a fairly clear explanation of our policy here: https://www.gov.uk/help/cookies

    1. Hi Rory,

      I think someone came along and fixed this just as you posted as I can’t see an issue with the link from this end either in the tag or on the click outcome.

      Louise

  4. Is this project the reason that the HMRC site is down for the entire weekend?

    I love the approach you’re taking to creating the new GOV.UK structure and I look forward to pulling information on completing my tax return this coming week. It’s just a shame that the HMRC site was taken down for the entire weekend on what would be the last real weekend before filing deadline for paper returns.

    Hopefully the downtime is worth it and searching the new GOV.UK site for the notes pages that are referenced in the assessment forms will actually return the notes. The current (former?) HMRC search function doesn’t return anything useful.

    1. Guy, no, the HMRC outage was planned, and not related to the DirectGov and Business Link transition to GOV.UK.

      I also noticed the outage from our continuous integration tests which were failing, as we plan to redirect a number of Business Link pages to HMRC.

      1. Thanks for the update Paul. Odd choice of weekend for a planned outage of HMRC.
        It’s great that you’re exposing the work behind the scenes on the renovation of the government portals. I had a poke around in the beta site to see how the tax disc purchasing process was changing. I don’t think the process designer owned a car as the user must click some variation of ‘buy a tax disc’ 5 times before even entering any information. Is your project tackling process improvements such as this?

  5. I take it this doesn’t apply to the Welsh language pages then.
    Just tried three old links

    http://www.direct.gov.uk/cy/index.htm (works, as it goes to the the Welsh home page)

    http://www.direct.gov.uk/cy/CaringForSomeone/MoneyMatters/DG_10038111CY (redirects to a page saying ‘Directgov has been replaced by GOV.UK’ and points you to the gov.uk home page)
    http://www.direct.gov.uk/cy/HomeAndCommunity/YourlocalcouncilandCouncilTax/CouncilTax/DG_10037422CY (redirects to a page saying ‘Directgov has been replaced by GOV.UK’ and points you to the gov.uk home page)

        1. Phil, there are suggested links for pages which had a reasonable amount of inbound links or are high-traffic, but adding links to every Gone page would have demanded even more discussion across departments, time better spent on creating good mappings for pages which contain information where government is the canonical source.

          Also, the government citing another source demands great care and attention to detail which will need to be maintained over time as suggested sites come and go.

          Anointing one site when there are many easily discovered alternatives also introduces risks. The reductio ad absurdum of this issue is a page with greener advice for holding a BBQ, where we could have linked to any one of a number of different NGOs, or garden centres, all of whom could legitimately claim to be the best source of such advice.

        1. Well the title is a snowclone on “no marine left behind” which wasn’t strictly true, either, but I think linking to the original in the National Archives for pages where there is no clear user-need on a government site, means it isn’t left behind.

  6. Loving this blog, and being able to follow the story of, and approach taken to, the development of gov.uk – it’s been great to watch. Just to provide a bit of citizen-provided QA on this re-direct topic, the directgov tax disc URL (http://www.direct.gov.uk/taxdisc) shown on the middle image in “Links in the wild” in this post isn’t re-directing to gov.uk currently, and is ending up on a directgov page here: https://www.taxdisc.direct.gov.uk/EvlPortalApp/app/home/intro.

    1. Thanks Chris, that link is now pointing directly at the application for a taxdisc, rather than the old DirectGov page which then linked to the DVLA application.

      Confusingly taxdisc remains on a subdomain of direct.gov.uk — the post originally explained how there would be some orange on transactions for some time to come, but was edited for length. Sorry for the confusion this may have caused.

  7. My only comment could be considered a rather pedantic one but worth making. As the article itself talks about using redirects from the old sites to replacement pages the over use of ‘URL’ is incorrect as these web addresses do not point to where an object can actually be found. Rather they point to a resource that may or will redirect you to a relevant resource for the topic you are looking for. Therefore the article should really be discussing URI’s and not URL’s as the web addresses are not physical locations of a resource. Most servers these days use redirect’s, URI re writing modules or directory aliasing to some extent making the term ‘URL’ incorrect for most web address use and the IEEE has for some time now been advising the adoption of URI as the default term although not many actually pay any attention to it.

      1. Hi Paul given that universities rarely get this correct I am happy to forgive you. Just it would be good to see the move towards the right use of terminology as in a lot of topics the common parlance tends to take ownership and turn things into buzzwords.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s