Jump to content
IndiaDivine.org

Y2K FAILURE CURVE VINDICATED? - UK passport problems

Rate this topic


Guest guest

Recommended Posts

Westergaard 2000

 

 

Y2K FAILURE CURVE VINDICATED?

 

By Ian Hugo August 25, 1999

 

At the beginning of this year, I attempted a reasoned approach to

predicting what might happen later in the year as a result of the Year

2000 problem. These thoughts and prognostications, a kind of Old Hugo's

Almanac for the Millennium, are published on the Taskforce 2000 website

at www.taskforce2000.co.uk under Articles (Predicting Year 2000

Disruption); you can probably find the paper on various other Y2K sites

too. Since then (the thinking was actually done towards the end of last

year), I've had time to revisit my reasoning; and the passing of time

also has allowed some evidence to be gathered. So, as all would-be

soothsayers should, I'll try to re-assess how my reasoning and

predictions are bearing up.

 

MAJOR ELEMENTS OF THE REASONING

 

This is essentially a short recapitulation of the main points in the

reasoning I produced. It will serve to refresh the key points for

readers who d then and bring new readers up to date.

 

The first and most obvious key point is that whatever level of

disruption we shall see will not be seen at the change of century

(01.01.2000). What we shall see is a failure curve, of currently unknown

dimensions but possibly predictable shape, operating over some period

before and after the century change.

 

The second key point was to make a distinction between failures of any

kind and disruptive failures. Very many failures are possible but many

also can be recovered from quite quickly. The insight here was that the

cure for the so-called Millennium Bug could be quite as dangerous as the

disease (ripping interconnected IT systems apart and putting them back

together is an inherently risky exercise). So the "implementation" stage

of Y2K projects, far from being a formal sign-off, could be the most

fraught part of the whole program.

 

Thirdly, I tried to separate potential incidents that were inherently

unpredictable, such as an overlooked chip embedded in an important

control system from what might reasonably be predictable. The second and

third points together led me to suggest that implementation in general,

and installation of new replacement systems in particular, would be the

most likely cause of predictable disruption. Failure in such cases,

based on general experience to date, won't be a matter of hours or days

but of weeks or months.

 

Finally, as regards disruptive possibilities, I made the point that a

single failure at any given point in time would be unlikely to cause

major disruption. Companies can and do occasionally experience major

"hits," such as loss of a data center, but can recover from them within

days (to all external appearances) if sufficiently prepared. Short-term

disruption could occur from a single failure but longer-term (and

possibly terminal) failure would only occur if multiple impacts were

experienced within an overlapping time frame. The latter case I termed

"congestion."

 

That was my analysis of the (predictable) disruptive failure potential;

the questions of where and when remained to be addressed.

 

I proposed that the most likely victims would be large organizations

that were late in starting Y2K programs, if these could be known. The

reasoning behind this assertion was simply that most large programs

comprise multiple large projects which, given sufficient time, can be

scheduled so that their completion dates are staggered within the

ultimate deadline. This allows contingency time between the scheduled

completion time of one project and that of the next. Large and late

programs have to be telescoped into the time available, with all

completion dates falling within a short period. The latter thus have

much greater potential for overlapping failures.

 

As to when the beginning of the failure curve might become discernable,

I suggested the middle of this year. That prediction was based upon the

assumption that large and late programs would be in the implementation

phase as from June this year.

 

REASONING REVISITED

 

Since the time I started this line of reasoning, slightly less than a

year ago, I have had some opportunity to reflect upon it. Nothing I have

thought of since has caused me to change the line of reasoning.

 

However, I have had cause to reconsider my thoughts on timing. Just

about every organization I communicate with, even those with very well

managed programs, is late in delivery. To that extent, I believe I may

have missjudged the time at which late implementations would be

attempted and at which congestion would occur. I predicted the build-up

of congestion as beginning from June and now think that may be too

early, by about a quarter. It can occur earlier (in fact has done so;

see below) but congestion for purely Y2K reasons, if it occurs, I now

believe will occur later. Paradoxically, since I was originally arguing

against a focus on 01.01.2000, if my timing prediction is now to correct

the effects will be seen most visibly around the change of century, even

though the causes precede that date.

 

The result of my current revisions is that I think the failure curve I

originally drew should stand as regards its shape but be moved forward

in time by about three months.

 

THE EVIDENCE

 

At this point I need to acknowledge a debt of gratitude to UK Government

Departments and the media. The former (and other infrastructure bodies)

seem kindly determined to prove my predictions correct and the latter

have been assiduous in reporting the resultant failures. The reported

evidence to date comes from reported incidents at John Radcliffe

Hospital, Maritime and Coastguard Agency, London Electricity, National

Air Traffic Control Systems (NATS), Dept for Social Security (DSS), the

Inland Revenue and the Passport Agency. There are more but we'll leave

that aside for the moment. All the examples below are from the UK and

there is similar evidence available on the Internet from the USA,

although not (yet) from other countries, which is something I'll also

comment on later.

 

Briefly, John Radcliffe Hospital in Oxford failed in a PABX replacement

(for Y2K reasons) and the failure resulted in loss of telephone

communications for 9 hours as reported in the Daily Express. In fact,

the failure appears to have resulted in loss of full facilities for over

24 hours and caused a neighboring hospital to be put on alert.

 

The Maritime Agency had problems with replacement of its Adas (data

acquisition) system which resulted in some minor disruption over a

couple of weeks.

 

London Electricity attempted to replace some thousands of key-controlled

meters because they wouldn't have been able to record price changes

beyond the end of this year. The new keys didn't work (cut off supply)

resulting initially in some 2000 users being disconnected. The last I

heard was that the replacement program had been temporarily aborted

whilst thumbs were stuck in mouths (or in the air).

 

NATS proposed to resort to manual operation for a 2-hour period in order

to get some compliant replacement equipment installed. This produced

some consternation amongst Members of Parliament because (a) the

replacement was scheduled at a peak traffic period and (b) they had been

led to believe that NATS systems were already compliant. The replacement

reportedly failed, leading to more thumb-sucking no doubt.

 

I've highlighted the Inland Revenue Y2K program as high risk in my last

three assessments of UK central Government readiness (the last is

viewable at www.taskforce2000.co.uk/articles) because of replacement

programs scheduled late this year. The first of these (Infrastructure

2000), previously due for completion in September and now for November,

is already hitting problems. The Bradford Midland Tax Office has

apologized to various companies for threatening to send in bailiffs to

collect amounts supposedly due but which had already been paid. The

problem was that failures attributed to the Infrastructure 2000 project

(to replace 50,000+ desktops, 30,000 of which were classified as

critical) prevented staff from accessing current information.

 

The DSS has failed in implementation of a replacement National Insurance

Contributions system, which currently has some 1500 faults in it (low by

Microsoft standards?) according to a report to the House of Commons

Public Accounts Committee, and has resulted in miscalculation of

benefits to some 350,000 people, of which 70,000 cases remained to be

cleared as at the beginning of August.

 

Finally, the Passport Agency is dealing with a reported backlog of some

hundreds of thousands of applications for passports, in part because of

a failed implementation of a new passport issuing system (PASS) to

replace a previous and non-compliant one (PIMIS).

 

RECONCILIATION OF PREDICTIONS AND EVIDENCE

 

All these cases, all reported in the mass media, tend to support my

earlier conclusion that replacement of systems was likely to produce the

most significant cause of disruption in the Year 2000 context. In

effect, that is simply recognition of the fact that, in this context,

the remedy is about as dangerous as the disease.

 

I think there are a few important points to highlight. The first is that

nothing blew up and nobody got killed. Moreover, in all of the above

cases other than that of the Passport Agency (and arguably the Inland

Revenue), there has been and is unlikely to be any long-term mess. These

are cases of local and containable, albeit inconvenient and

embarrassing, administrative "hiccups." That much was in my original

prognostications.

 

The more interesting cases are the Passport Agency and possibly the

Inland Revenue. The Passport Agency is a long-term mess. It results from

failure to adequately implement a new passport issuing system

overlapping with a second impact: new Government legislation on

passports resulting in a large and sudden increase in demand. It is the

coincidence in time of the two impacts that has produced the longer-term

disruption, as predicted in my paper.

 

At the moment, the Inland Revenue case is producing only minor and

locally containable disruption. However, three further replacement

projects are scheduled for completion in October and November and,

should failure in any these overlap with continuing disruption from the

Infrastructure 2000 project, we could well see longer-term disruption

here also.

 

UNREPORTED CASES

 

Cases of Y2K failures resulting in disruption that get reported in the

media must be the tip of the iceberg. Common sense dictates that that

must be so. I, with limited knowledge of individual organizations, know

of two further cases that I cannot name in which internal disruption is

occurring and has necessitated resort to manual operation of processes.

In both cases, the situation is unrecoverable before financial year-end

and whether the results become public or not will depend very much on

the attitude of the organizations' auditors. I cannot believe these are

isolated cases.

 

Whether these and doubtless other cases will result in major disruption

probably depends less on auditors' attitudes than on whether they

experience another significant "hit" in an overlapping time-frame. But,

a good question to ask now is: how many organizations do you

know/suspect are already running some previously automated processes

manually? They must be at serious risk: either of getting into an

unrecoverable situation or of experiencing a second or third and

potentially terminal failure.

 

The other question I would like to pose relates to what is happening in

other countries. It seems certain that the UK recognized and started

work on the Y2K problem earlier than most other countries. The UK should

therefore be more advanced and I would expect that other countries

should thus be experiencing more of the kind of problems described above

than the UK. But there are no reports to confirm this. This leads to

three possible conclusions. Firstly, that countries starting late have

caught up and instituted better quality programs. Secondly, that they

haven't but any failures are not being reported in the national media.

Thirdly, that they have yet to progress to the stage that the UK has

reached and are thus not yet experiencing the failures reported in the

UK.

 

There is a fourth, and rather more alarming, possibility: that other

countries are relying heavily on a "fix on fail" strategy. I won't go

fully into the folly of such a strategy here. Suffice to say that such a

strategy relies on the assumption that you will know when a system fails

(quite unlikely unless you create "traps" to detect failure); and data

corruption, if it occurs undetected for any appreciable length of time,

may well be unrecoverable.

 

FUTURE PROGNOSTICATIONS

 

I believe that we will see increasing numbers of the kinds of failure

described above as this year progresses. Whether they result in anything

more than a few days local brouhaha and a couple of column inches in the

Press will depend, as I've said before, on whether second (third or

fourth) impacts occur, from any source, in the same time-frame.

 

If that assumption proves true, then the probability of multiple "hits"

occurring within a single organization must grow correspondingly. That

is what I have termed "death by attrition." Also, the probability of

individual hits weakening individual organizations in a single chain of

dependencies must similarly increase: death by a thousand cuts.

 

Thus far, there is no evidence to support "end of the world" scenarios

and we must all hope that no such scenario will be realized. However, it

would be foolish not to recognize that the early seeds of potential

widespread disruption are already sprouting.

Link to comment
Share on other sites

Not that they couldn't be coincidental, but I was just visiting an elderly

friend in the hospital and as we were chatting, he mentioned that he had

just gotten an 80 million dollar bill from his bank. Needless to say it was

a mistake.

 

Also, all the correspondance I am getting from one of my gourd customers is

coming in dated like 8-24-00 instead of 99.

 

I use Netscape, so the messages are arranged by date. When I get an

order, I leave the e mail in my inbox as a remainder. Usually the newer

messages are coming in after those, but this ladies all bubble to the end

because of the 00 date. May have absolutely nothing to do with Y2K

Link to comment
Share on other sites

Not that they couldn't be coincidental, but I was just visiting an elderly

friend in the hospital and as we were chatting, he mentioned that he had

just gotten an 80 million dollar bill from his bank. Needless to say it was

a mistake.

 

Also, all the correspondance I am getting from one of my gourd customers is

coming in dated like 8-24-00 instead of 99.

 

I use Netscape, so the messages are arranged by date. When I get an

order, I leave the e mail in my inbox as a remainder. Usually the newer

messages are coming in after those, but this ladies all bubble to the end

because of the 00 date. May have absolutely nothing to do with Y2K

Link to comment
Share on other sites

Join the conversation

You are posting as a guest. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...