19-07-2024, 02:08 PM
It was bound to happen, it will be worse in the future.
I want a PNP valve!
|
Internet crash today 19/7/24
|
|
19-07-2024, 02:30 PM
Seems odd that they put it out Nation wide and World wide. Would have thought the design should have allowed gradual installation.
Ironic that a system made to stop the very thing that happened caused it Gary
19-07-2024, 03:38 PM
The Register reports give some technical insight: https://www.theregister.com/2024/07/19/a...pdate_mess
www.borinsky.co.uk Jeffrey Borinsky www.becg.tv
19-07-2024, 04:03 PM
Let's be clear: This is NOT an "Internet crash".
None of the internet's infrastructure or performance is affected. What has happened is the CrowdStrike, a supplier of security s/w mainly to corporates, issued a faulty patch which causes some users' machines to enter a "death loop" of continual reboots. i.e. this is an end-user infrastructure issue, not in any way an internet issue. Oh, and the only reason that Windows machines are affected is nothing to do with Windows or Microsoft, it's only that the faulty AV product is one that is used on Windows machines and like all AV products regardless of operating system, it requires very low level access, hence the damage that "getting it wrong" can cause. It'd probably be worse if it was a Unix AV product as most of the websites in the world run on Unix servers.
sıʌǝɹq ɐʇıʌ `ɐƃuol sɹɐ
ʞɔıu
19-07-2024, 04:52 PM
(19-07-2024, 02:08 PM)PerdioPal Wrote: It was bound to happen, it will be worse in the future. Yes, I wrote a book years ago. it was going to be totally apocalyptic with 100s of millions dying, but I made it more light-hearted. "No Silver Lining" never blames Windows (or any OS) at all, The problem is a more subtle management issue. See also: … stupid about using GPS as simply an alternative to buying a stable oscillator or clock. Should only be used by for navigation, not Mobile base stations, fibre heads, DAB and DTT timing. It's a single point of failure either by jamming (DOS) or solar flare or management by operator. It's a small once off capital equipment saving that is absolutely stupid. I cyber security guy writes: Quote:I think what has shown is that every enterprise cannot have a single threat agent dependency. But what is also bad is potatoes like in 19th C. Too many people using Cloudfare, or the same Cloud services, or the ISPs & so called "cloud" providers using the one kind of thing. Automatic updates are bad. Written in 2017 Quote:“Why then do you say the Cloud is like potatoes?” later Quote:We were doing that anyway before our analysts suggested at the beginning of September that the so called Cloud had reached a tipping point. Failure was deemed to be inevitable, with more severe results than in previous years due to more vital infrastructure and core services being outsourced to it.”
19-07-2024, 04:58 PM
(19-07-2024, 04:03 PM)Nick Wrote: Let's be clear: This is NOT an "Internet crash". Agree with all of that. TOTALLY! It's NEVER really an OS issue. Mostly a management issue. But there will be a Friday evening bodged patch/update due to artificial deadlines. You just need two different ones that affect say servers and edge routers and it WILL fall like a house of cards. Then all sorts of things that shouldn't be affected will fail due to outsourcing. It might be a several hours glitch, or very bad indeed.
Why didn't Crowdstrike do extensive tests on Windows PCs,
same as used by their customers, to emulate what happens if an update is about to be released ? I
20-07-2024, 06:24 AM
(20-07-2024, 05:26 AM)Doodlebug Wrote: Why didn't Crowdstrike do extensive tests on Windows PCs, They would have done, but their change and release control processes were inadequate. It's also mainly servers rather than PCs that were affected.
sıʌǝɹq ɐʇıʌ `ɐƃuol sɹɐ
ʞɔıu
20-07-2024, 06:33 AM
Looking at reports on The Register, it's PCs as much as servers. Since the software provides endpoint security it will, of necessity, be running on ordinary PCs and laptops.
The phrase that comes to mind is "single point of failure".
www.borinsky.co.uk Jeffrey Borinsky www.becg.tv
20-07-2024, 07:00 AM
The critical infrastructure are the servers.
As in all such systems, there are risks: it's up to those responsible to mitigate as many of the risks as possible (within your remit/budget) and then to quantify those risks remaining. These turn have to be signed off and accepted by senior management. This process is a fundamental tenet of DR &BC (Disaster Recovery and Business Continuity) planning. DR is a different situation than BC - they are related but are very different things. In this example, DR is how you get your infrastructure back on air; BC is how you keep the business operational in the interim. I spent a lot of my career as a CTO for financial institutions and hedge funds with DR&BC central to my responsibilities. It's not an easy job as designing the processes needed requires a deep & fundamental knowledge of how the organisation works, plus continual refinement and testing. These are "living" processes - there is no "end" to a DR&BC project. There is also a full spectrum of how minor or severe an individual incident may be, both from a DR or BC perspective: the CFO is seriously ill (BC) or "there's a gas leak nextdoor and we have to evacuate the building" to "the office has burnt down" or "a bomb has gone off" - we were in the West End of London and both the gas leak and bomb scenarios happened, as did a 11KV underground substation fire coupled with a failure of a backup generator. In the few times over 35 years I've had to execute (part of) such a plan, there have always been unexpected/left field events that have had to be handled in real time - you can't plan for everything, so have to be flexible and very quick thinking. You need a good team, good resources, authority and backing from the executive board as hard decisions may have to be made. I had to plan for dirty bombs, pandemics, hacking, infrastructure failures, comms failures (roadworks breaking fibres etc.), utility failures, pandemics, strikes, fire, theft, heatwaves, floods, commercial risk (upstream supplier failures) etc. Plans have to workable and regularly rehearsed so everyone is aware of what to do when called upon. It's very very hard to do well.
sıʌǝɹq ɐʇıʌ `ɐƃuol sɹɐ
ʞɔıu |
| Users browsing this thread: |
| 1 Guest(s) |