“Fake News” considered harmful

The “Fake News” as a term and a concept became ubiquitous ever since Trump used it in September 2016 to dismiss accusations of sexual assault against him.

Unfortunately, that expression stuck ever since and is now used at large, including by experts in disinformation in their scientific literature.

This is highly problematic.

The ongoing common usage of “Fake News” term is harmful to healthy public discourse and research into misleading information.

Terminology forces a mindframe

Despite what his political opponents wanted to believe, Trump and his allies (notably Steve Bannon) are far from being dumb. While they are con men, they mastered the art of switching public attention away from topics that don’t suit them.

Using a specific vocabulary and terms that “stick” with the public is one of them, and the famous “covfefe” tweet is arguably the most egregious example of it.

While many people still remember the term and the tweet itself, very few realized that it switched over the news cycle away from a bunch of seriously problematic stories for Trump, namely, Michael Flynn pleading the Fifth, Trump’s “deal” with Saudi Arabia, and massive cuts to science and education.

And the trick did its job incredibly effectively:

George Lakoff has warned us about conservatives using language to dominate politics and distort the perspective already in 2003.

“Covfefe” is a modern example of such usage. Michael Flynn pleading 5th and Saudi Arabia “deal” stories were pretty damaging to the image Trump has been trying to maintain with his followers. If gained enough momentum, they could have started eroding his base’s trust in him. A simple “spelling error from a president overtired from making all the hard decisions” can definitely be passed for “liberals nitpicking over nothing” – basically a mirror of Obama’s beige suit or latte salute.

“Fake News” is no different.

“Fake News” transforms public debate into a shouting match

A major issue with “Fake News” is that it stops a public debate and transforms it into a screaming match, based on loyalty. Trump says that accusations of rape against him and the “grab them by the pussy” tape are “Fake News”. Hilary says that his claims of being a billionaire and a great businessman are “Fake News”.

And that’s where the debate stops. Most people will just get offended that the other side is calling obviously true things “Fake News” or persisting in their belief that obviously “Fake News” are real. And their decision regarding what to go is already predetermined by prior beliefs, partisanship, and memes.

Which is exactly what you want when you know in advance that you are in the wrong and will be losing in a proper civil debate.

Which is Trump’s case. Hence the “Fake News”

And it is exactly the opposite of what you want when you are in the right and know you will win over people’s opinions if you have a chance to get their attention.

Which is the case for investigative journalists, scientists, civil rights activists, and saner politicians.

And yet, for whatever bloody reason the latter decided to get on with and keep using the “Fake News” term.

And by doing it, surrendered any edge their position might have had from being rooted in reality.

Just to appropriate a catchy term, even though it is meaningless and can be both a powerful statement and a powerless cry depending on how you see the person saying it.

Major journals publishing an in-depth investigation? Fake news by corrupt media!

Scientific journal publishing a long-awaited scientific article suggesting climate change is accelerating and is even worse than what we expected? Fake news by climate shills!

Are doctors warning about pandemics? Fake news by the deep state looking to undermine the amazing record of president Trump on job creation and economic growth!

Russian media outlets writing a story about the danger of western vaccines? Fake news! … Oh, wait, no, it’s just leftist cultural Marxists propaganda being countered by a neutral outlet!

Basically, by leaving no space for discussion, disagreement, nuance, or even a clear statement, “Fake News” is a rallying cry that precludes civil public debate and bars any chance to convince the other side.

“Fake News” is harmful to public debate.

“Fake News” is too nebulous for scientific research

Words have definitions. Definitions have meanings. Meanings have implications.

“Fake News” is a catchphrase and has no attached definitions. It is useless for scientific process.

How do you define “Fake” and how do you define “News”?

  • Is an article written by a journalist lost in scientific jargon that has some things wrong “Fake News”?
  • Is an information operation by a state-sponsored APT using puppet social media profiles to create a sentiment “Fake News”?
  • Is yellow press with scandalously over-blown titles “Fake News”?
  • How about a scientific article written with egregious errors that have results currently relevant to an ongoing pandemic – is it “Fake News” too?
  • Are summaries and recommendations by generative language models such as GPT-Chat “Fake News”, even though it is trying its best to give an accurate answer, but is limited by its architecture?

None of those questions have an answer, or rather answer varies depending on the paper and the paper’s authors’ interests. “Fake News” lacks an agreed-upon operational definition and is too nebulous for any useful usage.

“Fake News” is harmful to the research on public discourse and unauthentic information spread.

Ironic usage is not a valid excuse

An excuse for keeping using the “Fake News” around is that it’s ironic, used in the second degree, humor, riding on the virality wave or calling the calling the king naked.

Not to sound like fun police, but none of them are valid excuses. While the initial use might not be in the first degree, its use still affects thinking patterns and will stick. Eventually, the impact of usage of the term will become more and more natural and start affecting thought patterns, until in a moment of inattention it is used in first-degree.

Eventually “Fake News” becomes a first-degree familiar concept and starts being consistently used.

The harm of “Fake News” occurs regardless of initial reasons to use it.

A better language to talk about misleading information

To add insult to injury, before “Fake News” took over we already had a good vocabulary to talk about misleading information.

Specifically, misinformation and disinformation.

Intention.

Not all misleading information is generated intentionally. In some cases a topic is complex can just be plain hard, and the details making a conclusion valid or invalid – a subject of ongoing debate. Is wine good or bad for health? How about coffee? Cheese? Tea? With about 50 000 headlines claiming both, including within scientific, peer-reviewed publications, it makes sense that a blogger without a scientific background gets confused and ends up writing a misleading summary of their research.

That’s misinformation.

While it can be pretty damaging and lethal (cf alt-med narrative amplification), it is not intentional, meaning the proper response to it is education and clarification.

Conversely, disinformation is intentional, with those generating and spreading it knowing better, but seeking to mislead. No amount of education and clarification will change their narrative, just wear out those trying to elucidate things.

Means

Logical Fallacies are used and called out on the internet to the point of becoming a meme and a “logical fallacy logical fallacy” becoming a thing.

While they are easy to understand and detect, the reality is that there are way more means to err and mislead. Several excellent books have been written on the topic, with my favorites being Carl Bergstrom’s “Calling Bullshit”, Dave Levitan’s “Not a Scientist” and Cathy O’Neil’s “Weapons of Math Destruction“.

However, one of the critical differences between just calling out misinformation/disinformation from “Fake News” is explaining the presumed mechanism. It can be elaborate as “exploited Simpson’s paradox in clinical trial exclusion criteria” or as simple as “lie by omission” or “falsified data by image manipulation“.

But it has to be there to engage in a discussion or at least to get the questioning and thinking process going.

Vector.

No matter the intention or the mechanism of delivery, some of the counterfactual information has a stronger impact than others. A yellow press news article, minor influencer TikTok, YouTube, or a Twitter post are different in their nature but are about as likely to be labeled as untrustworthy. Reputable specialized media news articles or scientific journal publications are on the opposite side of the spectrum of trustworthiness, even though they need amplification to reach a sufficient audience to have an impact. Finally, major cable news segments, or a declarations by a notable public official – eg a head of state – are a whole other thing, percieved as both trustwrothy and reaching a large audience. Calling it out and countering it would require an uphill battle against authority and reputation and a much more detailed and in-depth explanation of misleading aspects.

More importantly, it also provides a context for those trying to spread misleading information. Claiming the world is run by reptilians because a TikToker posted a video about it is on the mostly risible side. Claiming that a public official committed fraud because a major newspaper published an investigative piece is on a much less risible one.

Motivation / Causes.

Finally, a critical part of either defeating disinformation or addressing misinformation is to understand the motivation behind the creators of the former and the causes of error of the latter.

This is a bit harder when it comes to the civil discourse, given that accusations are easy to make and hard to accept. It is however critical to enable research and investigation, although subject to political considerations – in the same way as attribution is in infosec.

Disinformation is pretty clear-cut. Gains can be financial, political, reputational, or military gain, but it’s more of the exact mechanism envisioned by the malicious actor that’s harder to identify and address. They are still important to understand to effectively counter it.

Misinformation is less clear-cut. Since there is no intention and errors are organic, the reasons they emerged are likely to be more convoluted. It can be overlooking primary sources, wrong statistical tests, forgetting about survivorship bias or other implicit selection present in the sample, or lack of expertise to properly evaluate the primary sources. Or likely all of the above combined. They are still important to understand how misinformation emerges and spreads and to stop it.

A couple of examples of better language.

HCQ for COVID from Didier Raoult’s lab

This is a disinformation scientific article, where misleading conclusions were generated by manipulating the exclusion criteria in medical trials and a phony review, motivated by a reputational, political, and likely financial gain.

Chuck Yager’s support of presidential candidate Trump

This is a disinformation social media post, that operated by impersonation of a public figure, motivated by adding credibility to Trump as presidential candidate for political gain.

DeepFake of Zelensky giving the order to surrender

This is a disinformation video on a national news channel, that was generated using deep generative learning (DeepFake) and injected using a cyber-attack, motivated by immediate military gain resulting from UA soldiers surrendering or at least leaving their positions.

Vaccine proponents are pharma shills

This is a piece of disinformation news, blog posts, and scientific articles, that were generated by manipulating data and hiding the truth, motivated by financial gains from money given by pharma.

Vaccine opponents make money from the controversy

This is disinformation blog posts, news articles, and videos, that were generated by selecting results and fabricating data, motivated by financial gains from sold books, films, convention tickets, and alt-medical remedies.

Better language leads to a better discussion

The five examples above got you screaming for references for each part of the statement, whether you disagree with them or agree with them.

Good.

That’s the whole point.

Same statements, but as “<X> IS FAKE NEWS” would have led to no additional discussion and just an agreement/disagreement and a slight annoyance.

A single, four-part statement regarding presumably misleading information is harder to form, especially if references are included but is also hard to deny, refute, and just upon being stated will open a door to hesitation and investigation.

Which is the opposite of “Fake News”. Which is what we want.

“Fake News” should be considered harmful and its use – abandoned.

EL5 for vaccines and natural immunity

With the vaccine hesitancy being as prevalent as it is now, I think more than one of you have had someone in their immediate environment be vaccine-skeptical or at least vaccine-hesitant.

Here is an explanation that worked for someone from my family, helping them to transit from a “I don’t need it – I am healthy enough” mindset to “Ok, I am getting the vaccine ASAP”. In this particular case, the fact that they had several friends go to hospital due to COVID and in some cases be long-term incapacitate by COVID19 helped to raise awareness of how bad the disease can be, as well as underlying trust into my competence as well as a good will, with most of hesitation focused around “my natural immunity is already good” and “vaccines don’t work that well to start with”. Oh, and also the simplifications are pretty insane and I can hear immunologists and virologists scream for here, but it’s one of those “sacrifice precision for ease of understanding” cases.

So, the preamble being done, let’s get to the actual explanation:

“Your immune system works by learning what attacked you and takes time to react. Specifically, it looks for cells that explode and let their contents out (also called necrosis, as opposed to apoptosis-when the cell dies because it is programmed to by the organism) and starts looking for anything it would have not encountered before (or at least that it doesn’t encounter in the organism often) that is present in the contents of the exploded cell or around. When the immune system detects what is there, they start generating the antibodies and test all the cells in the organism to see if they contain the new dangerous compound (here it is particles of the virus). If the dangerous compound is present in the cells, the immune system orders “cleaner” cells to digest those cells and destroy them and everything they contain.

The problem is that the process starts only when there are first mass cell deaths (so the infection is well stated) and take several days to start and spin-up. During that time, the virus continue to propagate in the body.

So when the cleaning cells start digesting cells that have the virus in them, that’s a lot of cells. And some of them are very important and cannot be replaced – such as heart cells. Or some cells that can be regenerated, but if there are too much that die, you die as well (a good example are lung cells).

In case of the vaccine, you have proteins that are located on the outside of the virus that are injected, as well as an agent that will provoke the cells to explode and attract the immune system’s attention. So the immune system will detect the viral proteins “on the crime scene” and learn to find and destroy them on sight. But since there are not that many cells that are affected and are very easy to regenerate (muscle cells are among the easiest to regenerate – you actually regenerate a lot of them every day if you are exercising) and are not essential to life (given it’s a muscle of a hand and not heart/lungs/brain).

Once your immune system learnt to detect them, it will remember them for a long time and closely monitor your body to see if it can find the smallest traces of the particles it learned were dangerous, even before the first infected cell explodes. Your immune system will monitor your body even closer and longer for those particles if it finds those particles on a different “scene of crime’. That’s why we need a second dose of the vaccines.

So when the virus arrives, it is immediately detected and neutralized and cells it has the time to infect are destroyed before they explode and before the virus is able to replicate itself.

The reason we need to wait after a vaccine is that the immune system transits from a “I am currently responding to an attack” to a “I have responded to an attack and will remember what attacked me, so that the next time I can respond faster and better”. That process takes about 10-14 days, but 21 days leave a margin if the immune system doesn’t work all that great (for instance if the person is stressed).”

Now, as to the need to get vaccinated immediate, the argument went along the lines that with countries opening up, barriers for the virus propagation are going down and we are still very far from herd immunity, meaning your own vaccine is the only thing that will be protecting you, with everyone not yet vaccinated getting sweeped in the upcoming wave.

Arguing with Turing is a loosing game (or why predicting some things is hard)

FiveThirtyEight’s 2016 election forecast and Nassim Taleb

Back in August 2016, Nassim Nicholas Taleb and Nate Silver had a massive and a very public run-in on Twitter. Taleb – the author of “Black Swan” and “Antifragile” and Nate – the founder of FiveThirtyEight were arguably the two most famous statisticians alive back at the time, so their spite ended up attracting quite a lot of attention.

The bottom line of the spat was that for Taleb, the FiveThirtyEight 2016 US presidential election model was overly confident. As the events in the race between Clinton and Trump swung the vote intentions one way or another (mostly in response to the quips from Trump), Nate’s model’s forecasts had too large swings in its predictions and confidence intervals to be considered a proper forecast.

Taleb did have a point. Data-based forecasts, be they based on the polls alone or more considerations, are supposed to be successive estimations of a well-defined variable vector – aka voting splits per state, translating in electoral college votes for each candidate on the day of the election. Any subsequent estimation of that variable vector with more data (such as fresher polls) is supposed to be a refinement of the prior estimation. We expect the incertitude range of the subsequent estimations to be smaller than the one of the prior estimation and to be included into the prior predictions incertitude range – at least most of the time, modulo expected errors.

In the vein of the central thesis of “Black Swan”, Taleb’s shtick for 2016 election models – and especially the FiveThrityEight’s one – was that they were prone to overconfidence they had in their predictions and underestimated the probability of rare events that would throw them off. As such, according to him, the only correct estimation technique would be a very conservative one – giving every candidate 50/50% chance of willing right until the night of the election, when it would jump to 100%.

Image
Nassim Taleb’s “Rigorous” “forecasting”

The thing is that Nassim Nicholas Taleb is technically correct here. And Nate Silver was actually very much aware of that himself before Taleb showed up.

You see, the whole point of Nate Solver creating the FiveThirtyEight website and going for rigorous statistics to predict games and election outcomes was specifically the fact that mainstream analysts had hard time with probabilities and would go either for a 50/50 (too close to call), or a 100% sure, never in-between (link). The reality is that there is a whole world in between and that such rounding errors would cost you dearly. As anyone dealing with inherently probabilistic events would painfully learn, for an 80% win bet, you still loose 20% of the time – and have to pay dearly for it.

Back in 2016, Nate’s approach was to make forecasts based on the current state of the polls and the estimation of how much that could be expected to change if the candidates were doing about the same thing as the candidates in past elections. Assuming the candidates don’t do anything out of the ordinary and the environment doesn’t change drastically, his prediction would do well and behave like an actual forecast.

Nate Silver| Talks at Google| The Signal and The Noise
Nate explains what is a forecast is to him

The issue is that you can never enter the same river twice. Candidates will try new things – especially if models show them as loosing if they don’t do anything out of the ordinary.

Hence Taleb is technically correct. Nate Silver’s model predictions will always be overly confident and overly sensible to changes resulting from candidate’s actions or previously unseen changes in the environment.

Decidability, Turing, Solomonoff and Gödel

However Taleb’ technical correctness is completely besides the point. Unless you are God himself (or Solomonoff’s Demon – more about that in a second), you can’t make forecasts that are accurately calibrated.

In fact, any forecast you can make is either trivial or epistemologically wrong – provably.

Let’s for a second imagine ourselves physicists and get a “spherical cow in vacuum” extremely simplified model of the world. All the events are 100% deterministic, everyone obeys well-defined known rules, based on which they will make their decisions, and the starting conditions are known. Basically, if we had a sufficiently powerful computer, we can run the perfect simulation of the universe and come up with the exact prediction of the state of the world and hence of the election result.

Forecast in this situation seems trivial, no? Well, actually no. As Alan Turing – the father of computing – have proved in 1936, unless you run the simulation, in general you cannot predict even if a process (such as voter making up his mind) will be over by the election day. Henry Gordon Rice was even more radical and in 1951 has proven a theorem that can be summarized as “All non-trivial properties of simulation runs cannot be predicted”.

In computer science, forecasting if a process will be over is called halting problem and is known as a prime example of a problem that is undecidable (aka, there is no method to predict the outcome in advance). For those of you with an interest in mathematics, you might have noted a relationship to Gödel’s incompleteness theorem – stating that undecidable problems will exist for any set of postulates that are complex enough to embed the basic algebra.

Things only go downhill once we add some probabilities into the mix.

For those of you who have done some physics, you probably have recognized the extremely simplified model of the world above as Laplace’s demon and have been screaming on the screen about the quantum mechanics and how that’s impossible. You are absolutely correct.

That was one of the reasons that pushed Ray Solomonoff (also known for the Kolmogrov-Solomonoff-Chaitin) to create something he called Algorithmic Probability and a probabilistic pendant to Laplace’s demon – Solomonoff’s demon.

In the Algorithmic probability frameowrlk, any possible theory about the world that can exist has some non-nul probability associated to it and is able to predict something about the state of the world. And to perform an accurate prediction about the probability of the event, you need to calculate the probability of that event according to all theories, weighted by the probability of each theory:

Single event prediction according to Solomonoff (conditional probabilities)
E is event, H is a theory, sum is over all possible theories

Fortunately for us, Solomonoff also provided a (Bayesian) way of learning the probabilities of events according to theories and probabilities of theories themselves, based on the prior events:

Learning from a single new data point according to Solomonoff
D is new data, H is a theory, sum is over all possible theories T, but excluding theory H

Solomonoff was nice enough to provide us with two critical results about his learning. First, that it was admissible – aka optimal. In other terms, there is no other learning mode that could do better. The second – that it was uncomputable. Given that a single learning or prediction step requires an iteration over the entire infinite set of all possible theories, only a being from a thought experiment – Solomonoff’s demon – is actually able to do it.

This means that any computable prediction methodology is necessarily incomplete. In other terms, it is guaranteed not to take in account enough theories for its predictions to be accurate. It’s not a bug or an error – it’s a law.

So when Nassim Taleb says that Nate Silver’s model overestimates its confidence, while technically correct, is also completely trivial and tautological. Worse than that. Solomonoff also proved that in general case, we can’t evaluate by how much our predictions are off. We can’t possibly quantify what we are leaving on the table by using only a finite subset of theories about the universe.

The fact that we cannot evaluate in advance by how much we are off in our predictions basically means that all of Taleb’s own recommendations about investments are basically worthless. Yes his strategy will do better when there are huge bad events that market consensus did not expect. It will however do much worse in case they don’t happen.

Basically, for any forecasts, the, best we can do is something along the lines of Nate Silver’s now-cast + some knowledge gleaned from the most frequent events that occurred in the past. And that “frequent” part is important. What made the FiveThirtyEight model particularly bullish about Trump in 2016 (and in retrospect most correct) was its assumption of correlation of polling errors across the states. Without it, it would have given Trump 0.2% chances to win instead of 30% it ended up giving him. Modelers were able to infer and calibrate this data because it occurred in every prior election cycle.

What they couldn’t have properly calibrated (or included or thought of for the matter) were one-off events that would radically change everything. Think about it. Among things that were possible, although exceedingly improbable in 2016 were:

  • Antifa militia killing Donald Trump
  • Alt-right militia killing Hilary
  • Alt-right terrorist group inspired by Trump’s speeches blowing up the One World Trade Center, causing massive outrage against him
  • Trump making an off-hand remark about grabbing someone’s something that would outrage the majority of his potential electorate
  • It rains in a swing state, meaning 1000 less democratic voters decide last second to note vote because Hillary’s victory is assured
  • FBI director publicly speaking about Hillary’s emails scandal once again, painting her as corrupt politician and causing a massive discouragement among potential democrat voters
  • Aliens showing up and abducting everyone
  • ….

Some of them were predictable, since they happened before (rain) and were built into the model’s uncertainty. Other factors were impossible to predict. In fact, if in November 2015 you would have claimed that Donald Trump will become the next president due to the last-minute FBI intervention, you would have been referred to the nearest mental health specialist for urgent help.

2020 Election

So why write about that spat now, in 2020 – years after the fact?

First, it’s a presidential election year in the US, again, but this time around even with more potential for unexpected turns. Just as in 2016, Nate’s models is drawing criticism, except this time for underestimating Trump’s chances to win instead of overestimating. While Nate and the FiveThrityEight team are trying to account for the new events and be exceedingly clear about what their predictions are, they are limited in what they can reliably quantify. His model – provably – cannot predict all the extraordinary circumstances that still can happen in the upcoming weeks.

There is still time for the Trump campaign to discourage enough female vote by painting Biden as rapist – especially if FBI reports in the week before the election about an ongoing investigation. There is still time for a miracle vaccine to be pushed to the market. This is not an ordinary elections. Many thing can happen and none of them is exactly predictable. There is still room for new modes for voter suppression to pop up and for Amy Barrett to be nominated to make sure the Supreme court will judge the suppression legal.

And no prediction can account for them all. Nate and the FiveThirtyEight team are the best applied statisticians in the politics since Robert Abelson, but they can’t decide the undecideable.

2020 COVID 19 models

Second – and perhaps most importantly – 2020 was a year of COVID 19 and forecasts associated to it. Back in March and April, you had to be either extremely lazy or very epistemologically humble to not try to build your own models of COVID19 spread or not to criticize other’s models.

Some takes and models were such an absolute hot garbage that the fact they could come from people I have respected and admired in the past plunged me into a genuine existential dread.

But putting those aside, even excellent models and knowledgeable experts were constantly off – and by quite a lot as well. Let’s take an example of expert predictions from March 16th about how many total cases they thought the CDC will have reported on March 29th in the US (from FiveThrityEigth for a change). Remind yourself – March 16th was the time when only 3500 cases were reported in the US, but Italy was already out of the ICU beds and was not resuscitating anyone above 65, France was going into confinement and China was still mid-lockdown.

Consensus: ~ 10 000 – 20 000 cases total with 80% upper bound at 81 000.
Reality: 139 000 cases

Yeah. CDC reported about 19 000 cases on the March 29th alone. Total was almost 10 times higher than the predicted range and almost the double of the predicted interval. But the reason they were off wasn’t because experts are wrong and their models are bad. Just as the election models, they can’t account for factors that are impossible to predict. In this specific case, what made all the difference was that CDC’s own COVID 19 tests were defective, and the FDA didn’t start issuing emergency use authorizations for other manufacturer’s tests until around the 16th of March. The failure of the best-financed, most reliable and most up-to-date public health agency to produce reliable tests for the world-threatening pandemics for two months would have been an insane hypothesis to make given the prior track record of the CDC.

More generally, all the attempts to predict the evolution of the COVID19 pandemic in the developed countries ran into the same problem. While we were able to learn rapidly how the virus was spreading and what needed to be done to limit its spread, the actions that would be taken by people and governments of developed countries were a total wild card. Compared to the election forecasts, we don’t have any data from prior pandemics in developed countries to build and calibrate models upon.

Nowhere is it more visible than in the state-of-the-art models. Here is Youyang Gu’s COVID19 projections forecasting website, using a methodology similar to the one of Nate Silver uses to predict the election outcomes (parameters fitting based on prior experience + some mechanistic insight about the underlying process). It is arguable the best performing one. And yet, it failed spectacularly, even a month in advance.

Youyang Gu’s forecast for deaths/day in early July
Youyang Gu’s forecast for deaths/day in early August. His July prediction was off by about 50-100%. I also wouldn’t be as optimistic about November.

And once again, it’s not the fault of Youyang Gu. A large part of that failure is imputable to Florida and other Southern states doing everything to under-report the number of cases and deaths attributed to COVID19.

The model is doing as well as it could, given what we know about the disease and how people have behaved themselves in the past. It’s just that no model can solve the halting problem of “when will the government decide to do something about the pandemic, what it will be and what people will decide to do about it”. Basically the “interventions” part of projections from Richard Neher’s lab COVID19-scenarios is as of now undecidable.

Once you know the interventions, predicting the rest reliably is doable.

Another high-profile example of expert model failure is the Global Health Security Index ranking for countries best prepared for pandemic from October 2019. US and the UK received respectively the best and second best scores for pandemic preparedness. Yet, if we look today (end of October 2020) and the COVID19 per capita among developed countries, they are a close second and third worst performers. They are not an exception. France ranked better than Germany. Yet it has 5 times the number of deaths. China, Vietnam and New Zealand – all thought to be around the 50/195 mark are now the best performers, with a whooping 200, 1700 and 140 times less deaths per capita than the US – the supposed best performer. All because of difference in the leadership approach and how decisive and unrelentless the response to the pandemic from the government was.

GHS index is not useless – it was just based on the previously implicit expectation that governments of countries affected by the outbreaks would aggressively intervene and do all in their capacity to stop the pandemic. After all, USA in the past went as far as to collaborate with the USSR, at the height of the Cold War to to develop and distribute the polio vaccine. It was after all what was observed during the 2010 Swine flu pandemic and during the HIV pandemic (at least when it became clear it was hitting everyone, not just the minorities) and in line with what WHO was seeing in developing countries hit by pandemics that did have a government.

Instead of a conclusion

Forecasting is hard – especially when it is quantitative. In most of cases, predicting what will happen deterministically is impossible. Probabilistic predictions can only base themselves on what happened in the past frequently enough to allow them to be properly calibrated.

They cannot – provably – account for all the possible events, even if we had a perfectly deterministic and known model of the world. We can’t even estimate what the models are leaving on the table when they try to perform their predictions.

So the next time you criticize an expert or modeler and point a shortcoming in their model, think for a second. Are you criticizing them for not having solved the halting problem? Not having bypassed the Rice theorem? Built an axiome set that does not account for a previously un-encountered event? Would the model you want require a Laplace or a Solomonoff Demon to actually run it? Are you – in the end – betting against Turing?

Because if the answer is yes, you are – provably – going to lose.

===========================

A couple of Notes

An epistemologically wrong prediction, can be right, but for the wrong reason. A bit like “even the stopped clock shows the correct time twice a day”. You can pull a one-off, maybe a second one if you are really lucky, but that’s about it. That’s one of the reasons for which any good model can only take in account events that has occurred sufficiently frequently in the past. To insert them into the model, we need to quantify their effect magnitude, uncertainty and the probability of occurrence. Many pundits using single-parameter models in 2016 election to predict a certain and overwhelming win for Hillary learnt it the hard way.

Turing’s halting problem and Rice theorem have some caveats. It is in fact possible to predict some properties of algorithms and programs, assuming they are well-behaved in some sense. In a lot of cases it involves them not having loops or having a finite and small number of branches. There is an entire field of computer science dedicated to developing methods to prove things about algorithms and to writing algorithms in which properties of interest are “trivial” and can be predicted.

Another exception to Turing’s halting problem and Rice’s theorem is given by the law of large number. If we have seen a large number of algorithms and seen the properties of their outcomes, we can make reasonably good predictions about the statistics of those properties in the future. For instance we cannot compute the trajectories of individual molecules in a volume, but we can have a really good idea about their temperature. Similarly, we can’t predict if a given person will become infected before a given day or if they will die of an infection, but we can predict the average number of infections and the 95% confidence interval for a population as a whole – assuming we did see enough outcomes and know the epidemiological interventions and adherence. That’s the reason, for instance, why we are pretty darn sure about how to limit COVID19 spread.

Image
Swiss Cheese defense in depth against COVID19 by Ian M. Mackay

To push the point above further, scientists can often make really good prediction from existing data. Basically, that’s how scientific process work – only theories with confirmed good predictive value are allowed to survive. For instance, back in February we didn’t know for sure what was the probability of the hypothesis “COVID19 spreads less as temperature and humidity increases”, nor what would be the effect on spread. We had however a really good idea that interventions such as masks, hand washing, social distancing, contact tracing, at-risk contacts quarantine, rapid testing and isolation would be effective – because of all we learnt about viral upper respiratory tract diseases over the last century and the complete carnage of the theories that were not good enough at allowing us to predict what would happen next.

There is however an important caveat to the “large number” approach. Before the large numbers laws kick in and modeler’s probabilistic models become quantitative, stochastic fluctuations – for which models really can’t account – dominate and stochastic events – such as super spreader events – play an outsized role, making forecasts particularly brittle. For instance, in France the early super spreader event in Mulhouse (mid-February) meant that Alsace-Moselle took the brunt of the ICU deaths in the first COVID19 wave, followed closely by the cramped-up and cosmopolitan Paris region. One single event, one single case. Predicting it and its effect would have been impossible, even with China-grade mass survelliance.

Mobilising against a pandemic - France's Napoleonic approach to covid-19 |  Europe | The Economist
March COVID19 deaths in France from The Economis

If you want to hear a more rigorous and coherent version of the discussion about decidability, information, Laplace-Bayes-Solomonoff and about automated model design (aka ML/AI mesh with all of that), you can check out Lê Nguyên Hoang and El Mahid El Mhamdi. A lot of what I mention is presented rigorously here, or in French – here.

On masks and SARS-CoV-2

This comment was initially a response to a youtube video from Tech Ingredients – a channel I have in the past thoroughly enjoyed for their in-depth dive into scientific and engineering aspects of various heavy on engineering DIY projects. Unfortunately, I am afraid that panic around COVID19 has prevented a lot of people from thinking straight and I could but disagree with the section on masks.

==

Hey there – Engineer turned biomedical scientist here. I absolutely love your videos and have been enjoying them a lot, but I believe that in this specific domain I should have enough experience to point out what appears to me as overlooked and is likely to chase drastically your recommendation on masks.

First of all, the operation room masks and the standard medical masks are extremely different beasts – if anything their capacity to filter out small particles, close in size to droplets transporting COVID19 at the longest distance is much closer to N95s than those of standard medical masks:

masks filtration efficiency

The standard medical masks let through about 70% of droplets on the smaller end of those that can carry SARS-CoV-2. A decrease in exposure of such magnitude has not been associated with a statistically significant reduction in contagion rates in any respiratory transmitted disease.

So why are standard medical masks recommended for sick people? The main reason for that is that in order to get into the air, the viral particles need to be aerosolized by coughing/sneezing/speaking by a contaminated person. The mask does not do well at preventing small particles from getting in and out, but it will prevent, at least partially the aerosolization, especially for larger droplets – that will contain more viruses and hence be more dangerous.

Now, that means that if you really want to protect yourself, rather than using a mask, even surgical, it’s much better to use a full face shield – while useless against aerosolized particles suspended in the air, it will protect you from the largest and most dangerous droplets.

Why do medical people need them?
The reality is that without the N95 masks and in immediate contact with the patients, the risk of them getting infected is pretty high even in what is considered as “safe” areas – as well as passing the virus to their colleagues and patients in those “safe” areas. If let spreading, due to the over-representation of serious cases in the hospital environment, it is not impossible that the virus will evolve to forms that lead to more serious symptoms. Even if we can’t protect the medical personnel, preventing those of them who are asymptomatic from spreading the virus is critical for everyone (besides – masks are also for patients – if you look at pictures in China, all patients wear them).

Second, why did WHO not recommend the use of N95 masks to the general public at the beginning of this outbreak, whereas they did that for SARS-CoV in 2002-2004 outbreak almost as soon as it became known to the West?

Unlike the first SARS-CoV, SARS-CoV-2 does not remain suspended in aerosols for prolonged periods of time it does not form clouds of aerosolized particles that remain in suspension and can infect someone who is passing through the cloud hours after the patient who spread it left. For SARS-CoV-2, the droplets fall to the ground fairly rapidly – within a couple of meters and a couple of minutes (where they can be picked up – hence hand washing and gloves). Due to that, unlike SARS-CoV, SARS-CoV-2 transmission is mostly driven by direct face-to-face contact with virus-containing droplets landing on the faces of people in direct contact.

Situation changes in hospitals and ICU wards – with a number of patients constantly aerosolizing, small particles do not have the time to fall and the medical personnel is at less than a couple of meters from patients due to the place constraints. However, even in the current conditions, the N95 masks are only used in the aerosol-generating procedures, such as patient intubation.

Once again, for most people, face shield, keeping several meters of distance and keeping your hands clean and away from your face are the absolute best bang-for-buck there is with everything else having significantly decreasing returns.

==

PS: since I wrote this paper, a number of science journalists have done an excellent job at doing in-depth research on the subject and write up their findings in an accessible manner:

In addition to that, a Nature study has been recently published, indicating that while masks are really good at preventing large droplets formation (yay), when it comes to small droplets formation (the type that can float for a little bit), it’s not that great for Influenza. The great news is that for Coronavirus, since there are few droplets of that size formed, it works great and containing any type of viral particles emission: Nature Medicine Study.