“Fake News” considered harmful

The “Fake News” as a term and a concept became ubiquitous ever since Trump used it in September 2016 to dismiss accusations of sexual assault against him.

Unfortunately, that expression stuck ever since and is now used at large, including by experts in disinformation in their scientific literature.

This is highly problematic.

The ongoing common usage of “Fake News” term is harmful to healthy public discourse and research into misleading information.

Terminology forces a mindframe

Despite what his political opponents wanted to believe, Trump and his allies (notably Steve Bannon) are far from being dumb. While they are con men, they mastered the art of switching public attention away from topics that don’t suit them.

Using a specific vocabulary and terms that “stick” with the public is one of them, and the famous “covfefe” tweet is arguably the most egregious example of it.

While many people still remember the term and the tweet itself, very few realized that it switched over the news cycle away from a bunch of seriously problematic stories for Trump, namely, Michael Flynn pleading the Fifth, Trump’s “deal” with Saudi Arabia, and massive cuts to science and education.

And the trick did its job incredibly effectively:

George Lakoff has warned us about conservatives using language to dominate politics and distort the perspective already in 2003.

“Covfefe” is a modern example of such usage. Michael Flynn pleading 5th and Saudi Arabia “deal” stories were pretty damaging to the image Trump has been trying to maintain with his followers. If gained enough momentum, they could have started eroding his base’s trust in him. A simple “spelling error from a president overtired from making all the hard decisions” can definitely be passed for “liberals nitpicking over nothing” – basically a mirror of Obama’s beige suit or latte salute.

“Fake News” is no different.

“Fake News” transforms public debate into a shouting match

A major issue with “Fake News” is that it stops a public debate and transforms it into a screaming match, based on loyalty. Trump says that accusations of rape against him and the “grab them by the pussy” tape are “Fake News”. Hilary says that his claims of being a billionaire and a great businessman are “Fake News”.

And that’s where the debate stops. Most people will just get offended that the other side is calling obviously true things “Fake News” or persisting in their belief that obviously “Fake News” are real. And their decision regarding what to go is already predetermined by prior beliefs, partisanship, and memes.

Which is exactly what you want when you know in advance that you are in the wrong and will be losing in a proper civil debate.

Which is Trump’s case. Hence the “Fake News”

And it is exactly the opposite of what you want when you are in the right and know you will win over people’s opinions if you have a chance to get their attention.

Which is the case for investigative journalists, scientists, civil rights activists, and saner politicians.

And yet, for whatever bloody reason the latter decided to get on with and keep using the “Fake News” term.

And by doing it, surrendered any edge their position might have had from being rooted in reality.

Just to appropriate a catchy term, even though it is meaningless and can be both a powerful statement and a powerless cry depending on how you see the person saying it.

Major journals publishing an in-depth investigation? Fake news by corrupt media!

Scientific journal publishing a long-awaited scientific article suggesting climate change is accelerating and is even worse than what we expected? Fake news by climate shills!

Are doctors warning about pandemics? Fake news by the deep state looking to undermine the amazing record of president Trump on job creation and economic growth!

Russian media outlets writing a story about the danger of western vaccines? Fake news! … Oh, wait, no, it’s just leftist cultural Marxists propaganda being countered by a neutral outlet!

Basically, by leaving no space for discussion, disagreement, nuance, or even a clear statement, “Fake News” is a rallying cry that precludes civil public debate and bars any chance to convince the other side.

“Fake News” is harmful to public debate.

“Fake News” is too nebulous for scientific research

Words have definitions. Definitions have meanings. Meanings have implications.

“Fake News” is a catchphrase and has no attached definitions. It is useless for scientific process.

How do you define “Fake” and how do you define “News”?

  • Is an article written by a journalist lost in scientific jargon that has some things wrong “Fake News”?
  • Is an information operation by a state-sponsored APT using puppet social media profiles to create a sentiment “Fake News”?
  • Is yellow press with scandalously over-blown titles “Fake News”?
  • How about a scientific article written with egregious errors that have results currently relevant to an ongoing pandemic – is it “Fake News” too?
  • Are summaries and recommendations by generative language models such as GPT-Chat “Fake News”, even though it is trying its best to give an accurate answer, but is limited by its architecture?

None of those questions have an answer, or rather answer varies depending on the paper and the paper’s authors’ interests. “Fake News” lacks an agreed-upon operational definition and is too nebulous for any useful usage.

“Fake News” is harmful to the research on public discourse and unauthentic information spread.

Ironic usage is not a valid excuse

An excuse for keeping using the “Fake News” around is that it’s ironic, used in the second degree, humor, riding on the virality wave or calling the calling the king naked.

Not to sound like fun police, but none of them are valid excuses. While the initial use might not be in the first degree, its use still affects thinking patterns and will stick. Eventually, the impact of usage of the term will become more and more natural and start affecting thought patterns, until in a moment of inattention it is used in first-degree.

Eventually “Fake News” becomes a first-degree familiar concept and starts being consistently used.

The harm of “Fake News” occurs regardless of initial reasons to use it.

A better language to talk about misleading information

To add insult to injury, before “Fake News” took over we already had a good vocabulary to talk about misleading information.

Specifically, misinformation and disinformation.

Intention.

Not all misleading information is generated intentionally. In some cases a topic is complex can just be plain hard, and the details making a conclusion valid or invalid – a subject of ongoing debate. Is wine good or bad for health? How about coffee? Cheese? Tea? With about 50 000 headlines claiming both, including within scientific, peer-reviewed publications, it makes sense that a blogger without a scientific background gets confused and ends up writing a misleading summary of their research.

That’s misinformation.

While it can be pretty damaging and lethal (cf alt-med narrative amplification), it is not intentional, meaning the proper response to it is education and clarification.

Conversely, disinformation is intentional, with those generating and spreading it knowing better, but seeking to mislead. No amount of education and clarification will change their narrative, just wear out those trying to elucidate things.

Means

Logical Fallacies are used and called out on the internet to the point of becoming a meme and a “logical fallacy logical fallacy” becoming a thing.

While they are easy to understand and detect, the reality is that there are way more means to err and mislead. Several excellent books have been written on the topic, with my favorites being Carl Bergstrom’s “Calling Bullshit”, Dave Levitan’s “Not a Scientist” and Cathy O’Neil’s “Weapons of Math Destruction“.

However, one of the critical differences between just calling out misinformation/disinformation from “Fake News” is explaining the presumed mechanism. It can be elaborate as “exploited Simpson’s paradox in clinical trial exclusion criteria” or as simple as “lie by omission” or “falsified data by image manipulation“.

But it has to be there to engage in a discussion or at least to get the questioning and thinking process going.

Vector.

No matter the intention or the mechanism of delivery, some of the counterfactual information has a stronger impact than others. A yellow press news article, minor influencer TikTok, YouTube, or a Twitter post are different in their nature but are about as likely to be labeled as untrustworthy. Reputable specialized media news articles or scientific journal publications are on the opposite side of the spectrum of trustworthiness, even though they need amplification to reach a sufficient audience to have an impact. Finally, major cable news segments, or a declarations by a notable public official – eg a head of state – are a whole other thing, percieved as both trustwrothy and reaching a large audience. Calling it out and countering it would require an uphill battle against authority and reputation and a much more detailed and in-depth explanation of misleading aspects.

More importantly, it also provides a context for those trying to spread misleading information. Claiming the world is run by reptilians because a TikToker posted a video about it is on the mostly risible side. Claiming that a public official committed fraud because a major newspaper published an investigative piece is on a much less risible one.

Motivation / Causes.

Finally, a critical part of either defeating disinformation or addressing misinformation is to understand the motivation behind the creators of the former and the causes of error of the latter.

This is a bit harder when it comes to the civil discourse, given that accusations are easy to make and hard to accept. It is however critical to enable research and investigation, although subject to political considerations – in the same way as attribution is in infosec.

Disinformation is pretty clear-cut. Gains can be financial, political, reputational, or military gain, but it’s more of the exact mechanism envisioned by the malicious actor that’s harder to identify and address. They are still important to understand to effectively counter it.

Misinformation is less clear-cut. Since there is no intention and errors are organic, the reasons they emerged are likely to be more convoluted. It can be overlooking primary sources, wrong statistical tests, forgetting about survivorship bias or other implicit selection present in the sample, or lack of expertise to properly evaluate the primary sources. Or likely all of the above combined. They are still important to understand how misinformation emerges and spreads and to stop it.

A couple of examples of better language.

HCQ for COVID from Didier Raoult’s lab

This is a disinformation scientific article, where misleading conclusions were generated by manipulating the exclusion criteria in medical trials and a phony review, motivated by a reputational, political, and likely financial gain.

Chuck Yager’s support of presidential candidate Trump

This is a disinformation social media post, that operated by impersonation of a public figure, motivated by adding credibility to Trump as presidential candidate for political gain.

DeepFake of Zelensky giving the order to surrender

This is a disinformation video on a national news channel, that was generated using deep generative learning (DeepFake) and injected using a cyber-attack, motivated by immediate military gain resulting from UA soldiers surrendering or at least leaving their positions.

Vaccine proponents are pharma shills

This is a piece of disinformation news, blog posts, and scientific articles, that were generated by manipulating data and hiding the truth, motivated by financial gains from money given by pharma.

Vaccine opponents make money from the controversy

This is disinformation blog posts, news articles, and videos, that were generated by selecting results and fabricating data, motivated by financial gains from sold books, films, convention tickets, and alt-medical remedies.

Better language leads to a better discussion

The five examples above got you screaming for references for each part of the statement, whether you disagree with them or agree with them.

Good.

That’s the whole point.

Same statements, but as “<X> IS FAKE NEWS” would have led to no additional discussion and just an agreement/disagreement and a slight annoyance.

A single, four-part statement regarding presumably misleading information is harder to form, especially if references are included but is also hard to deny, refute, and just upon being stated will open a door to hesitation and investigation.

Which is the opposite of “Fake News”. Which is what we want.

“Fake News” should be considered harmful and its use – abandoned.

What is missing on Mastodon

After 6 years, Mastodon is finally in the news. And for all the right reasons.

And a lot of people in tech have written about what they think about Mastodon and its future, including ones that I think very highly of, notably “Malwaretech” Marcus Hutchinson (here and here) and Armin Ronacher (here). Go give them a read, they are great insight.

Unlike them, I’ve been on Mastodon on and off (but mostly off) since early 2017, in large part due to missing people I was interested in interacting with there.

Back in 2017 I checked out Mastodon as a potential destination for then nascent science Twitter, if we were kicked off the platform by Trump and his friends, for going againts their agenda.

It might sound a bit paranoid now, but back at the time, NOAA, NSF and EPA archives of climate data and polution were taken down and a certain Andrew Wakefield was invited as a guest of honor at Trump’s Inauguration ceremony. Twitter haven’t taken any stance on fact-checking, scientists harassment or highly vocal antivax propaganda (Vaxxed movie was for instance advertised open and wide there). Facebook was going to roll with whatever the power would have wanted it to do anyway, and for pretty busy MDs and scientists figuring out yet another forum/social network was going to be a hassle. And IRC for sure was out of question.

In the save-the-scientific-date antiscientific new order world we were going to fight against, Mastodon instances would be indexes for scientific datasets torrent magnets and their hashes to prevent tampering and flooding. Complete with VPN, GPG hash re-signing, and hash registration on the Bitcoin blockchain (I couldn’t believe NFTs were not actually that when I first heard of them).

In hindsight, we might have gone full cyberpunk. And yet, we might have just been on advance on our time.

Five years later, and an aggressively anti-science billionaire hast taken over Twitter and is kicking off anyone he doesn’t like from his platform.

The Great Mastodon Migration(s)

With a lot of people among my contacts now seeing the sign on the wall for Twitter, the issue of Mastodon being empty is now solved for me. I now see there most of the people I followed and interacted on a day-to-day basis on Twitter are now on Mastodon too, mostly interacting in the same way.

Most. And mostly.

As for many tech people and early adopters, I am a bit different from an average user. I am no Linus Torvalds, Steve Wozniak, Dennis Ritchie, or even your middle-of-the road hacker. Bit-by-bit inspection of compiled code to debug or find an exploit is not my idea of good time, but I am still able to jump on a new tech or platform and mostly figure its head and tail, even if it looks off and I need to crawl stackoverflow, reddit and man pages to figure out how to do what I want with it.

Needless to say, I am a minority. A lot of Twitter’s features were getting in the way for me. For instance my timeline has been almost exclusively in chronological and I had add-ons to unfold twitter thread storms into trees to be able to follow it and track people I talked to, because Twitter’s prioritization miserably failed.

But that also meant that Mastodon was a frictionless transition for me. It already worked the way I wanted Twitter to work, except better – more granular control on post visibility, no surprise sponsored posts, and a simpler verification process (rel=”me” FTW!).

It was all but for a lot of my contacts, who struggled to figure out what it was and how it worked.

Roadblocks in the fediverse

Trust into instances.

Twitter is simple. You say to people you are on Twitter, you give them your handle (@JohnDoe) and they can go on Twitter and follow you. @JohnDoe1, @JohnDoe87, @JohnD0e will be other people and you will not be following and interacting with them. With a bit of luck, you are visible enough to earn a blue checkmark and people looking for you will be able to distinguish your account from all the others.

On Mastodon, things are much less clear if you think like a Twitter user.

There is a @JohnDoe@mastodon.social, there is @JohnDoe@infosec.exchange, there is @JohnDoe@SanFrancisco.city. Are they all the same person? If not, which one is the one you need?

If you want to start your account, what does choosing an instance changing?

  • Which ones will protect your private information, and which ones will get hacked?
  • Which ones will allow you to follow your friends and which ones will get de-federated? Or block instances your friends are on?
  • What are the rules of your instance, and who is enforcing them?
  • Can you criticize and oppressive government? Does your instance admin has enough legal protection to keep you safe against a small dictatorship?
  • Can you post gore from a recently bombed city to raise awareness, or will it get you kicked or your instance – defederated?
  • How reliable will the service be? Will you be able to connect when you want to? Or is it going to be a FailPachyderm 24/7?

Over the years, Twitter built themselves a reputation, which made clear what users could expect. At least until Musk nuked it in about 7 days, re-nuking it about once a day since, just in case.

For Mastodon, things are more complicated – basically every instance is it’s own “mini-Twitter” and when push comes to shove it’s not clear if they will stand together or fall one by one. There is some trust towards the largest “mastodon.social”, ran by the Mastodon non-profit itself, but it has no means to scale at the speed of new users arrival, even less to moderate them all. That’s not how Mastodon or Fediverse are supposed to work to start with.

And the problem is that those questions are questions of life and death for opponents to oppressive regimes, citizens trying to survive to them or soldiers on battlefield. Life of protestors in Iran depends on whether the authorities can get to their real name – by injunction or by hacking. Same for women looking for abortion in red states in the US. Ukrainian soldier’s unit depends on whether the instance removes the meta-data from images and reminds them about OPSEC and blurring.

Those people specifically are missing from Mastodon online and are still on Twitter or are erroneously moving to Telegram, that a lot of oppressive regimes can easily track.

Part of it is educating users and changing the mentality from “it’s Twitter, but for hipster hackers” to “It looks like Twitter, but it’s more of emails”.

Part of it is actually addressing the structural instances right now. A lot of people I’ve heard touching the topic believe it’s not possible without corporate instances making enough profit to protect their users.

I disagree.

No-profits like EFF, La Quadrature du Net, ACLU have been set up specifically to help small organizations with user interests in mind stand up to and fight large organizations with the opposite of user interests in mind. Mozilla, Apache, Linux and Python foundations has been able to provide real-time critical maintenance and support making their products safe to use and deployed with an excellent safety record.

There is no reason those players and foundations couldn’t bring Mastodon foundation up to speed and provide it with instance vetting/certification process and an umbrella coverage to them. It won’t be pre-Musk Twitter, but it might be for the better.

Basically EFF, ACLU and someone like Mozilla need to bring together their powers to create trusted instances and someone like Wikipedia will need to give them a primer on moderation.

Search.

Mastodon does not allow full-text search. Period. That’s by design. You can search for users or hashtags, and you can do full-text searches of your own posts.

You can’t however do a full-text search of your own instance and much less of the fediverse. Once again, that’s by design.

And if you remember the neo-nazi harassment campaign on Twitter against “jewish” accounts back in 2016, with their (((@<account>))), you will agree it makes sense. Besides, fuzzy whole-text indexing and searching are rather expensive operations. In absence of personal data scouring for info that could be used to target ads, it makes no business sense to have it.

However, it also makes Mastodon useless for a lot of people who relied on Twitter to do their job- journalists, scientists, malware analysts, or even critical emergency response.

Journalists were able to zero in on an event or find new information – starting with images posted by some of Jan 6 2021 Capitol Rioters attacks to finding the people combining an in-depth, engaging writing with deep expertise in topics such as COVID. Or find published images of war crimes such as Bucha massacres posted in real time and be able to validate them to make a timely story.

Scientists were able to find people talking about their latest paper or preprint and either address the shortcomings or get a better idea of what to do in the future. Or alternatively look for valid criticism of papers they were going to ask their PhD students to base their work off. With a number of scientific fraud sleuths on Twitter, there are good chances that a search like could lead to project adjustment and save years to the student and hundreds of thousands in funding.

Similarly, Malware analysts could do a deep dive into mentions of CVEs or breach numbers to find ideas others would have regarding patching the system or re-configuring the network to decrease their vulnerability to the attacks.

But perhaps the most critical is emergency incident response. With Twitter, people tweeted about hurricanes hitting them and destroying their houses, cutting water and power, earthquakes they felt, tornadoes that removed their neighbor’s house, smell of gas, symptoms of a disease – you name it. They became essential to the assessment of the situation and decision-making for the search and rescue operations. And given that no one in their right mind would be adding hashtags in their tweets while overwhelmed with feelings and fear, and surely not spelling them right, full-text search was essential to their work.

All of those applications are unquestionable social good and were made possible by Twitter’s full-text search. Mastodon needs to find a way to replicate such search for those applications, even if based on clearances or specific terms.

Commercial content

Another social good that Twitter unintentionally brought with it was companies accountability. Thanks to the search and open APIs (at least for a relatively moderate cost), commercial companies could track their customers sentiment and feedback and jump on anything that could be a request for the support.

While it led to several comical interactions and led to abuse from MLM and web 3.0 Ponzi schemes, the public visibility of their reaction definitely led companies to move it lest they were going to loose customers due to rumors of bad service and bad customer service.

Moreover, a lot of consultants, authors and freelancer lived and died by their Twitter reputation and engagement. LinkedIn is for making pompous announcement in corporatespeak. Twitter is about the “here’s how you do it kids, and here’s the reasons why doing it in that other way is a bad idea”. It was a place to show and prove competence in the domain and get visibility to people who would provide them with contracts and ultimately income. Twitter allowed people more independence and better put forwards their expertise.

However, the reasons they could do it is that they decisionmakers with budgets in their domains already were on Twitter, even if it was for hot takes about the latest sportsball event or to follow a celebrity in hopes to interact with them.

The reality is that commercial content is part of everyone’s life and the way the overwhelming majority of people make money. Mastodon cannot stand on its own if it doesn’t provide a space to talk about it and a space for commercial players to engage at least in some form and people to reach for commercial context in at least some form.

Rules by which commercial companies operate are radically different from FOSS. They need predictability, reliability and protection against impersonation (the whole reason behind Twitter blue checkmarks). There are companies that can do both, but they are few and far in between. And tbh, it’s mostly Red Hat.

Mastodon needs its own Red Hat to emerge and will need to figure out conditions on which the federated instances will let the commercial entities come into fediverse if it is to stand on its own as a social network.

Context

Mastodon home timeline is confusing. At least in the base web interface.

You are greeted with tibits of unrelated conversation without an ability to identify immediately which threads do they fit on, who is that person your friends are boosting, or what is behind that link they are sharing. It just lacks context to be able to fully leverage it.

The <First Tweet> … <Tweets from your friends you haven’t seen yet> feature on Twitter sucked big time – in part because it was impossible to find what the … actually were about unless you got lucky. But it at least provided some context to understand what you were jumping back in.

Similarly, the expansion of URLs into a square that at least includes the header and a short excerpt of the abstract on Twitter was not without its downside, but provided enough context to the link for you to understand what it was about and decide where you were clicking on it.

Finally the show-info-on-hoover for accounts was quite vital to figure out how trustworthy/competent the person behind the post was. Especially once combined with checkmarks (no matter how problematic they were), allowing to tell whether the person really was who they said they were.

Speaking of validation, while the validation is working on Mastodon with the <rel="me"> tag, and could be improved with dedicated instances, they are both prone by domain look-alike squatting. De-facto trust in Web 2.0 is provided by platforms (eg Google search), making sure you are landing on your bank’s account rather than a lookalike built for phishing. Mastodon will need to figure something out, simply because the name or even domain-based trust schema of Web 1.0 is simply too unsafe for the vast majority of users, no matter how tech-saavy.

Similarly, the blue checkmarks is not sufficient context. Yes, an MD with a degree of pandemics preparedness, virology and contagious diseases epidemiology is a good person to listen to regarding COVID news and opinions. An MD with a gastroenterology degree and obesity epidemics expertise is probably not. Yet a blue checkmark is not enough to distinguish them, or even an instance name. There is a need to provide more context about people and their expertise that’s visible at a glance.

Mastodon will need to figure out how to provide enough context while not making editorial decisions and leaving fediverse free, be it with regards to just helping the user or ensuring safety in high-stakes applications.

It might mean that dedicated high-visibility validated instances (EFF/Quadrature/ACLU/Mozilla) will take an outsized importance in the fediverse. It might mean commercial instances. It might mean rules about names proximity. But it will need to be figured out.

Algorithm(s)

The Algorithm is seen as **evil** by the better part of civil liberties advocates, including Mastodon developers and community. The whole promise of Mastodon is to remove the algorithmic censorship of free speech large corporations inevitably put in place and just let you see everything your friends are posting, in the order they are posting.

And I understand that stance. I’ve ran my Twitter feed sorted by newest first most of the time, even after they introduced algorithmic prioritization. It mostly worked for me.

Mostly.

As long as I was following people that were tweeting approximately the same amount of equally important information, it worked well.

It went out of the window as soon as some of them went into a tweeting spike and basically flooded your timeline with retweets of tweets announcing their next event (book signing), or reactions to their latest blog post, or supporting their side in their latest flame war. Good for them, but while scrolling past the tweets of no interest to you, chances you were going to skip something important or critical – like an announcement from an intimate friend for their wedding, promotion, or a prize.

The important signal gets drowned in chaff and meant that even me, with my measly 800 accounts followed had to regularly switch to algorithmic Twitter timeline to catch up on anything I could have missed over the last 4-5 days. For people following the same number of accounts in a more professional setting, the whole chronological timeline becomes an insanity – a non-stop Niagara of new posts rapidly appearing and disappearing as other posts come to take their place. No human can process it all, especially if they spend only a 15-30 minutes a day on that social media.

That’s why prioritization algorithms became popular with users in the first place.

However, algorithms developed by twitter Twitter/Facebook/Google/Youtube/… don’t serve the interest of their users. In the web of ads, attention, engagement and retention are everything. Algorithms built by companies are there to serve the interest of companies first and foremost, user well-being or even safety be damned.

The web of gifting is free of that pressure, but also is lacking resources to develop, train and deploy SotA ML solutions. However SotA is usually not necessary and the benefit of even basic recommendation algorithms is so high that on several occasions I considered writing an independent Twitter client just to have a prioritization algorithm that worked the way I wanted it to.

On Mastodon, it would be at least masking of boosting of tweets I’ve seen before or earlier tweets in a thread I’ve already read by starting from one of the most recent posts. Yes, you can implement your client doing it the way you want, but it’s just not realistic for the vast majority of users – even highly tech advanced. We need a solution about as easy as an “algorithm store”, be it more of a pip or apt or an AppStore. Algorithms can easily be developed by users and shared among them, either for personal use or for distributed privacy-preserving learning.

Mastodon just needs support for personalized algorithms and a way to distribute them and let users choose which ones they want to run.

Moderation

Scaling moderation is a hard task. As of now, the fediverse managed to do it with de-federation and within-instance moderation of federated instances. And lack of valuable targets for harassment,

It is a good solution for the world of small instances and a fediverse with a reasonable amount of instances. With mega-instances, such as for instance the mastodon.social, now at 90 000 users, moderation is unreasonable and smaller instances are already de-federating.

As users will keep migrating into the fediverse (and I do believe they will), there are likely won’t be enough separation of users by interests and communities to avoid mega-instances, especially in the contexts where the likely moderators would impede free speech (eg academic supervisors in academic instances preventing students from warning one another about some top-level academicians’ behaviors).

The moderation of large instances and will become a big problem.

Twitter was riding a thin line between moderating to the point of editorializing and letting abuse run rampant. 2011 Arab Spring protester were on Twitter because it didn’t bulge on their dedication to user protection against abusive regimes – thing on which Microsoft, Google and Facebook didn’t hesitate to bulge. That’s why Arab Spring happened on Twitter. In 2013 peoples who were being vocal about their hate to companies advertising on Twitter were once again let to be vocal rather than de-amplified or outright silenced. That’s how Twitter became the platform to go to with complaints about service or experience and be heard and treated for support in priority. With rampant abuse by state actors seeking to manipulate the public opinion in 2015 and 2016, anti-science disinformation campaigns against vaccines, climate change and pollution in 2016-2019, and then on COVID starting from 2020, and finally Trumps’ call on the insurrection, the moderation became more and more difficult and politicized until fact-checking billionaires brought the demise of Twitter.

However, the most important reason Twitter was credible in their actions was that it opened itself to the external supervision. While Facebook and Google fought against anyone willing to have a look into what was going on their platforms (starting with their own employees), even if it meant they were being complicit in a genocide, Twitter opened it’s platform to researchers for basically free, providing top-tier data access usually reserved for internal use or trusted partners. Even if it meant a deluge of reports came in highlighting hate speech, narratives twisting and information operations, Twitter allowing it and sometimes acting on it still allowed public trust into Twitter as platform, as it was eroding to pretty much everyone of its competitors. It all led people believe Twitter was a hellsite. In reality, it was better than others, just not censoring reports about the problems it might have had.

The mechanisms driving those issues haven’t gone anywhere. They are still here and will start impacting Mastodon and fediverse as it grows.

FOSS doesn’t have billions in ad, premium or VC money to throw at the problem like online giants do.

However it is able to leverage the goodwill and gift of the time from its users as volunteers to achieve the same thing.

Wikipedia, among all, achieved a ubiquitous status as the last instance of truth in the internet, in large part thanks to the moderation model it is running, letting people argue with facts, publicly available information and academic writing until they get to a stalemate that’s pretty representative of the scientific consensus or public knowledge about facts.

It has pretty big issues, notably with women and minorities representation or coverage from minorities narratives. It still did a better job than most large platforms, to the point they started using Wikipedia information in their own moderation decisions.

But Mastodon is not Wikipedia. Fediverse will need to keep figuring out its moderation rules, especially as the stakes keep rising as more and more users join it and larger and larger instances emerge.

Hitting the main street.

I am optimistic about Mastodon and fediverse overall, in large part because it’s a protocol and a walled garden.

And also perhaps because I really want it to work out.

For all its shortcomings, pre-Musk Twitter was a great tool and in a lot of ways made transformations in this world possible, ranging from democratic revolutions to people just getting a better customer service.

And for all its past greatness, Twitter to me is now dead, because what was making it so unique – trusted moderation team – is now dead and will not be coming back to it.

For me personally, the mix of pretty much everyone on there – ranging from scientific colleagues to infosec and disinformation experts, to OPSEC experts to journalists and columnists allowed me to keep the hand on the pulse of the world, professionally, and have the best insight possible not only into the current state, but also project into the future, often further than most other news outlet would have allowed me to.

Unlike a lot of my colleagues, I don’t think that the lack of average Joes on Mastodon is such a big issue for scientific outreach. After all most people don’t listen to some random dude on the internet. They are listening to their local opinion leader, someone they know knowledgeable in their domain, and whose opinion they think extends to other domains. If those people are connected, the outreach still works out.

I also don’t see people coming back to Twitter if the management has a sudden change of heart and a new management comes in. Trust is really slow to build and is easily lost. There is no guarantees new Twitter won’t go deleting users’ critical comments, banning accounts on a whim or perform algorithmic manipulations. People in tech are painfully aware of it; journalists are becoming more and more aware and the general public who don’t care for it, also don’t see any advantages to Twitter over Facebook, TikTok, Instagram or a myriad of other, more engaging and less serious social medias.

Similarly, unlike a lot of my contacts over on Twitter, I don’t think that a social media being commercial is a fatality. I am seeing Mastodon evolving and becoming a healthier alternative to the social media while keeping good things about them and dropping the less good things. FOSS worked out in the past, there is no reason it has to fail now.

In the end, thanks to Musk, Mastodon is now alone in the field, ready to grow and provide to people a better alternative. The only thing that could undermine its growth is itself – its users and its developers.

But that’s a part of the deal. I certainly hope they will find a way forwards and be willing to accept change, no matter how scary.

Web 3.0 is dead. Why was it even alive?

(This post is part 1 of a series trying to understand the evolution of the web and where it could be heading)

For most of the people in tech, the blockchain/crypto/NFTs/Web 3.0 have been an annoying buzz in the background for the last few years.

The whole idea of spending a couple of hours to (probably) perform an atomic commit into a database while burning a couple of millions of trees seemed quite stupid to start with. Even stupider when you realize it’s not even a commit, but a hash of hashes of hashes of hashes confirming an insertion into a database somewhere.

Financially speaking, the only business model of the giants of this new web seemed to be either slapping that database on anything it could be slapped on and expecting immediate adoption, or a straight-out Ponzi scheme (get it now, it’s going to the moon, tomorrow would be too late!!1!1!).

Oh, and the cherry on the top is that the whole hype machine around Web 3.0 proceeded to blissfully ignore the fact that Web 3.0 has already been popularized – and by no one other than Tim Berner-Lee almost a decade ago, referring to the semantic web.

Now that the whole house of cards is coming crumbling down, the sheer stupidity of crypto Web 3.0 seems self-evident.

And yet, VCs who should have known better have poured tens of billions into it, and developers with stellar reputations have also jumped on the hype train. People who were around for Web 1.0, .com burst, Web 2.0, and actually funded and built the giants for all of them seemingly drank the cool-aid.

Why?

Abridged opinionated history of the InterWebs

Web 1.0

The initial World Wide Web was all about hyperlinks. You logged into your always-on machine, wrote something in the www folder of your TCP/IP server directory, about something you cared about, while hyper-linking to other people’s writing on the topic, and at the end hit “save”. If your writing was interesting and someone found it, they would read it. If your references were well-curated, readers could dive deep into the subject and in turn hyperlink to your writing and to hyperlinks you cited.

If that sounds academic, it is because it is. Tim Berner-Lee invented Web 1.0 while he was working at CERN as a researcher. It’s an amazing system for researchers, engineers, students, or in general a community of people looking to aggregate and cross-reference knowledge.

Where it starts to fall apart, is when you start having trolls, hackers, or otherwise misbehaving users (hello.jpg).

However, what really broke Web 1.0 was when it got popular enough to hit the main street and businesses tried to use it. It made all the sense in the world to sell products that were too niche or for a physical store – eg. rare books (hello early Amazon). However, it was an absolute pain for consumers to find and evaluate the reputation of businesses on the internet.

Basically, as a customer looking for a business, your best bet was to type what you wanted into the address bar followed by a .com (eg VacationRentals.com) and hope you would get what you wanted or at least a hyperlink to what you wanted. As a business looking for customers, your best bet was in turn to try to get the urls people would likely be trying to type looking for the products you were selling.

Aaaand that’s how we got a .com bubble. The VacationRentals.com mentioned above sold for 35M at the time, despite the absence of a business plan, except for the certainty that someone will come along with a business plan and be willing to pay even more, just to get the discoverability. After all, good .com domains were in a limited supply and would certainly only get more expensive, right?

That definitely doesn’t sounds like anything familiar…

Google single-handedly nuked that magnificent business plan.

While competing search engines (eg AltaVista) either struggled to parse natural language queries, to figure out the best responses to them, or to keep up to date in the rapidly evolving Web, Google’s combination of PageRank to figure the site reputability and bag-of-words match to find ones the user was looking for Just Worked(TM). As a customer, instead of typing into an address bar a url you could just google it, and almost always one of the first 3 results would be what you were looking for, no matter the url insanity (eg AirBnBcom). Oh, and on top of it, their algorithms’ polishing made sure that you could trust your results to be free of malware, trolls, or accidental porn (thanks to Google’s company porn-squashing Fridays at Google).

As a business, your amazing .com domain was all of a sudden all but worthless and what really mattered was the top spot in Google search results. The switch of where discoverability budgets were going led to the .com bubble bust, and Google’s AdWords becoming synonymous with marketing on the internet. If you ever wondered how Google got a quasi-monopoly on ads online, you now know.

While Google has solved the issue of finding businesses (and incidentally knowledge) online, the problem of trust remained. Even if you got customers to land your website, it was far from guaranteed that they trust you enough to share their credit card number or just perform a wire to an account they’ve never seen before.

Web of Trusted platforms emerged around and in parallel to Google but really became mainstream once the latter allowed the products on them to be more easily discovered. Amazon with customer reviews, eBay, PayPal, … – they all provided the assurance that your money won’t get stolen, and if the product won’t arrive or would not be as advertised, they would take care of returns and reimbursement.

At this point, the two big issues with Web 1.0 were solved, but something was still missing. The thing is that for most people it’s the interaction with other peoples that matter. And with Web 1.0 still having been built for knowledge and businesses, that part was missing. Not only it was missing for users, but it was also a business opportunity, given that word-of-mouth recommendations are still more trusted than ones from an authority – no matter how reputable, and that users can reveal much more about their interests while talking to friends rather than with their searches.

Web 2.0

If Web 1.0 was about connecting and finding information, Web 2.0 was all about building walled gardens providing a “meeting space” while collecting the chatter and allowing businesses to slide into the chatter relevant to them.

While a few businesses preceded it with the idea (MySpace), it was really Facebook that hit the nail on the head and managed to attract enough of the right demographic to hit the jackpot with brands and companies wanting access to it. Perhaps part of the success of the virality was the fact that they started with two most socially anxious and least self-censoring communities: college and high-school students. Add the ability to post pictures of the parties, tag people in them, an instant chat that was miles ahead of any competition when it came to reliability (cough MSN messenger cough), sprinkle some machine learning on top and you had hundreds of millions pairs of attentive, easy to influence eyeballs, ready to sell to advertisers with little to no concern about user’s privacy or feelings (Are you missing ads for dating websites on FB when you changed your status to “it’s complicated”? or ads for engagent rings if you changed it to “in a relationship”?).

However the very nature of Facebook also burnt out their users and made them leave for other platform. You ain’t going to post your drunken evenings for your grandma to see. Nor will you want to keep in touch with that girl from high school who went full MLM. Or deal with the real-life drama that was moving onto that new platform: nightly posts about how much he hated his wife from your alcoholic oncle, your racist aunt commenting on your cousin’s pictures including a black guy in a group, …

And at that point Facebook switched from the cool hip kid to the do-all weirdo with a lot of cash, that kept buying out newer, cooler platforms – Gowalla, Instagram, What’s App, VR, … At least until anti-trust regulation stepped in when they tried to get to Snapchat.

Specialized social media popped up everywhere Facebook couldn’t or wasn’t yet moving into. LinkedIn allowed people to connect for their corporate needs. Yelp, Google Maps, and Foursquare allowed people to connect over which places to be. Tumblr, Pinterest, 9gag, and Imgur allowed people to share images – from memes to pictures of cats and dream houses. Or just generally be weird and yet engaging – such as Reddit and Twitter.

However, most of the social networks after Facebook had a massive problem – generating revenue. Or more specifically attracting businesses into advertising on them.

The main reason for that is that to attract enough advertisers, they need to prove they have enough reach and that their advertisement model is good enough to drive conversions. And for that, you need a maximum outreach and a maximum amount of information about the people your advertisement goes to. So unless you are going with Facebook and Google, you either need to have a really good idea of how a specific social media works or you pass through ad exchanges.

In other terms, you and your users are caught in a web of ads and as a social media, you are screwed both ways. Even if you are rather massive, there might not be enough advertisers to keep you afloat with your service still remaining usable. Despite its size and notoriety, Twitter failed to generate enough revenue for profitability, even 16 years in. YouTube, despite being part of the Google ecosystem is basically AdTube for most users. If you pass through ad exchange networks, you are getting a tiny sliver of ad revenue that’s not sufficient to differentiate yourself from others or drive any development. Oh, and nothing guarantees the ads that will get assigned to you will be acceptable to your userbase and not loaded with malware. Or conversely that the advertisers’ content won’t show up next to content they don’t want to be associated with their brand, at all.

The problem is not only that this model had razor-thin margins, but it was also increasingly brought under threat by the rising awareness about privacy issues surrounding the ad industry, and the use of information collected for ads against them. Not only for reclusive German hackers concerned about the NSA in the wake of Snowden’s revelations but also for anyone paying attention, especially after the Cambridge Analytica scandal.

Web of Taxed Donations (or world of microtransactions) was the next iteration to try to break out of it.

A lot of social media companies noticed through their internal analytics that not all users are equal. Some attract much more attention and amass a fandom (PewdiePie). Some of those fans are ready to pay some good money to get the attention of their idol (Bathwater). Why add the middleman of ads they don’t want to see anyway rather than allow them to do direct transfers and tax them? Especially if you are already providing them a kickback from ad revenue (because if you aren’t, they are leaving and taking their followers with them)?

Basically, it’s like Uber, but for people already doing creative work in cyber-space (content generators). Or, for people more familiar with gaming – microtransactions. Except instead of a questionable hat on top of your avatar, you are getting a favor from your favorite star. Or remove an annoyance in the way between you and them, such as an ad. Or get a real-world perk, such as a certificate or a diploma.

EdX started the move by creating a platform ot much providers and consumers for paid online classes and degrees. YouTube followed by adding Prime and Superchats, Twitch – subscriptions and cheers, and Patreon decided to roll in and provide a general way to provide direct donations. However, it was perhaps OnlyFans, completely abandoning any ads in favor of direct transactions that really drove the awareness of that income model (and accelerated Twitter’s downfall), while Substack did the same for high-quality writing.

The nice thing about that latter model is that it generated money. Like A LOT of money.

The less nice thing is that you have to convince creators to come to your platform, competing with platforms offering a better share of revenue, and then you need to convince people to sign-up / purchase tokens from your platforms. All while competing with sources funded by ad revenue and the money people spend on housing food, transportation, …

Basically, as of now web 2.0 is built around convincing the user to generate value on your platform and then taxing it – be it by them watching ad spots you can sell to businesses, or directly loading their hard-earned cash onto your platform, to stop annoyance, show they are cooler or donate it to creators.

Now, if we see the web as a purely transactional environment and strip it of everything else, you basically get a “bank” with “fees”. Or a “distributed ledger” with “gas fees”

Crypto Web 3.0

And that’s the promise of Crypto Web 3.0. The “big idea” is to make any internet transaction a monetary one. A microtransaction. Oh, you want to access a web page? Sure, after you pay a “gas fee” to its owner. You want to watch a video? Sure, after you transfer a “platform gas fee” for us to host it and a “copyright gas fee” to its creator.

Using cryptographic PoW-based blockchain for transactions when the “creators” were providing highly illegal things was far from stupid. You can’t have a bank account linked to your identity in case someone will rat you out, or the bank realizes what you are using money for. Remember Silk Road? Waiting a couple of hours for the transaction to clear and paying a fee on it that would make MasterCard, Visa and PayPal salivate was an acceptable price to get drugs delivered to your doorstep a couple of days later. Or get a password to a cache of 0-day exploits.

However, even in this configuration, the crypto had a weak point – exchanges. Hackers providing a cache of exploits have to eat and pay rent and electricity bills. Drug dealers have to pay the supplier, but also eat and live somewhere.

So you end up with exchanges – places that you trust would give you bitcoins someone is selling for real cash and when you want to cash out they will exchange them for real money. Basically Banks. But with none of the safety a bank comes with. No way to recover stolen funds, no way to discover the identity of people who stole them, no way to call Interpol on them, no insurance on the deposits – nothing.

You basically trust your exchange to not be the next Mt. Gox (good luck with that) and you accept it as a price of doing shady business. It’s questionable, and probably a dangerous convenience, but it is not entirely dumb.

What is entirely dumb, however, is trying to push that model mainstream. Those hours to clear a transaction? Forget about it – people want their coffee and bagel to go delivered in seconds. Irreversible transactions? Forget it – people send funds to the wrong address and get scammed all the time, they need a way to contest charges and reverse them. Paying 35$ to order a 15$ pizza in transaction fees? You better be kidding, right? Remembering seed phrases, tracking cold wallet compatibility, and typing 128 char hexadecimal wallet addresses? Forget it, people use 1234546! as a password for a reason and can’t even type correctly an unfamiliar name. Losing their deposited money? Now that’s absolutely out of question, especially if that’s any kind of more or less serious account.

So basically you end up with “reputable” major corporations that do de-facto centralized banking and promise that in the background they do blockchain. At least once a day. Maybe. Maybe not, it’s not like there are any regulations to punish them if they don’t. Nor with any guarantees for standard banks – after all the transactions are irreversible, frauds and “hacks” happen and it’s not like they are affected by the FIDIC 250k insurance. Oh, and they are now unsuited for any illegal activity because regulators can totally find, reach and nuke them (cf Tornado with Russian money after the start of war in Ukraine).

Oh, and on top of all of that, every small transaction still burns about a million trees worth of CO2 emissions.,

Pretty dumb.

But then it got dumber.

The NFTs.

Heralded as the poster child for everything good there was going to be about Web 3.0, they were little more than url pointers to jpeg images visible to anyone and not giving any enforceable rights of ownership. Just an association of a url to a wallet saved somewhere on a blockchain. Not like it was benefitting the artist either – basically the money goes to whoever mints them and puts them for an auction, artist rights be damned.

And yet somehow it worked. At the hype peak, NFTs were selling for tens of millions of dollars. Not even 2 years later, it’s hundreds of USD, at best. Not very different from beanie babies. But probably people getting them weren’t around for the beanie babies craze (made possible by eBay btw).

And VCs in all of that?

Silicon Valley invests in a lot of stupid things. It’s part of VCs mentality over there. And I don’t mean it in a bad way. They start out on the premise that great ideas look very stupid at first, right until they don’t. Very few people would have invested in a startup run by a 20-year-old nerd as a tool for college bro creeps to figure out which of their targets was single, or just got single, or was feeling vulnerable and was ripe to be approached. And yet when Facebook started gaining traction on campuses, VCs dumped a bunch of money into it to create the second biggest walled garden of Web 2.0 (and enable state-sponsored APT to figure out ripe targets for influence operations).

However, Silicon Valley VCs also are aware that a lot of ideas that look very stupid at first are actually just pretty stupid in the end. That is why they invest easily a bit, not rarely a lot, expecting 95-99% of their portfolio to flop.

And yet despite the shitshow FTX was, it still got 1B from VCs (if I am to trust Cruncbase, at a 32B valuation).

Perhaps it’s Wall Street golden boys turned VCs? Having fled investment banking after the 2008 collapse, they had the money to find a new home in Silicon Valley, make money with fintech and move on to being VCs themselves?

That would make sense – seeing crypto hit the main street during its 2017 peak, and seeing no regulations in place after it all crashed, the temptation to pull every single trick in the book that was made illegal by SEC in traditional finance would be pretty high. With households looking for investment vehicles for all the cash they amassed during lockdowns, crypto could be presented well as it was being pumped, at least as long as it could be made to look like it was going to the moon.

There is some credibility to that theory – massive crypto dumps were synchronized with FED base rate hikes to a day, in a way highly suggestive of someone with a direct borrowing line to FED to get leverage.

And yet it can’t be just it. The oldest and most reputable Silicon Valley VCs were onboard with the crypto hype train, either raising billions to form Web 3.0 investment funds or suggesting buying one coin over others.

But why?

While I am not inside the head of the investors with billions at their fingertips, my best guess is that they see the emerging taxed transactions Web 3.0 as ready to go and the first giant in it ready to displace major established markets – in the same way Google did for .com domains with its search.

Under the assumption that the blockchain was indeed going to be the backbone of that transactional web, and given the need for reputable actors to build a bridge with the real cash, it made sense that no price was too high to be onboard of the next monopoly, and a monopoly big enough to disrupt any major player on the market – be they ad-dependent or direct donation dependent.

But given the limitations of the blockchain, that latter assumption seemed… A bold bet to say the least. So maybe they also saw a .com bubble 2.0 coming up and decided to ride it the proper way this time around.

The curious case of hackers and makers.

While it makes sense that VCs wanted a crypto Web 3.0, it also attracted a number of actual notable makers – developers and hackers whose reputation was not to be made anymore. For instance, Moxie Marlinspike of Signal fame at some point considered adding blockchain capability to it. While he and a lot of other creators later bailed out, pointing out the stupidity of Web 3.0, for a while their presence lent the whole Web 3.0 credibility. The one crypto-fans still keep clawing to.

But why?

Why would makers be interested in transactional web? Why would they give any thought to using blockchain, despite having all the background to understand what it was? There are here to build cool things, even if it means that little to no revenues get to them, with the Linux kernel being the poster child for the whole stance.

Well.

The issue is that most makers and creators eventually realize that you can’t live on exposure alone and once a bunch of people start using your new cool project, you need to find money to pay your salary to work on it full time, salaries of people helping you and eventually servers running it, if it’s not self-hosted.

And the web of untaxed donations (or web of gifting) will only take you so far.

Wikipedia manages to rise what it needs to function yearly from small donations. But that’s Wikipedia. And Jimmy has to beg people 2-3 times a year for donations by mail.

Signal mostly manages to survive off donations, but on several occasions, the new users inflow overwhelmed the servers, before the volume of donations caught up with the demand, with the biggest crunches coming right around the moments when people need the service the most – such as at the start of the Russian invasion of Ukraine early this year.

Mastodon experienced it first hand itself this year, as waves after waves of Twitter refugees brought most instances to their knees, despite this specific case having been the original reason for Mastodon’s existence.

That scaling failure is kinda what happened to Twitter too. After Musk’s takeover and purges, Jack Dorsey went on record saying that his biggest regret was to make a company out of Twitter. He is not wrong. His cool side project outgrew his initial plan, without ever generating enough revenue to fund the expenses on lawyers when they decided to protect activists on their platform during the 2011 Arab Spring, then moderation teams in the wake of 2016 social platform manipulation for the US presidential election, content moderation in 2020-2022 to counter disinformation around COVID pandemic, or sexual content moderation in late 2021 and 2022. A non-corporate Twitter would have never had the funds to pull all those expenses through. (And even a lot of corporations try to hide the issue by banning any research on their platforms).

It kinda would make sense in a way that he now is trying to build a new social media, but this time powered by crypto and in web 3.0.

Not that the donations don’t work to fund free-as-in-freedom projects. There is just so much more that could be funded and so much more reliably if there was a better mechanism for a kickback to fund exciting projects. One that would perhaps have avoided projects shutting down or going corporate despite neither the developers, maintainers, or users really wanting it.

From that point of view, it also makes sense that the payment system underlying it is distributed and resilient to censorship. Signal is fighting off censorship and law enforcement, and going after its payment processor is one of the fastest ways to get it.

So it kinda makes sense, and yet…

Into the web of gifting with resilient coordination?

Despite the existence of Web 3.0 alternatives, when Twitter came crashing down, people didn’t flock to them, they flocked to Mastodon, caring little about its bulky early 2010s interface, intermittent crashes, and rate limits as servers struggled to absorb millions after millions after millions of new arrivals.

In the same way, despite not having the backing of a major publisher or professional writers, Wikipedia not only outlasted Encyclopedia Britannica, it became synonymous with current, up-to-date and reliable knowledge. Arguably, Wikipedia was even a Web 2.0 organization way before Web 2.0 was a thing, despite being unapologetically Web 1.0 and non-commercial. After all it just set the rules for nerds to have passionate debates about who was right while throwing citations and arguments and each other, with the reward being to have their worlds chiseled for the world to see and refer to.

And for both of them, it was the frail web of gifting that choked the giants in the end. In the same way, as Firefox prevailed over Explorer, Linux prevailed over Unix, Python prevailed over Matlab and R prevailed over SAS.

Perhaps because the most valuable aspect of the web of giving is not the monetary gifts, it’s the gift of time and knowledge by the developers.

Signal could have never survived waiting for a decade before it got wide-spread adoption, if it had to pay salaries of the developers of the caliber that contributed and supported it. Wikipedia could have never paid the salaries of all the contributors, moderators and admins for all the voluntary work they did. Linux is being sponsored by a number of companies to add features and security patches, but the show is still ran by Linus Torvalds and the devs mailing list. Same for Python and R.

It’s not only the voluntary work that’s a non-monetary gift. Until not too long ago people were donating their own resources to maintain projects online without a single central server – torrenting and Peer-to-Peer. Despite a public turn towards centralized solutions with regulations and ease-to-use, it’s still the mechanism by which Microsoft distributes Windows updates and it’s the mechanism by which PeerTube works.

Perhaps more interestingly, we now have pretty good algorithms to build large byzantine-resilient systems, meaning a lot of shortcomings with trust Peer-to-Peer had could now be addressed.

It’s just that blockchain ain’t it.

We might get a decentralized Web 3.0.

It just won’t be a crypto one.

EL5 for vaccines and natural immunity

With the vaccine hesitancy being as prevalent as it is now, I think more than one of you have had someone in their immediate environment be vaccine-skeptical or at least vaccine-hesitant.

Here is an explanation that worked for someone from my family, helping them to transit from a “I don’t need it – I am healthy enough” mindset to “Ok, I am getting the vaccine ASAP”. In this particular case, the fact that they had several friends go to hospital due to COVID and in some cases be long-term incapacitate by COVID19 helped to raise awareness of how bad the disease can be, as well as underlying trust into my competence as well as a good will, with most of hesitation focused around “my natural immunity is already good” and “vaccines don’t work that well to start with”. Oh, and also the simplifications are pretty insane and I can hear immunologists and virologists scream for here, but it’s one of those “sacrifice precision for ease of understanding” cases.

So, the preamble being done, let’s get to the actual explanation:

“Your immune system works by learning what attacked you and takes time to react. Specifically, it looks for cells that explode and let their contents out (also called necrosis, as opposed to apoptosis-when the cell dies because it is programmed to by the organism) and starts looking for anything it would have not encountered before (or at least that it doesn’t encounter in the organism often) that is present in the contents of the exploded cell or around. When the immune system detects what is there, they start generating the antibodies and test all the cells in the organism to see if they contain the new dangerous compound (here it is particles of the virus). If the dangerous compound is present in the cells, the immune system orders “cleaner” cells to digest those cells and destroy them and everything they contain.

The problem is that the process starts only when there are first mass cell deaths (so the infection is well stated) and take several days to start and spin-up. During that time, the virus continue to propagate in the body.

So when the cleaning cells start digesting cells that have the virus in them, that’s a lot of cells. And some of them are very important and cannot be replaced – such as heart cells. Or some cells that can be regenerated, but if there are too much that die, you die as well (a good example are lung cells).

In case of the vaccine, you have proteins that are located on the outside of the virus that are injected, as well as an agent that will provoke the cells to explode and attract the immune system’s attention. So the immune system will detect the viral proteins “on the crime scene” and learn to find and destroy them on sight. But since there are not that many cells that are affected and are very easy to regenerate (muscle cells are among the easiest to regenerate – you actually regenerate a lot of them every day if you are exercising) and are not essential to life (given it’s a muscle of a hand and not heart/lungs/brain).

Once your immune system learnt to detect them, it will remember them for a long time and closely monitor your body to see if it can find the smallest traces of the particles it learned were dangerous, even before the first infected cell explodes. Your immune system will monitor your body even closer and longer for those particles if it finds those particles on a different “scene of crime’. That’s why we need a second dose of the vaccines.

So when the virus arrives, it is immediately detected and neutralized and cells it has the time to infect are destroyed before they explode and before the virus is able to replicate itself.

The reason we need to wait after a vaccine is that the immune system transits from a “I am currently responding to an attack” to a “I have responded to an attack and will remember what attacked me, so that the next time I can respond faster and better”. That process takes about 10-14 days, but 21 days leave a margin if the immune system doesn’t work all that great (for instance if the person is stressed).”

Now, as to the need to get vaccinated immediate, the argument went along the lines that with countries opening up, barriers for the virus propagation are going down and we are still very far from herd immunity, meaning your own vaccine is the only thing that will be protecting you, with everyone not yet vaccinated getting sweeped in the upcoming wave.

Arguing with Turing is a loosing game (or why predicting some things is hard)

FiveThirtyEight’s 2016 election forecast and Nassim Taleb

Back in August 2016, Nassim Nicholas Taleb and Nate Silver had a massive and a very public run-in on Twitter. Taleb – the author of “Black Swan” and “Antifragile” and Nate – the founder of FiveThirtyEight were arguably the two most famous statisticians alive back at the time, so their spite ended up attracting quite a lot of attention.

The bottom line of the spat was that for Taleb, the FiveThirtyEight 2016 US presidential election model was overly confident. As the events in the race between Clinton and Trump swung the vote intentions one way or another (mostly in response to the quips from Trump), Nate’s model’s forecasts had too large swings in its predictions and confidence intervals to be considered a proper forecast.

Taleb did have a point. Data-based forecasts, be they based on the polls alone or more considerations, are supposed to be successive estimations of a well-defined variable vector – aka voting splits per state, translating in electoral college votes for each candidate on the day of the election. Any subsequent estimation of that variable vector with more data (such as fresher polls) is supposed to be a refinement of the prior estimation. We expect the incertitude range of the subsequent estimations to be smaller than the one of the prior estimation and to be included into the prior predictions incertitude range – at least most of the time, modulo expected errors.

In the vein of the central thesis of “Black Swan”, Taleb’s shtick for 2016 election models – and especially the FiveThrityEight’s one – was that they were prone to overconfidence they had in their predictions and underestimated the probability of rare events that would throw them off. As such, according to him, the only correct estimation technique would be a very conservative one – giving every candidate 50/50% chance of willing right until the night of the election, when it would jump to 100%.

Image
Nassim Taleb’s “Rigorous” “forecasting”

The thing is that Nassim Nicholas Taleb is technically correct here. And Nate Silver was actually very much aware of that himself before Taleb showed up.

You see, the whole point of Nate Solver creating the FiveThirtyEight website and going for rigorous statistics to predict games and election outcomes was specifically the fact that mainstream analysts had hard time with probabilities and would go either for a 50/50 (too close to call), or a 100% sure, never in-between (link). The reality is that there is a whole world in between and that such rounding errors would cost you dearly. As anyone dealing with inherently probabilistic events would painfully learn, for an 80% win bet, you still loose 20% of the time – and have to pay dearly for it.

Back in 2016, Nate’s approach was to make forecasts based on the current state of the polls and the estimation of how much that could be expected to change if the candidates were doing about the same thing as the candidates in past elections. Assuming the candidates don’t do anything out of the ordinary and the environment doesn’t change drastically, his prediction would do well and behave like an actual forecast.

Nate Silver| Talks at Google| The Signal and The Noise
Nate explains what is a forecast is to him

The issue is that you can never enter the same river twice. Candidates will try new things – especially if models show them as loosing if they don’t do anything out of the ordinary.

Hence Taleb is technically correct. Nate Silver’s model predictions will always be overly confident and overly sensible to changes resulting from candidate’s actions or previously unseen changes in the environment.

Decidability, Turing, Solomonoff and Gödel

However Taleb’ technical correctness is completely besides the point. Unless you are God himself (or Solomonoff’s Demon – more about that in a second), you can’t make forecasts that are accurately calibrated.

In fact, any forecast you can make is either trivial or epistemologically wrong – provably.

Let’s for a second imagine ourselves physicists and get a “spherical cow in vacuum” extremely simplified model of the world. All the events are 100% deterministic, everyone obeys well-defined known rules, based on which they will make their decisions, and the starting conditions are known. Basically, if we had a sufficiently powerful computer, we can run the perfect simulation of the universe and come up with the exact prediction of the state of the world and hence of the election result.

Forecast in this situation seems trivial, no? Well, actually no. As Alan Turing – the father of computing – have proved in 1936, unless you run the simulation, in general you cannot predict even if a process (such as voter making up his mind) will be over by the election day. Henry Gordon Rice was even more radical and in 1951 has proven a theorem that can be summarized as “All non-trivial properties of simulation runs cannot be predicted”.

In computer science, forecasting if a process will be over is called halting problem and is known as a prime example of a problem that is undecidable (aka, there is no method to predict the outcome in advance). For those of you with an interest in mathematics, you might have noted a relationship to Gödel’s incompleteness theorem – stating that undecidable problems will exist for any set of postulates that are complex enough to embed the basic algebra.

Things only go downhill once we add some probabilities into the mix.

For those of you who have done some physics, you probably have recognized the extremely simplified model of the world above as Laplace’s demon and have been screaming on the screen about the quantum mechanics and how that’s impossible. You are absolutely correct.

That was one of the reasons that pushed Ray Solomonoff (also known for the Kolmogrov-Solomonoff-Chaitin) to create something he called Algorithmic Probability and a probabilistic pendant to Laplace’s demon – Solomonoff’s demon.

In the Algorithmic probability frameowrlk, any possible theory about the world that can exist has some non-nul probability associated to it and is able to predict something about the state of the world. And to perform an accurate prediction about the probability of the event, you need to calculate the probability of that event according to all theories, weighted by the probability of each theory:

Single event prediction according to Solomonoff (conditional probabilities)
E is event, H is a theory, sum is over all possible theories

Fortunately for us, Solomonoff also provided a (Bayesian) way of learning the probabilities of events according to theories and probabilities of theories themselves, based on the prior events:

Learning from a single new data point according to Solomonoff
D is new data, H is a theory, sum is over all possible theories T, but excluding theory H

Solomonoff was nice enough to provide us with two critical results about his learning. First, that it was admissible – aka optimal. In other terms, there is no other learning mode that could do better. The second – that it was uncomputable. Given that a single learning or prediction step requires an iteration over the entire infinite set of all possible theories, only a being from a thought experiment – Solomonoff’s demon – is actually able to do it.

This means that any computable prediction methodology is necessarily incomplete. In other terms, it is guaranteed not to take in account enough theories for its predictions to be accurate. It’s not a bug or an error – it’s a law.

So when Nassim Taleb says that Nate Silver’s model overestimates its confidence, while technically correct, is also completely trivial and tautological. Worse than that. Solomonoff also proved that in general case, we can’t evaluate by how much our predictions are off. We can’t possibly quantify what we are leaving on the table by using only a finite subset of theories about the universe.

The fact that we cannot evaluate in advance by how much we are off in our predictions basically means that all of Taleb’s own recommendations about investments are basically worthless. Yes his strategy will do better when there are huge bad events that market consensus did not expect. It will however do much worse in case they don’t happen.

Basically, for any forecasts, the, best we can do is something along the lines of Nate Silver’s now-cast + some knowledge gleaned from the most frequent events that occurred in the past. And that “frequent” part is important. What made the FiveThirtyEight model particularly bullish about Trump in 2016 (and in retrospect most correct) was its assumption of correlation of polling errors across the states. Without it, it would have given Trump 0.2% chances to win instead of 30% it ended up giving him. Modelers were able to infer and calibrate this data because it occurred in every prior election cycle.

What they couldn’t have properly calibrated (or included or thought of for the matter) were one-off events that would radically change everything. Think about it. Among things that were possible, although exceedingly improbable in 2016 were:

  • Antifa militia killing Donald Trump
  • Alt-right militia killing Hilary
  • Alt-right terrorist group inspired by Trump’s speeches blowing up the One World Trade Center, causing massive outrage against him
  • Trump making an off-hand remark about grabbing someone’s something that would outrage the majority of his potential electorate
  • It rains in a swing state, meaning 1000 less democratic voters decide last second to note vote because Hillary’s victory is assured
  • FBI director publicly speaking about Hillary’s emails scandal once again, painting her as corrupt politician and causing a massive discouragement among potential democrat voters
  • Aliens showing up and abducting everyone
  • ….

Some of them were predictable, since they happened before (rain) and were built into the model’s uncertainty. Other factors were impossible to predict. In fact, if in November 2015 you would have claimed that Donald Trump will become the next president due to the last-minute FBI intervention, you would have been referred to the nearest mental health specialist for urgent help.

2020 Election

So why write about that spat now, in 2020 – years after the fact?

First, it’s a presidential election year in the US, again, but this time around even with more potential for unexpected turns. Just as in 2016, Nate’s models is drawing criticism, except this time for underestimating Trump’s chances to win instead of overestimating. While Nate and the FiveThrityEight team are trying to account for the new events and be exceedingly clear about what their predictions are, they are limited in what they can reliably quantify. His model – provably – cannot predict all the extraordinary circumstances that still can happen in the upcoming weeks.

There is still time for the Trump campaign to discourage enough female vote by painting Biden as rapist – especially if FBI reports in the week before the election about an ongoing investigation. There is still time for a miracle vaccine to be pushed to the market. This is not an ordinary elections. Many thing can happen and none of them is exactly predictable. There is still room for new modes for voter suppression to pop up and for Amy Barrett to be nominated to make sure the Supreme court will judge the suppression legal.

And no prediction can account for them all. Nate and the FiveThirtyEight team are the best applied statisticians in the politics since Robert Abelson, but they can’t decide the undecideable.

2020 COVID 19 models

Second – and perhaps most importantly – 2020 was a year of COVID 19 and forecasts associated to it. Back in March and April, you had to be either extremely lazy or very epistemologically humble to not try to build your own models of COVID19 spread or not to criticize other’s models.

Some takes and models were such an absolute hot garbage that the fact they could come from people I have respected and admired in the past plunged me into a genuine existential dread.

But putting those aside, even excellent models and knowledgeable experts were constantly off – and by quite a lot as well. Let’s take an example of expert predictions from March 16th about how many total cases they thought the CDC will have reported on March 29th in the US (from FiveThrityEigth for a change). Remind yourself – March 16th was the time when only 3500 cases were reported in the US, but Italy was already out of the ICU beds and was not resuscitating anyone above 65, France was going into confinement and China was still mid-lockdown.

Consensus: ~ 10 000 – 20 000 cases total with 80% upper bound at 81 000.
Reality: 139 000 cases

Yeah. CDC reported about 19 000 cases on the March 29th alone. Total was almost 10 times higher than the predicted range and almost the double of the predicted interval. But the reason they were off wasn’t because experts are wrong and their models are bad. Just as the election models, they can’t account for factors that are impossible to predict. In this specific case, what made all the difference was that CDC’s own COVID 19 tests were defective, and the FDA didn’t start issuing emergency use authorizations for other manufacturer’s tests until around the 16th of March. The failure of the best-financed, most reliable and most up-to-date public health agency to produce reliable tests for the world-threatening pandemics for two months would have been an insane hypothesis to make given the prior track record of the CDC.

More generally, all the attempts to predict the evolution of the COVID19 pandemic in the developed countries ran into the same problem. While we were able to learn rapidly how the virus was spreading and what needed to be done to limit its spread, the actions that would be taken by people and governments of developed countries were a total wild card. Compared to the election forecasts, we don’t have any data from prior pandemics in developed countries to build and calibrate models upon.

Nowhere is it more visible than in the state-of-the-art models. Here is Youyang Gu’s COVID19 projections forecasting website, using a methodology similar to the one of Nate Silver uses to predict the election outcomes (parameters fitting based on prior experience + some mechanistic insight about the underlying process). It is arguable the best performing one. And yet, it failed spectacularly, even a month in advance.

Youyang Gu’s forecast for deaths/day in early July
Youyang Gu’s forecast for deaths/day in early August. His July prediction was off by about 50-100%. I also wouldn’t be as optimistic about November.

And once again, it’s not the fault of Youyang Gu. A large part of that failure is imputable to Florida and other Southern states doing everything to under-report the number of cases and deaths attributed to COVID19.

The model is doing as well as it could, given what we know about the disease and how people have behaved themselves in the past. It’s just that no model can solve the halting problem of “when will the government decide to do something about the pandemic, what it will be and what people will decide to do about it”. Basically the “interventions” part of projections from Richard Neher’s lab COVID19-scenarios is as of now undecidable.

Once you know the interventions, predicting the rest reliably is doable.

Another high-profile example of expert model failure is the Global Health Security Index ranking for countries best prepared for pandemic from October 2019. US and the UK received respectively the best and second best scores for pandemic preparedness. Yet, if we look today (end of October 2020) and the COVID19 per capita among developed countries, they are a close second and third worst performers. They are not an exception. France ranked better than Germany. Yet it has 5 times the number of deaths. China, Vietnam and New Zealand – all thought to be around the 50/195 mark are now the best performers, with a whooping 200, 1700 and 140 times less deaths per capita than the US – the supposed best performer. All because of difference in the leadership approach and how decisive and unrelentless the response to the pandemic from the government was.

GHS index is not useless – it was just based on the previously implicit expectation that governments of countries affected by the outbreaks would aggressively intervene and do all in their capacity to stop the pandemic. After all, USA in the past went as far as to collaborate with the USSR, at the height of the Cold War to to develop and distribute the polio vaccine. It was after all what was observed during the 2010 Swine flu pandemic and during the HIV pandemic (at least when it became clear it was hitting everyone, not just the minorities) and in line with what WHO was seeing in developing countries hit by pandemics that did have a government.

Instead of a conclusion

Forecasting is hard – especially when it is quantitative. In most of cases, predicting what will happen deterministically is impossible. Probabilistic predictions can only base themselves on what happened in the past frequently enough to allow them to be properly calibrated.

They cannot – provably – account for all the possible events, even if we had a perfectly deterministic and known model of the world. We can’t even estimate what the models are leaving on the table when they try to perform their predictions.

So the next time you criticize an expert or modeler and point a shortcoming in their model, think for a second. Are you criticizing them for not having solved the halting problem? Not having bypassed the Rice theorem? Built an axiome set that does not account for a previously un-encountered event? Would the model you want require a Laplace or a Solomonoff Demon to actually run it? Are you – in the end – betting against Turing?

Because if the answer is yes, you are – provably – going to lose.

===========================

A couple of Notes

An epistemologically wrong prediction, can be right, but for the wrong reason. A bit like “even the stopped clock shows the correct time twice a day”. You can pull a one-off, maybe a second one if you are really lucky, but that’s about it. That’s one of the reasons for which any good model can only take in account events that has occurred sufficiently frequently in the past. To insert them into the model, we need to quantify their effect magnitude, uncertainty and the probability of occurrence. Many pundits using single-parameter models in 2016 election to predict a certain and overwhelming win for Hillary learnt it the hard way.

Turing’s halting problem and Rice theorem have some caveats. It is in fact possible to predict some properties of algorithms and programs, assuming they are well-behaved in some sense. In a lot of cases it involves them not having loops or having a finite and small number of branches. There is an entire field of computer science dedicated to developing methods to prove things about algorithms and to writing algorithms in which properties of interest are “trivial” and can be predicted.

Another exception to Turing’s halting problem and Rice’s theorem is given by the law of large number. If we have seen a large number of algorithms and seen the properties of their outcomes, we can make reasonably good predictions about the statistics of those properties in the future. For instance we cannot compute the trajectories of individual molecules in a volume, but we can have a really good idea about their temperature. Similarly, we can’t predict if a given person will become infected before a given day or if they will die of an infection, but we can predict the average number of infections and the 95% confidence interval for a population as a whole – assuming we did see enough outcomes and know the epidemiological interventions and adherence. That’s the reason, for instance, why we are pretty darn sure about how to limit COVID19 spread.

Image
Swiss Cheese defense in depth against COVID19 by Ian M. Mackay

To push the point above further, scientists can often make really good prediction from existing data. Basically, that’s how scientific process work – only theories with confirmed good predictive value are allowed to survive. For instance, back in February we didn’t know for sure what was the probability of the hypothesis “COVID19 spreads less as temperature and humidity increases”, nor what would be the effect on spread. We had however a really good idea that interventions such as masks, hand washing, social distancing, contact tracing, at-risk contacts quarantine, rapid testing and isolation would be effective – because of all we learnt about viral upper respiratory tract diseases over the last century and the complete carnage of the theories that were not good enough at allowing us to predict what would happen next.

There is however an important caveat to the “large number” approach. Before the large numbers laws kick in and modeler’s probabilistic models become quantitative, stochastic fluctuations – for which models really can’t account – dominate and stochastic events – such as super spreader events – play an outsized role, making forecasts particularly brittle. For instance, in France the early super spreader event in Mulhouse (mid-February) meant that Alsace-Moselle took the brunt of the ICU deaths in the first COVID19 wave, followed closely by the cramped-up and cosmopolitan Paris region. One single event, one single case. Predicting it and its effect would have been impossible, even with China-grade mass survelliance.

Mobilising against a pandemic - France's Napoleonic approach to covid-19 |  Europe | The Economist
March COVID19 deaths in France from The Economis

If you want to hear a more rigorous and coherent version of the discussion about decidability, information, Laplace-Bayes-Solomonoff and about automated model design (aka ML/AI mesh with all of that), you can check out Lê Nguyên Hoang and El Mahid El Mhamdi. A lot of what I mention is presented rigorously here, or in French – here.

Problems with a major programming language version bump (Python 2>3)

After about 10 years after the initial Python 3 release and about six months after the end of Python 2 support I have finally bumped my largest and longest-running project to Python 3. Or at least I think so. Until I find some other bug in a rare execution path.

BioFlow is a python project of mine that I have been on-and-off running and maintaining since 2013 – by now almost 7 years. Heavily dependent on high-performance scientific computing libraries and python libaries providing bindings to them (cough scikits.sparse cough), despite Python 3 being out for a couple of years by the time I started working on it none of the libraries I depended supported it yet. So I got on with Python 2 and rolled for it for a number of years. By that time, with several refactors, feature creep and optimization, with about 6.5 k LOC, 2.5 k Lines of comments, 665 commits over 7 years and a solid 30% of test coverage, it is a middle-of-the road python work-horse library.

As many other people running scientific computing libraries I did see a number of things impacting the code: bugs being introduced into the libraries I depended on (hello library version pinning), performance degradation due to anti-spectre attacks on Intel CPUs, libraries disappearing for good (RIP bulbs), databases discontinuing support for the means accessing them I was using (why, oh why neo4j did you drop REST) or host system just crapping itself trying to install the old Fortran libraries that have not yet been properly packaged for it (hello Docker).

Overall, it taught me a number of things about programming craftsmanship, writing quality code and debugging code I forgot the details about. But that’s a topic for another post – back to Python 2 to 3 transition.

Python 2 was working just fine for me, but with its end of life coming near, proper async support and type hinting being added to it, a switch to Python 3 seemed like a logical thing to do to ensure a long-term support.

After several attempts to keep a codebase in Python 2 consistent with Python 3

So I forked off a 2to3 branch and ran the 2to3 script in the main library. At first it seemed that it should have solved most of issues:

  • print xyz was turned into print(xyz)
  • dict.iteritems() was turned into dict.items()
  • izip became zip
  • dict.keys() when fed to an enumerator was turned into list(dict.keys())
  • reader.next() was turned into next(reader)

So I gladly tried to start running my test suite, only to discover that it was completely broken:

  • string.lower("XYZ") now was “XYZ".lower()
  • file("fname", 'w') was now an open("fname", 'w')
  • but sometimes also open("fname", 'wr')
  • and sometimes open("fname", 'rt') or open("fname", 'rb') or open("fname, 'wb'), depending purely on the ingesting library
  • AssertDictEqual or assertItemsEqual (a staple in my unit test suite) disappeared into thin air (guess assertCountEqual will now have to do…)
  • wtf is even with pickle dumps ????

Not to be forgotten that to switch to Python 3 I had to unfreeze dependencies for the libraries I was building on top, which came with its own cans of worms:

  • object.properties[property] now became an object._properties[property] in one of the libraries I heavily depended on (god bless whoever invented Ctrl-F and PyCharm for it’s context-aware class/object usage/definition search)
  • json dumps all of a sudden now require an explicit encoding, just as hashlib digests

And finally, after running for a couple of weeks my library, some previously un-executed branch triggered a bunch of exception arising from the fact that in Python 2 / meant an integer division, unless a float was involved, whereas for Python 3 / is always a float division and an // is needed to trigger an integer division.

I can be in part blamed for those issues. A code with complete unit test coverage would have caught all of the exceptions in the unit-test phase and the integration tests would have caught problems in rare codepaths.

The problem is that no real-life library have a total unit-test or coverage library. Python 3 transition trench warfare hell have killed a number of popular python projects – for instance Gourmet recipe manager (I used to use myself). For hell’s sake, even DropBox, who employs Guido himself and runs a multi-billion business on an almost pure Python stack waited until end 2018 and took about a year to roll-over.

The reality is that the debugging of a major language version bump is **really** different from anything a codebase encounters in its lifetime.

When you write a new feature, you test it out as you develop. Bugs appear as you add lines of code and you can track them down. When a dependencies craps out, the bugs that appear are related to it. It is possible to wrap it and isolate the difference in its response to calls. Debugging is localized and traceable. When you refactor, you change the model of the problem and code organization in your head. The bugs that appear are once again directly triggered by your code modifications.

When the underlying language changes, the bugs appear **everywhere**. You don’t know which line of code could be at the bugged one, and you miss bugs because some bugs obscure other bugs. So you have to do pass after pass after pass of your entire codebase, spending weeks and months tracking exceptions as they pop up and never sure if you have corrected all the bugs yet. It is hard, because you need to load the entire codebase in your head to search for bugs, be aware of the corner cases. It is demoralizing, because you are just trying to get to the point where your code already was, without improving it in any way possible.

It is pretty much a trench warfare hell – stuck in the same place, without clear advantage gained by debugging excursions at the limit of your mental capacities. It is unsurprising that a number of projects never made it to Python 3, especially niche ones made by non-developers and for non-developers – the kind of projects that made Python 2 a loved, universal language that surely would have a library that could solve your niche problem. The problem is so severe in the scientific community, that there is a serious conversation in Nature about starting to use Python 2.7 to maximise projects reproductibility, given it is guaranteed it will never change/

What could have been improved? As a rank-and-file (non-professional) developer of a niche, moderately complex library here’s a couple of things that would have my life **a lot** easier while bumping the Python version:

  • Provide a tool akin to 2to3 and make it default path. It was far from perfect – sure. But it hammered out the bulk of the differences and allowed to code to at least start executing and me – to start catching bugs.
  • Unlike 2to3, it needs to annotate potential problems in the code it could not resolve. 'rt' vs 'rb' was a direct consequence for the text vs byte separation in Python 3 and it was clear problems will arise with that. Same thing for / vs //. 2to3 should have at least high-lighted potential for problems. For me my workflow, adding a # TODO: potential conflict that needs resolution would have gone a loooooong way.
  • Better even, roll out a syntax change in the old language version that will allow the developer to explicitly resolve the ambiguity so that the automated upgrade tools can get more out of the library
  • Don’t touch the unittest functions. They are the lifeblood of the debugging of the library after the language bump. If they bail out, getting them to work would require figuring out how the code they are covering works once again and defeats their purpose.
  • Make sure that the most wide-spread libraries in your ecosystem have performed a roll-over before pushing others to do the same.
  • Those libraries need to provide a “bump” version: aka with exactly the same call syntax from the users code, they would return exactly the same results both in the previous and the new version of the language. Aka the libraries should not be bumping their own major version at the same time they bump the supported langage version.

On masks and SARS-CoV-2

This comment was initially a response to a youtube video from Tech Ingredients – a channel I have in the past thoroughly enjoyed for their in-depth dive into scientific and engineering aspects of various heavy on engineering DIY projects. Unfortunately, I am afraid that panic around COVID19 has prevented a lot of people from thinking straight and I could but disagree with the section on masks.

==

Hey there – Engineer turned biomedical scientist here. I absolutely love your videos and have been enjoying them a lot, but I believe that in this specific domain I should have enough experience to point out what appears to me as overlooked and is likely to chase drastically your recommendation on masks.

First of all, the operation room masks and the standard medical masks are extremely different beasts – if anything their capacity to filter out small particles, close in size to droplets transporting COVID19 at the longest distance is much closer to N95s than those of standard medical masks:

masks filtration efficiency

The standard medical masks let through about 70% of droplets on the smaller end of those that can carry SARS-CoV-2. A decrease in exposure of such magnitude has not been associated with a statistically significant reduction in contagion rates in any respiratory transmitted disease.

So why are standard medical masks recommended for sick people? The main reason for that is that in order to get into the air, the viral particles need to be aerosolized by coughing/sneezing/speaking by a contaminated person. The mask does not do well at preventing small particles from getting in and out, but it will prevent, at least partially the aerosolization, especially for larger droplets – that will contain more viruses and hence be more dangerous.

Now, that means that if you really want to protect yourself, rather than using a mask, even surgical, it’s much better to use a full face shield – while useless against aerosolized particles suspended in the air, it will protect you from the largest and most dangerous droplets.

Why do medical people need them?
The reality is that without the N95 masks and in immediate contact with the patients, the risk of them getting infected is pretty high even in what is considered as “safe” areas – as well as passing the virus to their colleagues and patients in those “safe” areas. If let spreading, due to the over-representation of serious cases in the hospital environment, it is not impossible that the virus will evolve to forms that lead to more serious symptoms. Even if we can’t protect the medical personnel, preventing those of them who are asymptomatic from spreading the virus is critical for everyone (besides – masks are also for patients – if you look at pictures in China, all patients wear them).

Second, why did WHO not recommend the use of N95 masks to the general public at the beginning of this outbreak, whereas they did that for SARS-CoV in 2002-2004 outbreak almost as soon as it became known to the West?

Unlike the first SARS-CoV, SARS-CoV-2 does not remain suspended in aerosols for prolonged periods of time it does not form clouds of aerosolized particles that remain in suspension and can infect someone who is passing through the cloud hours after the patient who spread it left. For SARS-CoV-2, the droplets fall to the ground fairly rapidly – within a couple of meters and a couple of minutes (where they can be picked up – hence hand washing and gloves). Due to that, unlike SARS-CoV, SARS-CoV-2 transmission is mostly driven by direct face-to-face contact with virus-containing droplets landing on the faces of people in direct contact.

Situation changes in hospitals and ICU wards – with a number of patients constantly aerosolizing, small particles do not have the time to fall and the medical personnel is at less than a couple of meters from patients due to the place constraints. However, even in the current conditions, the N95 masks are only used in the aerosol-generating procedures, such as patient intubation.

Once again, for most people, face shield, keeping several meters of distance and keeping your hands clean and away from your face are the absolute best bang-for-buck there is with everything else having significantly decreasing returns.

==

PS: since I wrote this paper, a number of science journalists have done an excellent job at doing in-depth research on the subject and write up their findings in an accessible manner:

In addition to that, a Nature study has been recently published, indicating that while masks are really good at preventing large droplets formation (yay), when it comes to small droplets formation (the type that can float for a little bit), it’s not that great for Influenza. The great news is that for Coronavirus, since there are few droplets of that size formed, it works great and containing any type of viral particles emission: Nature Medicine Study.

Scale-Free networks nonsense or Science vs Pseudo-Science

(this article’s title is a nod to Lior Pachter vitriolic arc of 3 articles with similar title)

Over the last couple of days I was engaged in a debate with Lê from Science4All about what exactly science was, that spun off from his interview with an evolutionary psychologist and my own vision of evolutionary psychology in its current state as a pseudo-science.

While not necessarily always easy and at times quite movemented, this conversation was quite enlightening and let me to trying to lay down

Following the recent paper about scale-free networks not being that spread in the actual environment (that I first got as a gist from Lior Pachter’s blog back in 2015) helped me to formalize a little bit better what I believe I feel a pseudo-science is.

Just as the models and theories within the scientific method itself, something being a scientific approach is not defined or proved. Instead, similarly to the NIST definition of random numbers through a series of tests that all need to be successfully passed, the definition of a scientific approach is a lot of time defined from what it’s not, whereas pseudo-science is defined as something that tries to pass itself as a scientific method but fails one or several tests.

Here are some of my rules of thumb for the criteria defining pseudo-science:

The model is significantly more complicated that what the existing data and prior knowledge would warrant. This is particularly true for generative models not building on the deep pre-existing knowledge of components.

The theory is a transplant from another domain where it worked well, without all the correlated complexity and without justifying that the transposition is still valid. Evolutionary psychology is a transplant from molecular evolutionary theory,

The success in another domain is advanced as the main argument for the applicability/correctness of the theory in the new domain.

The model claims are non-falsifiable.

The model is not incremental/emergent from a prior model.

There are no closely related, competing models that are considered upon application to choices.

The cases where the model fails are not defined and are not acknowledged. Evo psy – modification of the environment by humans. Scale-Free networks.

Back-tracking on the claims, without changing the final conclusion. This is different with regards to affining the model where the change in the model gets propagated to the final conclusion and that conclusion is then re-compared with reality. Sometimes mends are done to that model for it to align with the reality again, but at least during a period, the model is still considered as false.

Support by a cloud of plausible, but refuted claims rather than a couple of strong, hard to currently attack the claims.

The defining feature of pseudo-science however, epsecially compared to the faulty science is its refusal to accept the criticism/limitations to the theory and change its prediction accordingly. It always needs to fit the final maxim, no matter the data.

Synergy from the boot on Ubuntu

This one seemed to be quite trivial per official blog, but the whole pipeline gets a bit more complicated once the SSL enters into the game. Here is how I made it work with synergy and Ubuntu 14.04

  • Configure the server and the client with the GUI application
  • Make sure SSL server certificate fingerprint was stored in the ~/.synergy/SSL/Fingerprints/TrustedServers.txt
  • Run sudo -su myself /usr/bin/synergyc -f --enable-crypto my.server.ip.address
  • After that check everything was working with sudo /usr/bin/synergyc -d DEBUG2 -f --enable-crypto my.server.ip.address
  • Finally add the greeter-setup-script=sudo /usr/bin/synergyc --enable-crypto my.server.ip.address line into the /etc/lightdm/lightdm.conf file under the [SeatDefaults] section

Why you shouldn’t do it?

Despite the convenience, there seemed to be a bit or an interference for the keyboard command and command interpretation on my side, so since my two computers side by side and since I have an usb button switch from before I got synergy, I’ve decided to manually start synergy every time I log in.

Writing a research paper with ReStructuredText

Why?

As a part of my life as a Ph.D. student, I have to read a large number of scientific papers and I have seen a couple of them written. What struck me was that these papers have a great deal of internal structure (bibliographic references, references to figure, definitions, adaptation to a particular public, journal, etc…). However, they are written all at once, usually in a word document, in a way that all that structure lives and can only be verified in the writer’s head.

As a programmer, I am used to dealing with organizing complex structured text as well – my source code. However, the experience has shown me that relying on what is inside your brain to keep the structure of your project together works only for a couple of lines of code. Beyond that, I have no choice but to rely on compilers, linters, static analysis tools and IDE tools to help me dealing with the logical structure of my program and prevent me from planting logical bombs and destroying one aspect of my work while I am focusing on the other. An even bigger problem is to keep myself motivated while writing a program over several months and learning new tools I need to do it efficiently.

From that perspective, writing the papers in a word file is very similar to trying to implement high-level abstraction with very low-level code. In fact, so low-level, you are not allowed to import from other files (automatically declare common abbreviations in your paper), define functions and import upon execution. Basically, the only thing you can rely on is the manuscript reviews by the writers and the editors/reviewers of journals where the paper is submitted. Well, unless they get stuck in an infinite loop.

Alt text

So I decided to give it a try and write my paper in the same way I would write a program: using modules, declarations, import, compilation. And also throwing in some comments, version control and ways to organize revisions.

Why ReStructuredText?

I loved working with it when I was documenting my projects in Python with Sphinx. It allows quite a lot of operations I was looking for, such as .. include:: and .. replace:: directives. It is well supported in the PyCharm editor I am using, allowing me to fire up my favorite dark theme and go fullscreen to remain distraction-free. It also can be translated with pandoc to both .docx for my non-techy Professor and LaTeX files for my mathy collaborator.

It also allowed me to type the formulas with raw mathematics in LaTeX notation quite easily by using .. math:: directive.

How did it go?

Not too bad so far. I had some problems with pandoc’s ability to convert my .rst file tree into .docx, especially when it came to failing on the .. import:: and citation formatting. (link corresponding issues) There was also some issue with rendering .png images in the .docx format as well (link issue). In the end, I had to translate the .rst files into html and with rst2html tool and then html to docx. For now, I am still trying to see how good of the advantages it is giving me.

After some writing, I’ve noticed that I am now being saved from the pending references. for instance, at some point I wrote a reference [KimKishnoy2013]_ in my text and while writing biblio I realized the paper came out in 2012, so defined the paper there as .. [KimKishnoy2012]. And rst compilation engine threw an error on compilation about Unknown target name: "kimkishnoy2013" Yay, no more dead references! The Same thing is true for the references defined in the bibliography but not used in the text.

Now a shortcoming of this method of writing is the fact that inter-part transitions do not come naturally. It can be easily mitigated once the writers’ block has been overcome by writing all parts separately by opening a compiled HTML or .docx document and editing the elements os that they align properly.

An additional difference with the tools that has been developed to review code is that the information density and consistency in the programming languages is closer to mathematical notations rather than a human-readable text with all the redundancy necessary for a proper understanding. A word change or a line change is a big deal in the case of programming. It isn’t so important in the case of writing. and all the annotation and diff tools used for that are not very useful.

On the other hand, it is still related to the fact that human language is still a low-level language. Git won’t be as useful to review the binaries that it is when reviewing the programming languages that are important.

Over the time, a two significant problems emerged with that approach. First – incorporating the revisions. Since the other people in the revision pipeline are using the MS word built-in review tools, in order to address every single revision I have to find the location in the text tree file where the revision needs to be made, then correct it. Doing it once is pretty straight-forward. Doing it hundreds upon hundreds of time across tens of revision by different collaborators is an entire thing altogether and is pretty tiresome.

The second problem is related to the first. When the revisions are more major and require re-writing and re-organization of entire parts, I need to go and edit the parts one by one, then figure which part contents is going into what new part. Which is a lot of side work for not a lot of added value in the end.

What is is still missing?

  • Conditional rendering rules. There are some tags that I would want to see when I am rendering my document for proofreading and correction (like parts name, my comments, reviewer comments), but not in the final “build”.

  • Linters. I would really like to run something like a Hemingway app on my text, as well as some kind of Clonedigger to figure out where I tend to repeat myself over and over again and make sure I only repeat myself when I was to. In other terms automated tools for proof-reading the text and figuring out how well it is understood. It seems that I am not the only one to have the idea: Proselint creators seem to have had the same idea and implemented a linter for prose in python. Something I am quite excited about, even though they are still in the infancy because of how liberal the spoken language is compared to programming language. We will likely see a lot of improvements in the future, with the development in NLP and machine learning. Here are a couple of other linter functions I could see be really useful.

    • Check that the first sentence of each paragraph describes what the paragraph will be about
    • Check that all the sentences can be parsed (subject-verb-object-qualifiers)
    • Check that there is no abrupt interruption in the words used between close sentences.
    • Check for the digressions and build a narration tree.
  • Words outside common vocabulary that are undefined. I tend to write to several very different public about topics they are very knowledgeable about and sometimes not really. The catch is that since I am writing about it, I am knowledgeable about them, to the point that sometimes I fail to realize that some words might need. If I have an app that shows me words that I introduce that are rare and that I don’t define, I could rapidly adapt my paper to a different public or like the reviewers like to ask unpack a bit.

  • chaining with other apps. There are already applications that do a great job on the structuring the citation and referencing to the desired format. I need to find a way to pipe the results of .rst text compilation into them so that they can adapt the citation framework in a way that is consistent with the publication we are writing.

  • Skeptic module. I am writing scientific papers. Ideally, my every assertion in the introduction has to be backed by an appropriate citation and every paragraph in the methods and results section has to be backed by the data, either in the figures or in supplementary data.

  • A proper management of mathematical formulas. They tend to get really broken. Latex is the answer, but it would be nice if the renderings of formulae could also be translated into HTML or docx, that has it’s own set of mathematical notation (MS office always had to do it differently from open source standards).

  • Way to write from the requirements. In software we have something we refer to as unittests: pre-defined behaviors we want our code to be able to implement. As a rule of thumb, accumulation of unittests is a tedious process, that is nonetheless critical for building a system and validating that upon change our software is still behaving in a way we expect it to. In writing we want to transmit a certain set of concepts to our readers, but because the human time is so valuable, we regularly fail at that task, especially when we fail to see that 100th+ revision makes a concept that is referred to in a paper not be defined anymore. In software it is a little bit like acting as a computer and executing the software in your head. Works well for small pieces, but there are edge cases and what you know about what program should do that really gets into the way.