Rebekah Jones’s Lies about Florida COVID Data Keep Piling Up

Link: https://www.nationalreview.com/2021/06/rebekah-joness-lies-about-florida-covid-data-keep-piling-up/

Excerpt:

One of the most persistent falsehoods of the COVID pandemic has been the claim that Florida has been “hiding” data. This idea has been advanced primarily by Rebekah Jones, a former Florida Department of Health employee, who, having at first expressed only some modest political disagreements with the way in which Florida responded to COVID, has over time become a fountain of misinformation.

…..

To understand what is happening here, one needs to go back to the beginning. Over the past 15 months, Florida has published a truly remarkable amount of COVID-related data. At the heart of this trove has been a well-maintained list of literally every documented case of COVID — listed by county, age, and gender, and replete with information about whether the patient had recently traveled, had visited the ER, had been hospitalized, and had had any known contact with other Floridians. To my knowledge, Florida has been the only state in the union that has published this kind of data.

…..

To this day, you can download Florida’s case-line data and see 21 cases of COVID that, despite having been identified between March 2020 and December 2020, feature a December 2019 “Event Date.” To anyone who understands data, these results are clearly the product of the system having assigned a non-null default value when no data has been entered. To the Miami Herald, however, these results hinted at scandal. Even now, when its reporters know beyond any doubt that their initial instincts were wrong, the Herald continues to tell its readers that these entries serve as “evidence of community spread potentially months earlier than previously reported.” This is not true.

Author(s): Matt Shapiro

Publication Date: 8 June 2021

Publication Site: National Review

Rebekah Jones, the COVID Whistleblower Who Wasn’t

Link: https://www.nationalreview.com/2021/05/rebekah-jones-the-covid-whistleblower-who-wasnt/

Excerpt:

There is an extremely good reason that nobody in the Florida Department of Health has sided with Jones. It’s the same reason that there has been no devastating New York Times exposé about Florida’s “real” numbers. That reason? There is simply no story here. By all accounts, Rebekah Jones is a talented developer of GIS dashboards. But that’s all she is. She’s not a data scientist. She’s not an epidemiologist. She’s not a doctor. She didn’t “build” the “data system,” as she now claims, nor is she a “data manager.” Her role at the FDOH was to serve as one of the people who export other people’s work—from sets over which she had no control—and to present it nicely on the state’s dashboard. To understand just how far removed Jones really is from the actual data, consider that even now—even as she rakes in cash from the gullible to support her own independent dashboard—she is using precisely the same FDOH data used by everyone else in the world. Yes, you read that right: Jones’s “rebel” dashboard is hooked up directly to the same FDOH that she pretends daily is engaged in a conspiracy. As Jones herself confirmed on Twitter: “I use DOH’s data. If you access the data from both sources, you’ll see that it is identical.” She just displays them differently.

Or, to put it more bluntly, she displays them badly. When you get past all of the nonsense, what Jones is ultimately saying is that the State of Florida—and, by extension, the Centers for Disease Control and Prevention—has not processed its data in the same way that she would if she were in charge. But, frankly, why would it? Again, Jones isn’t an epidemiologist, and her objections, while compelling to the sort of low-information political obsessive she is so good at attracting, betray a considerable ignorance of the material issues. In order to increase the numbers in Florida’s case count, Jones counts positive antibody tests as cases. But that’s unsound, given that (a) those positives include people who have already had COVID-19 or who have had the vaccine, and (b) Jones is unable to avoid double-counting people who have taken both an antibody test and a COVID test that came back positive, because the state correctly refuses to publish the names of the people who have taken those tests. Likewise, Jones claims that Florida is hiding deaths because it does not in­clude nonresidents in its headline numbers. But Florida does report nonresident deaths; it just reports them separately, as every state does, and as the CDC’s guidelines demand. Jones’s most recent claim is that Florida’s “excess death” number is suspicious. But that, too, has been rigorously debunked by pretty much everyone who understands what “excess deaths” means in an epidemiological context—including by the CDC; by Daniel Weinberger, an epidemiologist at the Yale School of Public Health; by Lauren Rossen, a statistician at the CDC’s National Center for Health Statistics; and, most notably, by Jason Salemi, an epidemiologist at the University of South Florida, who, having gone to the trouble of making a video explaining calmly why the talking point was false, was then bullied off Twitter by Jones and her followers.

Author(s): Charles C. W. Cooke

Publication Date: 13 May 2021

Publication Site: National Review

COMIC: How I Cope With Pandemic Numbness

Link: https://www.npr.org/sections/goatsandsoda/2021/04/25/987208356/comic-how-i-cope-with-pandemic-numbness

Graphic:

Excerpt:

Each week I check the latest deaths from COVID-19 for NPR. After a while, I didn’t feel any sorrow at the numbers. I just felt numb. I wanted to understand why — and how to overcome that numbness.

Author(s): CONNIE HANZHANG JIN

Publication Date: 25 April 2021

Publication Site: Goats and Soda at NPR

An Alternative to the Correlation Coefficient That Works For Numeric and Categorical Variables

Link: https://rviews.rstudio.com/2021/04/15/an-alternative-to-the-correlation-coefficient-that-works-for-numeric-and-categorical-variables/

Graphic:

Excerpt:

Using an insight from Information Theory, we devised a new metric – the x2y metric – that quantifies the strength of the association between pairs of variables.

The x2y metric has several advantages:

It works for all types of variable pairs (continuous-continuous, continuous-categorical, categorical-continuous and categorical-categorical)

It captures linear and non-linear relationships

Perhaps best of all, it is easy to understand and use.

I hope you give it a try in your work.

Author(s): Rama Ramakrishnan

Publication Date: 15 April 2021

Publication Site: R Views

Error-riddled data sets are warping our sense of how good AI really is

Link: https://www.technologyreview.com/2021/04/01/1021619/ai-data-errors-warp-machine-learning-progress/

Paper link: https://arxiv.org/pdf/2103.14749.pdf

Graphic:

Excerpt:

Yes, but: In recent years, studies have found that these data sets can contain serious flaws. ImageNet, for example, contains racist and sexist labels as well as photos of people’s faces obtained without consent. The latest study now looks at another problem: many of the labels are just flat-out wrong. A mushroom is labeled a spoon, a frog is labeled a cat, and a high note from Ariana Grande is labeled a whistle. The ImageNet test set has an estimated label error rate of 5.8%. Meanwhile, the test set for QuickDraw, a compilation of hand drawings, has an estimated error rate of 10.1%.

Author(s): Karen Hao

Publication Date: 1 April 2021

Publication Site: MIT Tech Review

How a Software Error Made Spain’s Child COVID-19 Mortality Rate Skyrocket

Link: https://slate.com/technology/2021/03/excel-error-spain-child-covid-death-rate.html

Excerpt:

“Even though I didn’t know what the problem was, I knew it wasn’t the right data,” Soler realized once he got his hands on the Lancet paper. “Our data is not worse than other countries. I would say it is even better,” he says. Pediatricians across the nation contacted Spain’s main research institutes, as well as hospitals and regional governments. Eventually, they discovered that the national government somehow misreported the data. It’s hard to pinpoint exactly what went wrong, but Soler says the main issue is that patient deaths for those over 100 were recorded as children. He believes that the system couldn’t record three-digit numbers, and so instead registered them as one-digit. For example, a 102-year-old was registered as a 2-year-old in the system. Soler notes that not all centenarian deaths were misreported as children, but at least 47 were. This inflated the child mortality rate so much, Soler explains, because the number of children who had died was so small. Any tiny mistake causes a huge change in the data.

Author(s): ELENA DEBRÉ

Publication Date: 25 March 2021

Publication Site: Slate

America’s Coronavirus Catastrophe Began With Data

Link: https://www.defenseone.com/ideas/2021/03/americas-coronavirus-catastrophe-began-data/172686/

Excerpt:

The consequences of this testing shortage, we realized, could be cataclysmic. A few days later, we founded the COVID Tracking Project at The Atlantic with Erin Kissane, an editor, and Jeff Hammerbacher, a data scientist. Every day last spring, the project’s volunteers collected coronavirus data for every U.S. state and territory. We assumed that the government had these data, and we hoped a small amount of reporting might prod it into publishing them.

Not until early May, when the CDC published its own deeply inadequate data dashboard, did we realize the depth of its ignorance. And when the White House reproduced one of our charts, it confirmed our fears: The government was using our data. For months, the American government had no idea how many people were sick with COVID-19, how many were lying in hospitals, or how many had died. And the COVID Tracking Project at The Atlantic, started as a temporary volunteer effort, had become a de facto source of pandemic data for the United States.

Author(s): ROBINSON MEYER and ALEXIS C. MADRIGAL, THE ATLANTIC

Publication Date: 15 March 2021

Publication Site: Defense One

Finding ‘Anomalies’ Illustrates 2020 Census Quality Checks Are Working

Link: https://www.census.gov/newsroom/blogs/random-samplings/2021/03/finding_anomalies.html?utm_campaign=20210309msc20s1ccpuprs&utm_medium=email&utm_source=govdelivery

Excerpt:

So far in 2020 Census processing, 27 of the 33 anomalies we’ve found are of this type. Let me give a couple of examples.

Miscalculating age for missing birthdays. We found that our system was miscalculating ages for people who included their year of birth but left their birthday and month blank. We fixed this with a simple code correction. Making sure ages calculate correctly helps us with other data processing steps for matching and removing duplicate responses.

Incorrectly sorting out self-responses from group quarters residents. The 2020 Census allowed people to respond online or by phone without using the pre-assigned Census ID that links their response to their address. As a result, some people who live in group quarters facilities, such as nursing homes, were able to respond on their own even though they were also counted through the separate Group Quarters Enumeration operation. This also makes their address show up as a duplicate — as both a group quarters facility and a housing unit. Our business rules sort out these duplicate responses and addresses by accepting the response coming from the group quarters operation and removing the response and address appearing as a housing unit. We found an error in how this rule was being carried out. The code was correctly removing the duplicate address but wasn’t removing the duplicate response. We fixed this with another code correction, which enables us to avoid overcounting these residents. 

Author(s): MICHAEL THIEME, ASSISTANT DIRECTOR FOR DECENNIAL CENSUS PROGRAMS, SYSTEMS AND CONTRACTS

Publication Date: 9 March 2021

Publication Site: U.S. Census Bureau

The Importance of How we See the Numbers During the Pandemic

Link: https://idsc.miami.edu/the-importance-of-how-we-see-the-numbers-during-the-pandemic/

Excerpt:

Speaking to Al Jazeera English for a piece entitled: “The Power and Politics of Data Visualisation” three contributors looked at how data is often presented as objective truth, but the way it is presented, interpreted, and contextualized can distort its original purpose. Turning data into graphics people can understand is increasingly important, but viewers also should also be better informed and more careful in recognizing the nature of uncertainty in these visualizations.

The piece looks at how important it is to be able to trust the data, yet it’s equally important that viewers understand that the visualization of the data can be influenced by human decisions on the collection, interpretation, and depiction of the data.  Dr. Cairo says “Data visualizations are some of the best tools that we have to understand the world if we use them well and we interpret them well, but that doesn’t mean that those numbers are the whole story. We also need to use logic and scientific reasoning.”

Publication Date: 26 February 2021

Publication Site: Institute for Data Science & Computing at University of Miami

Reflecting on a decade of data science and the future of visualization tools

Link: https://www.tableau.com/about/blog/2021/2/data-science-and-future-visualization-tools-reflection

Graphic:

Excerpt:

Importantly, a working definition of data science narrows the scope of research. Instead of considering all possible types of data analysis that one may wish to conduct, we look closely at the types of analyses data scientists carry out. This distinction is important as the specific steps that, say, an experimental physicist takes to analyze data are different, even though they share commonalities, than the analytic steps a data scientist may take. Which leads to an important follow on: what exactly is data science work?

There have been several industry standards for breaking down data science work. The first was the KDD (or Knowledge in Data Discovery) method, that over time was modified and expanded upon by others. From these derivations, as well as studies that interview data scientists, we created a framework that has four higher order processes (preparation, analysis, deployment, and communication) and 14 lower order processes. Using the red stroke outline we also highlighted the specific areas where data visualization already plays a prominent role in data science work. In our research article we provide detailed definitions and examples of these processes.

Author(s): ANA CRISAN

Publication Date: 24 February 2021

Publication Site: Tableau

CVS, Walgreens Look for Big Data Reward From Covid-19 Vaccinations

Link: https://www.wsj.com/articles/cvs-walgreens-look-for-big-data-reward-from-covid-19-vaccinations-11614681180?mod=djemwhatsnews

Excerpt:

Administering Covid-19 vaccines comes with a valuable perk for retail pharmacies: access to troves of consumer data.

Chains such as CVS Health Corp., Walmart Inc. and Walgreens-Boots Alliance, Inc. are collecting data from millions of customers as they sign up for shots, enrolling them in patient systems and having recipients register customer profiles.

The retailers say they are using the information to promote their stores and services, tailor marketing and keep in touch with consumers. The companies also say the information is critical in streamlining vaccinations and improving record-keeping, while ensuring only qualified people are receiving shots.

Author(s): Sharon Terlep

Publication Date: 2 March 2021

Publication Site: Wall Street Journal