data quality – Actuarial News

Date and Time of upcoming event: 3:00 PM ET Tuesday, January 24, 2023 (60 Minutes)

Description:

The U.S. Congress passed legislation on December 15, 2022 that includes requirements for the Securities and Exchange Commission to adopt data standards related to municipal securities. The Financial Data Transparency Act (FDTA) aims to improve transparency in government reporting, while minimizing disruptive changes and requiring no new disclosures. The University of Michigan’s Center for Local State and Urban Policy (CLOSUP) has partnered with XBRL US to develop open, nonproprietary financial data standards that represent government financial reporting which could be freely leveraged to support the FDTA. The Annual Comprehensive Financial Reporting (ACFR) Taxonomy today represents general purpose governments, as well as some special districts, and can be expanded upon to address all types of governments that issue debt securities. CLOSUP has also conducted pilots with local entities including the City of Flint.

Attend this 60-minute session to explore government data standards, find out how governments can create their own machine-readable financial statements, and discover what impact this legislation could have on government entities. Most importantly, discover how machine-readable data standards can benefit state and local government entities by reducing costs and increasing access to time-sensitive information for policy making.

Presenters:

Marc Joffe, Public Policy Analyst, Public Sector Credit
Stephanie Leiser, Fiscal Health Project Lead, Center for Local, State and Urban Policy (CLOSUP), University of Michigan’s Ford School of Public Policy
Campbell Pryde, President and CEO, XBRL US
Robert Widigan, Chief Financial Officer, City of Flint

Publication Site: XBRL.us

Underdispersion in the reported Covid-19 case and death numbers may suggest data manipulations

February 26, 2022February 26, 2022 Mary Pat Campbell

Link: https://www.medrxiv.org/content/10.1101/2022.02.11.22270841v1

doi: https://doi.org/10.1101/2022.02.11.22270841

Graphic:

Abstract:

We suggest a statistical test for underdispersion in the reported Covid-19 case and death numbers, compared to the variance expected under the Poisson distribution. Screening all countries in the World Health Organization (WHO) dataset for evidence of underdispersion yields 21 country with statistically significant underdispersion. Most of the countries in this list are known, based on the excess mortality data, to strongly undercount Covid deaths. We argue that Poisson underdispersion provides a simple and useful test to detect reporting anomalies and highlight unreliable data.

Author(s): Dmitry Kobak

Publication Date: 13 Feb 2022

Publication Site: medRXiV

Coffee Chat – “Data & Science”

October 19, 2021October 19, 2021 Mary Pat Campbell

Link:https://www.youtube.com/watch?v=S5GHsjgSl1o&ab_channel=DataScienceWithSam

Video:

Excerpt:

The inaugural coffee chat of my YouTube channel features two research scholars from scientific community who shared their perspectives on how data plays a crucial role in research area.
By watching this video you will gather information on the following topics:
a) the importance of data in scientific research,
b) valuable insights about the data handling practices in research areas related to molecular biology, genetics, organic chemistry, radiology and biomedical imaging,
c) future of AI and machine learning in scientific research.

Author(s):

Efrosini Tsouko, PhD from Baylor College of Medicine; Mausam Kalita, PhD from Stanford University; Soumava Dey

Publication Date: 26 Sept 2021

Publication Site: Data Science with Sam at YouTube

Predictably inaccurate: The prevalence and perils of bad big data

September 11, 2021September 11, 2021 Mary Pat Campbell

Link: https://www2.deloitte.com/us/en/insights/deloitte-review/issue-21/analytics-bad-data-quality.html

Graphic:

Excerpt:

More than two-thirds of survey respondents stated that the third-party data about them was only 0 to 50 percent correct as a whole. One-third of respondents perceived the information to be 0 to 25 percent correct.
Whether individuals were born in the United States tended to determine whether they were able to locate their data within the data broker’s portal. Of those not born in the United States, 33 percent could not locate their data; conversely, of those born in the United States, only 5 percent had missing information. Further, no respondents born outside the United States and residing in the country for less than three years could locate their data.
The type of data on individuals that was most available was demographic information; the least available was home data. However, even if demographic information was available, it was not all that accurate and was often incomplete, with 59 percent of respondents judging their demographic data to be only 0 to 50 percent correct. Even seemingly easily available data types (such as date of birth, marital status, and number of adults in the household) had wide variances in accuracy.

Author(s): John Lucker, Susan K. Hogan, Trevor Bischoff

Publication Date: 31 July 2017

Publication Site: Deloitte

Excel autocorrect errors still plague genetic research

August 28, 2021August 28, 2021 Mary Pat Campbell

Link: https://cosmosmagazine.com/science/biology/excel-autocorrect-errors-still-plague-genetic-research/

Graphic:

Excerpt:

Earlier this year we repeated our analysis. This time we expanded it to cover a wider selection of open access journals, anticipating researchers and journals would be taking steps to prevent such errors appearing in their supplementary data files.
We were shocked to find in the period 2014 to 2020 that 3,436 articles, around 31% of our sample, contained gene name errors. It seems the problem has not gone away, and is actually getting worse.

Author(s): Mark Ziemann, Deakin University and Mandhri Abeysooriya, Deakin University

Publication Date: 27 August 2021

Publication Site: Cosmos magazine

Rebekah Jones’s Lies about Florida COVID Data Keep Piling Up

June 9, 2021June 9, 2021 Mary Pat Campbell

Link: https://www.nationalreview.com/2021/06/rebekah-joness-lies-about-florida-covid-data-keep-piling-up/

Excerpt:

One of the most persistent falsehoods of the COVID pandemic has been the claim that Florida has been “hiding” data. This idea has been advanced primarily by Rebekah Jones, a former Florida Department of Health employee, who, having at first expressed only some modest political disagreements with the way in which Florida responded to COVID, has over time become a fountain of misinformation.
…..
To understand what is happening here, one needs to go back to the beginning. Over the past 15 months, Florida has published a truly remarkable amount of COVID-related data. At the heart of this trove has been a well-maintained list of literally every documented case of COVID — listed by county, age, and gender, and replete with information about whether the patient had recently traveled, had visited the ER, had been hospitalized, and had had any known contact with other Floridians. To my knowledge, Florida has been the only state in the union that has published this kind of data.
…..
To this day, you can download Florida’s case-line data and see 21 cases of COVID that, despite having been identified between March 2020 and December 2020, feature a December 2019 “Event Date.” To anyone who understands data, these results are clearly the product of the system having assigned a non-null default value when no data has been entered. To the Miami Herald, however, these results hinted at scandal. Even now, when its reporters know beyond any doubt that their initial instincts were wrong, the Herald continues to tell its readers that these entries serve as “evidence of community spread potentially months earlier than previously reported.” This is not true.

Author(s): Matt Shapiro

Publication Date: 8 June 2021

Publication Site: National Review

Alameda County Updates COVID-19 Death Calculation to Align with State Definitions

June 7, 2021June 7, 2021 Mary Pat Campbell

Link: https://covid-19.acgov.org/covid19-assets/docs/press/press-release-2021.06.04.pdf

Excerpt:

Today, June 4, Alameda County’s COVID-19 dashboard will be updated to reflect
the total number of COVID-19 deaths using the State’s death reporting definition. Alameda County previously included any person who died while infected with the virus in the total COVID-19 deaths for the County. Aligning with the State’s definition will require Alameda County to report as COVID-19 deaths only those people who died as a direct result of COVID-19, with COVID-19 as a contributing cause of death, or in whom death caused by COVID-19 could not be ruled out. Based on data available as of May 23, 2021, this update will decrease the overall number of deaths from 1,634 to 1,223.
….
This update does not disproportionally impact reported deaths for any specific race or ethnic group or zip code.
Close observers of Alameda County’s dashboard may have noticed a substantial increase in the COVID-19 death totals prior to this update, during the week of May 17. This increase was due to a separate quality assurance process intended to correct previously incomplete data; adjustments were made based on additional information that became available regarding date of death and county of residence. These corrections are unrelated to the current alignment with the State’s definition of death due to COVID-19, and some of the deaths will be removed from the updated totals because COVID-19 was not a contributing cause.

Author(s): Neetu Balram

Publication Date: 4 June 2021

Publication Site: Alameda County Health Care Services Agency

Tag: data quality

Government Financial Reporting – Data Standards and the Financial Data Transparency Act

Underdispersion in the reported Covid-19 case and death numbers may suggest data manipulations

Coffee Chat – “Data & Science”

Predictably inaccurate: The prevalence and perils of bad big data

Excel autocorrect errors still plague genetic research

Rebekah Jones’s Lies about Florida COVID Data Keep Piling Up

Alameda County Updates COVID-19 Death Calculation to Align with State Definitions