PSI 2023 Conference

We are pleased to announce that we have submitted 3 abstracts for PSI 2023 Conference, in London. We are curious to see which ones you find interesting. Please visit our LinkedIn page and join the conversation about these abstracts.

Data Visualization: A Useful Tool in the Assessment of Multiple Endpoints in Ultrarare Diseases

by Hanna-Liza Jooste

It has been proposed that a disease is considered to be ultrarare when it affects fewer than 20 patients in a population of 1 million. As a result, several challenges arise when conducting a clinical trial in these diseases, one of them being the choice of one primary outcome. Ultrarare disease impairments are sometimes multidimensional and there is still a limited understanding of disease pathology, disease progression, and variability in disease presentation, making it difficult to prove efficacy. Additionally, there is also still a lack of established endpoints, and early-stage trials must poke at every possible biomarker, gene marker, or laboratory marker. Facing all these challenges, how does a biostatistician go to work to show optimal results based on a small number of patients but many endpoints?

This poster will show how data visualization can be an effective tool when multiple quantitative variables must be assessed at one glance. Radar graphing is widely known, but it is under-utilized in my opinion, for reasons unknown. It is a useful way to present many independent variables in a single graphical display, even if these variables have different measurement scales. This poster will show how radar graphing can be used to illustrate the change in patient profiles over time.

Achieving Medical Precision: Calibrating Machine Learning Models for Spot-on Disease Prediction

by Javier Rodriguez Soto

The accurate prediction of disease probabilities for individuals is crucial in medical practice. Such predictions can aid in clinical decision making and improve patient communication. However, models may produce consistently high or low probability estimates regardless of whether the event occurs or not.

Probability calibration addresses this issue by aligning the distribution and behavior of predicted probabilities with that of observed probabilities. Despite its importance, calibration is often overlooked in favor of model discrimination. While a calibrated model may have a lower Area Under the Receiver Operating Characteristics Curve (AUC-ROC), it can provide more accurate risk estimates.

Many models are not calibrated by default, leading to over- or under-confidence in predictions. Even accurate classifiers may produce poor quality probability estimates. A well-calibrated model should not only assign higher probabilities to patients with the event than those without, but also produce well-estimated probabilities overall.

This poster will discuss the concept of model calibration, its significance, and methods for assessing calibration (both visually and quantitatively). Additionally, the two most commonly used techniques for calibrating machine learning models will be presented.

Man vs Machine: Why the “Subjective” Human Brain is not Obsolete in Statistical Analysis

by Mariska Burger

The “machine” in the form of ChatGPT launched by OpenAI end of November 2022 has taken over! All “techies” out there are exploring ways of using ChatGPT. From explaining complex terminology to a child to debugging or translating programming. ChatGPT might be able to draw graphs, but when it comes to interpreting graphs “man” triumphs over “machine”! 

While developing an application to perform repeated measures analysis of variance (ANOVA) we’ve realized that the “subjective” human brain is NOT obsolete in statistical analysis. We experienced multiple challenges while developing this “black box” application and had to find alternatives for dealing with these. 

One of the challenges was evaluating the normality assumption of the model. Since this is a “black box” application, we were forced to make use of the objective null hypothesis significance testing (NHST) framework. Usually, statisticians evaluate this assumption by visually inspecting the scaled residual plots, something that can’t be built into the application. If you were to use this strict objective NHST framework and conclude that the residuals are not normally distributed and chose to transform your data, this could potentially even introduce skewness into your otherwise “approximately” normally distributed data. Or if you chose to use non‑parametric methods you lose power and easy interpretation of your results.

The poster will present some challenges we’ve experienced while developing the application and why we feel that the “subjective” human brain is not obsolete in statistical analysis.