In light of the current coronavirus pandemic, we’re reminded day by day that timely, accurate, and data in-context is a key component for delivering effective analytics. We realize now more than ever that fostering collaboration from those on the frontline all the way to the data architects in the backroom is not just a nice-to-have—it’s imperative for survival.
We sat down with Stephanie Bruno, Senior Data Architecture Manager for Elizabeth Glaser Pediatric AIDS Foundation, and Shannon Lindsay, a leader in BI Consulting and Training. Both of these healthcare analysts have a tremendous amount of experience in the public health sector, particularly in the HIV research field.
In our discussion, we learned more about their perspective on how data modeling techniques and collaborative processes may help us answer the many questions we all have about COVID-19 and beyond.
There is a lot of misinformation surrounding COVID-19. How do you know if analytics are good and responsible?
What we’re seeing with the coronavirus epidemic is that information is not well understood by even the experts, so it’s definitely not going to be well understood by the general public. Presenting incomplete information can lead to fear.
That’s something that we need to be really careful about as data practitioners. You have to understand who you are creating these data products for. Who is your audience?
The second thing is that there are no consistent data collection tools. We’re often stitching data together across different countries. There are different standards in every country, and in every state and in every county.
A friend forwarded me an article the other day and said, “Great news, the death rate is going down.” The reality of the coronavirus situation is that we do not have enough data to understand what is happening with the death rate.
We must be careful when we’re presenting an analytics product to someone. If you’re presenting data to the public, make sure you frame it properly. Especially with the coronavirus pandemic, let people know the source—that new data is still coming and that we don’t have a complete picture yet.
Why is it essential for data to be timely, accurate, and in context?
Health care providers must respond to the needs on the ground and know what they’re dealing with. No one can deal with the unknown. So the best that we can do, especially in the case of COVID-19, is make sure that we gather data as it happens and be careful about immediately presenting incomplete data to the public.
In other words, the data must be put in context, and it must be complete and consistent. If we don’t check all of these important boxes, the data is weak, inaccurate, and certainly not meaningful to anyone.
The people who are expected to provide care must have the information they need. Disease doesn’t happen in a vacuum, so just looking at the clinical data isn’t enough. To effectively combat an epidemic, we need to provide other types of data to clinicians and decision-makers.
A diverse data set is useful from the number of cases of COVID-19 to the supply chain. What is available? Down the line…is there personal protective gear? Are there enough ventilators?
It’s data about people, but it’s also data about everything that’s around those people.
What does data modeling look like in the public health sector?
My experience has been with HIV. I’ve worked at the Elizabeth Glaser Pediatric AIDS Foundation for about 11 years. In that time, I’ve found no consistent data collection mechanisms. Every country we work in—and every donor that we work with—have different collection systems and different requirements.
We’ve spent a lot of our time figuring out how to clean up the data, organize it, and put it together. And that’s where data modeling comes in. We work really hard with our in-country counterparts to come up with consistent rules…the naming convention. Otherwise, we just spend all of our time stitching data together.
Is it even possible to make accurate predictions about the Coronavirus pandemic?
With the HIV work we’ve done in Africa, poor quality data was our top challenge. For people who are HIV positive, if they miss doses of their medication, they can get sick and infect people again.
We looked at patients’ health clinic visit records and tried to see if we could build a predictive model on who we think might not come back to get their next dose. It was really exciting. We thought that we had enough data to come up with a good model.
But what we found is we didn’t have enough demographic information to get a complete picture. It was really deflating. We thought we could build a tool to help healthcare providers flag people who are likely to not come back. But with poor quality data and not enough information, it’s pretty limited.
I’m not sure where we are with coronavirus. It might be a similar boat in terms of poor quality data.
How can organizations improve data quality?
My number one tip is: Build a decent data collection.
A lot of times you assume, “If I collect the data, then it will all work out just fine.” But, there aren’t enough resources put into building a robust data collection system. As a result, you get garbage. And that’s what we’ve experienced in the past decades, dealing with data that comes from poor quality collection tools.
There are so many good low code tools that exist now to help you build a quality data collection.
It’s going to take someone a very long time to stitch together the details of what has happened here in the U.S. as a result of this pandemic. I foresee big challenges coming down the line with all of the disparate data systems that exist here.
We need to ensure that these analytic tools allow us to tell a story with the data that we have—that we’re answering questions and that we’re presenting things in a way that makes sense.
What are some low code tools you recommend?
Microsoft Power Apps is coming out pretty strong, and you don’t need to be a developer. Power Apps gives you much more control over how data is collected, so you end up with clean data.
What does it take to create an effective analytics process?
Oftentimes as data practitioners, others dump a pile of data on our desks and say, “Here is this treasure trove of data. Find some insights for me!”
But if we’re doing effective analytics—where we’re making use of timely, high-quality data—we have to understand what questions the end-users, the researchers, or the clinicians are trying to ask or trying to answer.
You also want to ask what kind of analytics product am I building and what is it that you’re going to do with this? Think of analytics as an iterative process. These kinds of analytic products are never done.
How important is collaboration when building an analytics-driven organization?
For many years I was a software developer, hiding in the back room and working by myself. I thought that was totally fine. But technology is changing so fast and it’s impossible to keep up with everything by yourself.
Things really changed for me when I got involved in the Power BI community. You can’t keep up with everything in isolation. But working with other people who are doing the same thing as you? It’s life-changing.
Collaboration is critical at every level. From gathering information to the development of the product itself, and then also to the deployment of the product.
This is not the first time humanity has experienced a pandemic. We can continue to learn from what we’ve done in the past and build on that. But we won’t succeed if we are working in isolation—whether that’s one company or one country.
We’re all trying to improve outcomes. That’s the end goal. Hopefully, we can all contribute to that together.
From public health to B2B, an analytics-driven organization lifts teams up to achieve unparalleled results. When you prioritize the need for effective analytics, you gain more clarity and understanding.
The Collectiv team is here to help you build an analytics center of excellence. Prepare your organization now so you’re ready for tomorrow’s challenges.