Bricks & Bedside: Correlating Health Facility Availability with Mortality Rates Across Philippine Regions

A data-driven exploration of healthcare infrastructure and its potential impact on mortality rates in the Philippines

Overview

Despite advancements made in technology and medicine, access to proper healthcare remains limited for many Filipinos, possibly contributing to the country's high mortality rate up to this date.

As such, we intend to use this study to examine the factors influencing healthcare accessibility and to determine its correlation, if significant, with Filipino mortality rates, shedding light on the disparities in healthcare availability across different regions.

Background

Throughout the years, accessibility to healthcare and basic health services continue to be scarce and uneven in the Philippines.

In April of 2024, the World Health Organization stated that almost 800 million people in the Western Pacific region, including the Philippines, have insufficient access to basic healthcare services. This is further seen in the continuous increase of household expenditures from ₱686.49 billion in 2023 to ₱754.54 billion in 2024. Although the amount of facilities have increased, totaling to 40,108 health facilities in the country as of January 2024, this remains disproportionate among regions with CALABARZON having 4,415 facilities while other regions such as NCR and CAR having 2,784 and 1,278 facilities respectively.

Sources: [1] [2]

Problem

The inaccessibility of proper healthcare for many Filipinos has been exacerbated by the lacking amount of facilities and professionals in the industry as well as the increasing pricing of healthcare itself.

Solution

Our solution is to make use of data science methods and tools to analyze these factors that affect the accessibility of healthcare among Filipinos as a way to gain insights of the current situation of our healthcare system, giving way for feasible solutions to be implemented.

We aim to answer the following questions:

  1. How many health facilities are currently operational in each region?
  2. How does the availability of health facilities correlate to the mortality rate in each region?

and came up with:

Null Hypothesis

There is little to no correlation between the amount of operational health facilities in a region and the region's corresponding mortality rates.

Alternative Hypothesis

There is a significant correlation between the amount of operational health facilities in a region and the region's corresponding mortality rates.

Data Collection

Where we got our data:

Vital Statistics Report logo

Vital Statistics Report on Deaths in the Philippines 2023 - Philippine Statistics Authority

The 2023 yearly report on deaths in the Philippines. The dataset has a total of 698,421 observations.

ational Health Facility Registry logo

Active Facilities List - National Health Facility Registry

The master list of all active health facilities and health-related facilities. The dataset has a total of 118 observations.

Department of Health logo

Field Health Services Information System 2023 - Department of Health

The report on the national facility-based routine health information system used by the Department of Health to collect and manage data on health service delivery and public health programs. However, we only used data on the population in each province and region. The dataset has a total of 122 observations.

View our dataset here!

Data Exploration

How did we explore our data?

01

How many health facilities are currently operational in each region?

Bar chart showing total health facilities per region in the Philippines

What we found:

We began our exploration by taking the master list of active health facilities, cleaning and normalizing each "Region" label, then computing a new "Total Facilities" column (summing hospitals, RHUs, barangay stations, birthing homes, etc.). Next, we grouped that table by region and plotted a simple vertical bar chart of total facilities per region. That visualization let us immediately spot which regions (e.g. Region IVA, Region V) had the largest networks of clinics and which (e.g. BARMM, Region IX) had the fewest, guiding our subsequent per-capita and mortality-rate analyses.

💡

Some regions have significantly more health facilities than others, highlighting infrastructure disparities.

02

How does the availability of health facilities correlate to the mortality rate in each region?

Scatter plot showing correlation between health facility density and mortality rate by region

What we found:

A scatter plot with each region having a dedicated data point. Each point's x-value is the region's health facility density ([amt. of health facilities / total population of the region] * 10000) and its y-value is its respective mortality rate ([amt. of deaths / total population of the region] * 10000). A best fit line created using least squares regression shows a positive relationship between a region's health facility density and its mortality rate.

💡

Contrary to initial expectations, regions with higher health facility density sometimes show higher mortality rates, suggesting complex socioeconomic factors at play.

Hypothesis Testing Results

Pearson's Correlation Coefficient:

r = 0.169
p-value = 0.518

Indicating a positive but statistically insignificant correlation

We employed the Pearson's r correlation coefficient test to test for the correlation of the independent variable of our study, the health facility density of a region (per 10,000 people), with the dependent variable of our study, which is the mortality rate of a region (per 10,000 people). We executed this test using Python, as well as a CSV created from our dataset with the data for Region Name, Deaths, Population, Mortality Rate (MR), and Health Facility Density (HFD) -- which were used in creating the graph for Research Question 3. With this test, we found a correlation coefficient of 0.169, suggesting that there is indeed a correlation between HFD and MR. With that in mind, however, it is worth noting that the p-value computed by the correlation test is greater than the standard 0.05, at a value of 0.518. This means that 0.169 finding is statistically insignificant – or that the observed results could occur randomly, and not prove any relationship at all between the two variables. We have then failed to reject the null hypothesis.

View how we did our hypothesis testing here

Discussion

Study Limitations

  • Small sample size (18 regions) limiting statistical power
  • Lack of data on other contributing factors (poverty, geography, etc.)
  • Region-level analysis restricts granularity of insights
  • No access to 2024 data for more current analysis

Due to the limited metrics available in our current dataset, our sample size of 18 regions was not "sufficiently large" for the data to be treated as normally distributed. Thus, we failed to reject the null hypothesis set in this study: that there is little to no correlation between healthcare facility density and mortality rate in a region.

Be that as it may, we believe our exploration and analysis of the dataset remain relevant to the problems we set out to investigate. The scatter plot shows that, even though the correlation between the two variables was statistically insignificant, there may still be a positive relationship: increasing the number of healthcare facilities in densely populated regions appears to coincide with higher mortality rates in those same regions.

💡

One possible interpretation is that an increase in healthcare facilities does not directly raise mortality rates, but instead leads to more accurate and complete recording of deaths. In other words, improved healthcare infrastructure might enhance logistical capabilities, such as data collection and record-keeping, rather than affect mortality rates directly.

What conclusions or assumptions can be made from this? We believe that due to the lack of data on other possible factors influencing mortality rates—such as poverty levels, unemployment, geographical conditions, and more—we were unable to establish meaningful connections to explain this relationship.

Conclusion

While our current findings do not support a statistically significant correlation, they underscore the importance of a multidimensional approach in health systems research in a country as complex as the Philippines. The lack of strong correlation does not necessarily imply the absence of a relationship, but rather points to the limitations of analyzing health outcomes without considering other crucial variables.

Recommendations for Future Research

  • Include a broader range of variables (socioeconomic indicators, specific health outcomes, etc.)
  • Analyze at city/municipality level rather than regional level
  • Incorporate more recent and comprehensive datasets
  • Investigate specific causes of death in relation to healthcare facility types

These findings should serve as a foundation for more comprehensive and inclusive studies in the future. Conducting future research can better identify the root causes of regional disparities in health outcomes and contribute to evidence-based solutions. In this way, data science can become a powerful tool not just for understanding the state of public health, but for enacting substantial, lasting change across the country.

"Our results suggest that improved healthcare infrastructure may lead to better data reporting rather than higher mortality, raising questions that require deeper investigation."

Meet the Team!

Team member MK Pau-tin

Muhamad Khaled Pau-tin

II - BS Computer Science | CS 132 WFV

Team member Sean Kenji Tolentino

Sean Kenji Tolentino

III - BS Computer Science | CS 132 WFV

Team member Nathan Zuñiga

Nathaniel Silas Zuñiga

III - BS Computer Science | CS 132 WFV