Blog

Redefining Possibilities Through Data Science, Inspired by Purpose

By Michael Ochola and Christine Ger Ochola Can you tell us a bit about yourself and your journey leading up to your involvement with the Data Science Without Borders (DSWB) initiative? My name is Michael Ochola, my friends and family like to call me Mike. I grew up in Kisumu, located in the western part of Kenya on the outskirts of the city, with a blend of rural and urban experiences. As a young boy, I enjoyed playing soccer, watching movies, cycling, and was fascinated by planes. We made paper planes and enjoyed seeing them fly. During my primary and high school years, I developed a keen interest in sciences and mathematics, which eventually led me to pursue a course in computing. Fast forward as an undergraduate, data science was at its formative stages, and most of the tools we enjoy today were probably being conceptualized by the founding scholars of data science. I remember taking a course in my fourth year in Artificial Intelligence (AI), though what I learned then was centered around programming logic and the application of heuristics in solving complex computational challenges. An AI class today would be very interesting with buzzwords like machine learning, deep learning, and even federated learning. I later started working as a computer analyst. My first assignment was to develop a web application to manage data for longitudinal population dynamics study, under the guidance of great minds on population dynamics research in Africa. This is where my fascination with data science truly took shape. I remember an occasion when we were doing some data exploration, and my supervisor then could tell that ‘something was a mess’ with the data. I thought he possessed some ‘magical powers’ to be able to tell such a difference between ‘good’ and ‘bad’ data, just like being able at this dispensation, to tell AI-generated vis-a-vee actual content. I wanted to have this ‘power’ too. In hindsight I later learned, it was all about understanding probability distributions; most of the data we collected then was tested against specific probability distributions.  My passion for using data to solve complex challenges and drive impactful decisions led me to strategic roles that honed my skills in data standardization, harmonization, integration, machine learning, model development, and deployment. This led to my natural progression to DSWB, where I collaborate with like-minded professionals across Africa, fostering innovation and knowledge-sharing to address some of the continent’s most pressing health and population challenges. How has being part of the Data Science Without Borders (DSWB) initiative contributed to your growth? Being part of DSWB has significantly broadened my professional horizons. It has provided opportunities to work on high-impact projects, collaborate with a diverse network of experts, and access cutting-edge tools and technologies. The initiative has also enhanced my leadership skills, particularly in managing cross-functional teams and delivering training workshops that empower others to harness the power of data science. How has data science transformed your work? Data science has fundamentally reshaped how I approach problem-solving and decision-making. It has provided me with the skills to develop tools and methodologies to extract meaningful insights from different data formats, enabling evidence-based strategies. With a vibrant data community, open-source tools, and cloud technology, we have used data science to streamline our research processes, enhance data quality, and ensure real-time collaboration among global partners. This is not only applicable to work but can transcend to personal life, for instance, prudent financial management by developing or customizing available tools to visualize one’s monthly or yearly spending by feeding mobile money transactions or monthly bank statements to such algorithms. Therefore, transformation is limitless, bounded only by the scope of one’s imagination and willingness to try out new approaches. Can you share a specific example of a challenge you faced and how data science helped you overcome it? One notable challenge was managing disparate longitudinal mental health datasets collected across multiple sites from different mental health studies in Africa. These datasets had varying formats and standards, making it difficult to derive actionable insights to inform mental health policy recommendations. By leveraging data science techniques, we developed a central data warehouse and implemented an Extract Transform and Load (ETL) pipeline to standardize and integrate the datasets into a unified schema and format. This solution not only improved data accessibility but also enabled seamless analysis and visualization, significantly enhancing the research outcomes. We believe in innovation; the data warehouse schema is dynamically designed to ingest not only mental health but also demographic surveillance system data. DSWB allows us to re-use the technology with some of the pathfinders’ longitudinal data. What do you consider your biggest accomplishment in applying data science, and how has it impacted others around you? Data science has empowered me to develop solutions that transcend the limitations of conventional programming. Consider tasks like categorizing emails as spam or determining the likelihood of a loan applicant defaulting challenges where traditional software approaches often fall short. Leveraging data science techniques, I successfully implemented a predictive model for a digital lending application, which not only reduced loan processing times from hours to mere seconds but also significantly lowered the default risk from nearly 40% to just 10%. At the African Population and Health Research Center (APHRC), we leverage data science to transform the discovery and accessibility of African research data, striving to ensure that future AI products are developed free from bias occasioned by lack of data or metadata discoverability. What inspires you to keep pushing boundaries in data science, and what message would you share with others who are just starting out? What inspires me is the transformative potential of data science to address critical issues and improve lives across Africa. The ability to turn raw data into actionable insights that drive policy and innovation keeps me motivated. To those starting out, I would say: embrace curiosity, seek out strategic opportunities to learn, and don’t shy away from challenges. Data science is a journey of continuous discovery, and the impact you can make
Continue Reading

The Power of Data Science for Personalized and Predictive Health

Written by Joseph Mutura Kuria, with contributions from Christine Ger Ochola Picture this! It was a very warm night, you struggled with sleep. You wake up feeling unusually irritable, skip your morning run, and instead spend extra time scrolling through your phone. Your fitness app records your inactivity, while your WhatsApp status hints at a dip in the mood. Later in the day, a notification pops up from your health app: “How are you feeling? It’s been a long time since your last therapy session. Would you like to schedule one?” Or perhaps it’s even smarter enough to have already booked a session for you. Data science is revolutionizing personal health management by integrating diverse data points, physical activity, mental health indicators, and social habits. This approach allows individuals and healthcare providers to anticipate and address health issues before they escalate. The ability to collect, harmonize, and analyze large and diverse datasets is driving a paradigm shift in how we approach healthcare delivery, research, and public health policy formulation. Health is more than just clinical metrics; it’s a product of physical activity, diet, mental well-being, socioeconomic conditions, environmental, and genetic factors. These are mirrored by mobile devices, wearable technologies, social media platforms, genomic data, climate data, and pandemic response data, among other dimensions. By combining wearable health data with genomic insights, socioeconomic indicators, climate data, and pandemic response data, we can identify at-risk populations, design targeted interventions, and optimize resource allocation. Automating data flow for analysis and prediction is essential to unlock the full potential of these datasets. Automated pipelines enable real-time data ingestion, cleaning, and transformation for advanced analysis and predictive modeling. Machine learning algorithms can then identify patterns, forecast disease outbreaks, and personalize care recommendations, accelerating insights and reducing the time between data collection and actionable interventions. However, Africa faces significant gaps in health data availability, with many healthcare systems still relying on paper-based records and limited digital infrastructure. Sharing remains a challenge due to fragmented systems, lack of standardization, and concerns over data ownership and privacy. Robust data governance frameworks are essential for ensuring data security, privacy, and ethical use, but many African countries lack clear policies and regulations, making it difficult to manage data effectively while fostering trust among stakeholders. Political instability in some regions can also exacerbate these challenges. The integration of data science into African healthcare systems has the potential to revolutionize the continent’s approach to health. By overcoming current challenges, Africa can: Develop precision public health: Tailored interventions for specific populations based on real-time data. Enhance disease surveillance: Use predictive analytics to forecast and mitigate outbreaks like malaria or cholera. Improve resource allocation: Optimize the distribution of medical supplies and personnel to underserved regions. Foster collaboration: Create centralized data platforms to enable cross-country research and innovation. Strengthen pandemic response: Leverage data science to predict, monitor, and respond to outbreaks effectively, ensuring timely interventions and resource allocation. Imagine a future where healthcare is truly personalized, with diagnoses and prescriptions informed by every aspect of your life. This data-driven approach not only benefits individuals but also strengthens community, national, and continental healthcare systems. By harmonizing diverse datasets and integrating social determinants of health, we can build a future where health is equitable, proactive, and deeply informed by the richness of human experiences.  
Continue Reading

Advancing Data Science for Health in Africa

By Christine Ger Ochola, with contributions from Miranda Barasa. Refine, Reflect, Reinvent: A perfect theme for the Data Science Without Borders (DSWB) Annual General Meeting 2025 that recently took place in Dakar, Senegal. The meeting brought together stakeholders from all around the continent and beyond for three days of collaboration, innovation, and strategic planning. The event not only showcased Africa’s commitment to leveraging data science to achieve revolutionary health outcomes, but it also promoted discussions about progress, challenges, and prospects for a data-driven society.   (Video) Highlights from the DSWB Annual General Meeting 2025. Pathfinder institutions from Ethiopia, Cameroon, and Senegal shared progress on data exploration and mapping, capacity building, and digitization, while technical partners showcased innovations in data harmonization, open science, and responsible AI. The meeting emphasized the importance of good data governance and infrastructure to maximize the potential of digital health systems across Africa and beyond. Other major considerations for developing health data ecosystems included standardized vocabularies, data-sharing agreements, and strong privacy safeguards. Expanding data science projects across the continent, particularly in Francophone countries, was identified as a top priority for ensuring inclusive growth. The role of data in driving self-sustainability was also discussed during the meeting, with emphasis placed on domestic financing, stronger privacy protections, and enhanced local capacity to manage and use data effectively. It was also noted that infrastructural gaps continue to persist, particularly for institutions that are shifting from paper-based to digital systems. The DSWB Pathfinder and Partner institutions shared their experiences in developing strong data ecosystems in Africa and identified key areas for improvement. DSWB’s objective continues to be on the development of a skilled workforce through targeted capacity building. A needs assessment highlighted institutional data capabilities that fell into three categories: proactive, stable, and reactive, and identified existing gaps in data science concepts that require additional training. DSWB has played a critical role in empowering researchers in the three African institutions in concepts around data standardization and harmonization, AI, machine learning, and data governance. A harmonized approach to training and mentorship across sites was identified as a priority for scaling impact. Discussions about Africa-specific AI models stressed the necessity of tackling biases in healthcare algorithms and providing openness and explainability in AI-powered systems. A major shift in attitude was also observed, with data being treated as a renewable asset that can drive continual innovation and improved healthcare outcomes rather than an unlimited resource. The Health Information Exchange (HIE) was recognized as a crucial tool for integrating health systems and enabling real-time data sharing. Efforts by the Africa CDC to promote interoperability through standardized guidelines, assessment frameworks, and national data strategies are gaining momentum. The success of national platforms demonstrated the potential for cross-sectoral collaboration and strengthened health data management.  Additionally, leveraging insights from the needs assessment conducted during the past year was identified as critical for designing future interventions. The assessment provided valuable insights into infrastructure and capacity gaps, ensuring that upcoming initiatives align with institutional needs. There are also opportunities to scale this assessment across more institutions to broaden its impact. Supporting MSc and PhD students within the project emerged as another key priority. Their research presentations at the AGM highlighted the potential for cutting-edge research within the DSWB network. Moving forward, structured mentorship, funding, and collaboration opportunities will be developed to ensure their success and meaningful contributions to data science innovation in Africa. Strengthening partnerships was emphasized as a strategy to enhance impact and sustainability. As DSWB moves forward, developing and strengthening strategic relationships with key stakeholders—including governments, academic institutions, funding agencies, and the private sector—will be crucial. There was also a call to identify new joint grant opportunities that leverage multi-country diversity and community research priorities, with upcoming funding calls being explored for potential applications. The AGM also reinforced the importance of aligning DSWB with similar continental projects and initiatives. Discussions focused on increasing visibility and deepening partnerships with organizations such as CODATA, DSI Africa, and Deep Learning Indaba to further expand the reach and impact of the initiative. A final major takeaway was the need to develop tangible DSWB-led innovations and products. This will be a priority in the coming year, with innovations supported through student projects. Pathfinder institutions were urged to support DSWB fellows in their data requests to ensure meaningful research and product development. As the second year begins, the DSWB community is committed to tackling challenges in data sharing, infrastructure, and capacity building. This includes strengthening governance frameworks, investing in digital infrastructure, enhancing AI and machine learning training, fostering cross-institutional collaboration, developing innovative data products, and exploring funding opportunities to support long-term sustainability. The DSWB Annual General Meeting reaffirmed Africa’s potential to lead in data science and health innovation. By embracing collaboration, ethical data use, and cutting-edge technology, Africa is well-positioned to transform its healthcare landscape. The journey continues with a shared commitment to building a stronger, data-driven future. View The DSWB AGM 2025 Gallery
Continue Reading

Building Data Science Capacity Across Africa

Written by Christine Ger Ochola with contributions from Agnes Kiragga The Data Science Without Borders (DSWB) project team recently embarked on a pathfinder tour to Ethiopia, Cameroon, and Senegal. This tour marked a significant milestone in enhancing data science capacity for health across Africa. Launched in February 2024, the DSWB project will operate for the next three years. This project is a collaborative initiative receiving technical oversight from the Africa Centers for Disease Control and Prevention (Africa CDC). DSWB will be implemented in three African institutions, including the Armauer Hansen Research Institute (AHRI) in Ethiopia, Douala General Hospital (DGH) in Cameroon, and the Institute for Health Research—Epidemiological Surveillance and Training (IRESSEF) in Senegal, with leadership from the project lead, Dr. Agnes Kiragga from the African Population and Health Research Center, Kenya. In collaboration with key technical partners such as the London School of Hygiene and Tropical Medicine (LSHTM), UK, the Committee on Data (CODATA), France, Makerere Artificial Intelligence Lab (Mak AI Lab), Uganda, and the Open Science Program Office (OSPO Now), UK, The project’s primary goal is to co-design strategies that leverage advanced data science tools, including machine learning (ML) and artificial intelligence (AI), that will leverage locally generated data sets to address locally derived research questions that aim to improve African health outcomes. The team was hosted at the Armauer Hansen Research Institute (AHRI), led by the Director General, Prof. Afework Kassi, and the site principal investigator, Dr. Alemseged Abdissa. The team met with several delegates from the different regional health bureaus, Oromia and Amhara, Health and Demographic Surveillance Sites, and the leadership of the ALERT Hospital, one of six national referral hospitals. The discussions focused on the reconstruction and digitalization reforms in Ethiopia, which have evolved over the 53 years of AHRI’s existence. Additionally, developing data-sharing frameworks and policies has improved AHRI’s access to data from other government systems, enhancing their surveillance efforts. AHRI is currently building a data center with three different data storage systems, with plans to digitize all of them. The visit underscored opportunities leveraging local datasets, set systems for prioritizing research questions, and identified key opportunities for training that will be covered through four PhD scholarships at AHRI. The DSWB project received technical oversight from the Africa Centers for Disease Control and Prevention (Africa CDC) and visited the Directorate of Science and Innovation, Dr. Musoka Papa Fallah, Dr. Elvis Temfack, and Dr. Nebiyu Derebe. Discussions focused on aligning the project with the Africa CDC’s broader goals and strategies for building data science capacity and aligning the efforts with continental plans for health information exchange and workforce development for data professionals across Africa. The DSWB team then visited the Douala General Hospital (DGH) Cameron, where the team, led by Dr. Bertrand Hugo Mbathchuot, highlighted the potential of hospital records to drive informed clinical decisions. The discussions emphasized the need for robust electronic health systems and AI training to enhance healthcare in Cameroon. Several datasets were identified for AI and machine-learning model development to improve patient care. The visit concluded with a grand tour of the Douala General Hospital and a special session on data science for medical students and hospital workers. The project will also train several master’s students to support data usage and install AI and ML models on local datasets. Last but not least, the tour ended in Senegal, where the team was hosted by Prof. Souluman Mboup, the Executive Director of the Institute for Health Research—Epidemiological Surveillance and Training (IRESSEF), and the site principal investigator, Dr. Mousa Sarr. Key discussions included the use of available research and program data sets generated at the Institute and more unique datasets, such as electronic health datasets from the Hospital Military De Ouakam, a key partner with IRESSEF. Several research questions were prioritized and will be led by local researchers and students and supported by the project technical partners. We learned that Senegal is advancing in data science with the establishment of a data center and supercomputer in Diamniadio and has developed a national data and AI strategy in 2023. Early-stage AI projects are emerging in both the public and private sectors, including the use of a robot for TB screening at the IRESSEF bio-medical laboratory. As the DSWB team reflects on the insights and progress made during these visits, they are committed to building on this momentum over the next three years to have African researchers lead data-science-driven research projects, develop data science capacity, and identify cross-cutting themes that will be addressed through multi-country data sharing platforms. This African-led partnership reflects the current drive to enhance local collaborations to solve challenges that cut across the continent and speak to achieving Africa’s Agenda 2063—to improve public health and work towards the Africa we want. We are grateful for the support and funding from Wellcome and other partners. View Pathfinder Tour Gallery
Continue Reading