DataDrill: A Public Dataset for Formation Pressure Prediction and Kick Detection in Drilling Operations
By Bruno Silva Carneiro Mapurunga
Offshore Drilling Technical Consultant | SPE & IADC Member
Advances in artificial intelligence (AI) have made real-time formation pressure prediction and kick detection increasingly feasible in offshore drilling. However, the absence of publicly available, high-quality datasets remains a major barrier to accelerating innovation and academic-industry collaboration in this domain.
To help bridge this gap, I’m proud to introduce DataDrill, a publicly accessible dataset generated through realistic digital twin simulations. DataDrill is specifically designed to support the development and validation of intelligent algorithms for formation pressure prediction and kick detection in drilling rigs.
What’s Inside DataDrill?
-
2 engineered scenarios
• Formation Pressure Prediction
• Kick Detection with gas influx
-
Over 2,000 samples per scenario
-
28 drilling parameters including:
Weight on bit, hook load, ROP, wellbore pressure, pump flow rates, torque, and more
-
Technical validation using:
• Principal Component Regression (PCR): R² = 0.78
• Principal Component Analysis (PCA): RPD = 0.92
Applications and Relevance
DataDrill provides a much-needed foundation for:
-
Training and benchmarking machine learning models in drilling automation
-
Developing virtual sensors and digital twins for pressure prediction
-
Educational tools for well control, simulation, and safety response
The dataset structure allows for easy integration with common data science platforms and can be used for teaching, testing, or publishing predictive workflows.
Download and Cite
DataDrill is hosted on Zenodo and is free for academic and industrial use:
👉 https://doi.org/10.5281/zenodo.12759014
About the Author
I am a petroleum engineer and offshore drilling consultant with over 15 years of experience in deepwater rig operations, well control, and drilling optimization. As a member of SPE and IADC, I am committed to knowledge sharing and the advancement of digital technologies in our industry.
Join the Discussion
I invite fellow engineers, researchers, and students to explore the dataset, provide feedback, and collaborate on next steps. If you're developing algorithms, simulators, or training programs—DataDrill is yours to build upon.
For inquiries or collaboration:
📧 brunoscmapurunga@gmail.com