Estimating birth time based on contraction length
Replicating Steve Mould's birth prediction
After watching Steve Mould's video in which he meassured his wife's contraction length over time and used this data to predict the "exact" time of birth of their son, I tried to replicate his results:
Dataset
The first step is creating a dataset similar to the one on the video:
- Create column "A" with numbers increasing from 0 to 33 in increments of 1/3
=A1+(1/3)
. - Create column "B" rows with a random number between 30 + value of A and 60:
=RANDBETWEEN(30+A1,60)
in the first 50 rows. - From row 51 to 70 the random number should be between 30 + value of A and 65:
=RANDBETWEEN(30+A65,65)
. - From row 71 to 100 the random number should be between 30 + value of A and 70:
=RANDBETWEEN(30+A70,70)
. This mimics with upward trend of Steve Mould's data and also assured that the random values would converge. - The third column "C" contains the hour, minute and second in which each contraction started. I start at 5 AM and use a 10 minute increment for simplification (you could also use a random number between 3 and 15 minutes)
=C1+TIME(0,10,0)
. Add values until you get to 24 hours after the initial time. This dataset contains 100 random contraction duration at specific times. The contraction duration increases and converges, which resemble Steve Mould's dataset.
Estimating Data Dispersion
The next step was to calculate the upper and lower bounds that encompass most of the data:
- Paste the values of Time and Contraction Duration into another sheet to avoid the recalculation of the random numbers. Column "A" is Time and Column "B" is Duration.
- To estimate the upper and lower bound of the dispersion of the data using Mean and Standard Deviation of the previous 10 values using the following formulas: Column "C"
=AVERAGE(B1:B10)+(2*STDEV.S(B1:B10))
and Column "D"=AVERAGE(B1:B10)-(2*STDEV.S(B1:B10))
. Assuming a normal distribution, this assures that each set of values will encompass 95.45% of the data.
Estimating Lower and Upper Trendlines
With these bounds in place we can now calculate the curve that best fits the Column "C" and Column "D" . The curve drawn by Steve Mould looks like an exponential curve, thus we will try to approximate that:
The first thing that we need is to create a new column "E" which calculates the minutes since the first contraction. Column "E"
=(A2-$A$1)*1440
. This will be important for the calculation of the exponential curves.The next step is to use the Logarithmic Estimator from Excel to approximate the parameters m and b of each of the exponential curves. We will use Column "E" because the
LOGEST
function works bet with integer values and not times. (Remember to select 2 cells for each estimation and useCtrl
+Shift
+Enter
to get the result for both m and b.)- Upper bound
=LOGEST(C11:C100,E11:E100,TRUE,FALSE)
- Lower bound
=LOGEST(D11:D100,E11:E100,TRUE,FALSE)
- Upper bound
Next step is not very clear to me, but it worked. We take the Natural Logarithm of b to get *b** for both curves.
We create two new columns "F" and "G" in which we calculate the values of the exponential curves based on the parameters b** and *m.
- Upper bound
=$J$5*EXP($I$5*$E1)
where$J$5
is b** and$I$5
is *m - Lower bound
=$J$6*EXP($I$6*$E1)
where$J$6
is b** and$I$6
is *m
- Upper bound
Calculating the Birth Time
The last step is to calculate the intersection of the Upper and Lower bounds, and thus calculate the time of birth πΆπ» π₯³:
- To calculate the intersection we simply compare the values of Column "F" and Column "G".
=IF(AND($F1<=$G2,$F2>=$G2),"Birth","-")
This will output Birth where the two trend lines intersect and - if the do not. - Add titles to each of the columns for better understanding.
!!!!!! 4:00 AM birth !!!!!!
Final Thoughts and Ideas
I will definitely try this with my wife in the next month, when our πΆπ» will be born. Let's see what happens.
You can download the excel sheet I used from here.
Replicating an idea base only on a graph and minimal information is doable, but you will need to make many assumptions which could be incorrect.
Some work that can be done around this is:
- There is possibility to play with the estimation of the data dispersion by multiplying the Standar Deviation by a large number
=AVERAGE(B1:B10)+(3*STDEV.S(B1:B10))
thus getting a larger percentage of the values within the range. What I noticed was that the final intersection point does not change much for values from 2 to 4 times de STDEV. - Another idea is to use
LOOKUP
to get the time of birth into a cell. I tried it but do not seem to understand why it does not work. - Use random time increments to get time values for each contraction duration. (Contractions happen in intervals of 3 to 5 minutes with 30 to 60 second duration during active labor, as birth is closer the contractions get longer on average).