 ### Joeri Jongbloets

Almost MSc // Modeller in the lab // Fluent in Java, Python, and R

# Evolution in prolonged turbidostat cultivation

### Overview

Combined data of turbidostats 2015-05-28 and 2015-07-20

### Overview

Time plot of the OD during the `2015-05-28` experiment Time plot of the OD during the `2015-07-20` experiment Clean data using interquartile rule (IQR) and limit time span to clean data regions

The cleaned Optical Density (720nm) measurements for experiment 28-05-2015 The cleaned Optical Density (720nm) measurements for experiment 20-07-2015 ### Combining the two experiments

The Optical density measurments of the two experiments combined

The number of decisions per 5 hours # Growth rate calculation

To calculate growth rates, different methods are available. I will use some of the available methods as I trust these the most (and I have a reliable pipeline for them).

Reliable methods (used in this document):

• `slopes`: fits linear models to each decision cycle.
• `dxdt`: calculates change in OD over time

Unreliable methods (not used in this document):

• `dilution`: calculates dilution rate from dispense volume and estimated vessel volume.

This method is considered to be unreliable because of the huge variation in estimating the volume vessel which caused the dilution rate to vary a lot.

## Calculting growth rate with slopes method

Method works by fitting one linear model each dilution cycle describing the following equation. The estimated coefficient (`a`), standard error of the estimation, the p-value of the estimate (`a != 0`) and the R squared of the fit, the number of points included in the decision cycle and the number of points used to fit the model are recorded. Each fit is attributed to the time point of the decision (`t(D)`) that ends to cycle.

`log(OD) = a * t + c`

### Filter statistics

Plots below show the data and colour based on a selection metric. Goal is to determine selection criteria that reduce the spread of growth rates at each light regime. The median is indicated as the true value. Data is shown per experiment (columns) and channel (rows). Points with r squared below 0.5 are already removed.

Colour by r squared Coloured by fit_size Coloured by Slope Std. Error. Coloured by standard deviation in OD values used to fit the data. ### Cleaned data with filters

Remove points with R squared < 0.85 and fit size < 6.

Coloured by r squared (Scale changed!) ## Calculting growth rate with dxdt method

The dxdt method calculates the growth rate using the formula given below.

`mu = (dx) / (dt) = (log(ODmax) - log(ODmin)) / (tmax - tmin)`

At least three growth rates are calculated from the data using combinations of points. The median and standard deviation of the collected slopes is returned. The following combinations are calculated (if enough data is available).

1. t(min) vs t(max) (`fit_size > 3`)
2. t(min+1) vs t(max) (`fit_size > 4`)
3. t(min) vs t(max-1) (`fit_size > 4`)
4. t(min+1) vs t(max-1) (`fit_size > 5`)
5. t(min+2) vs t(max) (`fit_size > 5`)
6. t(min) vs t(max-2) (`fit_size > 5`)
7. t(min+2) vs t(max-1) (`fit_size > 6`)
8. t(min+1) vs t(max-2) (`fit_size > 6`)
9. t(min+2) vs t(max-2) (`fit_size > 7`)

### Filter statistics

Plots below show the data and colour based on a selection metric. Goal is to determine selection criteria that reduce the spread of growth rates at each light regime. The median is indicated as the true value. Data is shown per experiment (columns) and channel (rows). Points with standard deviation above 0.75 are already removed.

Colour by std. dev of slopes Coloured by fit size. Colour by std. dev of OD ### Cleaned data with filters

Remove points with Std. dev. of obtained sloped > 0.75 and a fit size < 6. # Analysis of growth data

Sliding window function Sliding windows of different sizes (5, 10, 20, 50, 100) over the data to calculate the mean # Compare growth rates at begin to end

The following significance labels are used:

• `ns`: P > 0.05
• `*`: P <= 0.05
• `**`: P <= 0.01
• `***`: P <= 0.001
• `****`: P <= 0.0001

## Using t-tests

T-test first 200 against last 200 (i.e 50 - 250 hours vs 650 to 850 hours).

Using unaggregated data. Using data from all window size per sliding window function (`mean` and `median`). Using data per window size of the `mean` sliding window function. Using data per window size of the `median` sliding window function. ### Calculate relative growth rate ## Using Linear models

Fitting linear model to all data (not aggregated using sliding window)  Fitting linear models using aggregated data from all sliding window sizes Fitting linear models to means of sliding window Fitting linear models to medians of sliding window ### Calculate relative growth rate # Thesis pictures   