Please send questions to wambaugh.john@epa.gov or aemeade7@gmail.com
from “Development and analysis of high throughput physiologically based pharmacokinetic/toxicokinetic (PBPK/TK) dermal exposure model”
Annabel E. Meade, Celia M.Schacht, Marina V Evans, Alex George, Rachael Cogbill, Risa R. Sayre, and John F. Wambaugh
Dermal absorption of chemicals represents an important route of exposure in pharmaceutical, occupational, and environmental settings. There are thousands of chemicals in use with little or no toxicity or toxicokinetic data, and dermal absorption data are lacking, unsurprisingly, for many of these chemicals. One alternative is to estimate human toxicokinetics using high-throughput methods. Accordingly, the aim of this study was to develop a generalized physiologically based pharmacokinetic/toxicokinetic (PBPK/TK) dermal exposure model for the R package httk. For dermal exposures this model can be used in a high-throughput manner to estimate human blood and tissue concentrations and estimate risk for many chemicals. The structure of the dermal PBPK/TK model was based on Campbell, Clewell, Gentry, Anderson, and Clewell, Computational Toxicology, 929, 439 (2012). Chemical-specific metabolism and protein binding data were obtained from the literature as collected by R package httk. The physiochemical property data needed for the model were obtained from the EPA CompTox Chemicals Dashboard. Here we have compared two different in vitro methods for estimating chemical-specific and vehicle-specific permeability in the skin: 1) Potts-Guy [Potts and Guy, Pharm Res, 9, 663 (1992)] and 2) Surrey [Chen, Han, Saib, and Lian, Pharm Res, 32, 1779 (2015)]. The model was constructed to allow comparison of dermal, oral, and intravenous exposures. Over 26 exposure scenarios across 13 chemicals were modeled and compared to published concentration-time in vivo data from the EPA CvT database [Sayre, Wambaugh, and Grulke, Scientific Data, 7, 122 (2020)]. Of these 13 chemicals, four are pharmaceuticals while the other 9 are occupationally or environmentally relevant, 9 of the 13 are lipophilic (log of the octanol to water partition coefficient, logP > 2), three of the 13 have low water solubility (< 10 mg/L), four of the 13 have high water solubility (> 1,000 mg/L), and eight of the 13 are considered volatile. The Root Mean Squared Log10 Error (RMSLE) between log-transformed simulated and observed concentrations was calculated to determine how well the model captures the behavior of this data when using the methods for dermal permeability. Based on these results, the Potts-Guy methods slightly out-perform the Surrey method of calculating dermal permeability for more volatile chemicals and the vehicle is a large factor in method success, where Potts-Guy/octanol and Surrey/water combinations generated the best fits. We also found that exhalation is an important route of excretion for volatiles, in particular. The views expressed in this abstract are those of the authors and do not necessarily represent the views or the policies of the U.S. Environmental Protection Agency.
This vignette was created with httk v2.3.3. Although we attempt to maintain backward compatibility, if you encounter issues with the latest release of httk and cannot easily address the changes, historical versions of httk are available from: https://cran.r-project.org/src/contrib/Archive/httk/
Prepare for session Result 1: Model Verification Usint In Vivo Data Step One: Load Data Display which chemicals are being looked at Step Two: Clean and Format Data
If you are using the RMarkdown version of this vignette (has the file
extension .RMD) you will be able to see that several chunks of code in
this vignette have the statement eval = FALSE
. The next
chunk of code, by default, sets run.simulations = FALSE
.
This means that the code is included (and necessary) but was not run
when the vignette was built. We do this because some steps require
extensive computing time and CRAN has limits how long we can spend
building the package. If you want this vignette to work, you must run
all code, either by cutting and pasting it into R. Or, open the .RMD
file in RStudio and either change run.simulations to TRUE or press
“play” (the green arrow) on each chunk in RStudio.
We set a variable run.simulations
. This will run
simulations and regenerate data files. If this is set to
TRUE
, make sure you have the following data file:
and if run.simulations = FALSE
, make sure to have the
following files in your working directory:
# Set whether or not the following chunks will be executed (run):
<- FALSE # Change FALSE to TRUE here to make the vignette work
run.simulations <- FALSE # change to TRUE to make the plots
make.plots .2 <- FALSE #for second part of vignette
run.simulations.2 <- FALSE # change to TRUE to make the plots make.plots
Here, we set our working directory (the location of this file and the
necessary data files) and load necessary packages. Please reset
your working directory below to loc.wd
or the code will
produce an error.
We also set the location of the R package httk
using the
name loc.httk
so that we can run the functions
modelinfo_dermal.R
and
modelinfo_dermal_1subcomp.R
file later in the code. Note
that if you have run.simulations = FALSE
, the variable
loc.httk
is not used, so you do not need to provide or
change it.
# Set working directory
<- "C:/Users/jwambaug/OneDrive - Environmental Protection Agency (EPA)/Profile/Documents/Research Projects/DermalTK/vingette"
loc.wd #loc.wd <- "/Users/ameade/Library/CloudStorage/GoogleDrive-aemeade7@gmail.com/My Drive/EPA/HTTK-Dermal/HTTK-Dermal-Vignette"
#loc.wd = "C:/Users/cschacht/OneDrive - Environmental Protection Agency (EPA)/Profile/Documents/CCTE_2023_2024/dermal/thisone"
# Set location of httk package
#loc.httk <- "C:/Users/jwambaug/git/httk-dev/httk"
# Load packages
<- c("httk","dplyr",
packages "tidyverse","xlsx","Metrics", #for data manipulation
"ggplot2","ggforce","ggpubr","ggrepel","viridis", #for plotting
"knitr", "ggpubr","grid","ggh4x","readr","ggforce","knitr","tidyr",
"stringr") #to make pretty visual tables
sapply(packages, require, character.only=TRUE)
#> Loading required package: dplyr
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
#> Loading required package: tidyverse
#> ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
#> ✔ forcats 1.0.0 ✔ stringr 1.5.1
#> ✔ lubridate 1.9.4 ✔ tibble 3.2.1
#> ✔ purrr 1.0.4 ✔ tidyr 1.3.1
#> ✔ readr 2.1.5
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
#> Loading required package: xlsx
#>
#> Loading required package: Metrics
#>
#> Loading required package: ggforce
#>
#> Loading required package: ggpubr
#>
#> Loading required package: ggrepel
#>
#> Loading required package: viridis
#>
#> Loading required package: viridisLite
#>
#> Loading required package: knitr
#>
#> Loading required package: grid
#>
#> Loading required package: ggh4x
#> httk dplyr tidyverse xlsx Metrics ggplot2 ggforce ggpubr
#> TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> ggrepel viridis knitr ggpubr grid ggh4x readr ggforce
#> TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> knitr tidyr stringr
#> TRUE TRUE TRUE
# Make sure we're using a clean version of the httk chemical library:
#reset_httk()
First, we want to verify our model by testing it against in vivo data from literature.
We have some data which is taken from the CvT Database (Sayre, Wambaugh, and
Grulke, 2020) and some data which has been prepared for the database
but is not yet added to the Database, denoted as noSQL
since it is not yet in the SQL space. The CvT Database contains
structured data in the form of separate datasets linked with foreign
keys (“fk”). Below is the entity-relationship for each of the
databases.
To learn more about where these datasets come from, please see
R_CvT_Database
folder, which contains files used to
generate the data below.
# Load CvT table database
load(paste0(loc.wd,"/R_CvT_Database/CvT_all_tables_2022-08-17.Rdata")) #Necessary file that is not included yet
# Load CvT data that is not yet in CvT database (Annabel did this)
load(paste0(loc.wd,"/R_CvT_Database/CvT_all_tables_2023-07-05_noSQL.Rdata")) #Necessary file that is not included yet
#Add cumulative_amount column in SQL DB_series (we don't use any cumulative amounts from SQL data in final results)
<- DB_series %>% mutate(cumulative_amount=NA)
DB_series
# Bind data sets from database and noSQL collection.
<-DB_documents_noSQL[colnames(DB_documents_noSQL) %in% colnames(DB_documents)] #Keep only columns from DB_documents_noSQL that are also found in DB_documents
DB_documents_noSQL<- rbind(DB_documents,DB_documents_noSQL)
DB_documents <-DB_studies_noSQL[colnames(DB_studies_noSQL) %in% colnames(DB_studies)]
DB_studies_noSQL<- rbind(DB_studies,DB_studies_noSQL)
DB_studies <- rbind(DB_chemicals,DB_chemicals_noSQL)
DB_chemicals <- rbind(DB_subjects,DB_subjects_noSQL)
DB_subjects <- rbind(DB_series,DB_series_noSQL)
DB_series <-DB_conc_time_values_noSQL[colnames(DB_conc_time_values_noSQL) %in% colnames(DB_conc_time_values)]
DB_conc_time_values_noSQL<- rbind(DB_conc_time_values,DB_conc_time_values_noSQL) DB_conc_time_values
Next, we need to clean the data and filter out NA
results. First, we join the needed data sets using their foreign keys. *
DB_conc_time_values
* DB_series
*
DB_studies
* DB_subjects
*
DB_documents
: Note that only the PubMed ID (PMID) of each
extraction article is saved from this data set.
Next, we remove columns that are duplicated in each data set and are
not used here. These columns in each data set, listed below, tell the
user who created and updated the SQL files and additional comments by
the data curator or article author. * created_by
*
updated_by
* rec_update_dt
*
rec_created_dt
* curator_comment
*
author_comment
( not in DB_conc_time_values
or
DB_subjects
) We also remove any data that: * does not have
a defined DTXSID for the chemical * does not have a recorded
concentration * does not have a defined recorded medium (e.g., plasma,
blood, urine, etc.) * where the analyte measured does not match the test
substance used to dose the individual
Finally, after cleaning this data set, called df.full
,
we filter so that we only have data for chemicals that have dermal data.
Our final data set is called df
.
# Join necessary DBs: conc_time_values, series, studies, subjects
<- DB_conc_time_values %>%
df.full left_join(DB_series,by=c("fk_series_id"="id")) %>%
left_join(DB_studies,by=c("fk_study_id"="id")) %>%
left_join(DB_subjects,by=c("fk_subject_id"="id")) %>%
left_join(DB_documents[,c("id","pmid")],by=c("fk_extraction_document_id" = "id")) %>%
::rename(extraction_document_pmid=pmid) %>%
dplyrselect(!(ends_with(".x") | ends_with(".y"))) %>%
filter(!is.na(test_substance_dtxsid)) %>% #remove series w/o defined chemicals (Changed from study to series AG)
filter( !is.na(conc)) %>% #filter out NA data
filter(!is.na(conc_medium_normalized)) %>% #remove series w/o defined medium (i.e., plasma, blood, urine, etc.)
filter(test_substance_dtxsid==analyte_dtxsid | is.na(analyte_dtxsid)) #analyte (measured substance) and test substance must be the same
<- distinct(df.full)
df.full
# Remove empty cells
<- type_convert(df.full)
df.full <- df.full %>% filter(!is.na(conc))
df.full
# Get list of chemicals that have dermal data
<- df.full %>% filter(administration_route_normalized=="dermal")
df.dermal <- unique(df.dermal$test_substance_dtxsid)
dermal_dtxsid
# Only keep data for chemicals that have dermal data
<- df.full %>% filter(test_substance_dtxsid %in% dermal_dtxsid) df
<- df %>% mutate(Chemical = get_chem_id(dtxsid=test_substance_dtxsid)$chem.name) %>% dplyr::count(Chemical,test_substance_dtxsid,extraction_document_pmid,sort=F) %>% dplyr::rename(Number.of.Data.Points = n)
table row.names(table) <- 1:nrow(table)
::kable(table, row.names=TRUE,caption="List of chemicals with dermal data before cleaning data.")
knitrwrite.table(table,
file=paste0(loc.wd,"/tables/ChemicalswDermalData.txt"),
col.names=TRUE,
row.names=FALSE,
quote=FALSE,
sep="\t")
Next, We look at several chemical factors with respect to our data. Thus, we ran a batch search of the 13 chemicals in our study in the CompTox Dashboard and now load information about these chemicals (Boiling point, Volatility, etc.) and save this as ‘supptab1_meade2023’.
# Load chemical info
<- read.xlsx(file=paste0(loc.wd,"/CCD-Batch-Search_2022-11-29_07_06_35.xlsx"),
supptab1_meade2023 sheetName="Main Data") #A separate download was needed for this file AG 12/6/23
# Rename chemical properties and change units
<- supptab1_meade2023 %>% mutate(Water.Solubility.mg.L = WATER_SOLUBILITY_MOL.L_OPERA_PRED*AVERAGE_MASS*1000,
supptab1_meade2023 Vapor.Pressure.mmHg = VAPOR_PRESSURE_MMHG_OPERA_PRED,
Boiling.Point.C = BOILING_POINT_DEGC_OPERA_PRED,
logP = OCTANOL_WATER_PARTITION_LOGP_OPERA_PRED,
MW = AVERAGE_MASS,
%>%
) select(c(PREFERRED_NAME,DTXSID,Water.Solubility.mg.L,Vapor.Pressure.mmHg,Boiling.Point.C,logP,MW))
# Add column for "Volatility
<- supptab1_meade2023 %>% mutate(Volatility = ifelse(Boiling.Point.C<=75,"Very Volatile",
supptab1_meade2023 ifelse(Boiling.Point.C>400,"Not Volatile",
ifelse((Boiling.Point.C>=250)&(Boiling.Point.C<=400),"Semi-Volatile","Volatile"))),
# Rename long chemicals to nick-names
PREFERRED_NAME = ifelse(PREFERRED_NAME=="Methyl tert-butyl ether","MBTE",PREFERRED_NAME),
PREFERRED_NAME = ifelse(PREFERRED_NAME=="3,3',5,5'-Tetrabromobisphenol A","Bromdian",PREFERRED_NAME),
PREFERRED_NAME = ifelse(PREFERRED_NAME=="4,4'-Sulfonyldiphenol","Bisphenol S",PREFERRED_NAME),
PREFERRED_NAME = ifelse(PREFERRED_NAME=="Dichloromethane","DCM",PREFERRED_NAME),
PREFERRED_NAME = ifelse(PREFERRED_NAME=="Tetrachloroethylene","PERC/TCE",PREFERRED_NAME),
PREFERRED_NAME = ifelse(PREFERRED_NAME=="1-Methylbenzene","Toluene",PREFERRED_NAME))
$Volatility <- factor(supptab1_meade2023$Volatility, levels=c("Not Volatile","Semi-Volatile","Volatile","Very Volatile"))
supptab1_meade2023#save chemical dataframe
save(supptab1_meade2023,file=paste0("meade2023_",format(Sys.time(), "%b_%d_%Y"),".Rdata"))
c(3:7)]=signif(supptab1_meade2023[,c(3:7)],3)
supptab1_meade2023[,::kable(supptab1_meade2023,
knitr#row.names=TRUE,
caption="Chemical properties",
floating.environment="sidewaystable",
align="c")
write.table(supptab1_meade2023,
file=paste0(loc.wd,"/tables/ChemicalProperties.txt"),
col.names=TRUE,
row.names=FALSE,
quote=FALSE,
sep="\t")
Next we clean the data by looking only at data that can be simulated
using the httk dermal model. First, we load the most recent QSAR in
silico data (Dawson et al., 2021)
from httk
. #Add a statement about why QSAR values are
needed - there are many chemicals for which httk does not have
experimentally derived Clint and Fup. - AG 01/01/2024
load_dawson2021() #get QSAR data, overwrite for p-values greater than 0.05
Next, we only load data for routes iv
,
oral
, and of course dermal
. We also must
remove the chemicals: dibromomethane, halothane, and isoflurane, because
these chemicals do not have in silico or in vitro data
to inform necessary parameters for the httk dermal model.
# Select route (dermal/iv/oral)
= c("iv","dermal","oral")
route.ls
# Reduce data based on available in vitro data
# df.reduced <- df %>% filter(administration_route_normalized %in% route.ls) %>% #only iv, oral, dermal routes
# filter(test_substance_dtxsid %!in% c("DTXSID4021557","DTXSID4025371","DTXSID3020752")) #no metabolism data for: dibromomethane, halothane, isoflurane
<- df %>% filter(administration_route_normalized %in% route.ls) %>% #only iv, oral, dermal routes
df.reduced filter(!(test_substance_dtxsid %in% c("DTXSID4021557","DTXSID4025371","DTXSID3020752"))) #no metabolism data for: dibromomethane, halothane, isoflurane
<- type_convert(df.reduced) #change dose_level_normalized to numeric
df.reduced
#Join df.reduced with the Volatility/chemical info from supptab1_meade2023
= df.reduced %>% left_join(supptab1_meade2023,by=c("test_substance_dtxsid"="DTXSID"))
df.reduced
=subset(df.reduced, !(conc_medium_normalized %in%c("feces","urine")))
df.reduced# Display which chemicals are being looked at
<- df.reduced %>%
table mutate(Chemical = get_chem_id(dtxsid=test_substance_dtxsid)$chem.name) %>%
::count(Chemical,test_substance_dtxsid,extraction_document_pmid,conc_medium_normalized,sort=F) %>%
dplyr::rename("Number.of.Data.Points" = n)
dplyrrow.names(table) <- 1:nrow(table)
::kable(table,
knitrrow.names=TRUE,
caption="List of chemicals with dermal data after cleaning data.")
write.table(table,
file=paste0(loc.wd,"/tables/ChemicalswDermalDataCleaned.txt"),
col.names=TRUE,
row.names=FALSE,
quote=FALSE,
sep="\t")
Next, we clean up the data so we can match experimental conditions to simulation conditions. We start by normalizing concentration and concentration units (where concentration refers to the experimental concentration vs. time data)
# Set dummy column for Concentration units
$conc_normalized = NA
df.reduced$conc_units_normalized = df.reduced$conc_units_original
df.reduced
= which(df.reduced$conc_units_original %in% c("ug/mL","ug/g","mg/kg", "mcg/mL","mg/L"))
allset $conc_units_normalized[allset]="mg/L"; df.reduced$conc_normalized[allset]=df.reduced$conc_original[allset]
df.reduced
= which(df.reduced$conc_units_original=="ng/ml");df.reduced$conc_normalized[ngml]= df.reduced$conc_original[ngml]*0.001
ngml $conc_units_normalized[ngml]="mg/L"
df.reduced
= which(df.reduced$conc_units_original=="ng/l");df.reduced$conc_normalized[ngl]= df.reduced$conc_original[ngl]*1e-06
ngl $conc_units_normalized[ngl]="mg/L"
df.reduced
= which(df.reduced$conc_units_original %in% c("uM","nmol/L","umol/L"))
molunts =unique(df.reduced[molunts,c("test_substance_dtxsid","conc_units_original")])
dtx$mw = get_physchem_param(param="MW", dtxsid=dtx$test_substance_dtxsid)
dtx
for(z in 1:dim(dtx)[1]){
=which(df.reduced$test_substance_dtxsid==dtx$test_substance_dtxsid[z] & df.reduced$conc_units_original==dtx$conc_units_original[z])
these.rows= convert_units(input.units = dtx$conc_units_original[z],output.units = "mg/L",dtxsid=dtx$test_substance_dtxsid[z])
conversion_val "conc_normalized"] = df.reduced[these.rows,"conc_original"] * conversion_val
df.reduced[these.rows,$conc_units_normalized[these.rows] = "mg/L"
df.reduced
}
=which(df.reduced$conc_units_original %in%c("% dose", "% dose/h"))
perc.dose"conc_normalized"]=df.reduced[perc.dose, "conc_original"]*df.reduced[perc.dose, "dose_level_original"]
df.reduced[perc.dose, #df.reduced[perc.dose,"conc_units_normalized"]=df.reduced[perc.dose,"dose_level_original_units"]
"conc_units_normalized"]="mg/L"
df.reduced[perc.dose,# ppbv
= which(df.reduced$conc_units_original=="ppbv")
ppbv "conc_normalized"] = convert_units(input.units = "ppbv",output.units = "mg/L",dtxsid=unique(df.reduced$analyte_dtxsid[ppbv]),state="gas") * df.reduced[ppbv, "conc_original"]
df.reduced[ppbv, $conc_units_normalized[ppbv] = "mg/L" df.reduced
We also convert the dose units from the database to be in amounts (not concentrations) using the vehicle volumes. After doing this, all unique dosing scenarios are listed in a table.
# Convert ug/mL to mg
<- df.reduced %>%
df.reduced separate(dose_volume,
into=c("dose_volume","dose_volume_units"),
sep=" ") %>% #separate dose volume into number and units, units either NA or ml/mL
mutate(dose_volume=as.numeric(dose_volume)) %>%
mutate(dose_level_new =
ifelse(tolower(dose_level_original_units)=="ug/ml",
*dose_volume/1e+3, # ug >- mg, vol in mL
dose_level_original#convert ug/ml to mg
dose_level_original), dose_level_new_units =
ifelse(tolower(dose_level_original_units)=="ug/ml",
"mg",
dose_level_original_units))
# Convert ppm units to mg/L (assuming ppm is in air) - add MW to dataframe to do this
<- df.reduced %>%
df.reduced left_join(chem.physical_and_invitro.data[,c("DTXSID","MW")],
by=c("test_substance_dtxsid"="DTXSID"))
$MW.y=NULL
df.reducedcolnames(df.reduced)=gsub("MW.x","MW",colnames(df.reduced)) #fix MW
= df.reduced %>% #
df.reduced mutate(MW = as.numeric(MW)) %>%
group_by(test_substance_dtxsid) %>%
mutate(dose_level_new =
ifelse(dose_level_original_units=="ppm" ,
*
dose_level_new convert_units(input.units="ppmw", #?? whatever
output.units = "mg/l",
MW = unique(MW)),
dose_level_new),# dose_level_new =
# ifelse(dose_level_original_units=="ppm" & administration_route_original!="dermal vapor",
# dose_level_new *
# convert_units(input.units="ppmw", #?? whatever
# output.units = "mg/l",
# MW = unique(MW), state="liquid"),
# dose_level_new),
dose_level_new_units =
ifelse(tolower(dose_level_original_units)=="ppm",
"mg/L",
dose_level_new_units))$dose_level_new_units[which(df.reduced$dose_level_new_units=="ppm")]="mg/L"
df.reduced
# Set normalized vehicle
<- df.reduced %>%
df.reduced mutate(dose_vehicle_normalized =
ifelse(grepl("corn oil",dose_vehicle) |
grepl("ethanol",dose_vehicle),"octanol",
ifelse(!is.na(dose_vehicle) &
!grepl("Cookie",dose_vehicle),
"water",
NA)))
# Remove added chemical MW column for now
<- df.reduced %>% select(!MW)
df.reduced
# Display dosing scenarios
<- df.reduced %>%
table mutate(dose_level_original=signif(dose_level_original,3),
dose_level_new = signif(dose_level_new,3),
Chemical = get_chem_id(dtxsid=test_substance_dtxsid)$chem.name) %>%
count(Chemical,dose_level_original,
dose_level_original_units,
dose_level_new,
dose_level_new_units,
dose_volume,
dose_volume_units,
dose_vehicle,
dose_vehicle_normalized,
dermal_applied_area,sort=F) %>%
dermal_applied_area_units,rename("Number.of.Data.Points" = n)
row.names(table) <- 1:nrow(table)
::kable(table,
knitr#row.names=TRUE,
caption="Dosing Scenarios in CvT Data",
floating.environment="sidewaystable",
align="c")
<- df %>% mutate(Chemical = get_chem_id(dtxsid=test_substance_dtxsid)$chem.name) %>% dplyr::count(Chemical,test_substance_dtxsid,extraction_document_pmid,sort=F) %>% dplyr::rename(Number.of.Data.Points = n)
table row.names(table) <- 1:nrow(table)
::kable(table, row.names=TRUE,caption="List of chemicals with dermal data before cleaning data.") knitr
Now, update certain experimental parameters. If body weight is not reported, set it as the population mean, calculate skin area, and specify the model output name that corresponds to each medium. Also, we need to extract dose duration information for explicit values for the length of duration and the length of duration units (in hours, making conversions when necessary).
=df.reduced
df.new
$species = tolower(df.new$species)
df.new# get dose duration in numbers and time units
$dose_duration_num = as.numeric(str_extract(df.new$dose_duration, "[0-9]+"))
df.new$dose_duration_units = str_extract(df.new$dose_duration, "[aA-zZ]+")
df.new<- df.new %>% mutate(BW = ifelse(!is.na(weight_kg),weight_kg,
df.cases case_when(
=="rat" ~ 0.4,
species=="human" ~ 70,
species=="mouse" ~ 0.025,
species=="rabbit" ~ 1.6
species
)),# Set human default height to 175 cm if not listed
height_cm = ifelse(species=="human" & is.na(height_cm),175,height_cm),
totalSA = case_when(
=="rat" ~ 9.83*(BW*1000)^(2/3), # formula from Gouma et al., 2012, 195-240grams
species=="human" ~ sqrt(height_cm * BW/3600) * 100^2,
species=="mouse" ~ 9.83*(BW*1000)^(2/3),
species=="rabbit" ~ 100*11*(BW)^(2/3) # formula from Tadashi et al., 2018
species
),#Assume Fskin_exposed = 10% if not stated
Fskin_exposed = ifelse(!is.na(dermal_applied_area),dermal_applied_area/totalSA,0.1),
#Infinite Dose if dosing units are concentration mg/l AJG 4/13/24: Double check that mg/l are listed in the correct places in the previous chunk.
InfiniteDose = ifelse(dose_level_new_units=="mg/L",1,0), # omg
Vvehicle = ifelse(tolower(dose_volume_units)=="ml",dose_volume/1e3,dose_volume),
#Standardize time units for dose duration AJG 5/7/24
dose_duration_units = case_when(
%in% c("hr","h","hour","hrs","hours") ~ "hours",
dose_duration_units %in% c("min","m","minute","minutes","mins") ~ "mins",
dose_duration_units %in% c("day","d","days") ~ "days"
dose_duration_units
),dose_duration_num2 = ifelse(dose_duration_units=="mins", dose_duration_num/60, dose_duration_num),
dose_duration_units2 = ifelse(dose_duration_units=="mins", "hours", dose_duration_units)
)# remove experiments where conc_medium_normalized = feces and set observation to collect
= subset(df.cases, !(conc_medium_normalized %in%c("urine","feces")))
df.cases = df.cases%>%
df.cases mutate(conc_medium_obs = recode(conc_medium_normalized,
"blood"="Cven",
"liver"="Cliver",
"lung"="Clung",
"kidney"="Ckidney",
"plasma"="Cplasma",
"serum"="Cplasma",
"expired air" = "Aexhaled"
))
# Display dosing scenarios
<- df.cases %>%
table mutate(dose_level_original=signif(dose_level_original,3),
dose_level_new = signif(dose_level_new,3),
Chemical = get_chem_id(dtxsid=test_substance_dtxsid)$chem.name) %>%
count(Chemical, species,dose_level_original,
dose_level_original_units,
dose_level_new,
dose_level_new_units,
dose_volume,
dose_volume_units,
dose_vehicle,
dose_vehicle_normalized,
dose_duration_num2,
dermal_applied_area,
dermal_applied_area_units,sort=F)
extraction_document_pmid,row.names(table) <- 1:nrow(table)
::kable(table,
knitr#row.names=TRUE,
caption="Dosing Scenarios in CvT Data",
floating.environment="sidewaystable",
align="c")
write.table(table,
file=paste0(loc.wd,"/tables/CvTdbDosingScenarios.txt"),
col.names=TRUE,
row.names=FALSE,
quote=FALSE,
sep="\t")
Finally, we clarify exhalation (depending on chemical volatility) and set vehicle volume parameters and save the entire cleaned datasheet as concentration_data_meade2023.
= df.cases
df.cases_new $Kvehicle2water = df.cases_new$dose_vehicle_normalized # set this parameter
df.cases_new= df.cases_new %>% mutate(
df.cases_new administration_route_normalized = case_when(
== "oral" ~ "Oral",
administration_route_normalized == "iv" ~ "IV",
administration_route_normalized == "dermal" ~ "Dermal"
administration_route_normalized
),#Add exhalation for volatile compounds
Exhalation = case_when(
levels(Volatility)[Volatility] == "Very Volatile" ~ TRUE,
levels(Volatility)[Volatility] == "Volatile" ~ TRUE,
levels(Volatility)[Volatility] == "Semi-Volatile" ~ TRUE,
levels(Volatility)[Volatility] == "Not Volatile" ~ FALSE
),# Set default vehicle volume when it is unknown. Assume a uniform 1 mm layer of vehicle on skin. Then there is 0.0001 L vehicle per cm^2 of exposed skin. For cases with non-dermal routes of exposure, these values are NA.
Vvehicle = case_when(
=="Dermal" ~ ifelse(!is.na(Vvehicle), Vvehicle, Fskin_exposed*totalSA*0.0001),
administration_route_normalized=="Oral" ~ NA,
administration_route_normalized=="IV" ~ NA)
administration_route_normalized
)
# save what we want
= df.cases_new
concentration_data_meade2023 ::resave(concentration_data_meade2023,file=paste0("meade2023_",format(Sys.time(), "%b_%d_%Y"),".Rdata")) cgwtools
Now, we will simulate the in vivo concentration vs time data
using the httk dermal model. Using
concentration_data_meade2023
, which is clean and filtered
above, we create a new data set called df.simulations
which
will contain simulation data. We simulate 6 different dermal scenarios:
* Potts-Guy, vehicle=octanol * Potts-Guy, vehicle=water * Surrey,
vehicle=octanol * Surrey, vehicle=water * 2 Compartment, vehicle=octanol
* 2 Compartment, vehicle=water
Specify the units we want returned and set up simulation data.
# Set output units for dermal model to be same as data
<- c( "Agut"="mg", "Agutlumen"="mg", "Aliver"="mg", "Aven"="mg", "Alung"="mg", "Arest"="mg", "Akidney"="mg", "Atubules"="mg", "Ametabolized"="mg", "Avehicle"="mg", "Ain"="mg",
output.units.original "Cgut"="mg/L", "Cliver"="mg/L", "Cven"="mg/L", "Clung"="mg/L", "Cart"="mg/L", "Crest"="mg/L", "Ckidney"="mg/L", "Cplasma"="mg/L", "Cvehicle"="mg/L", "Cvehicle_infinite"="mg/L")
= concentration_data_meade2023
df.simulation
# Find experiments with unique dosing regimes/species/etc and average those that had multiple time points
# for the same type of experiment. Rename each experiment with new_id.
=c("PREFERRED_NAME","test_substance_dtxsid", "conc_medium_obs", "conc_medium_normalized", "administration_route_normalized", "species", "BW", "dose_level_new", "dose_level_new_units","totalSA","Vvehicle","InfiniteDose","dose_duration_num2","dose_duration_units2","Fskin_exposed","Exhalation","dose_vehicle_normalized","Water.Solubility.mg.L", "Vapor.Pressure.mmHg","Boiling.Point.C","logP", "Volatility", "Kvehicle2water")
keep.cols=data.frame(df.simulation%>%group_by(across(all_of(c("time_hr",keep.cols) )))%>%
df.simulation_redsummarize(conc_normalized=mean(conc_normalized),.groups="drop")%>%group_by(across(all_of(keep.cols)))%>% arrange(time_hr)%>% mutate(new_id=cur_group_id())%>%
ungroup())
=df.simulation_red%>%mutate(across(c(dose_duration_num2,dose_duration_units2,Vvehicle),~ifelse(is.na(.x),list(NULL),.x))) df.simulation_red2
We have nested for loops to account for each aspect of the data that needs to be simulated. Each ID number (or “new_id”) in df.simulation_red2 corresponds to a single experiment (or experiment with averaged concentration data if the subject species, dose, route, etc. was the same), where we have defined the necessary variables: species, chemical, dose, dose duration (when applicable), whether or not an Infinite dose is applicable, route of exposure, the collected medium/tissue being sampled (i.e., plasma, blood, liver) and subject parameters such as body mass (kg) and surface area of skin. Thus, we run through each unique ID and first parameterize the model using parameterize_dermal_pbtk() and then solve the model using solve_dermal_pbtk(). For each ID, we simulate each permeability method (Surrey and Potts-Guy); for each ID/permeability method, we simulate either a water or octanol vehicle; then for each combination, we simulate the model both with and without exhalation. Note that some experimental data indicate the vehicle used, but we still test both vehicles for simulation to compare. Likewise, we calculated volatility for each chemical (which indicate whether or not exhalation actually occurs), but we still generate for all options. Note that solve_dermal_pbtk() can handle IV or oral exposure by setting the “route” input variable. Each model output is saved to a list called conc_predictions_meade2023LIST. Then this list is converted to a data frame.
= df.simulation_red2
df.run.sims = unique(df.run.sims$new_id)
fkID = c("Potts-Guy", "UK-Surrey")# for
method.perm =c("octanol","water") # for
Kvehicle2water= c("TRUE","FALSE")
exhalation_for_simulation
=list()
conc_predictions_meade2023LIST
for (include.e in exhalation_for_simulation){
for(i in 1:length(fkID)){
for(mp in 1:length(method.perm)){
for(k in 1:length(Kvehicle2water)){
= subset(df.run.sims, new_id == fkID[i])
sub.df = unique(sub.df[,c("test_substance_dtxsid","conc_medium_obs","administration_route_normalized",
specs "species","BW","dose_level_new","dose_level_new_units",
"totalSA","Vvehicle","InfiniteDose","dose_duration_num2","dose_duration_units2",
"Fskin_exposed","Exhalation")])
# check for experiments with only 1 timepoint and reset sub.df
if(length(sub.df$time_hr)==1){
=rbind(sub.df, sub.df[rep(1,1),])
sub.df1,c("conc_normalized","time_hr")]=0
sub.df[
}
= parameterize_dermal_pbtk(dtxsid = specs$test_substance_dtxsid,
parms species = specs$species,
totalSA = specs$totalSA,
default.to.human=T,
Kvehicle2water = Kvehicle2water[k],
InfiniteDose = specs$InfiniteDose,
method.permeability = method.perm[mp],
model.type="dermal_1subcomp",
clint.pvalue.threshold = Inf,
suppress.messages = T
)$BW = specs$BW
parms$Fskin_exposed = specs$Fskin_exposed
parms
# Remove exhalation or keep exhalation (except for collected air which is )
if(include.e == FALSE) {parms$Qalvc = 0}
if(unique(sub.df$PREFERRED_NAME!="DCM"))
# tolerance is bad for DCM, so set tolerance to default values for all other compounds.
{ =1e-5; atol=1e-5
rtolelse{
}=1e-4; atol=1e-4
rtol
}
=sub.df[order(sub.df$time_hr),]
sub.df= unique(c(0,sub.df$time_hr)) #time in hours
timey gc() # garbage collection (memory)
=data.frame(solve_dermal_pbtk(parameters = parms,
oktimes = timey/24, #convert to days
route = tolower(specs$administration_route_normalized),
model.type="dermal_1subcomp",
plot = FALSE,
output.units = output.units.original,
washoff = TRUE,
input.units = specs$dose_level_new_units,
initial.dose=specs$dose_level_new,
dose.duration = unlist(specs$dose_duration_num2),
dose.duration.units = unlist(specs$dose_duration_units2),
Vvehicle = unlist(specs$Vvehicle),
#dosing.dermal = dosing.dermal,
Kvehicle2water = Kvehicle2water[k],
suppress.messages = TRUE,
atol = atol, rtol= rtol,
method="lsoda"
))# extract only times we need
= subset(ok, time%in%c(ok$time[!(is.na(match(ok$time, sub.df$time_hr/24)))]))
ok.time $Prediction = ok.time[,c(specs$conc_medium_obs)]
sub.df$exhalation_for_simulation = include.e
sub.df
# specify vehicle and perm. method and vehicle if dermal
if(specs$administration_route_normalized != "Dermal"){
$method.perm = NA
sub.df$Kvehicle2water = NA
sub.df$method.name = specs$administration_route_normalized
sub.dfelse{
}$method.perm = method.perm[mp]
sub.df$Kvehicle2water = Kvehicle2water[k]
sub.df$method.name = paste(method.perm[mp], "- vehicle = ", Kvehicle2water[k])
sub.df
}
paste0(fkID[i],".",Kvehicle2water[k],".",method.perm[mp],".exh",include.e)]]= sub.df
conc_predictions_meade2023LIST[[
}
}
}
}#Bind data together to create a dataframe and organize some columns
= do.call(rbind, conc_predictions_meade2023LIST)
conc_predictions_meade2023DF rownames(conc_predictions_meade2023DF)=NULL
::resave(conc_predictions_meade2023DF, conc_predictions_meade2023LIST, file=paste0("meade2023_",format(Sys.time(), "%b_%d_%Y"),".Rdata")) cgwtools
Now, if run.simulations = FALSE, we must load data. This load supptab1_meade2023, concentration_data_meade2023, conc_predictions_meade2023DF, conc_predictions_meade2023LIST.
load(paste0(loc.wd, "/meade2023_Jun_13_2025.Rdata"))
Now, we want to get the RMSLE for each chemical, comparing the model to the data. Let us group by chemical, method, and whether or not we include exhalation.
$method.perm = gsub( "UK-Surrey","Surrey", conc_predictions_meade2023DF$method.perm)
conc_predictions_meade2023DF$method.name = gsub( "UK-Surrey","Surrey", conc_predictions_meade2023DF$method.name)
conc_predictions_meade2023DF= conc_predictions_meade2023DF
df.sim # Convert aexhaled air from mg to mg/l (assume a room of x liters, for humans - humans
# the only species in these exhaled experiments)
which(df.sim$conc_medium_obs=="Aexhaled"),"Prediction"]=df.sim[which(df.sim$conc_medium_obs=="Aexhaled"),"Prediction"]/10000
df.sim[
$conc_normalized[which(df.sim$conc_normalized==0)]=1e-10
df.sim$Prediction[which(df.sim$Prediction==0)]=1e-10
df.sim=df.sim %>% group_by(PREFERRED_NAME, method.name, exhalation_for_simulation)%>%
df.sim.rmslemutate(cmax_obs = max(conc_normalized), cmax_pred=max(Prediction)) %>%
mutate(RMSLE = rmse(log10(conc_normalized),log10(Prediction)),
Cmax_RMSLE = rmse(log10(cmax_obs), log10(cmax_pred)))
# Linear fit
<- df.sim.rmsle %>% group_by(method.name) %>% do(
ddf.sim.rmsle
{= summary(lm(log10(Prediction) ~ log10(conc_normalized),data=.))$adj.r.squared
Adjusted.R2 = lm(log10(Prediction) ~ log10(conc_normalized),data=.)$coefficients[2]
Coefficient = lm(log10(Prediction) ~ log10(conc_normalized),data=.)$coefficients[1]
Intercept = lm(log10(Prediction) ~ log10(conc_normalized),data=.)$residuals
Residuals = lm(log10(Prediction) ~ log10(conc_normalized),data=.)$fitted.values
Fitted.Values data.frame(.,Adjusted.R2,Coefficient,Intercept,Residuals,Fitted.Values)
} )
Let us compare the RMSLE and the volatility (i.e., boiling point) for each compound (assuming exhalation).
# Plot RMSLE vs Boiling Point
= subset(df.sim.rmsle, exhalation_for_simulation==TRUE)
data
=data%>%group_by(PREFERRED_NAME, Boiling.Point.C,method.name)%>%
datasummarize(Chemical.cmax.RMSLE = rmse(log10(cmax_obs),log10(cmax_pred)),
Chemical.RMSLE=rmse(log10(conc_normalized),log10(Prediction)))
= data.frame(xmin=c(0,75,250,400),
data.rect xmax=c(75,250,400,Inf),
ymin=rep(-Inf,4),
ymax=rep(Inf,4),
Volatility=c("Very Volatile",
"Volatile",
"Semi-Volatile",
"Not Volatile"))
$Volatility <- factor(data.rect$Volatility, levels=data.rect$Volatility)
data.rect<- viridis(4)
viridis.colors <- ggplot(data=data,aes(x=`Boiling.Point.C`,y=Chemical.RMSLE)) +
plot geom_point()+
geom_smooth(method="lm",formula='y~x') +
geom_text_repel(data=data,aes(x=`Boiling.Point.C`,y=Chemical.RMSLE,label=PREFERRED_NAME),
max.overlaps=Inf)+
facet_wrap(~method.name,scales="free") +
#scale_x_log10() +
geom_rect(data=data.rect,
aes(xmin=xmin,xmax=xmax,ymin=ymin,ymax=ymax,fill=Volatility),
alpha=0.5,
inherit.aes=FALSE) +
scale_fill_manual(values=c(`Not Volatile`=viridis.colors[1],
`Semi-Volatile`=viridis.colors[2],
`Volatile`=viridis.colors[3],
`Very Volatile`=viridis.colors[4]))+
labs(y="RMSLE",title="Boiling Point vs. RMSLE",x="Boiling Point (C)")+
theme_bw() +
theme(text=element_text(size=33),
plot.title=element_text(hjust=0.5),
legend.position="bottom")
plotggsave(paste(getwd(), "/Figures_May1/BPvsRMSLE_suppl.png",sep=""),
width = 20, height = 10, units = "in")
<- ggplot(data=data,aes(x=`Boiling.Point.C`,y=Chemical.cmax.RMSLE)) +
plot2 geom_point()+
geom_smooth(method="lm",formula='y~x') +
geom_text_repel(data=data,aes(x=`Boiling.Point.C`,y=Chemical.cmax.RMSLE,label=PREFERRED_NAME),
max.overlaps=Inf)+
facet_wrap(~method.name,scales="free") +
#scale_x_log10() +
geom_rect(data=data.rect,
aes(xmin=xmin,xmax=xmax,ymin=ymin,ymax=ymax,fill=Volatility),
alpha=0.5,
inherit.aes=FALSE) +
scale_fill_manual(values=c(`Not Volatile`=viridis.colors[1],
`Semi-Volatile`=viridis.colors[2],
`Volatile`=viridis.colors[3],
`Very Volatile`=viridis.colors[4]))+
labs(y="RMSLE",title="Boiling Point vs. Cmax RMSLE",x="Boiling Point (C)")+
theme_bw() +
theme(text=element_text(size=33),
plot.title=element_text(hjust=0.5),
legend.position="bottom")
#annotate_figure(plot2,top=text_grob("Boiling Point vs RMSLE",size=40,face="bold"))
ggsave(paste(getwd(), "/Figures_May1/BPvsRMSLE_cmax_suppl.png",sep=""),
width = 20, height = 10, units = "in")
Now, we compare simulations with exhalation to those without exhalation (as a function of boiling point).
=df.sim.rmsle
df.exh.compare
= data.frame(df.exh.compare %>% group_by(PREFERRED_NAME,method.name, Boiling.Point.C)%>%
Exh arrange(exhalation_for_simulation) %>%
summarize(Cmax=Cmax_RMSLE[exhalation_for_simulation==FALSE]-Cmax_RMSLE[exhalation_for_simulation==TRUE], Full_time_course = RMSLE[exhalation_for_simulation==FALSE]-RMSLE[exhalation_for_simulation==TRUE]))
$method.name= sub(" - ", " -\n", Exh$method.name)
Exh
#stack data
= data.frame(Exh %>%
stackExh pivot_longer(cols = c("Cmax", "Full_time_course"),
names_to = "ind",
values_to = "values"))
= unique(stackExh[,c("Boiling.Point.C","values","PREFERRED_NAME","ind","method.name")])
data.text = data.frame(xmin=c(0,75,250,400),
data.rect xmax=c(75,250,400,Inf),
ymin=rep(-Inf,4),
ymax=rep(Inf,4),
Volatility=c("Very Volatile",
"Volatile",
"Semi-Volatile",
"Not Volatile"))
$Volatility <- factor(data.rect$Volatility,
data.rectlevels=data.rect$Volatility)
<- viridis(4)
viridis.colors
# rename labels for facets
# labeller=labeller(ind = c(Cmax="Cmax",
# Full_time_course = "Full time-course"))
<- ggplot(data=stackExh,aes(x=Boiling.Point.C,y=values)) +
plot geom_point()+ #geom_line()+
theme_bw()+
geom_smooth(method='lm') +
facet_grid(ind~method.name, scales="free", labeller=labeller(ind = c(Cmax="Cmax",
Full_time_course = "Full time-course")))+
geom_rect(data=data.rect,
aes(xmin=xmin,xmax=xmax,ymin=ymin,ymax=ymax,fill=Volatility),
alpha=0.5,
inherit.aes=FALSE) +
scale_fill_manual(values=c(`Not Volatile`=viridis.colors[1],
`Semi-Volatile`=viridis.colors[2],
`Volatile`=viridis.colors[3],
`Very Volatile`=viridis.colors[4]))+
geom_text_repel(data=data.text,aes(x=Boiling.Point.C,y=values,label=PREFERRED_NAME),
max.overlaps=Inf)+
#facet_grid(cols=vars(Route.or.Method.if.Dermal), rows=vars(RMSLE.Type),scales="free_y")+
#scale_x_log10() +
geom_hline(aes(yintercept=0)) +
labs(y=expression(Delta~"RMSLE"),x="Boiling Point (C)",title="Change in RMSLE when Exhalation is Added")+
scale_color_brewer(type="qual",palette="Paired") +
theme(axis.text.x = element_text(angle=45,vjust=1,hjust=1),
plot.title=element_text(hjust=0.5),
text=element_text(size=30),
legend.position="bottom")
plot
ggsave(paste(getwd(), "/Figures_May1/BP_RMSLE_Exhalation_main.png",sep=""),
width = 20, height = 10, units = "in")
# P-values for regression lines
= stackExh%>% group_by(ind, method.name) %>%
stackExhStats summarize(model = list(lm(values~Boiling.Point.C, data=cur_data())),
.groups="drop")%>%
rowwise()%>%
mutate(msummary=list(summary(model)),
rsquared=msummary$r.squared,
p_val= pf(msummary$fstatistic[1], msummary$fstatistic[2], msummary$fstatistic[3], lower.tail=FALSE)
%>%select(ind,method.name,rsquared,p_val) )
Now, let’s generate CvT plots for each chemical and color by whether or not exhalation was included.
= df.sim.rmsle
df.sim2
= unique(df.sim2$PREFERRED_NAME)
chem
for(chx in 1:length(chem)){
= subset(df.sim2, PREFERRED_NAME==chem[chx])
each.chem #each.chem$dose = paste(each.chem$dose_level_new, each.chem$dose_level_new_units)
$dose=paste(each.chem$dose_level_new, each.chem$dose_level_new_units)
each.chem# each.chem$dose = factor(each.chem$dose,
# levels=unique(each.chem$dose)[order(unique(each.chem$dose_level_new))])
$method.name= sub(" - ", " -\n", each.chem$method.name)
each.chem$subject=paste(each.chem$species, "BW",each.chem$BW, "kg")
each.chem
ggplot(each.chem)+
geom_point(data=each.chem, aes(x=time_hr, y=conc_normalized,colour=dose))+
geom_line(data=each.chem, aes(x=time_hr, y=Prediction,colour=dose,linetype=exhalation_for_simulation),linewidth=1)+
labs(title=paste(chem[chx], "\n Log-transformed Concentrations v time"), x=" Time (hr)", y="Log Concentrations (mg/L)", colour="")+
facet_nested(~conc_medium_normalized+method.name,scales="free")+
scale_y_log10() +
theme(legend.position = "bottom")
ggsave(paste(getwd(), "/Figures_May1/cvt_eachchem_by_exhalation_suppl", unique(each.chem$test_substance_dtxsid),".png",sep=""),
width = 8, height = 6, units = "in")
}
Now plot the data in Predicted vs Observed (PvO) plots for each of the 13 chemicals. First, we extract only simulated results that match the correct exhalation conditions by finding simulations where exhalation_for_simulations = T if Exhalation (which was calculated from volatility parameters), is TRUE and FALSE if FALSE.
= subset(df.sim.rmsle, exhalation_for_simulation==Exhalation)
df.new
= unique(df.new$PREFERRED_NAME)
chem
for(chx in 1:length(chem)){
= subset(df.new, PREFERRED_NAME==chem[chx])
each.chem #each.chem$dose = paste(each.chem$dose_level_new, each.chem$dose_level_new_units)
$dose=paste(each.chem$dose_level_new, each.chem$dose_level_new_units)
each.chem$dose = factor(each.chem$dose,
each.chemlevels=unique(each.chem$dose)[order(unique(each.chem$dose_level_new))])
$method.name= sub(" - ", " -\n", each.chem$method.name)
each.chem$subject=paste(each.chem$species, "BW",each.chem$BW, "kg")
each.chem
ggplot(each.chem, aes(x=conc_normalized,y=Prediction))+
geom_point(aes(colour=dose,shape=conc_medium_normalized))+
labs(title=paste(chem[chx], "\n Log-transformed Concentrations"), x="Observed Concentrations (mg/L)", x="Predicted Concentrations (mg/L)", colour="", shape="")+
#facet_wrap(~dose,scales="free")+
facet_wrap(~method.name,scales="free")+
geom_smooth(method='lm',formula='y~x',colour="black")+
geom_abline(linetype=2)+
#geom_smooth(aes(color=method.name),method='lm',formula='y~x')+
scale_x_log10() +
scale_y_log10() +
theme_bw() + theme(text=element_text(size=11),
strip.text=element_text(size=13))
#theme(legend.position = "bottom")
ggsave(paste(getwd(), "/Figures_May1/pvo_bydose_suppl", unique(each.chem$test_substance_dtxsid),".png",sep=""),
width = 6, height = 6, units = "in")
}
# Linear fit for these.
<- df.new %>% group_by(method.name) %>% do(
linear.fit.new
{= summary(lm(log10(Prediction) ~ log10(conc_normalized),data=.))$adj.r.squared
Adjusted.R2 = lm(log10(Prediction) ~ log10(conc_normalized),data=.)$coefficients[2]
Coefficient = lm(log10(Prediction) ~ log10(conc_normalized),data=.)$coefficients[1]
Intercept = lm(log10(Prediction) ~ log10(conc_normalized),data=.)$residuals
Residuals = lm(log10(Prediction) ~ log10(conc_normalized),data=.)$fitted.values
Fitted.Values data.frame(.,Adjusted.R2,Coefficient,Intercept,Residuals,Fitted.Values)
}
)# RMSLE grouped by method type
= df.new%>%group_by(method.name)%>%
rmsle.all summarize( Adjusted.R2 = summary(lm(log10(Prediction) ~ log10(conc_normalized)))$adj.r.squared,
RMSLE_total =rmse(log10(Prediction),log10(conc_normalized)),
CMAX_RMSLE_total =rmse(log10(cmax_pred) ,log10(cmax_obs) )
)
now plot log-transformed predicted vs. observed values and group by permeability method, vehicle (or route of exposure), and tested tissue.
= subset(df.sim, exhalation_for_simulation==Exhalation)
df.new =df.new %>% group_by(PREFERRED_NAME, method.name)%>%
df.sim.rmsle2mutate(cmax_obs = max(conc_normalized), cmax_pred=max(Prediction)) %>%
mutate(RMSLE = rmse(log10(conc_normalized),log10(Prediction)),
Cmax_RMSLE = rmse(log10(cmax_obs), log10(cmax_pred)))
=c("firebrick", "indianred", "blue4","steelblue1","#2ca25f", "#addd8e")
col=ggplot(df.sim.rmsle2, aes(x=conc_normalized, y=Prediction))+
plot.pvogeom_point(aes( colour=method.name,shape=PREFERRED_NAME), alpha=1,size=2)+
labs(title=paste0("(A) Concentrations"),
y = "Simulated (mg/L)", x="Observed (mg/L)",
color="Route or\n Method",shape="Chemical")+
scale_shape_manual(values=1: length(unique(df.sim.rmsle2$PREFERRED_NAME)))+
geom_smooth(aes(colour=method.name),method="lm",se=F)+
#scale_color_brewer(type="qual",palette="Paired",direction=-1) +
scale_colour_manual(values=col)+
theme_bw() +
scale_x_log10() + scale_y_log10() +
geom_abline(lty="dashed") +
#coord_fixed(xlim=lims,ylim=lims) +
theme(plot.title=element_text(hjust=0.5),
text=element_text(size=30))
= ggplot(df.sim.rmsle2, aes(x=cmax_obs, y=cmax_pred))+
plot.cmax geom_point(aes( colour=method.name,shape=PREFERRED_NAME), alpha=1,size=2)+
labs(title=paste0("(B) Max Concentrations"),
y = "Simulated (mg/L)", x="Observed (mg/L)",
color="Route or\n Method", shape="Chemical")+
#scale_color_brewer(type="qual",palette="Paired",direction=-1) +
scale_colour_manual(values=col)+
scale_shape_manual(values=1: length(unique(df.sim.rmsle2$PREFERRED_NAME)))+
theme_bw() +
scale_x_log10() + scale_y_log10() +
geom_abline(lty="dashed") +
#coord_fixed(xlim=lims,ylim=lims) +
theme(plot.title=element_text(hjust=0.5),
text=element_text(size=30))
ggarrange(plot.pvo, plot.cmax, common.legend=T,legend="right")
ggsave(paste(getwd(), "/Figures_May1/PvO_main.png",sep=""),
width = 20, height = 10, units = "in",dpi=300)
=df.new %>% group_by(PREFERRED_NAME, method.name)%>%
method.rmsle.onlymutate(cmax_obs = max(conc_normalized), cmax_pred=max(Prediction)) %>%
summarize(RMSLE = rmse(log10(conc_normalized),log10(Prediction)),
Cmax_RMSLE = rmse(log10(cmax_obs), log10(cmax_pred))) %>%group_by(method.name)%>%summarize(mean(Cmax_RMSLE),
mean(RMSLE))
Now let’s plot a heat map of the RMSLE and Cmax RMSLE. We will subset the dataframe to only plot the simulations appropriate for exhalation (that is, for a volatile chemical where Exhalation = TRUE, we will extract the solutions when exhalation_for_simulation = TRUE).We will also make note of the experiments for which a dosing vehicle was specified (water or octanol). For dermal, we will plot the results using both permeability methods and both vehicle methods, but will mark boxes over the cases where the vehicle matched to the simulated vehicle. We also want to plot the chemicals in order of volatility (based on boiling point).
# Extract correct exhalation
# df = conc_predictions_meade2023DF
# df$conc_normalized[which(df$conc_normalized==0)]=1e-10
# df$Prediction[which(df$Prediction==0)]=1e-10
=df.sim
df= subset(df, exhalation_for_simulation==Exhalation)
dfmatchE # order chemicals by volatility
= unique(dfmatchE[,c("PREFERRED_NAME","Boiling.Point.C")])
vol.chem =vol.chem[order(vol.chem$Boiling.Point.C,decreasing=F),]
vol.chem2
#also want to extract table to identify experiments that actually did list the vehicle as either water or octanol (or similar)
= data.frame(dfmatchE%>%group_by(PREFERRED_NAME,method.name,dose_vehicle_normalized,Kvehicle2water,administration_route_normalized)%>%
dfmatchE2 mutate(cmax_obs = max(conc_normalized), cmax_pred=max(Prediction)) %>%
summarize(Cmax_RMSLE = rmse(log10(cmax_obs),log10(cmax_pred)),
RMSLE = rmse(log10(conc_normalized),log10(Prediction))))
# Reorganize the columns so RMSLE values can be grouped by Cmax or total RMSLE and then take average RMSLE values of oral or iv values so we have one per chemical
= data.frame(dfmatchE2 %>%
dfmatchE3 pivot_longer(cols = c("Cmax_RMSLE", "RMSLE"),
names_to = "ind",
values_to = "value") %>% group_by(PREFERRED_NAME,ind,administration_route_normalized) %>%
mutate(value = if_else(administration_route_normalized !="Dermal", mean(value[administration_route_normalized !="Dermal"]), value)
))
# Make PREFERRED_NAME a factor and set the levels to match those from the volatility order.
$PREFERRED_NAME = factor(dfmatchE3$PREFERRED_NAME,
dfmatchE3levels=vol.chem2$PREFERRED_NAME)
Generate the plot.
ggplot(dfmatchE3, aes(y=PREFERRED_NAME, x=method.name))+
geom_tile(aes(fill=value))+
#geom_tile(aes(fill=Chemical.Cmax.RMSLE.NE))+
facet_wrap(~ind)+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
scale_fill_viridis()+
geom_text(aes(label=round(value,2)),colour="white",size=4)+
labs(title="",
x="Route or\n Method",
fill="RMSLE",y="Chemical")+
theme_bw() +
theme(axis.text.x=element_text(angle=45,vjust=1,hjust=1),
axis.text.y=element_text(angle=45,vjust=1.05,hjust=0.9),
plot.title=element_text(hjust=0.5),
text=element_text(size=20))+
geom_tile(data=subset(dfmatchE3, dose_vehicle_normalized==Kvehicle2water & administration_route_normalized=="Dermal"),aes(y=PREFERRED_NAME, x=method.name),
size=1.5,fill=NA,colour="red" )
ggsave(paste(getwd(),"/Figures_May1/RMSLE_compare_main.png",sep=""),width=11,height=8,units="in")
Now, we perform in vitro to in vivo extrapolation using the dermal model to obtain some occupational hazardous levels.
First we load ToxCast Data data from the EPA-ORD-CCTE Clowder. Some of the variables in this dataset of importance include:
hitc
: if hitc=1
there was a statistically
significant systematic conc-response observed, and if
hitc=0
there was no statistically significantmc6_flags
: features of the hit that might be of
interestmodl
type of conc-response model selected (using AIC)
"cnst"
: no concentration response (constant)"hill"
: Hill equation +"gnls"
: Gain-loss
model, product of an increasing and decreasing Hill equations, often
interpreted as being caused by cytotoxicitymodel_ga
: log10 uM concentration at which the chemical
caused 50% activity (i.e., chemical-assay “potency” or “AC50”)spid
: sample id, since each chemical might have
multiple assays with hits or have been run multiple times (experimental
replicates)# Load in vitro data
load("invitrodb_3_5_mc5.Rdata")
# Get chemicals of interest
load_sipes2017(overwrite=F)
# Using list, download chem info from CompTox Dashboard for chem.list chemicals, and load.
<- read.xlsx(file="CCD-Batch-Search_2023-03-08_12_23_25.xlsx",
df.chem sheetName="Main Data")# %>%filter(!is.na(BOILING_POINT_DEGC_OPERA_PRED))
<- type_convert(df.chem,na="N/A")
df.chem <- type_convert(df.chem,na="N/A") df.chem
#Get list of chemicals that can run in dermal model:
<- get_cheminfo(info="DTXSID",species="Human",model="dermal_1subcomp")
chem.list write.table(chem.list,
file=paste0(loc.wd,"/tables/ChemicalList.txt"),
col.names=TRUE,
row.names=FALSE,
quote=FALSE,
sep="\t")
# Take out volatile chemicals
<- c("DTXSID0020232","DTXSID6022923","DTXSID8024151","DTXSID3022409","DTXSID1026081")
chems.from.invivo <- df.chem %>% filter(!is.na(BOILING_POINT_DEGC_OPERA_PRED))%>%
df.chem filter((BOILING_POINT_DEGC_OPERA_PRED >= 400) | (DTXSID %in% chems.from.invivo))
<- df.chem$DTXSID
chem.list
#get sample to test code with
set.seed(100)
#chem.list <- sample(chem.list,50)
<- mc5 %>% subset(dsstox_substance_id %in% chem.list) toxcast.dermal
# CONCENTRATIONS OF INTEREST - https://doi.org/10.1093/bioinformatics/btw680 ----------------
<- NULL
toxcast.table <- Sys.time()
old.time <- length(chem.list)
num.chem for(k.chem in 1:num.chem){ #for each chemical
<- chem.list[k.chem]
this.id # Modulus operator
if(k.chem %% 100==0){
<- difftime(Sys.time(), old.time,units="secs")
new.time cat(paste0("Run ", k.chem,":",round(new.time)," secs \n"))
}<- subset(toxcast.dermal, dsstox_substance_id == this.id) #subset over chemical
toxcast.dermal.chem <- subset(toxcast.dermal.chem, hitc==1) #only look at hits
toxcast.dermal.chem.hits if(dim(toxcast.dermal.chem.hits)[1]>0){ #if there were any hits
<- data.frame(Chemical = as.character(toxcast.dermal.chem.hits[1,"chnm"]),
this.row DTXSID = this.id,
Total.Assays = dim(toxcast.dermal.chem)[1],
Unique.Assays = length(unique(toxcast.dermal.chem$aeid)),
Total.Hits = dim(toxcast.dermal.chem.hits)[1],
Unique.Hits = length(unique(toxcast.dermal.chem.hits$aeid)),
Low.AC50 = min(toxcast.dermal.chem.hits$modl_ga),
Low.AC10 = min(toxcast.dermal.chem.hits$modl_ac10),
Low.ACC = min(toxcast.dermal.chem.hits$modl_acc),
Q10.AC50 = quantile(toxcast.dermal.chem.hits$modl_ga,probs=0.1),
Q10.AC10 = quantile(toxcast.dermal.chem.hits$modl_ac10,probs=0.1),
Q10.ACC = quantile(toxcast.dermal.chem.hits$modl_acc,probs=0.1),
Med.AC50 = quantile(toxcast.dermal.chem.hits$modl_ga,probs=0.5),
Med.AC10 = quantile(toxcast.dermal.chem.hits$modl_ac10,probs=0.5),
Med.ACC = quantile(toxcast.dermal.chem.hits$modl_acc,probs=0.5),
Q90.AC50 = quantile(toxcast.dermal.chem.hits$modl_ga,probs=0.9),
Q90.AC10 = quantile(toxcast.dermal.chem.hits$modl_ac10,probs=0.9),
Q90.ACC = quantile(toxcast.dermal.chem.hits$modl_acc,probs=0.9)
)<- rbind(toxcast.table, this.row)
toxcast.table
}
}rownames(toxcast.table) <- seq(1,dim(toxcast.table)[1]) # set rownames to be sequential numbers
#View table
::kable(head(toxcast.table[,1:6]), caption = "Summarized ToxCast Data",
knitrfloating.environment="sidewaystable")
write.table(toxcast.table,
file=paste0(loc.wd,"/tables/ToxCastSummaryTable.txt"),
col.names=TRUE,
row.names=FALSE,
quote=FALSE,
sep="\t")
# Get Cmax = max(Cplasma) from dermal model for each chemical in toxcast table
<- Sys.time()
old.time =which(toxcast.table$DTXSID %in% get_cheminfo(info="dtxsid", suppress.messages=T))
include.these= subset(toxcast.table,DTXSID %in% toxcast.table$DTXSID[include.these])
toxcast.table2
<- rtol <- 1e-5
atol
<- length(toxcast.table2$DTXSID)
num.chem =F
plot.each# Set Dermal Solutions
<- c("Potts-Guy","UK-Surrey")
method.ls =list()
toxcast.lsfor(k.chem in 1:num.chem){
= toxcast.table2$DTXSID[k.chem]
this.id = TRUE
suppress.messages
for(k.method in 1:2){
<- method.ls[k.method];
this.method # if (this.method=="2-Compartment") this.method.input <- "NULL"
<-"dermal_1subcomp"
this.model.type =0
end.time=parameterize_dermal_pbtk(dtxsid=this.id,
parmsmodel.type = this.model.type,
method.permeability = this.method,
clint.pvalue.threshold=Inf,
suppress.messages=TRUE,
Kvehicle2water = 1,
species="Human",
default.to.human = T)
$InfiniteDose=1
parms#if(end.time<2){ #make sure solver finishes
=try( solve_dermal_pbtk(parameters=parms,
outmodel.type=this.model.type,
method.permeability=this.method,
#Kvehicle2water=1, #vehicle is water
days=5,
initial.dose = 1,
input.units = "ppm",
dose.duration=8,
dose.duration.units="hr",
washoff=T,
# InfiniteDose=T,
atol = atol, rtol= rtol,
method="lsoda",
suppress.messages=suppress.messages) )#put SA for hands!!!
if (is(out,"try-error")){
=0
Cmaxbrowser()
}
else{end.time <- out[nrow(out),"time"]
<- max(out[,"Cplasma"])
Cmax
}paste0(k.chem,".",this.method)]]=cbind(toxcast.table2[toxcast.table2$DTXSID==this.id,],Cmax=Cmax,Permeability = this.method)
toxcast.ls[[
}
}
=do.call(rbind,toxcast.ls)
toxcast2.all
# # Calculate the EquivDose in units same as input.units (ppm)
$EquivDose = signif(10^toxcast2.all$Q10.AC50 /
toxcast2.all$Cmax,
toxcast2.all4)
# Edit data
=toxcast2.all%>%
toxcast3rename(`Permeability Type` = Permeability,
`Equivalent Dose`=EquivDose)%>%
mutate(Chemical = ifelse(Chemical=="Methyl tert-butyl ether","MTBE",Chemical),
Chemical = ifelse(Chemical=="3,3',5,5'-Tetrabromobisphenol A","Bromdian",Chemical),
Chemical = ifelse(Chemical=="4,4'-Sulfonyldiphenol","Bisphenol S",Chemical),
Chemical = ifelse(Chemical=="Dichloromethane","DCM",Chemical),
Chemical = ifelse(Chemical=="Tetrachloroethylene","PERC/TCE",Chemical),
Chemical = ifelse(Chemical=="1-Methylbenzene","Toluene",Chemical))
# Remove unused columns
<- toxcast3 %>% select(Chemical,DTXSID,`Permeability Type`,`Equivalent Dose`,Cmax) ivive_meade2023
<- subset(chem.physical_and_invitro.data,
chem.props %in% ivive_meade2023$DTXSID)[
DTXSID c("DTXSID","MW","logWSol")]
,
<- merge(ivive_meade2023,
ivive_meade2023 all.x = TRUE,
chem.props,by="DTXSID")
<- within(ivive_meade2023,
ivive_meade2023 <-
WS.ppm signif(MW*10^logWSol*1.001142303,4))
# Save data
::resave(ivive_meade2023,
cgwtoolsfile=paste0(
loc.wd,"/meade2023_",
format(Sys.time(), "%b_%d_%Y"),
".Rdata"))
# Modify for plotting
$`Permeability Type` = gsub("UK-Surrey","Surrey",ivive_meade2023$`Permeability Type`)
ivive_meade2023==Inf] = 1e15
ivive_meade2023[ivive_meade2023write.table(subset(ivive_meade2023,
"Equivalent Dose"] <
ivive_meade2023[,"WS.ppm"]),
ivive_meade2023[,file=paste0(loc.wd,"/tables/IVIVEAchievable.txt"),
col.names=TRUE,
row.names=FALSE,
quote=FALSE,
sep="\t")
$`Equivalent Dose` <- log10(ivive_meade2023$`Equivalent Dose`)
ivive_meade2023$`Permeability Type` = paste0("Permeability Method is ",ivive_meade2023$`Permeability Type`)
ivive_meade2023write.table(ivive_meade2023,
file=paste0(loc.wd,"/tables/IVIVEResults.txt"),
col.names=TRUE,
row.names=FALSE,
quote=FALSE,
sep="\t")
# Plot Histogram
= ivive_meade2023 %>% group_by(`Permeability Type`) %>% mutate(Mean = mean(`Equivalent Dose`,na.rm=TRUE)) %>% mutate(Median = median(`Equivalent Dose`,na.rm=TRUE))
data <- c("Needs Protection (AED caused by <1ppm)",
AED.labels "AED Possibly Acheivable",
"\"Safe\" (Unachievable AED)",
"No Dermal Penetration")
# Shaded rectangles with cutoff at 10,000 ppm:
.1e4ppm = data.frame(xmin=c(-Inf,0,4,12),
data.rectxmax=c(0,4,12,Inf),
ymin=rep(-Inf,4),
ymax=rep(Inf,4),
Bioactivity=AED.labels)
# Shaded rectangles with cutoff at 1,000,000 ppm:
.1e6ppm = data.frame(xmin=c(-Inf,0,6,12),
data.rectxmax=c(0,6,12,Inf),
ymin=rep(-Inf,4),
ymax=rep(Inf,4),
Bioactivity=AED.labels)
.1e4ppm$Bioactivity <- factor(data.rect.1e4ppm$Bioactivity,
data.rectlevels=data.rect.1e4ppm$Bioactivity)
.1e6ppm$Bioactivity <- factor(data.rect.1e6ppm$Bioactivity,
data.rectlevels=data.rect.1e6ppm$Bioactivity)
= unique(data[,c("Permeability Type","Mean","Median")]) %>% pivot_longer(cols=c("Mean","Median"),names_to="Statistic",values_to="Statistic.Value")
data.med <- viridisLite::viridis(4)
viridis.colors <- ggplot(data,aes(x=`Equivalent Dose`)) +
plot geom_histogram(na.rm=TRUE,alpha=0.5,position="identity") +
geom_freqpoly(na.rm=TRUE) +
geom_vline(data = data.med, aes(xintercept=Statistic.Value,linetype=Statistic)) +
geom_rect(data=data.rect.1e4ppm,
aes(xmin=xmin,xmax=xmax,ymin=ymin,ymax=ymax,fill=Bioactivity),
alpha=0.5,
inherit.aes=FALSE) +
#scale_x_log10() +
scale_fill_manual(values=setNames(viridis.colors[4:1], AED.labels))+
labs(title=paste0("Distribution of concentrations causing AED\nfor hands submerged for 8 hours a day\nfor five days"),
y = "Number of Chemicals", x="Log10 of Administrered Equivalent Doses (AEDs) ppm in water") +
facet_col(vars(`Permeability Type`),scales="fixed") +
theme_bw() +
theme(plot.title=element_text(hjust=0.5))
plot ggsave(paste(getwd(), "/Figures_May1/histogram2.png",sep=""),
width = 20, height = 12, units = "cm")
# WATER SOLUBILITY
<-
ivive_meade2023 %>%
ivive_meade2023 mutate(WSol = case_when( ((MW*10^logWSol)>=10) ~ "Water Soluble (WSol>10g/L)",
*10^logWSol)<10) ~ "Not Soluble"
((MW
))#ivive_meade2023ALL <- ivive_meade2023WS; ivive_meade2023ALL$WSol = "Non-Water Soluble"
<- ivive_meade2023 %>% filter(WSol=="Water Soluble (WSol>10g/L)")
ivive_meade2023WS <- ivive_meade2023 %>% filter(WSol=="Not Soluble")
ivive_meade2023NonWS # Different bands for soluble/non-soluble:
.1e4ppm$WSol <- "Not Soluble"
data.rect.1e6ppm$WSol <- "Water Soluble (WSol>10g/L)"
data.rect<- rbind(data.rect.1e4ppm,data.rect.1e6ppm)
data.rect "Permeability Type"] <- "Permeability Method is Potts-Guy"
data.rect[,<- data.rect
data.rect2 "Permeability Type"] <- "Permeability Method is Surrey"
data.rect2[,<- rbind(data.rect,data.rect2)
data.rect
<- rbind(ivive_meade2023NonWS,ivive_meade2023WS)
ivive_meade2023 = ivive_meade2023 %>% group_by(`Permeability Type`,`WSol`) %>% mutate(Mean = mean(`Equivalent Dose`,na.rm=TRUE)) %>% mutate(Median = median(`Equivalent Dose`,na.rm=TRUE))
data = unique(data[,c("Permeability Type","WSol","Mean","Median")]) %>% pivot_longer(cols=c("Mean","Median"),names_to="Statistic",values_to="Statistic.Value") data.med
<- c("DTXSID0020232",
chem.to.label # "DTXSID6022923",
# "DTXSID8024151",
"DTXSID3022409",
"DTXSID1026081", "DTXSID9048699",
"DTXSID0022858","DTXSID1037","DTXSID0023163",
"DTXSID5024845" )
= data %>% filter(DTXSID %in% chem.to.label)
data.text #PLOT WS
<- ggplot(data) +
plot geom_rect(data=data.rect,
aes(xmin=xmin,xmax=xmax,ymin=ymin,ymax=ymax,fill=Bioactivity),
alpha=0.5#,
#inherit.aes=FALSE
+
) geom_histogram(na.rm=TRUE,alpha=0.5,position="identity",aes(x=`Equivalent Dose`)) +
geom_freqpoly(na.rm=TRUE,aes(x=`Equivalent Dose`)) +
geom_vline(data = data.med, aes(xintercept=Statistic.Value,linetype=Statistic)) +
# geom_point(data=data.text,aes(x=`Equivalent Dose`,y=0),size=2) +
# geom_text_repel(data=data.text,aes(x=`Equivalent Dose`,y=0,label=Chemical),
# max.overlaps=14,
# hjust = "right",nudge_x=-1,nudge_y=0.2)+
scale_fill_manual(values=setNames(viridis.colors[4:1], AED.labels))+
#ylim(0,135)+
#geom_text(x=) +
labs(title=paste0("Distribution of concentrations causing AED\nfor hands submerged for 8 hours a day\nfor five days"),
y = "Number of Chemicals", x="Log10 of Needed Concentration (ppm) in water") +
facet_wrap(vars(`Permeability Type`,`WSol`),scales="free_y",ncol=2) +
theme_bw() +
theme(plot.title=element_text(hjust=0.5),
text=element_text(size=20))
plot ggsave(paste(getwd(), "/Figures_May1/histogram_all_main.png",sep=""),
width = 15, height = 10, units = "in")
Include a table with the equivalent doses, grouped by permeability method
=data[order(data$`Equivalent Dose`),]
data2$`Equivalent Dose`= exp(data2$`Equivalent Dose`)
data2=data.frame(data2%>%group_by(Chemical,DTXSID, `Permeability Type`,WSol,logWSol)%>%
data3mutate(row=row_number())%>%
ungroup() %>%
pivot_wider(id_cols=c(Chemical, DTXSID, row, WSol,logWSol),
names_from = `Permeability Type`,
values_from = c(`Equivalent Dose`),
names_sep=""))
=data3[,-3]
data4=data4[order(data4$Permeability.Method.is.Surrey),] data4
Also, calculate permeability for each method for humans
=list()
pmat=c("UK-Surrey","Potts-Guy")
pm=c("water","octanol")
kwfor(i in 1:length(supptab1_meade2023$DTXSID)){
for(m in 1:2){ #each method
for(k in 1:2){ # each vehicle
=parameterize_dermal_pbtk(dtxsid=supptab1_meade2023$DTXSID[i],
pmethod.permeability =pm[m] ,species="Human",
Kvehicle2water = kw[k])
paste(i,".",m,".",k)]]=cbind(p$P, pm[m], kw[k],p$MW, supptab1_meade2023$DTXSID[i])
pmat[[
}
}
}
=do.call(rbind,pmat)
pmat2colnames(pmat2)=c("Perm", "Method","Vehicle","MW","DTXSID")
=data.frame(pmat2)
pmat2$Method=gsub("UK-Surrey","Surrey",pmat2$Method)
pmat2$Vehicle=paste("Vehicle is", pmat2$Vehicle)
pmat2
=merge(pmat2,supptab1_meade2023, by="DTXSID")
pmat3
ggplot(pmat3,aes(x=as.numeric(MW.x),y=as.numeric(Perm)))+
geom_point()+
facet_grid(Method~Vehicle,scales="free")+
labs(x="MW",y="Kp - Permeability")
ggsave(paste(getwd(), "/Figures_May1/MWvPerm.png",sep=""),
width = 4, height = 3.5, units = "in")
ggplot(pmat3,aes(x=as.numeric(logP),y=as.numeric(Perm)))+
geom_point()+
facet_grid(Method~Vehicle,scales="free")+
labs(x="log(Kow)",y="Kp- Permeability")
ggsave(paste(getwd(), "/Figures_May1/logKowvPerm.png",sep=""),
width = 4, height = 3.5, units = "in")
save.image(paste0(loc.wd,
"MeadeWorkspace",
Sys.Date(),
".RData"))