In this vignette we will see how we can summarise the use of measurement concepts in our dataset as a whole. For our example we’re going to be interested in measurement concepts related to respiratory function and will use the Eunomia synthetic dataset.
First we will connect to the database and create a cdm reference.
con <- dbConnect(duckdb(), dbdir = eunomiaDir())
cdm <- cdmFromCon(
con = con, cdmSchem = "main", writeSchema = "main", cdmName = "Eunomia"
)
cdm
#>
#> ── # OMOP CDM reference (duckdb) of Eunomia ────────────────────────────────────
#> • omop tables: person, observation_period, visit_occurrence, visit_detail,
#> condition_occurrence, drug_exposure, procedure_occurrence, device_exposure,
#> measurement, observation, death, note, note_nlp, specimen, fact_relationship,
#> location, care_site, provider, payer_plan_period, cost, drug_era, dose_era,
#> condition_era, metadata, cdm_source, concept, vocabulary, domain,
#> concept_class, concept_relationship, relationship, concept_synonym,
#> concept_ancestor, source_to_concept_map, drug_strength
#> • cohort tables: -
#> • achilles tables: -
#> • other tables: -
Now we’ll create a codelist with measurement concepts.
repiratory_function_codes <- newCodelist(list("respiratory function" = c(4052083, 4133840, 3011505)))
repiratory_function_codes
#>
#> - respiratory function (3 codes)
For a general summary of the use of these codes in our dataset we can use summariseCodeUse from the CodelistGenerator R package.
library(CodelistGenerator)
code_use <- summariseCodeUse(repiratory_function_codes, cdm)
tableCodeUse(code_use)
Database name
|
||||||||
---|---|---|---|---|---|---|---|---|
Eunomia
|
||||||||
Codelist name | Standard concept name | Standard concept ID | Source concept name | Source concept ID | Source concept value | Domain ID |
Estimate name
|
|
Record count | Person count | |||||||
respiratory function | overall | - | NA | NA | NA | NA | 8,728 | 2,096 |
FEV1/FVC | 3011505 | FEV1/FVC | 3011505 | 19926-5 | measurement | 2,320 | 125 | |
Spirometry | 4133840 | Spirometry | 4133840 | 127783003 | measurement | 2,320 | 125 | |
Measurement of respiratory function | 4052083 | Measurement of respiratory function | 4052083 | 23426006 | measurement | 4,088 | 2,072 |
Although we now have a general summary of the use of our measurement codes, we may well want more information on these measurements to inform study feasibility and design.
MeasurementDiagnostics helps us to perform additional, measurement specific, diagnostic checks. For this we’ll simply call the summariseMeasurementUse() function which will run a series of checks.
library(MeasurementDiagnostics)
repiratory_function_measurements <- summariseMeasurementUse(cdm, repiratory_function_codes)
As with similar packages, our results are returned in the summarised_result format as defined by the omopgenerics package.
repiratory_function_measurements |>
glimpse()
#> Rows: 47
#> Columns: 13
#> $ result_id <int> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,…
#> $ cdm_name <chr> "Eunomia", "Eunomia", "Eunomia", "Eunomia", "Eunomia"…
#> $ group_name <chr> "codelist_name", "codelist_name", "codelist_name", "c…
#> $ group_level <chr> "respiratory function", "respiratory function", "resp…
#> $ strata_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ strata_level <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ variable_name <chr> "number records", "number subjects", "time", "time", …
#> $ variable_level <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ estimate_name <chr> "count", "count", "min", "q25", "median", "q75", "max…
#> $ estimate_type <chr> "integer", "integer", "numeric", "numeric", "numeric"…
#> $ estimate_value <chr> "8728", "2096", "0", "0", "371", "1726.25", "33541", …
#> $ additional_name <chr> "overall", "overall", "overall", "overall", "overall"…
#> $ additional_level <chr> "overall", "overall", "overall", "overall", "overall"…
We can see each of the checks performed.
settings(repiratory_function_measurements) |>
pull("result_type") |>
unique()
#> [1] "measurement_timings" "measurement_value_as_numeric"
#> [3] "measurement_value_as_concept"
One of the checks summarises the numeric values associated with tests. We can quickly create a table summarising these results.
CDM name | Concept name | Concept ID | Domain ID | Unit concept name | Unit concept ID | Estimate name | Estimate value |
---|---|---|---|---|---|---|---|
respiratory function | |||||||
Eunomia | overall | overall | overall | No matching concept | 0 | N | 8,728 |
Median [Q25 - Q75] | - | ||||||
Range | - | ||||||
Missing value, N (%) | 8,728 (100.00%) | ||||||
Measurement of respiratory function | 4052083 | Measurement | No matching concept | 0 | N | 4,088 | |
Median [Q25 - Q75] | - | ||||||
Range | - | ||||||
Missing value, N (%) | 4,088 (100.00%) | ||||||
FEV1/FVC | 3011505 | Measurement | No matching concept | 0 | N | 2,320 | |
Median [Q25 - Q75] | - | ||||||
Range | - | ||||||
Missing value, N (%) | 2,320 (100.00%) | ||||||
Spirometry | 4133840 | Measurement | No matching concept | 0 | N | 2,320 | |
Median [Q25 - Q75] | - | ||||||
Range | - | ||||||
Missing value, N (%) | 2,320 (100.00%) |
Similarly, we can see a summary of concept values associated with measurements. We can see from this that our respiratory function measurements do not have concept value results (instead having numeric values which we see in the table above).
CDM name | Concept name | Concept ID | Domain ID | Variable name | Value as concept name | Value as concept ID | Estimate name | Estimate value |
---|---|---|---|---|---|---|---|---|
respiratory function | ||||||||
Eunomia | overall | overall | overall | Value as concept name | No matching concept | 0 | N (%) | 8,728 (100.00%) |
FEV1/FVC | 3011505 | Measurement | Value as concept name | No matching concept | 0 | N (%) | 2,320 (100.00%) | |
Spirometry | 4133840 | Measurement | Value as concept name | No matching concept | 0 | N (%) | 2,320 (100.00%) | |
Measurement of respiratory function | 4052083 | Measurement | Value as concept name | No matching concept | 0 | N (%) | 4,088 (100.00%) |
As well as overview of the values of measurements, we can also see a summary of the timing between measurements for individuals in the dataset.
CDM name | Variable name | Estimate name | Estimate value |
---|---|---|---|
respiratory function | |||
Eunomia | Number records | N | 8,728 |
Number subjects | N | 2,096 | |
Time | Median [Q25 - Q75] | 371.00 [0.00 - 1,726.25] | |
Range | 0.00 to 33,541.00 |