--- title: 'Supplement to: "HaDeX: Analysis and Visualisation of Hydrogen/Deuterium Exchange Mass Spectrometry Data"' author: "Weronika Puchała, Michał Burdukiewicz, Michał Kistowski, Katarzyna A. Dąbrowska, Aleksandra E. Badaczewska-Dawid, Dominik Cysewski, Michał Dadlez" date: "20.06.2019" output: rmarkdown::html_vignette: toc: true number_sections: true always_allow_html: yes bibliography: HDX.bib vignette: > %\VignetteIndexEntry{HaDeX: Analysis and Visualisation of Hydrogen/Deuterium Exchange Mass Spectrometry Data} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, message = FALSE, warning = FALSE, results='asis'} library(HaDeX) library(ggplot2) library(knitr) library(DT) library(dplyr) opts_chunk$set(fig.width = 7, fig.height = 5) knitr::opts_chunk$set(dev = "png", dev.args = list(type = "cairo-png")) ``` # About HaDeX HaDeX is a novel tool for processing, analysis, and visualization of HDX-MS experiments. HaDeX covers the final parts of the analytic process, including a comparison of experiments, quality control and generation of publication-quality figures. To make the HaDeX **R** package available to the less **R**-fluent users, we enhanced it with a comprehensive Graphical User Interface available as a HaDeX GUI. The reproducibility of the whole procedure is ensured with advanced reporting functions. The GUI is available online: http://mslab-ibb.pl/shiny/HaDeX/ or can be installed locally on Windows systems: https://sourceforge.net/projects/hadex/files/HaDeX_setup.exe/download. Alternatively, R-fluent users can access the GUI through the [HaDeX_gui()](https://hadexversum.github.io/HaDeX/reference/HaDeX_gui.html) function. This document covers the main functionalities of both the **R** package and the GUI. ## Comparison of the existing HDX-MS software To show the novelty of HaDeX, we compare its functionalities with other relatively new software for analysis of HDX-MS data: MEMHDX [@HourdelMEMHDXinteractivetool2016] and Deuteros [@LauDeuterossoftwarerapid2019]. ```{r echo=FALSE,results='asis'} read.csv2("comparison.csv") %>% datatable(options = list(dom = "t", ordering = FALSE, paging = FALSE), rownames = FALSE, style = "bootstrap") %>% formatStyle(c("MSTools", "MEMHDX", "Deuteros", "HaDeX"), backgroundColor = styleEqual(c("Yes", "No"), c("#00BFFF", "#FF8C91"))) ``` We have not considered HDX Workbench [@PascalHDXWorkbenchSoftware2012] as it deals with the preliminary steps of the analysis. This comparison also does not cover a versatile structural visualisation tool for HDX-MS results, HDX-Viewer [@BouyssieHDXViewerinteractive3D2019] as its scope is different from the tools mentioned above. **Web server:** a software is available as a web server. **Programmatic access:** analytic functionalities are documented and available from a command line. **Desktop software:** a software can be installed locally. **Multi-state analysis:** a software supports comparisons of more than two states. **ISO-based uncertainty:** analytic functions produce ISO-compatible uncertainty intervals. **Coverage and peptide overlap:** overview of experimental sequence coverage is available in a user-friendly way. **Quality control:** additional information about course of the experiment. **N- and C-terminal length corrections:** manual correction of sequence length. **Global visualization of deuterium uptake:** deuterium uptake for different states is shown together for comparison. **Woods plot:** deuteration difference between chosen states shown in the format of Woods plot. **Zooming of the Woods plot:** Woods plot can be zoomed in. **Customizable label names and colors:** Labels and colors on the plot can be changed by the user. **Peptide kinetics chart:** Kinetics plot (deuteration change in time) are available for each peptide. **3D structure visualization:** structure of the protein is visualized in 3D. **Downloadable charts:** charts are downloadable, preferably in a vector format (.eps, .svg or .pdf). **Deuterium uptake download:** data shown on Woods plot is downloadable (e.q. as CSV file). **Downloadable results of intermediate computations:** results of intermediate computations (e.q. pure deuteration data) are downloadable. **Report generation:** generates a report (e.q. in Html format) with results of the analysis with parametrization. **PyMol export:** exports data to the [PyMol](https://pymol.org/2/) format. **HDX Data Summary:** summary of the experimental data [@MassonRecommendationsperforminginterpreting2019]. # HaDeX functionalities ## Data import The HaDeX web server works only on data in the DynamX\textsuperscript{TM} datafile format (Waters Corp.). The data from other sources may be also adjusted to the format accepted by HaDeX provided it has following columns: ```{r warning=FALSE, message=FALSE, echo = FALSE} datatable( data = data.frame("Column Name" = c("Protein", "Start", "End", "Sequence", "Modification", "Fragment", "MaxUptake", "MHP", "State", "Exposure", "File", "z", "RT", "Inten", "Center"), "Column Type" = c("Character", "Integer", "Integer", "Character", "Logic", "Logic", "Numeric", "Numeric", "Character", "Numeric", "Character", "Integer", "Numeric", "Numeric", "Numeric")), rownames = FALSE, style = "bootstrap", list(dom = "t", ordering = FALSE, paging = FALSE, autoWidth = TRUE)) ``` Although data can be imported into R using other tools, we strongly advise to rely on the [read_hdx()](https://hadexversum.github.io/HaDeX/reference/read_hdx.html) function: ```{r warning=FALSE} dat <- read_hdx(system.file(package = "HaDeX", "HaDeX/data/KD_190304_Nucb2_EDTA_CaCl2_test02_clusterdata.csv")) ``` Currently, [read_hdx()](https://hadexversum.github.io/HaDeX/reference/read_hdx.html) supports .csv, .tsv and .xls files fulfilling the data structure described above. Files with data from multiple proteins are supported. The user need to indicate which protein is of interest for calculations. ## Computation of deuteration levels The computation of the level of deuteration involves several pre-processing steps, all of which are described in this section. These steps are performed automatically in the GUI or by the [prepare_dataset()](https://hadexversum.github.io/HaDeX/reference/prepare_dataset.html) function in the console. ### Measured data into overall peptide mass The results of HDX-MS measurements as given in the DynamX data files are represented as the measured mass of peptides plus proton mass to charge ratio ($Center$). For later use, this value has to be transformed into an overall mass of a peptide measured after specific time point from a protein in a specific state, as shown in equation 1: $$pepMass = z \times (Center-protonMass)\tag{1}$$ where: - $pepMass$ - expected mass of the peptide after incubation (Da), - $protonMass$ - the mass of the proton (Da), - $z$ - charge of the peptide, - $Center$ - experimentally measured peptide mass plus proton mass to charge ratio $\left(\frac{m}{z}\right)$. HDX-MS experiments are often repeated (by the rule of thumb at least three times). Thus, we aggregate the results of replicates as a weighted mean mass into a single result per peptide using equation 2: $$aggMass = \sum_{k = 1}^{N}\frac{Inten_k}{N}{\times}pepMass_k\tag{2}$$ where: - $aggMass$ - weighted mean mass of the peptide (Da), - $k$ - replicate index, - $Inten$ - intensity, - $N$ - number of replicates. This data manipulation results from an original data structure. Each repetition of the measurement gives the data for given peptide in a given time per possible value of $z$ as shown in the example below. We need to use the value of $pepMass$ as shown in equation 1 but still keep the information about original measurement - that's why we use the weighted mass. ```{r warning=FALSE, message=FALSE, echo=FALSE} dat_temp <- read.csv(system.file(package = "HaDeX", "HaDeX/data/KD_190304_Nucb2_EDTA_CaCl2_test02_clusterdata.csv")) dat_temp %>% filter(File == "KD_190119_gg_Nucb2_CaCl2_10s_01", Sequence == "KQFEHLNHQNPDTFEPKDLDML", Exposure == 0.167) %>% select(Sequence, File, z, RT, Inten, Center) ``` The uncertainty of a measurement is variability associated with the precision of measuring instrumentation. We present here a novel derivation of uncertainty formulas for HDX-MS data according to the ISO guidelines [@jcgm:2008:EMDG]. Input files always encompass results of more than one measurement. We assume uncorrelatedness of replicates as they come from different samples. Therefore, we average measurements of replicates for each time point and for all protein states. Thus, we compute peptide mass uncertainty $u$ as uncertainty for aggregate estimate using the formula for standard deviation of the mean: $$u(x) = \sqrt{\frac{ \sum_{i=1}^n \left( x_{i} - \overline{x} \right)^2}{n(n-1)}}\tag{3}$$ where: - $x_{i}$ - measurement, - $\overline{x}$ - mean value, - $n$ - number of measurements. After obtaining the mass of the peptide, we can compute the deuteration level depending on the chosen maximum deuteration level. The maximum deuteration can also be computed in two different ways: either as *theoretical* (where the maximum deuteration depends on the theoretical deuteration levels) and *experimental* (where the maximum deuteration is assumed to be equal to the deuteration measured at the last time point). ### Experimental deuteration level The experimental deuteration level is computed as the deuteration level of the peptide from a protein in a specific state and after incubation time $t$ compared to the deuteration level measured at the start of the incubation ($t_0$). It yields a value for the chosen state and chosen time $t$. $$D = D_{t} - D_{t_0}\tag{4}$$ where: - $D$ - deuteration level (Da), - $D_{t_0}$ - experimentally measured deuteration at the beginning of the incubation (0 or close to 0), The equation 4 produces only absolute deuteration levels. The computations of relative deuteration levels follows a similar logic and is normalized by the difference of deuteration between the start ($t_0$) and the end of the experiment ($t_out$) as shown in the equation 4a: $$D = \frac{D_{t} - D_{t_0}}{D_{t_{out}} - D_{t_0}}\tag{4a}$$ All functions in the HaDeX package contain the logical parameter $relative$ to determine if they should return absolute or relative deuteration levels. #### Uncertainty calculations We describe the methodology of the uncertainty calculations for relative deuteration levels. The uncertainty for absolute deuteration levels is computed similarly, but without scaling. To calculate uncertainty related to functions of more than one variables (e. g., equation 4) the Law of propagation of uncertainty is defined by equation 5: $$u_{c}(y) = \sqrt{\sum_{k} \left[ \frac{\partial y}{\partial x_{k}} u(x_{k}) \right]^2}\tag{5}$$ As the variable of interest is $D$, we apply the general formula to the deuteration level $D$ (equation 6): $$u_{c}(D) = \sqrt{\sum_{k} \left[ \frac{\partial D}{\partial D_{k}} u(D_{k}) \right]^2 }\tag{6}$$ where: - $k \in \{0, t, out\}$, - $D_{k}$ - deuteration in $k$ time (Da), - $u(D_{k})$ - an uncertainty associated with $D_{k}$ as standard deviation of the mean value, Then, expanding the equation 6: $$u_{c}(D) = \sqrt{ \left[ \frac{1}{D_{t_{out}}-D_{t_0}} u(D_{t}) \right]^2 + \left[ \frac{D_{t} - D_{t_{out}}}{(D_{t_{out}}-D_{t_0})^2} u(D_{t_0}) \right]^2 + \left[ \frac{D_{t_0} - D_{t}}{(D_{t_{out}}-D_{t_0})^2} u (D_{t_{out}}) \right]^2}\tag{7}$$ As expected, the uncertainty associated with $D_{t}$ has the biggest impact on $u_{c}(D)$. ### Theoretical deuteration level As opposed to the experimental deuteration levels, theoretical deuteration level only partially depends on the experimental data. Here, the maximum deuteration level is based on a hypothetical peptide where all hydrogens were replaced by deuterons, as it is shown in equation 8: $$D = \frac{D_{t}-MHP}{MaxUptake \times protonMass}\tag{8}$$ where: - $D_{t}$ - deuteration measured in a chosen time point (Da), - $MHP$ - theoretical mass of the peptide (constant) (Da), - $MaxUptake$ - the maximum proton uptake for the peptide (theoretical constant) (Da), - $protonMass$ - mass of a proton (constant) (Da). The absolute deuteration level is calculated as in equation 8 but without scaling (equation 8a): $$D = D_{t} - MHP\tag{8a}$$ #### Uncertainty calculations For functions of one variable uncertainty reduces to: $$u(y) = \left| \frac{dy}{dx} u(x) \right|.\tag{9}$$ Substituting $D$ from equation 8, we have $$u(D) = \left|\frac{1}{MaxUptake \times protonMass} u(D_{t}) \right|\tag{10}$$ For the absolute values, $u(D)$ is identical with $u(D_{t})$, based on equations 8a and 9. ## Difference of deuteration levels between two states The differences of deuteration levels between two states are associated with a different level of protection of hydrogens. Therefore, we are especially interested in the differential analysis of the deuteration levels. Thus, the deuteration level in one state $(D_{2})$ is subtracted from deuteration level in the other state $(D_{1})$: $$diff = D_{1} - D_{2}\tag{11}$$ and the uncertainty is a function of two variables (based on equation 11 and 5): $$u_{c}(diff) = \sqrt{u(D_{1})^2 + u(D_{2})^2}\tag{12}$$ ```{r warning=FALSE} calc_dat <- prepare_dataset(dat, in_state_first = "gg_Nucb2_EDTA_0.001", chosen_state_first = "gg_Nucb2_EDTA_25", out_state_first = "gg_Nucb2_EDTA_1440", in_state_second = "gg_Nucb2_CaCl2_0.001", chosen_state_second = "gg_Nucb2_CaCl2_25", out_state_second = "gg_Nucb2_CaCl2_1440") ``` ## Visual data analysis ### Comparison of states Comparison plots show the deuteration level of all peptides in selected states in a given time. The x-axis represents positions of amino acids in the sequence. The y-axis shows the deuteration level, expressed either as a *relative* or *absolute* deuteration. The chart below shows peptide deuteration after 25 minutes of the incubation along with the confidence intervals. -- theoretical: - relative values: ```{r warning=FALSE} comparison_plot(calc_dat = calc_dat, theoretical = TRUE, relative = TRUE, state_first = "Nucb2 Factor 1", state_second = "Nucb2 Factor 2") + labs(title = "Theoretical fraction exchanged in state comparison in 25 min time") ``` - absolute values: ```{r warning=FALSE} comparison_plot(calc_dat = calc_dat, theoretical = TRUE, relative = FALSE, state_first = "Nucb2 Factor 1", state_second = "Nucb2 Factor 2") + labs(title = "Theoretical fraction exchanged in state comparison in 25 min time") ``` -- experimental: - relative values: ```{r warning=FALSE} comparison_plot(calc_dat = calc_dat, theoretical = FALSE, relative = TRUE, state_first = "Nucb2 Factor 1", state_second = "Nucb2 Factor 2") + labs(title = "Fraction exchanged in state comparison in 25 min time") ``` - absolute values: ```{r warning=FALSE} comparison_plot(calc_dat = calc_dat, theoretical = FALSE, relative = FALSE, state_first = "Nucb2 Factor 1", state_second = "Nucb2 Factor 2") + labs(title = "Fraction exchanged in state comparison in 25 min time") ``` ## Woods plot Woods plots show the difference between the deuteration of all peptides in two different states in a specific time point as described by equation 11. Similarly to the comparison plot, HaDeX provides both experimental and theoretical deuteration levels using either relative or absolute values: -- theoretical: - relative values: ```{r warning=FALSE} woods_plot(calc_dat = calc_dat, theoretical = TRUE, relative = TRUE) + labs(title = "Theoretical fraction exchanged between states in 25 min time") ``` - absolute values: ```{r warning=FALSE} woods_plot(calc_dat = calc_dat, theoretical = TRUE, relative = FALSE) + labs(title = "Theoretical fraction exchanged between states in 25 min time") ``` -- experimental: - relative values: ```{r warning=FALSE } woods_plot(calc_dat = calc_dat, theoretical = FALSE, relative = TRUE) + labs(title = "Theoretical fraction exchanged between states in 25 min time") ``` - absolute values: ```{r warning=FALSE } woods_plot(calc_dat = calc_dat, theoretical = FALSE, relative = FALSE) + labs(title = "Theoretical fraction exchanged between states in 25 min time") ``` ### Confidence limit in Woods plot The function [calculate_confidence_limit_values()](https://hadexversum.github.io/HaDeX/reference/calculate_confidence_limit_values.html) calculates confidence limit values as it is described elsewhere [@HoudeUtilityHydrogenDeuterium2011]. ```{r} calculate_confidence_limit_values(calc_dat = calc_dat, confidence_limit = 0.99, theoretical = FALSE, relative = TRUE) ``` The function [add_stat_dependency()](https://hadexversum.github.io/HaDeX/reference/add_stat_dependency.html) returns data extended by column describing relation of a given peptide with confidence limit. ```{r} add_stat_dependency(calc_dat, confidence_limit = 0.98, theoretical = FALSE, relative = TRUE) ``` ## Kinetic plots By the term *kinetics* we understand deuteration change in time. To calculate deuteration values per peptide for different time points is used function [calculate_kinetics()](https://hadexversum.github.io/HaDeX/reference/calculate_kinetics.html). This function uses the [calculate_state_deuteration()](https://hadexversum.github.io/HaDeX/reference/calculate_state_deuteration.html) function. ```{r warning = FALSE} (kin_YYDEYL_gg_Nucb2_CaCl2 <- calculate_kinetics(dat = dat, protein = "db_Nucb2", sequence = "YYDEYL", state = "gg_Nucb2_CaCl2", start = 45, end = 50, time_in = 0.001, time_out = 1440)) ``` ```{r warning = FALSE} (kin_YYDEYL_gg_Nucb2_EDTA <- calculate_kinetics(dat = dat, protein = "db_Nucb2", sequence = "YYDEYL", state = "gg_Nucb2_EDTA", start = 45, end = 50, time_in = 0.001, time_out = 1440)) ``` Calculated data can be shown next to each other on the plot for comparison. To visualize kinetic data we recommend [plot_kinetics()](https://hadexversum.github.io/HaDeX/reference/plot_kinetics.html) function: -- theoretical: - relative values: ```{r warning = FALSE} bind_rows(kin_YYDEYL_gg_Nucb2_CaCl2, kin_YYDEYL_gg_Nucb2_EDTA) %>% plot_kinetics(theoretical = TRUE, relative = TRUE) ``` - absolute values: ```{r warning = FALSE} bind_rows(kin_YYDEYL_gg_Nucb2_CaCl2, kin_YYDEYL_gg_Nucb2_EDTA) %>% plot_kinetics(theoretical = TRUE, relative = FALSE) ``` -- experimental: - relative values: ```{r warning = FALSE} bind_rows(kin_YYDEYL_gg_Nucb2_CaCl2, kin_YYDEYL_gg_Nucb2_EDTA) %>% plot_kinetics(theoretical = FALSE, relative = TRUE) ``` - absolute values: ```{r warning = FALSE} bind_rows(kin_YYDEYL_gg_Nucb2_CaCl2, kin_YYDEYL_gg_Nucb2_EDTA) %>% plot_kinetics(theoretical = FALSE, relative = FALSE) ``` ## Additional tools HaDeX provides additional tools for assessment of experiments. ### Peptide coverage The sequence of the protein(s) is reconstructed from the peptides from the input file. Thus, amino acids not covered by peptides are marked as X according to the IUPAC convention. The sequence is reconstructed using the [reconstruct_sequence()](https://hadexversum.github.io/HaDeX/reference/reconstruct_sequence.html) function. ```{r warning=FALSE} reconstruct_sequence(dat) ``` Additionally, the coverage of peptides can be presented on a chart using the [plot_coverage()](https://hadexversum.github.io/HaDeX/reference/plot_coverage.html) and [plot_position_frequency()](https://hadexversum.github.io/HaDeX/reference/plot_position_frequency.html) functions. ```{r warning=FALSE} plot_coverage(dat, chosen_state = "gg_Nucb2_CaCl2") plot_position_frequency(dat, chosen_state = "gg_Nucb2_CaCl2") ``` The user can choose which state (or states) should be included in these plots. If this parameter is not provided, the first possible state is chosen. If a given peptide is available in more than one state, it is shown only once. ### Quality control The function [quality_control()](https://hadexversum.github.io/HaDeX/reference/quality_control.html) plots the change in the uncertainty of deuteration levels as a function of incubation time. The uncertainty is averaged over all peptides available at a given time point in a selected state. This chart has a double function: firstly, it allows checking if the measurement uncertainty is decreasing over time (which is the expected behavior) and the second one helps to plan the appropriate incubation length for the tested protein (whether we obtain the desired data reliability values). In HDX-MS experiments, the average uncertainty of measurements decreases over time as the sample is more and more deuterated (even accounting for the back-propagation). Therefore, if the level of uncertainty is stable, the exchange is still ongoing and it would be advisable to prolong the reaction. Moreover, the user can detect a time point after which the decrease of the deuteration uncertainty becomes too marginal to prolong the measurements. This function is most useful in case of multiple measurements of the same or very similar proteins because it helps to optimize the duration of the incubation. The result of this function can be easily visualized. Example: ```{r warning=FALSE} result <- quality_control(dat = dat, state_first = "gg_Nucb2_EDTA", state_second = "gg_Nucb2_CaCl2", chosen_time = 1, in_time = 0.001) ``` ```{r warning=FALSE} ggplot(result) + geom_line(aes(x = out_time, y = avg_err_state_first, color = "Average error (first state)")) + geom_line(aes(x = out_time, y = avg_err_state_second, color = "Average error (second state)")) + scale_x_log10() + labs(x = "log(time) [min]", y = "Average uncertainty", title = "Uncertainty change") + theme_bw(base_size = 11) + theme(legend.position = "bottom", legend.title = element_blank()) ``` The chart below shows how we interpret the three most common results of quality control. ```{r warning=FALSE,echo=FALSE} example_qc <- rbind(data.frame(x = c(10, 25, 60, 1440), y = c(0.008, 0.0075, 0.007, 0.0065), type = "Uncertainty decreases too slowly\nExperiment should be prolonged", Assessment = "Alter experimental settings"), data.frame(x = c(10, 25, 60, 1440), y = c(0.008, 0.001, 0.001, 0.001), type = "Uncertainty decreases too quickly\nExperiment should have more early timepoints", Assessment = "Alter experimental settings"), data.frame(x = c(10, 25, 60, 1440), y = c(0.008, 0.004, 0.003, 0.001), type = "Uncertainty decreases properly", Assessment = "Experiment conducted properly")) ggplot(example_qc, aes(x = x, y = y, color = Assessment)) + geom_line() + geom_point() + facet_wrap(~ type, ncol = 1) + theme_bw() + theme(legend.position = "bottom") ``` # HaDeX Graphical User Interface The HaDeX Shiny app is launched by the [HaDeX_gui()](https://hadexversum.github.io/HaDeX/reference/HaDeX_gui.html) function or available at MS Lab website: http://mslab-ibb.pl/shiny/HaDeX/. In the Input Data tab the user can adjust parameters to be propagated for the whole analysis. # Examples ## Example 1: CD160-HVEM The interaction between HVEM and the [CD160](https://www.uniprot.org/uniprot/O95971) receptor was measured with HDX-MS. Firstly, we read the input data, exactly as provided by the DynamX\textsuperscript{TM} 3.0 (Waters Corp.) ```{r warning=FALSE} library(HaDeX) # file import dat_1 <- read_hdx(system.file(package = "HaDeX", "HaDeX/data/KD_180110_CD160_HVEM.csv")) ``` Then, we reconstruct the protein sequence from the peptides measured during the experiment. We observe the region from amino acid 107 till amino acid 124 is not covered by any peptide. ```{r warning=FALSE} reconstruct_sequence(dat_1) plot_position_frequency(dat_1, chosen_state = "CD160") ``` The theoretical plot allows finding regions, which exchange quickly (N-terminus, regions between 30-70 amino acid) and regions, which exchange slowly (peptides 15-24, 95-110 amino acid). We can also see the differences in the exchange between two states, indicating regions which changed upon binding with other protein. ```{r warning=FALSE} # calculate data calc_dat_1 <- prepare_dataset(dat = dat_1, in_state_first = "CD160_0.001", chosen_state_first = "CD160_1", out_state_first = "CD160_1440", in_state_second = "CD160_HVEM_0.001", chosen_state_second = "CD160_HVEM_1", out_state_second = "CD160_HVEM_1440") # theoretical comparison plot - relative values comparison_plot(calc_dat = calc_dat_1, theoretical = TRUE, relative = TRUE, state_first = "CD160", state_second = "CD160_HVEM") ``` On the experimental plot, there are visible regions, which exchange quickly (N terminal part, regions between 30-70 amino acid) and regions, which exchange slowly (peptides 15-24, 30-35, 95-110 amino acid). ```{r warning=FALSE} # experimental comparison plot - relative values comparison_plot(calc_dat = calc_dat_1, theoretical = FALSE, relative = TRUE, state_first = "CD160", state_second = "CD160_HVEM") ``` Plots below are equivalent to plots above but in absolute values - for users that prefer those. ```{r warning=FALSE} # theoretical comparison plot - absolute values comparison_plot(calc_dat = calc_dat_1, theoretical = TRUE, relative = FALSE, state_first = "CD160", state_second = "CD160_HVEM") # experimental comparison plot - absolute values comparison_plot(calc_dat = calc_dat_1, theoretical = FALSE, relative = FALSE, state_first = "CD160", state_second = "CD160_HVEM") ``` The plot below shows peptides for which levels of deuteration were significantly lower upon binding with other protein (red) and peptides for which levels of exchange were significantly higher upon binding with other protein (blue). The plot also shows peptides in which no significant changes in deuteration between states are visible (grey). The biggest changes on the theoretical plot can be observed in 3 peptides (15-24, 30-35, 75-90). ```{r warning=FALSE} # theoretical Woods plot - relative values woods_plot(calc_dat = calc_dat_1, theoretical = TRUE, relative = TRUE) + coord_cartesian(ylim = c(-.2, .2)) ``` The biggest changes on the experimental plot can be observed in 3 peptides (15-24, 110-115). ```{r warning=FALSE} # experimental Woods plot - relative values woods_plot(calc_dat = calc_dat_1, theoretical = FALSE, relative = TRUE) + coord_cartesian(ylim = c(-.2, .2)) ``` Plot below shows results in absolute values. ```{r warning=FALSE} # theoretical Woods plot - absolute values woods_plot(calc_dat = calc_dat_1, theoretical = TRUE, relative = FALSE) + labs(title = "Theoretical fraction exchanged between states in 1 min time") # experimental Woods plot - absolute values woods_plot(calc_dat = calc_dat_1, theoretical = FALSE, relative = FALSE) + labs(title = "Fraction exchanged between states in 1 min time") # quality control - relative values (result <- quality_control(dat = dat_1, state_first = "CD160", state_second = "CD160_HVEM", chosen_time = 1, in_time = 0.001)) # example quality control visualisation library(ggplot2) ggplot(result) + geom_line(aes(x = out_time, y = avg_err_state_first, color = "Average error (first state)")) + geom_line(aes(x = out_time, y = avg_err_state_second, color = "Average error (second state)")) + scale_x_log10() + ylim(0, 0.05) + labs(x = "log(time) [min]", y = "Average uncertainty", title = "Uncertainty change in out time") + theme(legend.title = element_blank(), legend.position = "bottom") ``` ## Example workflow 2 ```{r warning=FALSE} library(HaDeX) # file import dat_2 <- read_hdx(system.file(package = "HaDeX", "HaDeX/data/KD_190304_Nucb2_EDTA_CaCl2_test02_clusterdata.csv")) # protein sequence reconstruction reconstruct_sequence(dat_2) # calculate data calc_dat_2 <- prepare_dataset(dat = dat_2, in_state_first = "gg_Nucb2_EDTA_0.001", chosen_state_first = "gg_Nucb2_EDTA_25", out_state_first = "gg_Nucb2_EDTA_1440", in_state_second = "gg_Nucb2_CaCl2_0.001", chosen_state_second = "gg_Nucb2_CaCl2_25", out_state_second = "gg_Nucb2_CaCl2_1440") # theoretical comparison plot - relative values comparison_plot(calc_dat = calc_dat_2, theoretical = TRUE, relative = TRUE, state_first = "Nucb2 Factor 1", state_second = "Nucb2 Factor 2") + labs(title = "Theoretical fraction exchanged in \nstate comparison in 25 min time") # experimental comparison plot - relative values comparison_plot(calc_dat = calc_dat_2, theoretical = FALSE, relative = TRUE, state_first = "Nucb2 Factor 1", state_second = "Nucb2 Factor 2") + labs(title = "Fraction exchanged in \nstate comparison in 25 min time") # theoretical comparison plot - absolute values comparison_plot(calc_dat = calc_dat_2, theoretical = TRUE, relative = FALSE, state_first = "Nucb2 Factor 1", state_second = "Nucb2 Factor 2") + labs(title = "Theoretical fraction exchanged in \nstate comparison in 25 min time") # experimental comparison plot - absolute values comparison_plot(calc_dat = calc_dat_2, theoretical = FALSE, relative = FALSE, state_first = "Nucb2 Factor 1", state_second = "Nucb2 Factor 2") + labs(title = "Fraction exchanged in \nstate comparison in 25 min time") # theoretical Woods plot - relative values woods_plot(calc_dat = calc_dat_2, theoretical = TRUE, relative = TRUE) + labs(title = "Theoretical fraction exchanged between states in 25 min time") + coord_cartesian(ylim = c(-.5, .7)) # experimental Woods plot - relative values woods_plot(calc_dat = calc_dat_2, theoretical = FALSE, relative = TRUE) + labs(title = "Fraction exchanged between states in 25 min time") + coord_cartesian(ylim = c(-.5, .7)) # theoretical Woods plot - absolute values woods_plot(calc_dat = calc_dat_2, theoretical = TRUE, relative = FALSE) + labs(title = "Theoretical fraction exchanged between states in 25 min time") # experimental Woods plot - absolute values woods_plot(calc_dat = calc_dat_2, theoretical = FALSE, relative = FALSE) + labs(title = "Fraction exchanged between states in 25 min time") # quality control (result <- quality_control(dat = dat_2, state_first = "gg_Nucb2_EDTA", state_second = "gg_Nucb2_CaCl2", chosen_time = 25, in_time = 0.001)) # example quality control visualisation - relative values library(ggplot2) ggplot(result[result["out_time"]>=1,]) + geom_line(aes(x = out_time, y = avg_err_state_first, color = "Average error (first state)")) + geom_line(aes(x = out_time, y = avg_err_state_second, color = "Average error (second state)")) + scale_x_log10() + labs(x = "log(time) [min]", y = "Average uncertainty", title = "Uncertainty change") + theme(legend.position = "bottom", legend.title = element_blank()) ``` # References