Exploring Cheminformatic Toolsets for Predicting the Dermal Toxicity of Furanocoumarins

© 2021 The Authors. This article is licensed under a Creative Commons Attribution 4.0 License Abstract. Linear furanocoumarins are skin sensitizers and anticancer agents whose appeal in skincare therapeutics is widely exploited. Owing to the need to predict the biological activities of medicines, this work aimed the investigate the predicted dermal toxicity of linear furanocoumarins through chemoinformatic approaches. Therefore, eight major linear furanocoumarins of interest in medicine were selected, and their pharmacophores / toxicophores were modelled and inputted in several databases and cheminformatic toolsets previously described in the literature. Moreover, Principal Components Analysis was performed to allow multivariable comparisons. Results showcased that the first two PCs accounted for 95.48% of all variance in the model, and molecular weight and polar surface showcased a positive correlation to Log P and Log Kp, which may be involved in skin penetration. Moreover, the pharmacophore modelling evidenced superimposition between linear furanocoumarins, ethidium bromide and acridine orange, thereby suggesting that these compounds share similar biological effects, supported by their acknowledged DNA intercalating activities. Therefore, this work showcased the application of various cheminformatic tools to screen the dermal toxicity of chemicals.


INTRODUCTION
Linear furanocoumarins (LF) or psoralens are secondary plant metabolites whose appeal in therapeutics is widely exploited in folk and standard medicine practices [1,2,3]. These compounds are biosynthesized from intermediaries of polyketide and mevalonate pathways, and their chemical structures showcase high variability, being only ever-present the core-moiety furo [3, 2-g] chromen-7-one. Although psoralens may vary in sidechain composition and conformation, these compounds do exhibit similar physicochemical features due to the substantial electron donor and accepting properties of their central aromatic system, which not only provides them similar electroactivity to several aromatic natural and synthetic compounds [4][5][6][7][8][9][10] but also confers photoreactivity to them upon the incidence of radiation at ≈ 340-365 nm (i.e. near the end of the ultraviolet spectrum) [1].
Considering the biological activities of LF, the furo [3, 2-g] chromen-7-one core-moiety is acknowledged to bind covalently to some therapeutic targets such as DNA bases [11][12][13]. This process is regarded as one of the main underlying factors regarding these compounds' therapeutic and toxic effects [3]. Notwithstanding, several authors reported that psoralens interact with many macromolecules in the human organism, which may be involved in their medicinal uses against cancer, psoriasis, vitiligo and other health conditions [2,14,15].
According to standard healthcare protocols, LF use in therapeutics is conditioned by either enteral or parenteral administration, being the drug uptake route decided according to the diagnosis and overall state of the patient [1]. In general terms, LF therapeutics rely on controlled patient exposure to ultraviolet radiation, given the dependence of the clinical outcome to this approach [2]. The patient exposure to ultraviolet light is supported by several authors, who described the remarkable effects of radiation on the thermodynamic feasibility of psoralens binding to DNA and other biomolecules, suggesting that photoactivation is involved in the pharmacodynamics of these compounds [2,13]. Furthermore, some authors reported the benefits of in situ skin administration of LF-containing formulations and ultraviolet exposure to directly treat an affected area [1,16,17].
Although LF topical use in medicine is well reported, the extent of their dermal toxicity is still an issue [18,19]. Despite not being considered outright toxic, these compounds are highly photoreactive, and the still unknown consequences of their phototoxicity upon the incidence of ultraviolet radiation daily is a significant concern to long term treatments [14]. In this sense, the comprehensive investigation of the dermal effects of LF is of particular importance to promote patient safety and ensure treatment efficacy.
Several in vitro and in vivo assays are commonly used to assess if new formulations are adequate to patient safety standards regarding dermal drug toxicity. Amongst these assays are murine local lymph node assay (LLNA) [20], direct peptide reactivity assay (DPRA) [21], human cell-line activation test (H-CLAT) [22] and KeratinoSens ® [23], which provide essential information regarding skin sensitization and dermal toxicity. Albeit reliable and effective in providing information on possible toxic effects and their mechanisms, some of these tests are expensive and require adequate infrastructure to perform [24]. Moreover, most in vitro approaches are time and reagent consuming, hindering their application for large output products such as toxicological screenings. In this sense, alternative methods such as cheminformatic investigations could provide complementary information regarding the dermal toxicity of compounds bearing similar chemical structures, such as LF.
Cheminformatic is a relatively new field of research and involves using computational resources to investigate chemical phenomena. This approach uses the physicochemical features of compounds converted to usable data through mono or multidimensional molecular descriptors [25][26][27][28]. This information can be correlated to databases or using data mining and machine learning algorithms to establish predictive models regarding biological activity, pharmacophores [30], docking models [27,28,31,32], and other applications. Notwithstanding, these studies can be performed on free software, i.e., freeware, for most scientific applications. This further increases their appeal as low-cost alternatives to primarily investigate drug biological activities or complementary techniques to guide in vitro and in vivo assays.
Therefore, because of the therapeutic relevance of LF for skin conditions, and the importance of inciting information regarding their toxicity upon topical use, this work investigates the predicted dermal toxicity of LF thence through chemoinformatic approaches. Henceforth several databases and toolsets previously described in the literature were used, and their results were compared.
Study design and data pre-treatment. This work focused initially to gather information regarding the physicochemical and biological properties of the selected LF. Therefore, Pubchem database was used to retrieve isomeric (when available) or canonical simplified molecular-input lineentry system (SMILES) of each compound [33]. This information was used without further treatment in Molinspiration [34], pkCSM [35], SwissADME [36,37] and Pred-Skin [38,39] cheminformatic tools.
Thereafter, the SMILES string of each compound was converted to a three-dimensional rendering of their chemical structures whereupon charges were assigned using Biovia Discovery Studio ® software. Manual corrections regarding aromatic bonds were also conducted, and all structures were thoroughly reviewed before further experiments.
All treated structures were added to a single file and submitted to PharmaGist Webserver [40,41]. The work conditions were: 5 output pharmacophores, PSO as key-molecule and a minimum of 3 features in the predicted model. Furthermore, we employed the following feature weightings in the proposed models: 3.0 for aromatic rings, 1.0 for charge (anion/cation), 1.5 for hydrogen bond (donor/acceptor), 0.3 for hydrophobic. The higher weighting for aromatic rings was selected considering the particularities of LF, given the abundance of π-electrons in their common core-moiety, namely furo [3, 2-g] chromen-7-one.
All figures were rendered and treated using Biovia Discovery Studio ® software.
Statistical analysis. To investigate any possible trend in the physicochemical data of the compounds, Principal Component Analysis (PCA) was used [42]. This approach was selected to minimize dataset dimensions basing on variance/correlation matrix. Statistical significance was attributed to p < 0.05. All statistical analysis was performed using Origin Pro 9b ® software package (Northampton, MA, USA).

RESULTS AND DISCUSSION
The first step of this investigation involved the assessment of the predicted physicochemical properties of each compound. Therefore, pkCSM and Molinspiration databases were used. All information above is showcased in Table 1. To investigate the pharmacokinetic properties of LF upon non-invasive cutaneous administration, their predicted skin permeability profiles were assessed by gathering data regarding Log Kp from different databases. Moreover, other predictive pharmacokinetic information was also gathered, namely: blood-brain barrier permeability (Log BB), one-way flux level corrected with the brain flux value representing the central nervous system permeability (Log PS), and intestinal permeability (IA). Results are showcased in Table 2.
Results show that predicted Log Kp values differ according to the consulted source, though expressed in different dimensions. For example, log Kp from source A (pkCSM) ranged from -2.216 to -2.830 cm h, while from source B (SwissADME), the values ranged from -4.61 to -6.40 cm s. Log BB and log PS, which correlate to the distribution of the drugs through cerebrospinal fluid, ranged from 0.085 to 0.447 and from -1.639 to -2.834, respectively. Furthermore, predicted intestinal absorption for all drugs was above 95%, the highest recorded of 98.386% for ISO (Table 2). Notes: *All data gathered at pkCSM and SwissADME databases upon imputing the SMILES string of each compound.
**Calculated properties at pkCSM. Log K p A -skin permeability (cm h); Log BB -blood-brain barrier permeability; Log PSconstant of one-way flux level corrected with the brain flux value representing the central nervous system permeability; IAintestinal absorption (%).
After gathering data regarding each compound's predicted physicochemical and pharmacokinetic features, molecular weight, Log P, surface and polar surface areas were investigated as independent and continuous variables through PCA analysis. This was performed to analyze if the reported values showcased any LF skin absorption behaviour trend, which Log Kp represents.
Results are showcased as PCA biplot in Figure 1.

Figure 1
Notes: PCA biplot showcasing scatter plot of LF (red), eigenvector orientation and dimension according to the first two principal components (PC). MW stands for molecular weight; SA stands for the surface area; PSA stands for polar surface area, and Log Kp stands for skin permeability coefficients gathered from different databases (A -pkCSM and B -SwissADME)  Figure 1). Furthermore, the eigenvectors representing molecular weight, surface area, Log Kp B and Log P converged, which was further supported by the correlation matrix of the model. This suggests the positive influence of these fac-tors towards better skin penetration. In addition, polar surface area and Log Kp A eigenvectors showcased little convergence to the other descriptive vectors. However, their alignment suggests that their datasets may be inversely proportional, further supported in the correlation matrix ( Figure 1).
The second step of this study involved the investigation of the predicted skin sensitization effects and toxicity of LF. Therefore, pkCSM and Pred-Skin databases were used, and the results are displayed in Table 3.
Results showcased that the pkCSM database suggested that no imputed LF would provide skin sensitization, while Pred-Skin suggested that most compounds promote this effect. Although Pred-Skin results are disclosed as categorical, a percentual score is given to support the reliability of the findings. In this sense, most LF was predicted as skin sensitizing agents with mode values of 80%. Moreover, Pred-Skin also provides predictive information about specific tests, namely LLNA; DPRA; h-CLAT and KeratinoSens ® , as well as a consensus regarding the skin sensitization model. In this sense, most LF showcased positive results to the predicted data, albeit ranging from 50 to 60% in terms of statistical reliability. Notwithstanding, the consensus model for most LF was positive as skin sensitizer, even though BTN and ISO showcased adverse predicted outcomes ( Table 3).
The third step of this investigation was the comparison of all LF chemical structures between themselves. Therefore, a pharmacophore model was rendered using PSO as a key-molecule. Results are depicted in Figure 2.
Results showcased that the core moiety furo [3,2g] chromen-7-one was superimposed in the model (Figure 2.A), the distance between the nuclei of furan and central benzene moiety of 2.146 Å. In comparison, the distance between the nuclei of the central benzene and aromatic δvalerolactone ring was 2.401 Å (Figure 2.B). The distances between the oxygens of the furan and aromatic δ-valerolactone ring were 4.781 Å, the distance between the oxygens in the carboxyl unit of this lactone 2.306 Å (Figure 2.C). In addition, the aromatic nuclei configuration presented an obtuse angle of 177.86 o (Figure 2.D).
Considering that DNA intercalators such as ethidium bromide and acridine orange have similar tricyclic structures to LF, a pharmacophore model was rendered using these compounds and having PSO key-molecule. Results are depicted in figure 3. Results showcased that the structures did not superimpose and, in the model, comprising only LF, an expected finding (Figure 3.A). The main model contributors were aromatic and positioned in three distinct points, whose angulation was 160.56 o (Figure 3.B). Moreover, the external aromatic contributors showcased similar distances to the central one, about 2.261 Å (Figure 3.C). The contributors were almost planar, showcasing a negligible torsion (Figure 3.D).
Many authors discuss the photoreactivity of LF and its effects upon dermal administration [1,2,16]. Although the exact mechanisms underlying these effects are still unclear, literature attributes to the condensed aromatic system of Psoralens pro-oxidant properties [43,44]. In this sense, some reports supported this interpretation due to the increased formation of reactive oxygen species in the presence of LF and similar compounds [19,32,[44][45][46]. Notwithstanding, the furan ring of LF also features thermodynamic feasibility to covalently bind to several biologic receptors, which could explain both the therapeutic and toxic effects of these chemicals [9,10,47,48].
In any case, most authors support that LF physicochemical features are of utmost importance to their effects, being structural characteristics like the condensed aromatic system often hinted as pharmacophores in both natural and synthetic components [29,[49][50][51][52], hence particular physicochemical and pharmacological attributes [53][54][55]. Moreover, this core moiety confers enough lipophilicity to these chemicals to allow their diffusion through the skin and cross other biological barriers such as the blood-brain barrier [56]. In this work, several databases provided similar physicochemical and pharmacokinetic information about LF (Table 1 and 2), being these data also supported by empiric research found in the literature [57][58][59][60]. Notwithstanding, the correlation of the findings through PCA showcased trends by other reports, such as the molecular weight, polar surface and Log P and Log Kp B eigenvectors convergence (Figure 1). The correlation between these vectors suggests the relationship between structure weight and electron cloud when their lipophilicity and skin permeability are concerned. Nonetheless, many authors describe how similar compounds' chemical and electronic features may affect their pharmacokinetics in silico and in vivo investigations [57-59, 61, 62].
Considering the similar core structure shared by LF, the superimposition of their structures in the pharmacophore modelling was no surprise (Figure 2). The overall sp 2 hybridization of all carbons in the furo [3, 2-g] chromen-7-one moiety confers a planar orientation to the core of all LF [63]. This was nonetheless supported by energy minimization approaches based on "Assisted Model Building with Energy Refinement" toolkits [64], as well as ab initio investigations of similar compounds using density functional theory calculations [65]. In any case, LF core structure is regarded as a strong electron acceptor and donor, which supports their pro-oxidant behaviour and photoreactivity due to the possibility of electron transitions that may be involved in the biological effects of these compounds [13].
Previous reports evidenced that exposure to UV radiation is of upmost importance to the covalent binding of LF such as PSO to DNA bases, which further supports the involvement of the core moiety in the therapeutic or toxic effect of these chemicals [1,66]. Moreover, when compared to well-known DNA intercalators ethidium bromide and acridine orange, the furo [3, 2-g] chromen-7one moiety showcased similar aromatic contributors, as well as geometry, considering the planar orientation of these molecules ( Figure 3). An important point must be addressed, though, which relies on the toxicodynamic of DNA intercalators such as the ones herein used for the pharmacophore modelling. Ethidium bromide and acridine orange link to DNA through covalent bonds and intermolecular interactions, being the kinetics and thermodynamics of this phenomenon subject of several investigations [12,13]. PSO showcases similar behaviour regarding the involvement of both covalent bonds and intermolecular interactions. Considering the structural similarities herein discussed, the furo [3, 2-g] chromen-7-one moiety may be an essential contributor to dermal toxicity.
Regarding the predicted skin sensitization, pkCSM database results differed from the ones provided by Pred-Skin (Table 3). This difference may be attributed to distinct algorithms and weightings employed when imputed information was compiled [35,38,39]. Regardless of the case, Pred-Skin results were sound when compared to empiric investigations, which showcased the skin sensitizing effects of LF by different in vitro and in vivo approaches [1,3,59,60]. Care must be taken, however, because these results imply by no means that one database is more reliable than the other, considering the narrow scope of our investigation. Given the plethora of information found in all databases herein consulted, which was sound to empiric data, we suggest all the tools be used complementarily.

CONCLUSIONS
This work investigated the predicted dermal toxicity of LF through chemoinformatic approaches. Results showcased that the molecular descriptor data can be used to predict the dermal toxicity of LF through in silico toolsets, which therefore shed light on the use of computational methods to predict the clinical outcomes of skin exposure to these compounds.