Journal of Chemometrics

Current research reports and chronological list of recent articles.


The international scientific Journal of Chemometrics is devoted to the rapid publication of original scientific papers, reviews and short communications on fundamental and applied aspects of chemometrics.

The publisher is Wiley. The copyright and publishing rights of specialized products listed below are in this publishing house. This is also responsible for the content shown.

To search this web page for specific words type "Ctrl" + "F" on your keyboard (Command + "F" on a Mac). Then: type the word you are searching for in the window that pops up!

Additional research articles see Current Chemistry Research Articles. See also: information resources on chemometrics.



Journal of Chemometrics - Abstracts



Introducing special issue on chemical image analysis


Datum: 15.09.2017


Issue Information

No abstract is available for this article.
Datum: 05.09.2017


Post-modified non-negative matrix factorization for deconvoluting the gene expression profiles of specific cell types from heterogeneous clinical samples based on RNA-sequencing data

The application of supervised algorithms in clinical practice has been limited by the lack of information on pure cell types. Several supervised algorithms have been proposed to estimate the gene expression patterns of specific cell types from heterogeneous samples. Post-modified non-negative matrix factorization (NMF), the unsupervised algorithm we proposed here, is capable of estimating the gene expression profiles and contents of the major cell types in cancer samples without any prior reference knowledge. Post-modified NMF was first evaluated using simulation data sets and then applied to deconvolution of the gene expression profiles of cancer samples. It exhibited satisfactory performance with both the validation and application data. For application in 3 types of cancer, the differentially expressed genes (DEGs) identified from the deconvoluted gene expression profiles of tumor cells were highly associated with the cancer-related gene sets. Moreover, the estimated proportions of tumor cells showed significant difference between the 2 compared patient groups in clinical endpoints. Our results indicated that the post-modified NMF can efficiently extract the gene expression patterns of specific cell types from heterogeneous samples for subsequent analysis and prediction, which will greatly benefit clinical prognosis.
Datum: 31.08.2017


FTIR-ATR adulteration study of hempseed oil of different geographic origins

Adulteration of hempseed (H) oil, a well-known health beneficial nutrient, is studied in this work by mixing it with cheap and widely used oils such as rapeseed (R) oil and sesame (Se) and sunflower (Su) oil. Many samples of different geographic origins were taken into account. Binary mixture sets of hempseed oil with these 3 oils (HR, HSe, and HSu) were considered. FTIR spectra of pure oils and their mixtures were recorded, and quantitative analyses were performed using partial least squares regression (PLS) and first-break forward interval PLS methods (FB-FiPLS). The obtained results show that each particular oil can be very successfully quantified (R2(val) > 0.995, RMSECV 0.9%–2.9%, RMSEP 1.0%–3.2%). This means that FTIR coupled with multivariate methods can rapidly and effectively determine the level of adulteration in the adulterated hempseed oil for these studied and frequently used adulterant oils. Also, the relevant variables selected by FB-FiPLS could be used for verification of hempseed oil adulteration.
Datum: 31.08.2017


Prediction of pitting corrosion status of EN 1.4404 stainless steel by using a 2-stage procedure based on support vector machines

The excellent properties of EN 1.4404 have made this material one of the most popular types of austenitic stainless steel used for many applications. However, in aggressive environments, this alloy may suffer corrosion. Electrochemical analyses have been extensively used in order to evaluate pitting corrosion behaviour of stainless steel. These techniques may be followed by microscopic analysis in order to determine the resistance of the passive layer. This step requires the human interpretation, and therefore, subjectivity may be included in the results. This work aims to solve this drawback by the development of an automatic model with the capability to predict pitting corrosion status of this material. A combined model based on support vector machines (SVMs) is presented in this work. With the aim to improve the prediction performance, the model considers the breakdown potential values estimated by itself at a first stage. The performance is evaluated based on receiver operating characteristic (ROC) curves. The area under the curve (AUC) and accuracy results (0.998 and 0.952, respectively) demonstrate the utility of the proposed model as an efficient and accurate tool to predict pitting behaviour of EN 1.4404 automatically.
Datum: 29.08.2017


Quionolone carboxylic acid derivatives as HIV-1 integrase inhibitors: Docking-based HQSAR and topomer CoMFA analyses

Quionolone carboxylic acid derivatives as inhibitors of HIV-1 integrase were investigated as a potential class of drugs for the treatment of acquired immunodeficiency syndrome (AIDS). Hologram quantitative structure-activity relationships (HQSAR) and translocation comparative molecular field vector analysis (topomer CoMFA) were applied to a series of 48 quionolone carboxylic acid derivatives. The most effective HQSAR model was obtained using atoms and bonds as fragment distinctions: cross-validation q2 = 0.796, standard error of prediction SDCV = 0.36, the non-cross-validated r2 = 0.967, non-cross validated standard error SD = 0.17, the correlation coefficient of external validation Qext2 = 0.955, and the best hologram length HL = 180. topomer CoMFA models were built based on different fragment cutting models, with the most effective model of q2 = 0.775, SDCV = 0.37, r2 = 0.967, SD = 0.15, Qext2 = 0.915, and F = 163.255. These results show that the models generated form HQSAR and topomer CoMFA were able to effectively predict the inhibitory potency of this class of compounds. The molecular docking method was also used to study the interactions of these drugs by docking the ligands into the HIV-1 integrase active site, which revealed the likely bioactive conformations. This study showed that there are extensive interactions between the quionolone carboxylic acid derivatives and THR80, VAL82, GLY27, ASP29, and ARG8 residues in the active site of HIV-1 integrase. These results provide useful insights for the design of potent new inhibitors of HIV-1 integrase.
Datum: 29.08.2017


Chemometrics optimization for simultaneous adsorptive removal of ternary mixture of Cu(II), Cd(II), and Pb(II) by Fraxinus tree leaves

Fraxinus tree leaves were successfully used to remove ternary mixture of Cu(II), Cd(II), and Pb(II) from an aqueous solution in a batch system. The simplex-centroid mixture design was used for optimization of the biosorption process. The effective factors on biosorption process, such as pH, s (amount of biosorbent), and Ci initial concentrations of metal ions, were considered via a crossed mixture–process design. Optimal conditions were found to be as follows: pH = 5, s (sorbent mass) = 0.05 g, CCu (initial Cu(II) concentration) = 100.0 mg/L, CCd (Initial Cd(II) concentration) = 129.1 mg/L, and CPb (Initial Pb(II) concentration) = 70.9 mg/L. The results clearly show competitive effects between mixture ingredients in favor of Pb(II), and also an interaction between process and mixture variables was observed. It was found that, with increasing Pb(II) contribution, the removal efficiency increases to its highest value. The pH has a positive effect and sorbent mass a negative effect on the response. To characterize the biosorption, Fourier transform infrared analysis was performed and, according to the results, the main functional groups of sorbent were involved in the biosorption process.
Datum: 29.08.2017


Fault detection based on weighted difference principal component analysis

Recently, multivariate statistical methods, such as principal component analysis (PCA), have drawn increasing attention for fault detection applications in industrial processes. However, industrial processes typically have complex multimodal and nonlinear characteristics. In these situations, the traditional PCA method performs poorly due to its assumption that the process data are linear and unimodal. To improve fault detection performance in nonlinear and multimode industrial processes, this paper proposes a new fault detection method based on weighted difference principal component analysis (WDPCA). Weighted difference principal component analysis first eliminates the multimodal and nonlinear characteristics of the original data by using the weighted difference method. Then, PCA is applied to the preprocessed data, neglecting the influences of multimodality and nonlinearity. Two numerical examples and an industrial application in a semiconductor manufacturing process are used to verify the effectiveness of WDPCA. The simulation results demonstrate that WDPCA shows better fault detection performance than the PCA, kernel principal component analysis (KPCA), independent component analysis (ICA), k-nearest neighbor rule (kNN), and local outlier factor (LOF) methods.
Datum: 24.08.2017


Independent component analysis based on data-driven reconstruction of multi-fault diagnosis

Independent component analysis based on data-driven reconstruction has been widely used in online fault diagnosis for industrial processes. As an alternative to conventional contribution plots, the reconstruction-based fault diagnosis method has been drawing special attention. The method detects fault information with a specific reconstruction model based on historical fault data. In this paper, a novel method was proposed that focuses on handling multiple fault cases in abnormal processes. First, reconstruction-based fault subspaces were extracted based on monitoring statistics in 2 different monitoring subspaces to enclose the major fault effects. Independent component analysis was then used to recover the main fault features from the selected fault subspaces, which represent the joint effects from multiple faults for online diagnosis. The simulation results showed the feasibility and performance of the proposed method with simulated multi-fault cases in the Tennessee Eastman (TE) benchmark process.
Datum: 24.08.2017


Identification of hit compounds for squalene synthase: Three-dimensional quantitative structure-activity relationship pharmacophore modeling, virtual screening, molecular docking, binding free energy calculation, and molecular dynamic simulation

Squalene synthase (SQS) is the key precursor in the synthesis of cholesterol. Located downstream in relation to hydroxy methylglutaryl coenzyme A reductase and having no influence on the formation of biologically necessary isoprenoids make it an interesting target for the development of cholesterol lowering drugs with fewer side effects. To discover novel SQS inhibitors, three-dimensional quantitative structure-activity relationship pharmacophore models were built and further validated by cost function analysis, test set validation, and decoy set validation to obtain a reliable model for virtual screening against a database that contains 5.5 million compounds. The interactions between SQS and the ligands were predicted by an integrated protocol that contains molecular docking, molecular mechanics/generalized born surface area, and molecular dynamic simulation. After that, five compounds with best binding affinities and binding modes were obtained as potential hits for further study and three of them showed inhibitory effects against SQS.
Datum: 24.08.2017


Calculation of topological indices from molecular structures and applications

This mini review presents a brief description of the research efforts for new topological indices of organic molecular structures undertaken in the authors' laboratory at Changchun Institute of Applied Chemistry, Chinese Academy of Sciences. They were used for the processing of chemical information, as highly selective topological indices for uniqueness determination, as highly selective atomic chiral indices for chiral center recognition, in the exhaustive generation of isomers, in a stereo code for the exhaustive generation of stereoisomers, in the prediction of C-13 nuclear magnetic resonance spectra, and in studies on rare earth extractions. The topological indices Ami, 3D descriptors, and chiral descriptors are described, as well as their applications in quantitative structure activity/property relationship studies.
Datum: 24.08.2017


Sampling error profile analysis (SEPA) for model optimization and model evaluation in multivariate calibration

A novel method called sampling error profile analysis (SEPA) based on Monte Carlo sampling and error profile analysis is proposed for outlier detection, cross validation, pretreatment method and wavelength selection, and model evaluation in multivariate calibration. With the Monte Carlo sampling in SEPA, a number of submodels are prepared and the subsequent error profile analysis yields a median and a standard deviation of the root-mean-square error (RMSE) for the submodels. The median coupled with the standard deviation is an estimation of the RMSE that is more predictive and robust because it uses representative submodels produced by Monte Carlo sampling, unlike the normal method, which uses only 1 model. The error profile analysis also calculates skewness and kurtosis for an auxiliary judgment of the estimated RMSE, which is useful for model optimization and model evaluation. The proposed method is evaluated with 3 near-infrared datasets for wheat, corn, and tobacco. The results show that SEPA can diagnose outliers with more parameters, select more reasonable pretreatment method and wavelength points, and evaluate the model more accurately and precisely. Compared with the results reported in published papers, a better model could be obtained with SEPA concerning RMSECV, RMSEC, and RMSEP estimated with an independent prediction set.
Datum: 24.08.2017


Selectivity-relaxed classical and inverse least squares calibration and selectivity measures with a unified selectivity coefficient

Two popular calibration strategies are classical least squares (CLS) and inverse least squares (ILS). Underlying CLS is that the net analyte signal used for quantitation is orthogonal to signal from other components (interferents). The CLS orthogonality avoids analyte prediction bias from modeled interferents. Although this orthogonality condition ensures full analyte selectivity, it may increase the mean squared error of prediction. Under certain circumstances, it can be beneficial to relax the CLS orthogonality requisite allowing a small interferent bias if, in return, there is a mean squared error of prediction reduction. The bias magnitude introduced by an interferent for a relaxed model depends on analyte and interferent concentrations in conjunction with analyte and interferent model sensitivities. Presented in this paper is relaxed CLS (rCLS) allowing flexibility in the CLS orthogonality constraints. While ILS models do not inherently maintain orthogonality, also presented is relaxed ILS. From development of rCLS, presented is a significant expansion of the univariate selectivity coefficient definition broadly used in analytical chemistry. The defined selectivity coefficient is applicable to univariate and multivariate CLS and ILS calibrations. As with the univariate selectivity coefficient, the multivariate expression characterizes the bias introduced in a particular sample prediction because of interferent concentrations relative to model sensitivities. Specifically, it answers the question of when can a prediction be made for a sample even though the analyte selectivity is poor? Also introduced are new component-wise selectivity and sensitivity measures. Trends in several rCLS figures of merit are characterized for a near infrared data set.
Datum: 17.08.2017


Fault diagnosis of output-related processes with multi-block MOPLS

For fault diagnosis of output-related processes, a relatively high false alarm rate (FAR) of output-irrelevant faults exists because the output-irrelevant variables are not removed completely by conventional approaches. A relatively large number of computational loads is thus required. Therefore, in this paper, a new fault diagnosis approach based on multiblock modified orthogonal projections to latent structures is proposed to complete fault diagnosis for complex chemical processes, particularly for the penicillin fermentation process. The main contributions are as follows: (1) Multiblock orthogonal projections to latent structures are applied to remove the noncorrelated variables of input blocks, which requires a relatively low computational load. In addition, block scores are obtained by block weights rather than super weights to better describe the character of each subblock. (2) Complete orthogonal decomposition between input block variables and output is explored to completely separate output-related and output-irrelevant variables to improve the accuracy of diagnosis. (3) A hierarchical diagnosis scheme, which is composed of block monitoring statistics and subblock monitoring statistics, is proposed to monitor and localize faults. Penicillin fermentation process is considered in this study, and the penicillin concentration-relevant fault in each block and subblock is analyzed. The results of this study show that the proposed method has steadier performance on output-related fault diagnoses and diagnoses faults more accurately.
Datum: 28.07.2017


Quantitative structure-selectivity relationship (QSSR)-based molecular insight into the cross-reactivity and specificity of chemotherapeutic inhibitors between PI3Kα and PI3Kβ

Selective inhibition of phosphoinositide 3-kinase (PI3K) isoforms α and β with small-molecule inhibitors can result in distinct biological effects on anticancer chemotherapy. However, many existing PI3K inhibitors have moderate or high promiscuity and cross-reactivity between the 2 kinase isoforms. Here, a quantitative structure-selectivity relationship–based statistical modeling scheme was used to characterize the relative contribution of independent kinase residues to inhibitor selectivity and to predict the selectivity and specificity for existing PI3K inhibitors. It is found that the residue type and distribution of kinase's active site play an important role in inhibitor selectivity, while rest of the kinase may contribute to the selectivity through long-range chemical interactions and indirect allosteric effect. The selectivity is also determined by the configuration difference between PI3Kα and PI3Kβ kinase domains. Larger inhibitor compounds have lower binding potency to PI3Kβ than PI3Kα and thus possess higher selectivity for PI3Kα over PI3Kβ.
Datum: 28.07.2017


Why orthogonal rotations might be not so orthogonal as you think


Datum: 19.07.2017


A novel molecular descriptor selection method in QSAR classification model based on weighted penalized logistic regression

Molecular descriptor selection is a pivotal tool for quantitative structure–activity relationship modeling. This paper proposes a novel molecular descriptor selection method on the basis of taking into account the information of the group type that the descriptor belongs to. This descriptor selection method is on the basis of combining penalized logistic regression with 2-sample t test. The proposed method can perform filtering and weighting simultaneously. Specifically, 2-sample t test is employed as filter method by removing the descriptor which is not show statistically significant difference. On the other hand, a weighted penalized logistic regression is used by assigning a weight depending on the 2-sample t test value inside the descriptor type block. The proposed method is experimentally tested and compared with state-of-the-art selection methods. The results show that our proposed method is simpler and faster with efficient classification performance.
Datum: 19.07.2017


Ensemble partial least squares regression for descriptor selection, outlier detection, applicability domain assessment, and ensemble modeling in QSAR/QSPR modeling

In QSAR/QSPR modeling, building an accurate partial least squares (PLS) model usually involves descriptor selection, outlier detection, applicability domain assessment, nonlinear relationship, and model stability problems. In the present study, we presented an ensemble PLS (EnPLS) method for solving these modeling tasks under a unified methodology framework. EnPLS aims at developing a consistent algorithmic framework by means of the idea of ensemble learning and statistical distribution. The approach exploits the fact that the distribution of PLS model coefficients provides a mechanism for ranking and interpreting the effects of variables, whereas the distribution of prediction errors provides a mechanism for differentiating the outliers from normal samples and assessing the applicability domain of models. The use of statistics of these distributions, namely, mean/median value and standard deviation, inherently provides a feasible way to effectively describe the information contained by the original samples. Furthermore, ensemble modeling and prediction based on several cross-predictive PLS models could effectively improve the model prediction performance and increase the model stability to a certain extent. The aqueous solubility data are used to demonstrate the ability of our proposed EnPLS method in solving various modeling tasks such as descriptor selection, outlier detection, applicability domain assessment, performance improvement, and model stability. Finally, a freely available R package implementing EnPLS is developed to facilitate the use of chemists and pharmacologists. The R package is freely available at https://github.com/wind22zhu/enpls1.2.
Datum: 19.07.2017


Prediction of anti-HIV activity on the basis of stacked auto-encoder

The prediction of biologically active compounds plays a very important role for high-throughput screening approaches in drug discovery. Most computational models, in this area, concentrate on measuring structural similarities between chemical elements. There are various methods to predict anti-HIV activity, such as artificial neural network and support vector machine, but generally using shallow machine learning with low accuracies and less samples. In this work, one of deep learning methods, stacked auto-encoder (SAE), is proposed to predict anti-HIV activity of a broad group of compounds for the first time. Through contrasting experiments of artificial neural network, support vector machine, and SAE under the same condition, the accuracy after descriptors screening is higher than using raw descriptors, and SAE performs better than the other two methods to achieve the perfect forecast of anti-HIV activity. It has a great significance on promoting anti-HIV drug design, which therefore can reduce research and development costs and improve the efficiency of anti-HIV drug discovery.
Datum: 19.07.2017


Robust variable selection based on bagging classification tree for support vector machine in metabonomic data analysis

In metabonomics, metabolic profiles of high complexity bring out tremendous challenges to existing chemometric methods. Variable selection (ie, biomarker discovery) and pattern recognition (ie, classification) are two important tasks of chemometrics in metabonomics, especially biomarker discovery that can be potentially used for disease diagnosis and pathology discovery. Typically, the informative variables are elicited from a single classifier; however, it is often unreliable in practice. To rectify this, in the current study, bagging and classification tree (CT) were combined to form a general framework (ie, BAGCT) for robustly selecting the informative variables, based on the advantages of CT in automatically carrying out variable selection as well as measuring variable importance and the properties of bagging in improving the reliability and robustness of a single model. In BAGCT, a set of parallel CT models were established based on the idea of bagging, each CT providing some endowed information such as the splitting variables and their corresponding importance values. The informative variables can be successfully spied via inspecting the variable importance values over all CTs in BAGCT. Taking the promising properties of support vector machine (SVM) into account, we used the informative variables identified by BAGCT as the inputs of SVM, forming a new classification tool abbreviated as BAGCT-SVM. A metabonomic dataset by hydrogen-1 nuclear magnetic resonance from the patients with lung cancer and the healthy controls was used to validate BAGCT-SVM with CT and SVM as comparisons. Results showed that BAGCT-SVM with less number of variables can give better predictive ability than CT and SVM.
Datum: 19.07.2017


The EIS-based Kohonen neural network for high strength steel coating degradation assessment

Electrochemical impedance spectroscopy (EIS) method is used for a long-term and in-depth study on the failure analysis of polymer coatings. With the assistance of neural networks, a deeper insight into the changing states of corrosion during certain exposure circumstances has been investigated by applying specific Kohonen intelligent learning networks. The Kohonen artificial network has been trained by using 4 sets of samples from sample 1# to sample 4# with unsupervised competitive learning methods. Each sample includes up to 14 cycles of EIS data. The trained network has been tested using sample 0# impedance data at 0.1 Hz. All the sample data were collected during exposure to accelerated corrosion environments, and it took the changing rate of impedance of each cycle as an input training sample. Compared with traditional classification, Kohonen artificial network method classifies corrosion process into 5 subprocesses, which is refinement of 3 typical corrosion processes. The 2 newly defined subprocesses of corrosion, namely, premiddle stage and postmiddle stage were introduced. The EIS data and macro-morphology for both subprocesses were analyzed through accelerated experiments that considered general atmospheric environmental factors such as UV radiation, thermal shock, and salt fog. The classification results of Kohonen artificial network are highly consistent with the predictions based on impedance magnitude at low frequency, which illustrates that the Kohonen network classification is an effective method to predict the failure cycles of polymer coatings.
Datum: 13.07.2017


Determining the number of pure chemical components in the mixed spectral data based on eigenvalue sequences transform

Determining the number of pure chemical components is an important step for various chemical data analysis methods like cluster analysis, principal component analysis, and spectral unmixing. In this paper, a method of eigenvalue sequences transform is proposed to improve the performance in determining the number of chemical components in spectral matrix. The proposed method converts the spectral data cube to eigenvalue sequences by applying the singular value decomposition technique firstly. Then, the method innovatively transforms the normalized eigenvalue sequences into a redefined coordinate system and detects the number of chemical components by searching the sequence of the highest point. Since the proposed method identifies the number of chemical components from the angle of geometry, all processes need not involve the use of time-consuming iterations, extensive calibration tables, or pseudostatistical hypothesis. This paper also evaluates the applications of the proposed method with simulation and real-world spectral data. The evaluation results show that the method has stronger robustness, better accuracy, and higher automaticity in estimating the number of chemical components by comparing with some calibration methods.
Datum: 13.07.2017


Intelligent tools to model photocatalytic degradation of beta-naphtol by titanium dioxide nanoparticles

Feasibility of applying intelligent tools in prediction and optimization of photocatalytic degradation of beta-naphthol using the titanium dioxide (TiO2) nanoparticles were conducted in this study. Biphasic TiO2 nanoparticles were synthesized using the controlled hydrolysis of TiCl4, and their properties were studied using the X-ray diffraction and transmission electron microscopy methods. Therefore, factors affecting photocatalytic degradation of beta-naphthol including impurity concentration, catalyst content, acidity, and aeration rate were monitored and controlled. The laboratory data showed that degradation rate of beta-naphthol is a complicated nonlinear function of monitored variables. Two models including artificial network trained with particle swarm optimization (ANN-PSO) and adaptive neuro-fuzzy interference system trained with particle swarm optimization (ANFIS-PSO) were used for prediction of this system. The results showed presence of a significant relation between the real and predicted data of these 2 models. However, ANFIS-PSO can be more efficiently applied for prediction and optimization of photocatalytic behavior of TiO2 nanoparticles as for degradation of beta-naphthol as compared to ANN-PSO. As an advantage, ANFIS eliminates the problems of fuzzy logic, such as creation of membership functions, and local minima, which should be located in design of ANN, and through PSO algorithm, it could be a very powerful tool for simulating kinds of processes.
Datum: 07.07.2017


Uncorrelated component analysis on manifold for statistical process monitoring

A novel process modeling approach referred to as locally invariant uncorrelated component analysis (LIUCA) is proposed for the purpose of statistical process monitoring. The contributions are as follows: (1) LIUCA intends to find a part-based representation subspace in which two data points are close to each other, if they are close in the k-nearest neighbor graph; (2) LIUCA can exploit the geometrical structure of the data space, which will improve the algorithm's modeling performance in real-world applications; (3) LIUCA-based multivariate statistical process monitoring scheme is proposed. (4) In contrast to traditional process modeling algorithm such as principal component analysis, LIUCA imposes no restriction on data distribution. In addition, both a multivariate numerical example and a hot galvanizing pickling waste liquor treatment process are taken to evaluate the feasibility of the proposed process monitoring scheme. Experiment results demonstrate the effectiveness of the proposed method.
Datum: 23.06.2017


Modeling of HILIC retention behavior with theoretical models and new spline interpolation technique

When it is taken into account that hydrophilic interaction liquid chromatography (HILIC) as an analytical method is relatively young compared with the other techniques, retention modeling could still bring scientifically valuable data to the field. Therefore, in this paper, olanzapine and its 8 impurities were selected as a test mixture, considering that they have never been analyzed in HILIC before. Their investigation on 4 different HILIC columns (bare silica, cyanopropyl, diol and zwitterionic) has been performed. The mixture of 9 structurally similar substances allows the examination of complex HILIC retention behavior depending on the chemical properties of the analytes, as well as of the stationary phase. To describe the nature of the relationship between the retention and the stronger eluent content in the mobile phase, we fitted experimentally obtained data to several theoretical (localized adsorption, nonlocalized partition, quadratic, and mixed) models. Results show that the best fit is the quadratic model with the highest R2 and cross-validated coefficient of determination (Q2) values, but its usage has some drawbacks. With the aim to improve the possibility to predict retention behavior in HILIC, a new empirical model was proposed. For that purpose, a spline interpolation technique was performed, by dividing the experimental range into several subdivisions. This type of interpolation was performed for the first time in the chromatographic field. The estimation of the polynomial equations was performed using Q2 values. Obtained Q2 values pointed out the goodness of fit of the model, as well as its good predictive capabilities. In the end, the prediction capabilities were experimentally verified, under randomly chosen conditions from the experimental range. The errors in prediction were all under 10%, which is satisfying for HILIC.
Datum: 20.06.2017


Resolution enhancement of overlapped peaks of ion mobility spectrometry based on stationary wavelet transform

Lack of sufficient resolving power is one common problem in quantitative determination using ion mobility spectrometry (IMS) method. This paper proposes a new application of stationary wavelet transform (SWT) in resolving the overlapped peaks in IMS. Firstly, the ion product overlapped peaks are denoised by wavelet transform and secondly, the overlapping peaks are resolved by looking into coefficients of SWT. One- and two-component overlapping ion peaks of IMS are processed with the developed SWT method, which yields the significant improvement of resolution.
Datum: 20.06.2017


Weighted time series fault diagnosis based on a stacked sparse autoencoder

Most statistical analysis technologies use detection thresholds for fault diagnosis, which often cannot effectively characterize some specific faults in a statistical manner. However, the details and small changes in the faults can be exploited by deep learning-based feature representation. In this paper, we present a weighted time series fault diagnosis method to learn the deep correlations of faults and reduce the loss of fault information. Our model includes 2 key novel properties: (1) It can learn high-level abstract features of faults and the underlying fault patterns, which is particularly efficient for detecting incipient faults; (2) a mathematical framework of stacked sparse autoencoder-based fault diagnosis, with capabilities of multiple nonlinear mapping and complex function approximation, is presented. The monitoring performance was compared with multivariate statistical methods and conventional artificial intelligence methods on the Tennessee Eastman process data set, which is a well-known chemical industrial benchmark. The experimental results showed its performance gain over existing methods, especially for incipient faults that are difficult to detect with traditional technologies.
Datum: 20.06.2017


Quantitative analysis based on spectral shape deformation: A review of the theory and its applications

Most of the commonly used calibration methods in quantitative spectroscopic analysis are established or derived from the assumption of a linear relationship between the concentrations of the analytes of interest and corresponding absolute spectral intensities. They are not applicable for heterogeneous samples where the potential uncontrolled variations in optical path length due to the changes in samples' physical properties undermine the basic assumption behind them. About a decade ago, a unique calibration strategy was proposed to extract chemical information from spectral data contaminated with multiplicative light scattering effects. From then on, this calibration strategy has been attentively examined, modified, and used by its developers. After more than 10 years of development, some important features of the calibration strategy have been identified. It has been proved that the calibration strategy can solve many complex problems in quantitative spectroscopic analysis. But, because of the relatively low awareness of the calibration strategy among chemometrics society, its potential has not been fully exploited yet. This paper reviews the theory of the calibration strategy and its applications with a view to introducing the unique and powerful calibration strategy to a wider audience.
Datum: 15.06.2017


Design matrices and modelling


Datum: 02.06.2017


Formulating an experimental design mathematically


Datum: 02.06.2017


The O-PLS methodology for orthogonal signal correction—is it correcting or confusing?

The separation of predictive and nonpredictive (or orthogonal) information in linear regression problems is considered to be an important issue in chemometrics. Approaches including net analyte preprocessing methods and various orthogonal signal correction (OSC) methods have been studied in a considerable number of publications. In the present paper, we focus on the simplest single response versions of some of the early OSC approaches including Fearns OSC, the orthogonal projections to latent structures, the target projection (TP), and the projections to latent structures (PLS) postprocessing by similarity transformation. These methods are claimed to yield improved model building and interpretation alternatives compared with ordinary PLS, by filtering “off” the response-orthogonal parts of the samples in a dataset. We point out at some fundamental misconceptions that were made in the justification of the PLS-related OSC algorithms and explain the key properties of the resulting modelling.
Datum: 11.04.2017






Information about this site:

Last update: 08.02.2016

The author- or copyrights of the listed Internet pages are held by the respective authors or site operators, who are also responsible for the content of the presentations.

To see your page listed here: Send us an eMail! Condition: Subject-related content on chemistry, biochemistry and comparable academic disciplines!

Topic: Current, research, scientific, chemometrics, letters, science, recent, journal, list, articles..








(C) 1996 - 2017 Internetchemistry










Current Chemistry Job Vacancies:

[more job vacancies]