Hyperspectral imaging for chemicals identification: a human-inspired machine learning approach

Kendler, Shai; Mano, Ziv; Aharoni, Ran; Raich, Raviv; Fishbain, Barak

doi:10.1038/s41598-022-22468-7

Download PDF

Article
Open access
Published: 20 October 2022

Hyperspectral imaging for chemicals identification: a human-inspired machine learning approach

Shai Kendler¹^nAff3,
Ziv Mano¹,
Ran Aharoni^3,4,
Raviv Raich² &
…
Barak Fishbain¹

Scientific Reports volume 12, Article number: 17580 (2022) Cite this article

4289 Accesses
12 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Data analysis has increasingly relied on machine learning in recent years. Since machines implement mathematical algorithms without knowing the physical nature of the problem, they may be accurate but lack the flexibility to move across different domains. This manuscript presents a machine-educating approach where a machine is equipped with a physical model, universal building blocks, and an unlabeled dataset from which it derives its decision criteria. Here, the concept of machine education is deployed to identify thin layers of organic materials using hyperspectral imaging (HSI). The measured spectra formed a nonlinear mixture of the unknown background materials and the target material spectra. The machine was educated to resolve this nonlinear mixing and identify the spectral signature of the target materials. The inputs for educating and testing the machine were a nonlinear mixing model, the spectra of the pure target materials (which are problem invariant), and the unlabeled HSI data. The educated machine is accurate, and its generalization capabilities outperform classical machines. When using the educated machine, the number of falsely identified samples is ~ 100 times lower than the classical machine. The probability for detection with the educated machine is 96% compared to 90% with the classical machine.

Spectral unmixing as a preprocessing step for SVM-based material identification in historical manuscripts

Article Open access 06 October 2025

GroupFormer for hyperspectral image classification through group attention

Article Open access 12 October 2024

Unsupervised spectra information extraction using physics-informed neural networks in the presence of non-linearities and multi-agent problems

Article Open access 07 November 2025

Introduction

Since the seminal work by Samuel¹, who coined the term Machine Learning (ML), this field has impacted virtually all areas of science, including engineering^2,3, the exact and social sciences^4,5,6, and even art⁷. The ML process can be described in the following way: let ${\mathbb{C}}$ be the set of possible states for the observed system. Let O(X) be a set of observations, each described by a set of features X, which a-priori are associated with a specific state $\mathcalligra{c} \in {\mathbb{C}}$; i.e., their states are known. O(X) is then used to train a machine and to establish a set of rules for associating observation O(X) and a system's state $\mathcalligra{c}$. When these different system states are viewed as different classes, this set of rules is referred to as a classifier. Using this classifier, the machine can classify new observations according to their feature values⁸. The classification accuracy depends on the nature of the training set, since it must adequately account for diversity in the test data. If X is a vector of the physical properties of an object that were measured experimentally, $O\left( X \right)$ must be chosen in such a way that the measurement noise of X is represented adequately. Take, for example, a situation in which each feature measurement is associated with additive random noise, such that $x \in X$ is of the form $x = x + RN_{x}$, where RN_X is the typical random noise of feature x. Hence, training should use observations that have similar noise characteristics, and the resulting classifier applies to observations in the same domain; i.e., that have these same noise characteristics^9,10. Let us now consider the same problem as described above, but with multiplicative rather than additive noise; i.e., $x = x \cdot RN_{x}$. In this case, if the classifier that was trained on observations with an additive noise model is used as is to classify observations with multiplicative noise, the classification accuracy is likely to decrease. In this case, a new set of classified observations needs to be obtained and a new classifier trained, or at least attempts should be made to transfer the existing classifier to the new domain using a smaller set of labeled observations in the new domain^11,12.

Despite these shortcomings, ML continues to have enormous appeal since a machine excels where humans struggle. Machines can handle vast amounts of high-dimensional data. In fact, ML improves with growth in the size of the database. Computer hardware and algorithms are constantly evolving and can handle large amounts of data and complicated problems¹³. On the other hand, humans can adapt to new situations, retrieve relevant knowledge, and outperform machines if the data and dimensionality are relatively small^14,15,16.

The lack of large datasets limits the applicability of ML and Deep Learning (DL) and has driven several studies that tried to overcome this shortcoming. Raissi et al.¹⁷ introduced the Physics-informed neural networks (PINN), in which physics laws are used to restrict the possible solutions during the training stage of a neural network (NN). Introducing such a regularization mechanism results in a robust NN with a relatively small training set. Manome et al.¹⁸ described a method to automatically adjust the learning rate by incorporating human cognitive biases into the training process. Such biases mimic the human use of causal relationships between events in the learning process. Adding these biases results in a more accurate classification even with a relatively small dataset and eliminates the need for parameter tuning. Fong et al.¹⁹ described a paradigm for neutrally-weighted machine learning, which took functional magnetic resonance imaging (fMRI) measurements of human brain activity from subjects viewing images and used these data as part of the training process of an object recognition algorithm to benefit from the capabilities of the human brain. After training, image classification did not require fMRI input. This approach improved the classification accuracy by 10–30% when using traditional machine vision features and by 3–5% when using convolutional neural network features. While these ML human inspired methodologies expedite the training phase, allow for smaller training sets and often improve accuracy, it does not allow for a better generalization. To cope with this, Lake et al.¹⁵, noted that humans can generalize successfully from a small set of examples compared to machine learning algorithms that require considerably more examples. They developed a computational model that successfully mimics these capabilities using Bayesian program learning (BPL) for learning a large class of visual concepts from just a single example.

The success of artificial intelligence in so many areas of our lives leads to the question of whether machines can be endowed with new humanoid capabilities. Such machines will need to go beyond current engineering trends in their learning techniques and capabilities; namely, they will also have to produce new knowledge, modeling, and superior generalization and adaptation capabilities¹⁶. A recent example of this type of machine was presented by Tenenbaum et al., who designed a machine that guides a marble through a circular maze¹⁴. The machine was trained using a physical engine that mimics the real system. The residual between the actual observations and the physical simulations were corrected using a statistical model (Gaussian process regression). Then, the movement of the marble in the maze was controlled using model-predictive feedback based on the combination of the physics engine and a statistical model.

This type of combination goes beyond the normal machine learning process and may be considered in human terms as machine education. According to Skinner, humans can forget facts, skills, and knowledge. Nevertheless, education enables people to recover these capabilities when facing a challenge. Skinner noted: …" education is what survives when what has been learned has been forgotten…¹¹. This suggests that human education goes beyond structured learning since it provides the seeds to regrow knowledge and capabilities when called upon to cope with a problem. These seeds exist in the form of small amounts of information invariant to the problem domain.

Machine education is defined as using domain invariant physical information and models to develop a machine that solves problems obeying these physical models and using this physical information in different domains.

The educated machine described here uses a physical model of nonlinear reflectance spectra mixing and target-materials attributes to compute a training set using non-labeled data. Since the physical model, and target-materials attributes are invariant to the problem's domain, the method overcomes a fundamental problem related to supervised learning—obtaining a training set that adequately represents the problem at hand. This approach is inspired by the human education process based on acquiring a small set of tools to deal with problems from different domains. The machine's ability to acquire human characteristics, such as the flexibility to operate in different domains through generalization and abstract thinking, is studied.

Data and methods

Data

Target material identification in hyperspectral imaging

Hyperspectral imaging (HSI) is used in numerous applications, such as geophysical mapping^20,21,22,23, cultural heritage material analysis²⁴, process control^25,26,27, and many others. The resulting image is a three-dimensional data cube consisting of the spatial axis (X, Y) and the spectral information (λ). Computers are vital in HSI data analysis^28,29,30; target materials are identified by comparing the reflected light's spectral signature to a reference spectrum. In many cases, these algorithms are effective even when the target material only occupies a small portion of the pixel, resulting in a linear mixing of the target and background materials' spectral signatures²¹. Another possible scenario is nonlinear mixing, which occurs when the photons are subjected to multipath effects. Nonlinear mixing results in a reflectance spectrum that is a product of the background materials' spectral signature and the target material³¹. This nonlinear situation is described in the following way. Let us assume that a portion of the scene contains a single target material $\mathcalligra{m}$, and a background material $\mathcalligra{b}$. Let $R_{\mathcalligra{m}} \left( \lambda \right)$ be the reflectance spectrum of $\mathcalligra{m}$, and $R_{\mathcalligra{b}} \left( \lambda \right)$ be the reflectance spectrum of the background. This portion of the scene is sampled by pixel i. Let $\alpha_{i}$ be the abundance of $\mathcalligra{m}$ in this portion of the scene, $I_{{\text{i}}}^{0} \left( \lambda \right)$ is the incident radiation intensity illuminating pixel i, and $I_{i} \left( \lambda \right)$ is the reflected light of pixel i. In this case $I_{i} \left( \lambda \right)$ is the element-wise product, denoted by $\odot$, of both $\mathcalligra{m}$ and the background material $\mathcalligra{b}$:

$$I_{i} \left( \lambda \right) = I_{{\text{i}}}^{0} \left( \lambda \right) \odot (R_{\mathcalligra{b}} \left( \lambda \right) \odot \alpha_{i} \cdot {\text{R}}_{\mathcalligra{m}} \left( \lambda \right){ } + \left( {1 - { }\alpha_{i} } \right) \cdot R_{\mathcalligra{b}} \left( \lambda \right)$$

(1)

Figure 1 illustrates this type of nonlinear mixing situation compared to a clean pixel for the simple case in which $\alpha_{i} = 1.$ Thus, Eq. (1) may be simplified to:

$$I_{i} \left( \lambda \right) = I_{{\text{i}}}^{0} \left( \lambda \right) \odot {\text{R}}_{\mathcalligra{b}} \odot {\text{R}}_{\mathcalligra{m}} \left( \lambda \right){ }$$

(2)

An example of the nonlinear mixing effect on the reflectance spectra of thin layers of organic materials deposited on environmental surfaces is provided in part 1 of the supplementary section. Although nonlinear mixing is a well-known phenomenon, it has received far less attention than efforts to develop unmixing algorithms for linear mixing situations³². Chen et al. developed a method for HSI nonlinear unmixing that implements a kernel-based learning theory. End member components at each band are mapped implicitly into the high feature space to address the photons' nonlinear interaction. Halimi et al.³³, suggested a bilinear model combined with a hierarchical Bayesian algorithm for unmixing hyperspectral images. This model was successfully applied to real data from the Cuprite mining site (Nevada, USA) in 1997 by employing an airborne visible-infrared imaging spectrometer (AVIRIS). Dobigeon et al. compared several algorithms using computer-simulated data and real images of vegetated areas³⁴. They found that the Polynomial Post Nonlinear Mixing model (PPNM)³⁵, outperformed other standard models. Kendler et al. developed an algorithm capable of automatically resolving nonlinear mixing situations between a thin layer of organic material (sub-millimeter) and various background materials common in the environment^36,37. The only input to this algorithm was ${\text{R}}_{\mathcalligra{m}} \left( \lambda \right)$, which is invariant to the measurement conditions and therefore was measured in advance in the lab and served as the reference spectrum.

Given $R_{\mathcalligra{b}} \left( \lambda \right)$, and Eq. (2), the reflectance spectrum of an unknown material x in pixel i, $R_{x, i} \left( \lambda \right)$, can be expressed as follows (3):

$$R_{x, i} \left( \lambda \right) = I_{i} \left( \lambda \right) \odot (I_{{0{\text{i}}}} \left( \lambda \right) \odot R_{\mathcalligra{b}} \left( \lambda \right){ })^{ - 1}$$

(3)

However, $R_{\mathcalligra{b}} \left( \lambda \right)$ is unknown and has to be estimated as accurately as possible from the scene. Furthermore, it may be impossible to assume that the illumination is uniform all over the scene. Hence, the key issue in the algorithms described in references^36,37 is to locate a pixel k' in the scene without prior knowledge, which satisfies (4):

$$R_{{\mathcalligra{b}_{k} }} \left( \lambda \right) = R_{{\mathcalligra{b}_{i} }} \left( \lambda \right) \;and\;also I_{0,k} \left( \lambda \right) = I_{0,i} \left( \lambda \right)$$

(4)

To find k', three exhaustive search methods were described, resulting in a detection rate of up to 90% and a false alarm rate of less than 1% for three materials from a distance of 30 m. They also reported that the likelihood of satisfying (4) increased as the distance between pixel i and k decreased³⁶.

This article presents a new, effective algorithm to resolve a nonlinear mixing situation and identify target materials. This algorithm is inspired by the human education process in which ${\text{R}}_{\mathcalligra{m}} \left( \lambda \right){ }$ are the seeds {Se} for learning, and Eqs. (1–3) are the physical model ${ }PMo$. As in the education process, {Se} and PMo are used to acquire knowledge about various scenes to generate a training set for a classifier (in this example, a Random-Forest classifier—RF)³⁸, which is subsequently used to classify a new set of observations. The effects of the parameters implemented to derive the training set (education) on the classification performance are discussed.

Experimental setup

The measurement was based on a set of test targets prepared in a controlled environment³⁶. For the sake of completeness, a short description of the HSI and the test target containing the three model materials (sugar, silicon oil, and polystyrene) is provided. This set contained three model target materials at three concentrations on three different surfaces (ceramic tile, plywood, and cardboard). The target material spot sizes measured 10 × 10 cm and contained 0.5, 1, and 1.5 gr of the pure target material. Three materials were used: 1. sugar dissolved in hot water, resulting in a viscous (~ 50%) solution. A known volume of this solution was placed on the surface. After a few hours, the water evaporated from the surface, leaving a known amount of dry sugar film on the surface. 2. Similarly, polystyrene was deposited from a methyl ethyl ketone solution, and 3. Silicon oil (polydimethylsiloxane, PDMS) was placed directly on the surface. The resulting film thicknesses were approximately 50, 100, and 150 µm. There were 33 target spots in total, as shown in Fig. 2.

The reference reflectance spectra of the target materials were measured in advance using a non-imaging spectrometer (FieldSpec4TM from ASD) using a fiber-optic probe. The spectral range of the VIS-SWIR (350–2500 nm) and the sampling interval/resolution in the VIS were 1.4 nm/3 nm and 1.1 nm/10 nm in the SWIR. Since the HSI operates at 1000–2500 nm, the data in the visible range were omitted.

HSI measurements were obtained using a line-scanner type HSI (SWIR-CL-400-N25E from Specim, Finland). The spectral range was 1000–2500 nm, and the spectral sampling interval/resolution was 5.6 nm/12 nm, using the sun as the light source. The HSI was mounted on a rotating stage and equipped with a 56 mm, F/2 lens with a 9.6° field of view. Exposure time varied from 10 to 25 ms, at a frame rate of 30 fps. Before each measurement, the HSI measured the dark current signal. Both radiometric calibration and dark current subtraction were performed automatically using the supplied control software and calibration parameters from Specim. The HSI is equipped with a Mercury-Cadmium-Telluride (MCT) focal plane array sensor with 384 by 288 pixels. Light enters the sensor through a 30 µm slit and is dispersed by the spectrometer to 288 wavelengths (channels), creating a single stripe of the scene. The data cube was constructed step by step by scanning n stripes by rotating the system n times during the measurement to create an ${\Omega } = \left| {m \times n \times l} \right|$ elements data cube. For the sensor used for this study, m = 384, l = 288, the number of samples (n), was set to be n = 2859, resulting in 1.1e6 spatial pixels. Figure 3 schematically depicts the HSI data cube. The cube was cropped down to P = 18,266 pixels and labeled according to its content. The distance between the sensor and the target was ~ 10 m. The analysis used a collection of cubes that needed to be perfectly aligned. For simplicity, the targets and the imager position were constant during the measurement of this cube collection. As the sun (the light source for these measurements) orbits the Earth, $I_{0} \left( \lambda \right)$ is not constant; additionally, the exposure time is not fixed during the measurements. As shown below, although the same set of targets and measuring device were used, these changes pose a significant challenge to a classical ML model since it lacks the ability to generalize.

Educating the machine

Let us now define the education process using the notation above. To this end, let $\left\{ {Se} \right\}$ be a set of seeds and $PMo$, a physical model. Given a specific real-life situation with a set of real-life unclassified HSI observations $O\left( X \right)$ that behave according to PMo, the machine will generate a set of labeled observations using $\left\{ {Se} \right\}$ and $PMo$. Using this set of labeled observations, a classifier is computed and used to classify observations. These seeds and physical models should apply to any domain. In other words, they are general tools that enable learning for a specific mission. This process is suggested as an analogy to human education in a specific field that still enables humans to acquire tools in a changing environment.

The analysis was based on prior knowledge of the reflectance spectrum of a pure target material x, ${\text{R}}_{{{\text{ref}},{\text{ x}}}}$ which are the seeds for education, $\left\{ {Se} \right\}$, and several measurements of the scene—cube collection. In this work, the cube collection contained 41 cubes. Each cube contained 18,266 pixels that could contain a target material; note that at this stage, it was assumed that each pixel could only contain a single target material and a background material, whereas the entire scene could contain several materials at the same time, which for this work was three. For a given cube, {S} is the set of pixels with no target material, and {F} is the set of pixels that may contain one of the target materials. The algorithm assumes that although the cube may contain no contaminated pixels; i.e., $\left\{ F \right\}$ maybe an empty set, it always contains a set of clean pixels; thus, $\left\{ S \right\}$ is never an empty set.

For a specific data cube, a fraction, p, of the clean pixels were randomly selected (in this case, p ranged from 0.01 to 0.6). $\left\{ {Se} \right\}$ were superimposed on the $p \cdot \left| S \right|$ clean pixels using Eq. (2), resulting in a synthetic set composed of the nonlinear mixture of the target materials' spectral signatures and clean pixels containing various background materials. The set contained |S| observations for each target material and |S| observations of clean pixels. Since the clean pixels were taken from the raw data, the resulting set of labeled observations accounted for the variation in illumination intensities and other noise sources.

This set of labeled observations, obtained in a process inspired by human education, was used for the most critical ML stage, namely, obtaining a training set, in this case, for a Random Forest classifier. The random forest consisted of 12 trees since the out-of-bag classification error³⁷ was low and stable for this value. The training set size and the impact of the physical model's quality on the quality of the educational process and overall performance have been studied. The remainder of the $\left[ {\left| F \right| + \left( {1 - p} \right) \cdot \left| S \right|} \right]$ pixels in the cube were classified using the computed classifier. Since the data were noisy in this case³⁶, additional steps were taken to improve accuracy.

To this end, the algorithm was tuned to minimize false-positive identification; therefore, the pixel was labeled as background if the combined maximum score was lower than a predetermined threshold. This thresholding rejected some positive (true and false) identifications of the target material. This iterative training/classification accounted for the between-pixel variability and ensured that the results were not affected by a random choice of the set of {S} that was used to generate the training set. The number of iterations ranged from 1 to 30. After completing the desired number of iterations, a combined score was computed for each pixel. Several metrics to compute the combined score for the presence of target material were tested, including the arithmetic mean, median, geometric mean, and the most probable score. Using the arithmetic mean resulted in slightly improved performance; hence, it was used for this study. The effect of p on accuracy was also estimated. Lower p values reduce computation time but may not accurately represent the variability between observations, thus reducing the algorithm's ability to learn proper classification patterns.

Once all the cubes of a specific scene had been classified, each pixel's final label was determined using the most probable label across the entire cube collection. It will be shown that voting between results obtained in different conditions increases the classification reliability.

Choosing the appropriate values of p and the number of iterations with and without voting is an essential step in machine education since these parameters affect the quality of the training set. Subsections 2 and 3 in the supplementary material provide more details on the effects of these parameters on the overall performance.

The method described here is aimed at computing a training set; hence, it might be compatible with other classifiers that undergo supervised training. In the case presented here, the random forest model was chosen since it was found effective for datasets of reflectance spectra³⁹. Additional study is required to finetune the method for different classifiers and datasets.

The main procedures of machine education are described in the following pseudocode:

Classifier performance evaluation

The performance of the classifier was evaluated in two stages. In the first stage, the combined score of the results of the training–testing iteration for a single cube was tuned to detect at least 90% of the targets for p = 0.15, with seven iterations. The resulting combined score (0.4) was used for all other computations. In the second stage, the classification quality was evaluated for various parameters, including the number of iterations (NI), p, and the number of cubes used for voting. These parameters affect the learning process quality and hence are informative as to the quality of education.

Note that successful target identification was considered to have been achieved when at least one pixel in the target had been correctly identified. Similarly, even a single miss-classified pixel was considered a classification error.

A standard ML procedure was also carried out by labeling each pixel in a specific cube. Then, a training set containing 70% of the pixels was randomly chosen, and the remaining 30% were used for testing. The use of the same cube for training and testing was denoted here as 'same cube classification’ (SCC). The SCC mode of operation is an ideal case in which labeled and tested data are obtained simultaneously. The classifier was also challenged with data from other cubes, denoted, here, different cube classification' (DCC). For a cube collection with n cubes, there were $\frac{{\left( {n - 1} \right)n}}{2}$ possible DCC results and n SCC results. Hence this technique's performance (accuracy and generalizability) was evaluated using the distribution of the classification results.

Results

One of the educated machine's main advantages is that it does not require a labeled dataset, which is often the bottleneck in machine learning applications. This feature is beneficial for classifying data cubes from different domains, resulting from differences in illumination intensity and data acquisition parameters. Since such variations are likely to occur in outdoor measurements, an educated machine is particularly suited for such applications. Furthermore, the flexibility to move between domains makes it possible to use voting between different cubes, further enhancing classification accuracy. Figure 4 shows that voting improves accuracy. A detection probability of 97 ± 1% with 30–70 (0.2–0.4%) misclassified pixels was obtained with 7 iterations, p = 0.1, and 5 data cubes. Table 1 shows the classification performance for the three materials with and without applying a voting mechanism. It shows that voting on data obtained under different conditions improves accuracy as the number of wrongly identified pixels (NWIP) is reduced. It also points to certain differences in classification accuracy for different materials caused by variability in the signal-to-noise ratio of the reflectance spectra (see Fig. S1. in the supplementary material).

Table 1 The educated machine algorithm's performance (run parameters: 7 iterations, p = 0.15, voting between 5 data cubes).

Full size table

For comparison, Fig. 5 shows the classification results (NWIP distribution for SCC and DCC) obtained using conventional machine learning techniques (see “Classifier performance evaluation” section). There was a significant difference between the SCC and DCC. For the SCC, a target detection rate exceeding 90% was obtained for 84% of the data cubes. In all the other cases, the target detection probability exceeded 85%. The NWIP for the SCC approach was 10–60. The DCC approach resulted in significantly inferior results. A target detection rate above 90% was only achieved in 49% of the cases and dropped as low as 65–80% in other cases. A more dramatic effect was obtained in this case for the NWIP, which was two to three orders of magnitude higher than in the SCC approach. Table 2 compares the results obtained using the educated machine, operating at optimal parameters, and the classical machine learning using the SCC and the DCC methods.

Table 2 A comparison between the classical and educated machine.

Full size table

This comparison shows that the classical machine, using the SCC approach, performs similarly to the educated machine. The probability for detection with the educated machine is slightly better than with the classical machine, and the difference in the NWIP values is within the error range. However, the conventional machine failed to generalize. This failure was evident in the DCC approach, which resulted in an average probability for detection of 90 ± 5% and an average misclassification of 3400 ± 1800. It should be noted that the SCC approach is impractical since the classical machines have to memorize the data prior to their operation; hence one has to consider the DCC as a more realistic approach. The education process created a superior machine with superior flexibility, crucial in realistic situations. This flexibility resulted from combining a nonlinear mixing physical model and the pure target materials' spectra, which is invariant to the scene's characteristics. This combination is analogous to human education, enabling adaptation to new situations without compromising performance.

Conclusions

This work presents a new approach for increasing machines' flexibility using human-inspired training in which the machine is educated instead of being trained to memorize data. This mechanized education process is analogous to the human education process as it results in high adaptivity by obtaining relevant knowledge using a physical model and a small amount of information (seeds) invariant to the problem domain. When confronted with a new problem, the machine used these seeds and the physical model to generate the knowledge needed to create a classifier. This process was illustrated here for chemical identification using HSI utilizing the RF classification model, which is simple and effective for classifying data resulting from reflectance measurements³⁹. Testing the applicability of the human-inspired method to other problems using different classifiers is part of our ongoing study.

The educated machine used a physical model for the nonlinear mixing between the uninformative background and the reference spectral signatures (seeds) to compute a classifier to detect the target materials in a realistic scenario where the signals are nonlinearly mixed. The physical model and the seeds were obtained in advance prior to the operation of the machine. Such a process is analogous to human education in that both promote the process of learning by providing tools and not by memorizing data. Educational quality was evaluated by tuning parameters such as the number of clean pixels and the number of iterations during classifier computation.

The findings presented in this manuscript suggest that machine and human education have several features in common since both improve the generalization capability due to their inherent flexibility. This capability is apparent in the realistic case in which a classical machine was trained prior to the operation—the DCC approach. In this case, the NWIP with the educated machine was two orders of magnitude lower than the classical machine. The probability for detection with the educated machine is 96% compared to 90% with the classical machine.

This capability to move between domains has long been considered the main difference between humans and machines. This work suggests that future machines can gain from education just as humans do. Hence, data scientists should focus their future efforts on educating machines rather than training them. This strategy may pave the way for a new approach to using machines in situations often considered too cumbersome for humans and too complicated for machines.

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

References

Samuel, A. L. Some studies in machine learning using the game of checkers. IBM J. Res. Dev. 3(3), 210–229. https://doi.org/10.1147/rd.33.0210 (1959).
Article MathSciNet Google Scholar
Hegde, J. & Rokseth, B. Applications of machine learning methods for engineering risk assessment: A review. Saf. Sci. 122, 104492. https://doi.org/10.1016/j.ssci.2019.09.015 (2020).
Article Google Scholar
Frolich, L., Vaizel-Ohayon, D. & Fishbain, B. Prediction of bacterial contamination outbursts in water wells through sparse coding. Sci. Rep. 7(1), 799. https://doi.org/10.1038/s41598-017-00830-4 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Schmidt, J., Marques, M. R. G., Botti, S. & Marques, M. A. L. Recent advances and applications of machine learning in solid-state materials science. npj Comput. Mater. 5(1), 83. https://doi.org/10.1038/s41524-019-0221-0 (2019).
Article ADS Google Scholar
Stephenson, N. et al. Survey of machine learning techniques in drug discovery. Curr. Drug Metab. 20(3), 185–193. https://doi.org/10.2174/1389200219666180820112457 (2018).
Article CAS Google Scholar
Molina, M. & Garip, F. Machine learning for sociology. Annu. Rev. Sociol. 45(1), 27–45. https://doi.org/10.1146/annurev-soc-073117-041106 (2019).
Article Google Scholar
Sabetsarvestani, Z., Sober, B., Higgitt, C., Daubechies, I. & Rodrigues, M. R. D. Artificial intelligence for art investigation: Meeting the challenge of separating x-ray images of the Ghent Altarpiece. Sci. Adv. 5(8), 1–9. https://doi.org/10.1126/sciadv.aaw7416 (2019).
Article CAS Google Scholar
Dietterich, T. G. Ensemble Methods in Machine Learning 1–15 (Springer, 2000).
Google Scholar
Daum, H. Frustratingly easy domain adaptation. In ACL 2007 - Proc. 45th Annu. Meet. Assoc. Comput. Linguist., pp. 256–263, 2007, [Online]. Available: http://www.scopus.com/inward/record.url?eid=2-s2.0-84860513476&partnerID=40&md5=17ebb3c4f4945ca7df03007f1576b31e
Segev, N., Harel, M., Mannor, S., Crammer, K. & El-Yaniv, R. Learn on source, refine on target: A model transfer learning framework with random forests. IEEE Trans. Pattern Anal. Mach. Intell. 39(9), 1811–1824. https://doi.org/10.1109/TPAMI.2016.2618118 (2017).
Article PubMed Google Scholar
Skinner, B. F. New methods and new aims in teaching. New Sci. 122, 1964 (1964).
Google Scholar
Petkar, H. A review of challenges in automatic speech recognition. Int. J. Comput. Appl. 151(3), 23–26. https://doi.org/10.5120/ijca2016911706 (2016).
Article Google Scholar
Date, P., Arthur, D. & Pusey-Nazzaro, L. QUBO formulations for training machine learning models. Sci. Rep. 11(1), 1–10. https://doi.org/10.1038/s41598-021-89461-4 (2021).
Article CAS Google Scholar
Ota, K. et al. Data-efficient learning for complex and real-time physical problem solving using augmented simulation. IEEE Robot. Autom. Lett. 6(2), 4241–4248. https://doi.org/10.1109/LRA.2021.3068887 (2021).
Article Google Scholar
Lake, B. M., Salakhutdinov, R. & Tnenbaum, J. B. Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338. https://doi.org/10.1126/science.aab3050 (2015).
Article ADS MathSciNet CAS PubMed MATH Google Scholar
Lake, B. M., Ullman, T. D., Tenenbaum, J. B. & Gershman, S. J. Building machines that learn and think like people. Behav. Brain Sci. 40(2012), 1–58. https://doi.org/10.1017/S0140525X16001837 (2017).
Article Google Scholar
Raissi, M., Perdikaris, P. & Karniadakis, G. E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707. https://doi.org/10.1016/j.jcp.2018.10.045 (2019).
Article ADS MathSciNet MATH Google Scholar
Manome, N., Shinohara, S., Takahashi, T., Chen, Y. & Il Chung, U. Self-incremental learning vector quantization with human cognitive biases. Sci. Rep. 11(1), 1–12. https://doi.org/10.1038/s41598-021-83182-4 (2021).
Article CAS Google Scholar
Fong, R. C., Scheirer, W. J. & Cox, D. D. Using human brain activity to guide machine learning. Sci. Rep. 8(1), 1–10. https://doi.org/10.1038/s41598-018-23618-6 (2018).
Article CAS Google Scholar
Goetz, A. F. H., Vane, G., Solomon, J. E. & Rock, B. N. Imaging spectrometry for earth remote sensing. Science 228(4704), 1147–1153. https://doi.org/10.1126/science.228.4704.1147 (1985).
Article ADS CAS PubMed Google Scholar
Vane, G. et al. The airborne visible/infrared imaging spectrometer (AVIRIS). Remote Sens. Environ. 44(2–3), 127–143. https://doi.org/10.1016/0034-4257(93)90012-M (1993).
Article ADS Google Scholar
Clark, R. N. et al. Imaging spectroscopy: Earth and planetary remote sensing with the USGS Tetracorder and expert systems. J. Geophys. Res. Planets 108(E12), 2003. https://doi.org/10.1029/2002JE001847 (2003).
Article CAS Google Scholar
Ben-Dor, E., Patkin, K., Banin, A. & Karnieli, A. Mapping of several soil properties using DAIS-7915 hyperspectral scanner data - A case study over soils in Israel. Int. J. Remote Sens. 23(6), 1043–1062. https://doi.org/10.1080/01431160010006962 (2002).
Article Google Scholar
France, F. G. Advanced spectral imaging for noninvasive microanalysis of cultural heritage materials: Review of application to documents in the U.S. library of congress. Appl. Spectrosc. 65(6), 565–574. https://doi.org/10.1366/11-06295 (2011).
Article ADS CAS PubMed Google Scholar
Gowen, A. A., O’Donnell, C. P., Cullen, P. J., Downey, G. & Frias, J. M. Hyperspectral imaging: An emerging process analytical tool for food quality and safety control. Trends Food Sci. Technol. 18(12), 590–598. https://doi.org/10.1016/j.tifs.2007.06.001 (2007).
Article CAS Google Scholar
Kawano, S., Saranwong, S. & Terada, F. Rapid , easy-handling system for NIR compositional analysis of non- homogenized milk using a test tube. pp. 77–79.
Calvini, R., Ulrici, A. & Amigo, J. M. Practical comparison of sparse methods for classification of Arabica and Robusta coffee species using near infrared hyperspectral imaging. Chemom. Intell. Lab. Syst. 146, 503–511. https://doi.org/10.1016/j.chemolab.2015.07.010 (2015).
Article CAS Google Scholar
Burger, J. & Gowen, A. Data handling in hyperspectral image analysis. Chemom. Intell. Lab. Syst. 108(1), 13–22. https://doi.org/10.1016/j.chemolab.2011.04.001 (2011).
Article CAS Google Scholar
Mobaraki, N. & Amigo, J. M. HYPER-Tools. A graphical user-friendly interface for hyperspectral image analysis. Chemom. Intell. Lab. Syst. 172, 174–187. https://doi.org/10.1016/j.chemolab.2017.11.003 (2018).
Article CAS Google Scholar
Manolakis, D. & Shaw, G. Detection algorithms for hyperspectral imaging applications. IEEE Signal Process. Mag. 19(1), 29–43. https://doi.org/10.1109/79.974724 (2002).
Article ADS Google Scholar
Bioucas-Dias, J. M. et al. Hyperspectral unmixing overview: Geometrical, statistical, and sparse regression-based approaches. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 5(2), 354–379. https://doi.org/10.1109/JSTARS.2012.2194696 (2012).
Article ADS Google Scholar
Chen, J. et al. A novel kernel-based nonlinear unmixing scheme of hyperspectral images To cite this version : HAL Id : hal-01966037 A novel kernel-based nonlinear unmixing scheme of hyperspectral images (2018).
Halimi, A., Altmann, Y., Dobigeon, N. & Tourneret, J. Y. Nonlinear unmixing of hyperspectral images using a generalized bilinear model. IEEE Trans. Geosci. Remote Sens. 49(11 Part 1), 4153–4162. https://doi.org/10.1109/TGRS.2010.2098414 (2011).
Article ADS Google Scholar
Dobigeon, N., Tits, L., Somers, B., Altmann, Y. & Coppin, P. A comparison of nonlinear mixing models for vegetated areas using simulated and real hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 7(6), 1869–1878. https://doi.org/10.1109/JSTARS.2014.2328872 (2014).
Article ADS Google Scholar
Altmann, Y., Halimi, A., Dobigeon, N. & Tourneret, J.-Y. Supervised nonlinear spectral unmixing using a postnonlinear mixing model for hyperspectral imagery. IEEE Trans. Image Process. 21(6), 3017–3025. https://doi.org/10.1109/TIP.2012.2187668 (2012).
Article ADS MathSciNet PubMed MATH Google Scholar
Kendler, S. et al. Detection and identification of sub-millimeter films of organic compounds on environmental surfaces using short-wave infrared hyperspectral imaging: Algorithm development using a synthetic set of targets. IEEE Sens. J. 19(7), 2657–2664. https://doi.org/10.1109/JSEN.2018.2886269 (2019).
Article ADS CAS Google Scholar
Kendler, S. et al. Non-contact and non-destructive detection and identification of Bacillus anthracis inside paper envelopes. For. Sci. Int. 301, e55–e58. https://doi.org/10.1016/j.forsciint.2019.05.007 (2019).
Article CAS Google Scholar
Breiman, L. Random forests. Mach. Learn. 45(1), 5–32. https://doi.org/10.1023/A:1010933404324 (2001).
Article MATH Google Scholar
Aharoni, R. et al. Spectral light-reflection data dimensionality reduction for timely detection of yellow rust. Precis. Agric. 22(1), 267–286. https://doi.org/10.1007/s11119-020-09742-2 (2021).
Article CAS Google Scholar

Download references

Funding

This work was partially supported by the Israel Ministry of Science and Technology Research Program, the Israel Ministry of Environmental Protection.

Author information

Shai Kendler
Present address: Environmental Physics Department, Israel Institute for Biological Research, 24 Lerer St., 74100, Ness Ziona, Israel

Authors and Affiliations

Department of Environmental, Water and Agricultural Engineering, Faculty of Civil and Environmental Engineering, Technion – Israel Institute of Technology, Haifa, Israel
Shai Kendler, Ziv Mano & Barak Fishbain
Kelley Engineering Center, School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, USA
Raviv Raich
Environmental Physics Department, Israel Institute for Biological Research, 24 Lerer St., 74100, Ness Ziona, Israel
Ran Aharoni
Physics department, Nuclear Research Centre - Negev, Beer Sheva, Israel
Ran Aharoni

Authors

Shai Kendler
View author publications
Search author on:PubMed Google Scholar
Ziv Mano
View author publications
Search author on:PubMed Google Scholar
Ran Aharoni
View author publications
Search author on:PubMed Google Scholar
Raviv Raich
View author publications
Search author on:PubMed Google Scholar
Barak Fishbain
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization S.K., B.F., R.R., Data curation S.K., Z.M., R.A., B.F., Formal analysis Z.M., S.K., Funding acquisition B.F., Investigation S.K., B.F., R.R., R.A., Z.M., Methodology S.K., R.A., Z.M., B.F., R.R., Project administration B.F., Resources, Software Z.M., Supervision S.K., B.F., Validation B.F., R.A., R.R., Visualization Z.M., Writing S.K., B.F., Z.M.

Corresponding author

Correspondence to Shai Kendler.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kendler, S., Mano, Z., Aharoni, R. et al. Hyperspectral imaging for chemicals identification: a human-inspired machine learning approach. Sci Rep 12, 17580 (2022). https://doi.org/10.1038/s41598-022-22468-7

Download citation

Received: 04 August 2021
Accepted: 14 October 2022
Published: 20 October 2022
Version of record: 20 October 2022
DOI: https://doi.org/10.1038/s41598-022-22468-7

This article is cited by

Semi-supervised multimodal contrastive regularization network for remote sensing hyperspectral and LiDAR classification
- Fang Wang
- Xingqian Du
- Binqiang Wang
Journal of King Saud University Computer and Information Sciences (2026)