Abstract
The use of face recognition based on numerous sketches is crucial for both law enforcement and digital entertainment. Due to lack of information given by the victim, it is difficult to identify faces from several sketches. In this study, we provide an unique SIFT (Scale Invariant Feature Transformation), 6-point facial landmark, Intuitionistic Fuzzy (IF) and Fuzzy \(\:{m}_{X}^{*}\) oscillation based techniques. To improve accuracy of face recognition, we have used fuzzy based similarity measuring techniques called IF and Fuzzy \(\:{m}_{X}^{*}\) oscillation. First, apply SIFT on sketches and digital pictures to find keypoints, then select the keypoints for extracting features. These values are classified using IF and fuzzy \(\:{m}_{X}^{*}\) oscillation after feature extraction. Since we can get more than one image as output, so to overcome this problem we use a 6-point facial landmark detector (which is a modified facial landmark detector of 68-point facial landmark detector) for both the sketch and resulting digital images. Using these 6 points we draw two different regions for further verification. Two possibilities, which are presented at the beginning of Sect. 6, were taken into consideration for the experiment. As a result, by utilizing the provided method, matching a facial sketch to a photo will make it straightforward to identify the correct image. Experimental results shows that our approach achieved 95.1% accuracy for Fuzzy \(\:{m}_{X}^{*}\) oscillation with SIFT domain and 97.3% for IF \(\:{m}_{X}^{*}\) oscillation with SIFT domain. Our algorithm’s accuracy suggested that IF and Fuzzy \(\:{m}_{X}^{*}\) oscillation may be applied to the field of facial recognition-based sketching.
Introduction
It is exceedingly challenging to distinguish faces in different modalities in computer vision, for example faces of different poses, photos and face sketches1,2,3,4,5,6,7. For poor quality video surveillance, it is very difficult to identify the suspect correctly. Since it is not possible to monitor every part of the world with a camera, we need a good sketch (as described by victim) to photo matching algorithm. Reconstructing transparent objects with limited constraints has long been considered a highly challenging problem8,9. Many authors proposed and showed how fuzzy system is useful in the field of face recognition10,11,12. Up until now, it has been quite challenging to recognize faces from sketches because of the stark disparities among mug shot pictures and face sketches.
The primary inquiry into the issue of face identification from drawings is presented in this study. The authors in13 proposed a thorough investigation on face identification based on several style sketches. They considered three different kinds of sketches, and their experimental findings show that hand-drawn sketches yield more accurate outcomes than the other two scenarios. This collection of sketches was put together using information from databases that are open to the public, including those from IIIT, CUHK, and e-PRIP (Details given in Sect. 2). Fuzzy logic has an excellent ability to handle ambiguity, approximate reasoning and partial truth. Classes become more difficult to split when they overlap and are linearly inseparable. If there is coinciding between two classes, then whatever the rate of belongingness may be, it is sure that there is a pattern between the two classes. In this condition fuzzy logic plays a significant role in recognizing the boundaries of the classes by specifying different degrees of belongingness of the same pattern in multiple classes. So along with the artificial neural network, fuzzy logic also plays a very significant role in solving our real-life complex problems. Through this study, we attempt to show how fuzzy sets can be used in the field of face recognition, and the experimental results were encouraging.
The following five areas best describe our contributions:
-
From sketches and digital images, keypoint regions are found using SIFT.
-
Then choose these keypoints as the coordinates for the feature.
-
After the feature extraction, call IF \(\:{m}_{X}^{*}\) and Fuzzy \(\:{m}_{X}^{*}\) oscillation-based classification for the recognition stage.
-
To determine similarity, take into account four different scenarios for each pixel (four cases are listed in Sect. 3).
-
Finally, the error has been eliminated using the landmark detector.
Our paper is structured as follows: Sect. 2 provides pertinent work. The concept and characteristics of the IF and Fuzzy minimum structure, SIFT are briefly described in Sect. 3. We also examine the digital photos and sketch images that were employed in our experiment in this section. The algorithm and a thorough explanation of our suggested approach are provided in Sect. 4. The landmark detector was then introduced in Sect. 5 for discussion of error-removal methods. Section 6 presents the experimental outcomes of face identification based SIFT and fuzzy \(\:{m}_{X}^{*}\) oscillation, as well as those of face identification based SIFT and IF \(\:{m}_{X}^{*}\) oscillation. In addition, this section discusses the comparison of our developed algorithms to other previously defined algorithms. Section 7 of the document discusses the conclusion portion.
Related works
A fascinating study subject in biometrics and computer vision is face matching between two facial photos14. Many composite sketches have been used in the face recognition area during the past few years; some of these can be found in15,16,17,18. Today, fuzzy based approaches have a significant impact on how face recognition issues are resolved19. Since IF Set uses both membership and non-membership values in its experiments, we may anticipate that it will provide us with results with a high degree of precision. Using neural network together with fuzzy techniques gives an impressive output in face recognition domain about the familiar and unfamiliar faces20. Since to executing this algorithm first we need to train the algorithm so, we need a high amount of face datasets. Face recognition based on a neuro-fuzzy system is beautifully described by the authors in10,11,12,21,22, which required precisely training techniques. Since in that procedure we need to train neural network system for better precision, it consumes too much storage and required high computational price.
Recently, authors in13 proposed a detailed investigation of face recognition based on numerous artistic sketches. For the identification of the various stylistic sketches, three criteria have been established: (i) face identification from several hand-drawn drawings; (ii) face identification from numerous hand-drawn drawings and composite drawings; and (iii) face identification from numerous composite drawings. The goal of authors is to cover every facial recognition scenario that could possibly occur. Finally, authors concluded that hand-drawn sketches produce results that are more accurate than those produced by other situations.
In23, Zhang et al. proposed an algorithm based principal component analysis and they first considered the sketches are made by multiple artists. A saliency and attribute feedback-based technique is proposed by Mittal et al.24, and to enhance the matching ability, the composite sketch was created utilizing software. Gao et al.25, created a face drawing and photo synthesis technique using a sparse encoding and numerous style drawings, and they used this dataset as the training set. However, throughout the testing phase, this approach was only taken into account for one sketch. Since, in the real world multiple stylistic sketches are very helpful for recognize the suspect, that’s why we took a lot of multiple sketches for testing. Klar et al. employed multi - resolution lbps and SIFT for feature identifiers in26 to compare forensic sketches to mug picture photos. For minimum distance matching between the photos, they are used multiple discriminant projections. In this paper we have used IF m_X^* oscillation and fuzzy m_X^* oscillation to measure the minimum distance between the images. Using the proposed algorithm we have tried to illustrate a hybrid face recognition technique from sketches.
Preliminaries
This section explains the notations that serve as the foundation for our suggested method. The symbol and definition of fuzzy \(\:{m}_{X}^{*}\) open and closed oscillatory operator, fuzzy oscillation fuzzy minimum structure, etc. will be described in the paragraphs that follow.
Definition 2.1
A collection M of fuzzy sets in a universal set X is referred to as a fuzzy minimum structure on \(\:X\) if \(\:\alpha\:{1}_{X}\:\in\:\:M\), where \(\:\alpha\:\in\:\left[\text{0,1}\right]\)27,28.
Definition 2.2
The following operators must be defined from \(\:{I}^{X}\)→ \(\:{I}^{X}\) in order to define Fuzzy \(\:{m}_{X}^{*}\)19
.
(ii)\(\:\:{Int}_{{a}_{k}}\left(X\right)=sup\left\{\:{{\mu\:}_{a}}_{k}\right({X}_{i})\::{{\mu\:}_{a}}_{k}({X}_{i})\le\:\:{{\mu\:}_{a}}_{k}(X),\:{X}_{i}\in\:G,\:\text{t}\text{h}\text{e}\:\text{s}\text{e}\text{t}\:G\:\text{i}\text{s}\:\text{o}\text{p}\text{e}\text{n},\:i=1,\:2,\dots\:,n\}\:\:\:\)
(iii)\(\:\:{Cl}_{{a}_{k}}\left(X\right)=inf\left\{{{\mu\:}_{a}}_{k}\right({X}_{i})\::{{\mu\:}_{a}}_{k}({X}_{i})\ge\:\:{{\mu\:}_{a}}_{k}(X),\:{\:X}_{i}\in\:G,\text{t}\text{h}\text{e}\:\text{s}\text{e}\text{t}\:G\:\text{i}\text{s}\:\text{c}\text{l}\text{o}\text{s}\text{e}\text{d},\:i=1,\:2,\dots\:,n\}\:\)
Here, for any characteristic \(\:{X}_{i}\) any feature \(\:{a}_{k}\), \(\:{{\mu\:}_{a}}_{k}\left({X}_{i}\right)\) represents the membership value.
Definition 2.3
The symbols \(\:{O}^{o}\) and define a fuzzy \(\:{m}_{X}^{*}\) open oscillatory operator from \(\:\:{I}^{X}\to\:{I}^{X}\) and defined by \(\:{O}_{{a}_{k}}^{o}\left(X\right)={{\Lambda\:}}_{{a}_{k}}\left(X\right)-{Int}_{{a}_{k}}\left(X\right)\). Similar to that, an operator with a fuzzy \(\:{m}_{X}^{*}\) closed oscillatory structure from \(\:{I}^{X}\to\:{I}^{X}\) is referred to by \(\:{O}^{c}\)and described by \(\:{O}_{{a}_{k}}^{o}\left(x\right)={\text{C}\text{l}}_{{a}_{k}}\left(X\right)-{\text{V}}_{{a}_{k}}\left(X\right).\)19,29.
Definition 2.4
In the context of intuitionistic fuzzy \(\:{m}_{X}^{*}\:\)Oscillation (IF \(\:{m}_{X}^{*}\:\)Oscillation) a member of the IF set family in X is said to be IF \(\:{m}_{X}^{0*}\) structure if \(\:{1}_{\sim}\) or \(\:{0}_{\sim}\) is a member of \(\:{m}_{X}^{*}\) and \(\:\alpha\:\) is an IF set with \(\:\alpha\:\in\:{m}_{X}^{*}\).
Fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation based face recognition
The authors30 presented the idea of fuzzy \(\:{m}_{X}^{*}\) oscillation. The authors also discuss how choosing between various oscillatory operator settings can help one determine whether two things are different or similar.
Due to brightness, darkness, or varied stances, various images create an image database where each pixel with a grey level membership value of 0 or 1 must be present in the collection. The constituents of this collection are \(\:{m}_{X}^{*}\) open set structures. Let’s say the gallery picture dataset includes the following images: \(\:I=\:{I}_{1}\:,\:\:{I}_{2}\:,\:\dots\:.\:,\:{I}_{M}\). With the aid of fuzzy m X* oscillation, we must now discriminate between an unknown image and a recognized image. So, using an oscillator operator, we must talk about the four examples that follow. The following circumstances may arise for \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)={{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)\).
(1) \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)=1\:\text{o}\text{r}\:0\)
(2) \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)\in\:\left(\text{0,1}\right)\)
(3) \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)={{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-\varphi\:\)
(4) \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)=\widehat{I}-{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)\)
Situation (1)
(i) Assume \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)={{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)=0.\)
.
\(\:\iff\:\) Component \(\:(i,\:j)\) intensity of the training image equals component \(\:(i,\:j)\) intensity of the unknown image.
\(\:\iff\:\) The component \(\:(i,\:j)\) in the unfamiliar (flipped) picture and the training picture are identical.
(ii) Assume \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)={{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)=1\).
\(\:\iff\:\)Specifically, the intensity of the unfamiliar picture at component \(\:(i,j)\) is not over a predetermined range.
As a result, this pixel cannot be compared to component (i, j) of a familiar picture. As a result, this component is indeterminate.
Situation (2)
Let, \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)\in\:\left(\text{0,1}\right)\).
-
(i)
There might be a likeness between the photos of the pixel \(\:(i,\:j)\) if \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)\le\:0.\:1\).
-
(ii)
The variance among the intensity of the unfamiliar picture at component \(\:(i,j)\) and \(\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)\) or \(\:{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)\) must be checked if \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)\ge\:0.\:1\). We pick this pixel if the variation is less than 0.1.
Situation (3)
Assume \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)={{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-\varphi\:.\).
In this instance, we must determine whether the intensity of the unfamiliar picture at pixels \(\:(i,j)\:\)and \(\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)\) or \(\:{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)\) differs from each other. If the difference is less than 0.1, we use this pixel; otherwise, we use another.
Situation (4)
Let, if possible \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)=\widehat{I}-{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)\).
In this instance, we must determine whether the intensity of the unfamiliar picture at pixels \(\:(i,j)\:\)and \(\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)\) differs. The picture may be regarded as familiar at component \(\:(i,j)\) if this variation is less than 0.1, otherwise distinct.
Face identification using intuitionistic fuzzy (IF) \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation
In31, the authors described the algorithm of Intuitionistic Fuzzy \(\:{m}_{X}^{*}\) Oscillation Based Face Recognition. In19,32, the authors described how fuzzy \(\:{m}_{X}^{*}\) oscillation may be useful in face recognition. The hypothesis that fuzzy sets oscillate between two closed and open fuzzy sets is the main foundation for fuzzy \(\:{m}_{X}^{*}\) oscillation. To compare the training and tested photos, we use the IF Set \(\:{m}_{X}^{*}\) oscillation and we treat the non-membership image as a closed set for this purpose. Since we treated non-membership images as a closed set, each non-membership image’s pixel values are \(\:\{0\:\le\:\:{{\upmu\:}}_{{a}_{j}}({x}_{i})\:+\:{{\upgamma\:}}_{{a}_{j}}({x}_{i})\:\le\:\:1,\:\:j=\:1,\:2,\:3\dots\:.\}.\) Similarly, for unknown image (i.e. tested image) both the membership and non-membership images are formed. Oscillatory operators are then computed from these membership and non-membership images. The IF open and IF closed operators are used in this instance to create an oscillatory operator matrix.
Forming the non-membership image for each image is crucial for determining the similarity measurement algorithm (i.e. unknown and known image). First some key points are calculated using SIFT algorithm (which is discussed in the next section). These key points are considered as a feature coordinates. The non-membership values need to be calculated from these feature coordinates, with \(\:\{0\:\le\:\:{{\upmu\:}}_{{a}_{j}}({x}_{i})\:+\:{{\upgamma\:}}_{{a}_{j}}({x}_{i})\:\le\:\:1,\:\:j=\:1,\:2,\:3\dots\:.\}\). The largest pixel value from these feature coordinates will be used to create the non-membership values, which is closure set. The highest pixel value difference, or \(\:1\:-{{\upmu\:}}_{{a}_{max}}\), is used to construct the base pixel value, and the condition \(\:\left[\right(\:\:{{\upmu\:}}_{{a}_{j}}+\:{{\upgamma\:}}_{{a}_{j}}\:)\le\:\:1]\) of IF is set to generate the remaining pixel values.
Each case’s explanation with an example
Let’s see how each fixed value’s touple is used to determine its supremum and infimum values. coordinate of a distinct image with the same coordinate’s unknown image pixel value, python allows us to extract some coordinate pixel values in the manner shown below.
-
I1: (x(21, 30), 〈0.786, 0.214〉), (x(27, 23), 〈0.335, 0.665〉), (x(15, 19), 〈0.435, 0.565〉), (x(15, 27), 〈0.512, 0.488〉)
-
I2: (x(21, 30), 〈0.663, 0.337〉), (x(27, 23), 〈0.292, 0.708〉), (x(15, 19), 〈0.341, 0.659〉), (x(15, 27, 〈0.685, 0.315〉).
-
I3: (x(21, 30), 〈0.373, 0.627〉), (x(27, 23), 〈0.275, 0.305〉), (x(15, 19), 〈0.241, 0.759〉), (x(15, 27), 〈0.631, 0.369〉)
-
I4: (x(21, 30), 〈0.466, 0.534〉), (x(27, 23), 〈0.308, 0.692〉), (x(15, 19), 〈0.531, 0.469〉), (x(15, 27), 〈0.527, 0.473〉)
-
I5: (x(21, 30), 〈0.513, 0.487〉), (x(27, 23), 〈0.541, 0.459〉), (x(15, 19), 〈0.499, 0.801〉), (x(15, 27), 〈0.735, 0.265〉).
The membership and non-membership values of those pictures at the corresponding pixel positions are represented by I1–I5. The membership values are categorized as an open set in this regard, while the non-membership values are categorized as a closed set. Let J be an unknown object with the following pixel values.
J (x(21, 30), 〈0.543, 0.457〉), (x(27, 23), 〈0.435, 0.565〉), (x(15, 19), 〈0.413, 0. 587〉), (x(15, 27), 〈0.369, 0.331〉).
For the pixel points (21, 30), picture J’s Λ operators will be 〈0.786, 0.214〉. It chooses the value with less closed value (i.e., less non-membership value) for similar open set values, and the operator will be 〈0.663, 0.337〉. Each ‘feature coordinate’ will have its Λ and Int value determined in this manner. Image J’s open oscillation operators \(\:{O}_{J}^{o}\) are:
Since \(\:\mu\:\:<\:1\) and \(\:\gamma\:\:<\:0.1\), situation 2 will be applied. The pixels at this point are similar based on the criteria of situation 2. In the same way the remaining coordinates will also be computed.
Since \(\:\mu\:\:<\:1\) and \(\:\gamma\:\:<\:0.1\), situation 2 will be applied. The difference between Λ and the unknown pixel point must be computed, i.e., (0.435–0.266) = 0.169 > 0.1, in accordance with situation 2 requirements, which state that \(\:\mu\:\:>\:0.1\). This indicates that the pixels are different at this location. The remaining coordinates will also be computed.
\(\:\gamma\:\:<\:0.9\) will be regarded as situation 4. With an unknown image pixel, we must next determine the difference of \(\:\gamma\:\), which is (0.801–0. 587) = 0.214 < 0.9. Next, we must determine the difference between the membership value and the unknown image pixel, or (0.499–0.413). \(\:{O}_{J}^{o}\left[\text{15,27}\right]=<{{\Lambda\:}}_{{J}_{\mu\:}}-{Int}_{{J}_{\mu\:}},{{\Lambda\:}}_{{J}_{\gamma\:}}-{Int}_{{J}_{\gamma\:}}><\left(0.512-\widehat{I}\right),\left(0.488\rangle\:\right)>=0.003<0.1\). Thus, pixels at this pixel location are comparable. \(\:{O}_{J}^{o}\left[\text{15,27}\right]=<{{\Lambda\:}}_{{J}_{\mu\:}}-{Int}_{{J}_{\mu\:}},{{\Lambda\:}}_{{J}_{\gamma\:}}-{Int}_{{J}_{\gamma\:}}><\left(0.512-\widehat{I}\right),\left(0.488\rangle\:\right)>\). Since \(\:\gamma\:\:<\:0.9\), this will be regarded as situation 3. Finding the difference with an unknown image pixel is the next step, which is \(\:(0.488\:-\:0.331)\:=\:0.157\:<\:0.9\). With an unknown picture pixel, we must then determine the difference of \(\:\mu\:\), which is \(\:(0.512\:-\:0.369)\:=\:0.143\:>\:0.1\). At this pixel position, pixels are therefore different.
SIFT or scale invariant feature transform
David G. Lowe described his research strategy, known as SIFT33, for extracting the keypoints from the image collection. SIFT has been utilised extensively in the field of face identification for the past 14 years. In the branch of face identification, it is mostly employed for finding the characteristics of keypoints. We refer to it as scale invariant since it would supply almost the same amount of keypoints after increasing the image (in this proposed research work, keypoints treated as feature coordinates). Additionally, it provides approximately the same amount of keypoints for rotated images. Unreliable keypoints discarding, Scale-Space Extrema Detection, orientation assignment, and keypoints descriptor calculation are required steps in this technique. For example, Fig. 1 represents Image gradient and Keypoints descriptor. Figure 2 shows the results of SIFT algorithm on gallery images and Fig. 3 showing the keypoint region following the SIFT application.
Databases of face sketches
We have gathered publicly accessible sketch datasets, such as the CUHK1,34 sketch database, which consists of three distinct databases: the CUHK database, the AR database, and the XM2VTS database. The CUHK student database has 188 drawing image pairs, the AR database35 has 123 drawing image pairs, and the XM2VTS database36 has 295 drawing image pairs. Thus, from this point on, we were able to compile a total of 606 picture databases. Because the XM2VTS database utilised in this instance is not a free one, we employed a total of 311 sketch databases. Next, we compared the result with the e-PRIP dataset’s composite sketches. Han et al.37 dataset was enhanced to create the e-PRIP15,16 dataset. It is the sole dataset that is provided for composite sketches. 123 composite and digital pictures from the AR dataset35 are included in this collection. The dataset includes four sets of composite sketches made by American, Asian, and Indian artists with the aid of the Identi-Kit38 and FACES39 tools. The CUHK sketch databases contain images with set backgrounds and extremely well controlled lighting. Despite what we saw, we cannot claim for sure that using these well-known databases, facial recognition is conceivable. We gathered sketch-digital images that were created by experts and contain a variety of sketches and images from different sources. This database has 231 image pairs. Among them the IIIT-Delhi databases, which have 568 sketch-digital image pairings40, the FG-NET ageing databases, which have 67 sketch-digital picture pairings41, the Labeled Faces in Wild databases, which have 92 drawings and its original pairs photos42, and the FG-NET ageing databases. The IIIT-Delhi databases are more aesthetically pleasing and intellectually fascinating than the CUHK databases, and they furthermore contain sketch-digital picture pairings made by their students and instructors. In addition, we have gathered some images with diverse facial expressions, motions, and lighting. In Fig. 4, there are several images that we have attempted to highlight.
Sift, fuzzy and IF \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation based face identification
Fuzzy and IF \(\:{m}_{X}^{*}\) oscillation was both used in our suggested approach. The procedure and method for matching sketches to images utilising IF \(\:{m}_{X}^{*}\) oscillation is introduced after the concepts of SIFT and fuzzy \(\:{m}_{X}^{*}\) oscillation. In the experimental section, the output of the fuzzy and IF minimal structure oscillations were contrasted with those of other methods. To distinguish the faces utilizing fuzzy \(\:{m}_{X}^{*}\) oscillation, we first collected probing sketches and digital photos from a gallery. Then, SIFT is utilized to highlight important areas in both the sketch and the photos. In Fig. 5 green plus points indicates the feature coordinates. The matrix of unfamiliar photos at each feature coordinate must then be compared to these pixel values at significant positions in the known image matrix. So, for that first we have to find the negative image of the given image. In Fig. 6 column (b) represent the negative image of the image in column (a). The first measurements are performed on the operator’s Λ (also known as the cl operator for negative photos) and Int (also known as the V operator for negative photos). In Fig. 6 we A rough distinction between the Λ and Int operators is the Open Oscillatory Operator (or Closed Oscillatory Operator for negative images). Depending on the many circumstances19 that would be taken into consideration for determining similarity, closed and open oscillatory operators are the targeted matrix values from both images.
Face identification process based on SIFT and fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation
Step 1: Assume that there are \(\:N\) photographs in our image database. Then SIFT is applied to each database photos. Then figure out the pixel values for each training image’s crucial areas. Feature coordinates from the unfamiliar photos were simultaneously extracted after utilising SIFT.
Step 2: We compute the supremum value with the operator Λ and the Int operator (INT OPTR).
Step 3: The complementary image’s maximum \(\:CL\_OPTR\) and \(\:V\_OPTR\) values should be determined. Then, using the normal picture as a guide, the complimentary images are used to determine the open oscillatory value \(\:{O}^{o}\) and the closed oscillatory value \(\:{O}^{cl}\).
Step 4: After computing the closed and open oscillatory operators in accordance with19, the four examples of the similarity measure are taken into consideration.
Pseudocode of fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation
The Pseudocode mentioned below was developed in this study to recognize faces utilizing SIFT and fuzzy \(\:{m}_{X}^{*}\) oscillation.


Block diagram for face identification based on SIFT with fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation
This flowchart displays the enhanced face detection algorithm using fuzzy \(\:{m}_{X}^{*}\) and SIFT (Fig. 7). Use SIFT to identify the most promising key points before attempting to distinguish unfamiliar face shots from a collection of familiar sketch face photos. Consider many instances for measuring similarity for each pixel. The flowchart for face recognition using fuzzy \(\:{m}_{X}^{*}\) oscillations and SIFT.
Face identification based on SIFT and IF \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation
Step 1: \(\:\text{M}\)images from an image database may be utilised as training examples or as input images. Specific feature values that are derived from predefined coordinates must be present in every database image. These feature values are used to quantify membership and non-membership pixels, resulting in the production of a non-membership image. The unknown image’s feature values are simultaneously acquired while taking the same coordinate value into account.
Step 2: Int_operator and Λ will now be computed. The “Infimum value” operator is determined by comparing the “feature value matrix” of the set of training pictures with the unfamiliar image (with less γ). Int_operator with supremum value is calculated in this manner. Now, the supremum value of the Λ operator and the Int operator (INT_ OPTR) will be determined.
Step 3: The closed oscillatory values, i.e., \(\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)\) from the non-membership images, and the open oscillatory values, i.e., \(\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)\) from the normal image, are evaluated.
Step 4: Four occurrences of \(\:{\text{O}}^{\text{o}}\) and \(\:{\text{O}}^{\text{c}\text{l}}\) should be taken into account for similarity measurement. To determine similarity and difference, a component-by-component comparison is taken into consideration.
Pseudocode of IF \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation
These noted stages are followed by the technique utilised in the proposed algorithm for face identification utilising SIFT and IF \(\:{m}_{X}^{*}\) Oscillation:


Block- diagram of face identification based on SIFT and IF \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation
The suggested technique can be employed for face identification using IF Fuzzy \(\:{m}_{X}^{*}\:\) and SIFT, as shown in this flow chart (Fig. 8).
Removing error using 6 point facial landmark detector
To find the facial landmarks on the frontal faces, based on the paper43,44 a shape estimator was applied that is perform in the dlib library. This shape estimator gives 68 facial landmarks along with the corner of the eyes, lips, eyebrow’s, nose tip etc.(Fig. 9(a)). In the proposed approach we modified the 68 facial landmarks as 6 point facial landmarks. These 6 facial landmark points include the left and right corner points of left and right eye, nose tip and 3 points on the mouth (Fig. 9(b)). In this paper we have used 6 facial landmark points because we want to create two regions in the face and use these regions in our future work. And besides, to create these regions we only need 6 facial landmark points and for which we have changed the 68 facial landmark points into 6 facial landmark points.
Now, let us take these points as \(\:A({x}_{1},{y}_{1}),\:B\left({x}_{2},{y}_{2}\right),\:O\left({x}_{3},{y}_{3}\right),\:C\left({x}_{4},{y}_{4}\right),\:D\left({x}_{5},{y}_{5}\right)\:\) and \(\:E({x}_{6},{y}_{6})\) where, (\(\:{x}_{i},{y}_{i}),\:i=1,\dots\:,6\) denote the pixel position of the 6 landmark points. Using the landmark points 37, 46, 31, 65, 58 and 61 we can easily find out the pixel positions of A, B, O, C, D, E. And then using the following distance formula measured the ratio of perimeters between the triangle AOB and the quadrilateral OCDE (Fig. 10).
This ratio is measured for both the sketch and output images (when the output images are more than one). As our aim is to achieve one correct output image from two or more output images, we shall take the smallest difference image as an output after comparing their ratio of perimeters. The working procedure of proposed method is described below with the help of Fig. 11.
Discussion of experimental outcomes
This section presents the experimental findings of our suggested method in detail. We separate the findings of our analysis into three parts; the first part includes the outcomes of fuzzy \(\:{m}_{X}^{*}\) oscillation in the SIFT domain and its comparison with other existing algorithms, the second part includes the outcomes of IF \(\:{m}_{X}^{*}\) oscillation using SIFT domain and its comparison with other existing algorithms and the last part includes comparison between the performances of IF set \(\:{m}_{X}^{*}\) Oscillation, Fuzzy \(\:{m}_{X}^{*}\) Oscillation and our algorithm. In order to conduct a comparison research, we used rank-10 and rank-50 recognition accuracy together with Cumulative Match Characteristic (CMC) curves45.
Experimental results of proposed algorithm (fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) Oscillation in SIFT domain) and its comparison with other existing algorithms
We made use of the CUHK Face drawing image set (CUFS), and the IIIT-D drawing image set, and four important databases. The results of the suggested approach and those of the current algorithms were then compared using the e-PRIP database and the IIIT-D semi-forensic sketch databases, which is given in Figs. 12, 13 and 14. According to the authors of17, they were able to reach rank-10 accuracy of 76.4% and 72.3% with attribute and 67.6% and 69.1% without attribute for the drawings made by Asian and Indian artists, respectively. Additionally, it is evident from their experimental part that for the drawings made by Asian and Indian artists, respectively, they attained rank-50 accuracy with attribute of 82.2% and 80% and without attribute of 68.1% and 68%.
Experimental results of proposed algorithm (IF \(\:{\varvec{m}}_{\varvec{X}}^{*}\) Oscillation in SIFT domain) and its comparison with other existing algorithms
Utilizing the CUHK Face drawing dataset (CUFS), and IIIT-D drawing image set, we assessed the findings of our experimental work. After that we compare our proposed algorithm using e-PRIP database and IIIT-D semi-forensic sketch databases, which is given in Fig. 15. First, we used the Scale Invariant Feature transformation, which produced the essential points depicted in Fig. 3. Then, from each facial image and sketch image, the pixel values of the keypoints are collected. We use these key locations as feature coordinates and compare them using the IF \(\:{m}_{X}^{*}\) oscillation to determine how similar they are. Following the conclusion of these first steps, a single image of a distinct human is compared, and the algorithm generates the desired outcome. On the other hand, if the same person were to be compared, it was declared that their faces were identical.
Tables 1 and 2, and 3 already display the results of various current algorithms15,16,17,24, and18 on the e-PRIP database and the IIIT-D semi-forensic sketch database with ranks-10 and 50. We conducted the comparison after determining the suggested algorithm’s accuracy at rank-10 and rank-50 for the e-PRIP and IIIT-D semi-forensic sketch databases. At rank-10, we received 78.1% and 75% on the Faces (Indian User) and Identi-kit (Asian User) databases, respectively, while at rank-50, we received 84% and 78.6% on the same databases. Additionally, at rank-50 on the IIIT-D semi-forensic drawing image set, our suggested method achieved 93.5%.
Comparison between the performances of IF set \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation, fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation and our algorithm
In this paper, two algorithms are developed; the first is called Fuzzy \(\:{m}_{X}^{*}\) oscillation with SIFT Operator, and the second is called IF set \(\:{m}_{X}^{*}\) oscillation with SIFT Operator. It is evident from the experimental results of Sect. 5(A) and (B) that both proposed algorithms produce results that are satisfactory. IF set \(\:{m}_{X}^{*}\) oscillation with SIFT Operator provides \(\:97.1\%\) for sketch faces, and fuzzy Set \(\:{m}_{X}^{*}\) oscillation with SIFT Operator provides \(\:97.1\%\) for sketch faces as shown in Figs. 16. Figure 17 makes it evident that our suggested algorithm produces better results when compared to the other algorithms that are currently in use30,31. The accuracy comparison is shown in Table 4.
In recent years, deep learning approaches have grown in popularity and effectiveness. Deep learning does, however, have a lot of drawbacks. Deep learning needs a huge amount of data in order to perform better than previous techniques. Due to sophisticated data models, training is very expensive. In addition, deep learning requires hundreds of workstations and pricey GPUs. The cost of users goes up as a result. We conducted a comparison analysis of our method with the deep learning technique used by Mittal et al.15, who employ facial representation based on deep learning architecture. The performance chart for fuzzy \(\:{\text{m}}_{\text{X}}^{\text{*}}\), the IF set oscillation, and our algorithm for face identification are shown in Fig. 16. While the y-axes display precision, the x-axes show the quantity of feature coordinates. It may be argued that when training sets and feature coordinates are multiplied, fuzzy \(\:{\text{m}}_{\text{X}}^{\text{*}}\) oscillation SIFT domain generates superior accuracy than other face recognition methods like fuzzy \(\:{\text{m}}_{\text{X}}^{\text{*}}\) oscillation30 and IF set oscillation31.
Limitation and future scope
The face recognition performance of the proposed fuzzy shift-domain method is promising, but it has some limitations. For large-scale datasets, the method requires a lot of computational power and may be affected by images that are noisy, blurry, or of low resolution. Additionally, the selection of fuzzy membership functions and shift-domain parameters, which may differ between datasets, has a slight impact on overall accuracy.
This work can be extended in multiple directions in the future. To reduce reliance on manual selection of fuzzy functions, adaptive or automated parameter optimization methods can be developed. Pose, illumination, and occlusion variations may be mitigated by integration with deep learning models. Improving the method’s ability to handle noisy and low-resolution images46 will further strengthen its applicability in real-world scenarios. Multi-stage Siamese neural network47 and multi-task joint learning48 provide methodologies for combining diverse data sources, which can enhance performance. Furthermore, hyperrectangle embedding networks49 offer advanced techniques for modeling complex relationships and incorporating additional information, which could improve feature consistency representation. Moreover, multi-scale spatial–temporal interaction fusion50 addresses the challenges in this area. These could be adapted to improve the accuracy for the future directions. For surveillance and mobile authentication systems, real-time implementation on GPU/FPGA-based platforms may also be considered. In addition, the approach’s scalability can be evaluated on extremely large databases, and its use can be expanded beyond face recognition to include other biometric and image analysis fields like gesture recognition and medical imaging.
Conclusion
This work presented a sketch-to-photo face recognition framework that integrates Scale-Invariant Feature Transform (SIFT) with Intuitionistic Fuzzy (IF) and Fuzzy Minimal Structure Oscillation. In order to locate the keypoints from sketch faces and gallery photos that will be utilized to extract features for our new algorithm, we used SIFT. IF \(\:{\text{m}}_{\text{X}}^{\text{*}}\) oscillation and fuzzy \(\:{\text{m}}_{\text{X}}^{\text{*}}\) oscillation-based clustering is applied for these values. For a better outcome, we ultimately used a modified 6-point facial landmark detector. By creating a Matlab language application and testing it against several face datasets, we have validated our algorithm. Experimental evaluations across multiple benchmark datasets, including CUHK, IIIT-D, and e-PRIP, demonstrate that the proposed approach consistently outperforms several existing methods, achieving recognition accuracies up to 97.3%. The findings affirm the use of fuzzy-based oscillation operators that successfully deal with uncertainty and partial similarity among heterogeneous modalities of sketches and photos. The proposed method is highly accurate and does not need massive data or extensive computations compared with deep learning approaches and thus is applicable in forensic and surveillance applications where training data can be limited.
Overall, the work demonstrates the effectiveness of using classical feature extraction in conjunction with fuzzy logic concepts to enhance cross-domain face recognition. Future research can focus on adaptive parameter selection, integration with lightweight deep learning models, and real-time deployment on GPU/FPGA platforms to further expand the applicability of this approach in law enforcement, digital security, and biometric systems.
Data availability
The datasets used and/or analysed during the current study are cited in the manuscript and publicly available.
References
Wang, X. & Tang, X. Face photo-sketch synthesis and recognition. IEEE Trans. Pat- Tern Anal. Mach. Intell. 31 (11), 1955–1967 (2009).
Wang, N., Gao, X. & Li, J. Random sampling for fast face sketch synthesis. Pattern Recognit. 76, 215–227 (2018).
Jiao, L., Zhang, S., Li, L., Liu, F. & Ma, W. A modified convolutional neural network for face sketch synthesis. Pattern Recognit. 76, 125–136 (2018).
Sánchez, L. et al. Information system for image classification based on frequency curve proximity. Inform. Syst. 64, 12–21 (2017).
Lee, K. & Min, K. An interactive image clipping system using hand motion recognition. Inform. Syst. 48, 296–300 (2015).
Khan, M. F., Khan, E., Nofal, M. M. & Mursaleen, M. Fuzzy mapped histogram equalization method for contrast enhancement of remotely sensed images, in. IEEE Access 8, 112454–112461 (2020).
Khan, M. F., Dannoun, E. M. A., Nofal, M. M. & Mursaleen, M. Significance of camera pixel error in the calibration process of a robotic vision system. Appl. Sci. 12, 6406 (2022).
Sha, X., Si, X., Zhu, Y., Wang, S. & Zhao, Y. 2025 Automatic three-dimensional reconstruction of transparent objects with multiple optimization strategies under limited constraints105580 (Image and Vision Computing,).
Sha, X., Zhu, Y., Sha, X., Guan, Z. & Wang, S. ZHPO-LightXboost an integrated prediction model based on small samples for pesticide residues in crops Vol. 188, 106440 (Elsevier, 2025).
Mehta, S., Gupta, S., Bhushan, B. & Nagpal, C. K. Face recognition using neuro-fuzzy inference system. Int. J. Signal. Process. Image Process. Pattern Recogn. 7 (1), 331–334 (2014).
Deng, H., Sun, X., Liu, M., Ye, C. & Zhou, X. Image enhancement based on intuitionistic fuzzy sets theory. IET Image Proc. 10 (10), 701–709 (2016).
Starostenko, O. et al. A fuzzy reasoning model for recognition of facial expressions. Comput. Sist. 15 (2), 1405–5546 (2011).
Peng, C., Gao, X. & Wang, N. Face recognition from multiple stylistic sketches: Scenarios, datasets, and evaluation. Pattern Recogn. 64, 262–272 (2018).
Li, S. & Jain, A. (eds) Handbook of Face Recognition 2nd edn (Springer, 2011).
Mittal, P., Vatsa, M. & Singh, R. (2015)Composite sketch recognition via deep network-a transfer learning approach, In Biometrics (ICB), 2015 International Conference on, pp. 251–256. IEEE .
Mittal, P., Jain, A., Goswami, G., Singh, R. & Vatsa, M. (2014) Recognizing composite sketches with digital face images via ssd dictionary, In Biometrics (IJCB), 2014 IEEE International Joint Conference on, pp. 1–6. IEEE .
Iranmanesh et al.(2018) Deep Sketch-Photo Face Recognition Assisted by Facial Attributes, In IEEE 9th international conference on Biometrics Theory Applications and Systems (BTAS), pp. 1–10 .
Peng, P. et al. Sparse graphical representation based discriminant analysis for heterogeneous face recognition. Sig. Process. 156, 46–61 (2019).
Bhattacharya, S. & Majumder, B. (eds) (Halder), Face Recognition Using Fuzzy Minimal Structure Oscillation in the Wavelet Domain, Pattern Recognition and Image Analysis,Vol. 29, N0. 1, pp. 174–180 (2019).
Li Liu, S., Chen, X., Chen, T., Wang & Zhang, L. Fuzzy weighted sparse reconstruction error–steered semi–supervised learning for face recognition. Visual Comput. 36, 1521–1534 (2020).
Chacon, M., Mario, I., Rivas, P. & Ramirez, A. A fuzzy clustering approach for face recognition based on face feature lines and eigenvectors. Eng. Lett. 15 (1), 35–44 (2007).
Abbasi, N., Khan, M. F., Khan, E., Alruzaiqi, A. & Al-Hmouz, R. Fuzzy histogram equalization of hazy images: a concept using a type-2-guided type-1 fuzzy membership function. Granul. Comput. 8 (4), 731–745 (2023).
Zhang, Y., McCullough, C., Sullins, J. & Ross, C., (2010) Hand-drawn face sketch recognition by humans and a PCA-based algorithm for forensic applications, in: Proceed- ings of IEEE International Conference on System Man & Cybernetics Part A, Vol. 40, pp. 475–485.
Mittal, P. et al. Composite sketch recognition using saliency and attribute feedback. Inform. Fusion. 33, 86–99 (2017).
Gao, W. N., Tao, X. D. & Li, X. Face sketch-photo synthesis and retrieval using sparse representation. IEEE Trans. Circuits Syst. Video Technol. 22, 1213–1226 (2012).
Klar, B., Li, Z. & Jain, A. Matching forensic sketches to mug shot photos. IEEE Trans. Pattern Anal. Mach. Intell. 33 (3), 639–646 (2011).
Alimohammady, M. R. Transfer closed and transfer open multimaps in minimal space. Chaos,Solitons and Fractals 40(3), 1162–1168 (2009).
Miry, Abbas Hussien. Miry,face detection based on multi facial feature using fuzzy logic. AI-Mansour J. Issue (21), 15–30 (2014).
Mukherjee, A. & (2007) Simulation, S. H. CIT, Coimbatore, p-665-670 .
S.Bhattacharya(Halder) & Roy, S. On fuzzy Oscillation and it’s application in image processing. Annals Fuzzy Math. Inf. 7(2), 319–329 (2014).
S.Bhattacharya(Halder. Application of IF set oscillation in the field of face recognition, pattern recognition and image analysis. Pattern Recognition and Image Analysis 27(3), 625–636 (2017).
S.Bhattacharya(Halder), Barman Roy, S. & Saha, S. Application of fuzzy oscillation in the field of face recognition, international symposium on advanced computing and communication (ISACC), pp. 192–197 (2015).
Lowe, D. G. Distinctive image features from Scale-Invariant keypoints. Int. J. Comput. Vision. 60, 91–110 (2004).
Zhang, W., Wang, X. & Tang, X. (2011) Coupled Information-Theoretic Encoding for Face Photo-Sketch Recognition, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Martinez, A. & Benavente, R., (1998) The AR Face Database, Technical Report, CVC, Barcelona, Spain, June .
Messer, K., Matas, J., Kittler, J., Luettin, J. & Maitre, G. XM2VTSDB (1999): the Extended of M2VTS Database, in Proceedings of International Conference on Audio- and Video-Based Person Authentication, pp. 72–77 .
Han, H., Klare, B., Bonnen, K. & Jain, A. Matching composite sketches to face photos: a component-based approach. IEEE Trans. Inf. Forensics Secur. 8, 191–204 (2013).
Identi-kit solutions. (2011). http://www.identikit.net/.
Faces 4. 0, iq biometrix. (2011). http://www.iqbiometrix.com.
Bhatt, H., Bharadwaj, S., Singh, R. & Vatsa, M. Memetically optimized MCWLD for matching sketches with digital face images. IEEE Trans. Inf. Forensics Secur. 7 (5), 1522–1535 (2012).
Fg-net aging database – http://www.fgnet.rsunit.com/
Huang, G. & Ramesh, M. and L.-M. T. Berg.(2007) Labeled faces in thewild: A database for studying face recognition in unconstrained environment, .
Vahid Kazemi and Sullivan Josephine. One millisecond face alignment with an ensemble of regression trees. In 27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, United States, 23 June 2014 through 28 June 2014, pages 1867–1874. IEEE Computer Society.
Christos Sagonas, G. & Tzimiropoulos Stefanos Zafeiriou, and Maja Pantic. (2013) 300 faces in-the-wild challenge: The first facial landmark localization challenge. In Computer Vision Workshops (ICCVW), 2013 IEEE International Conference on, pages 397–403. IEEE.
Jain, A. K. & Li, S. Z. Handbook of face-recognition (Springer-, 2005 ).
Liu, K. et al. Pixel-Level noise mining for weakly supervised salient object Detection, in IEEE transactions on neural networks and learning systems.
Lu, J. et al. Multi-Stage-Based Siamese neural network for seal image recognition. CMES-Computer Model. Eng. Sci., 142(1). (2025).
Sha, X. et al. SSC-Net: A multi-task joint learning network for tongue image segmentation and multi-label classification. Digit. Health. 11, 20552076251343696 (2025).
Feng, M. et al. Hyperrectangle Embedding for Debiased 3D Scene Graph Prediction from RGB Sequences (IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025).
Ma, C. et al. A multi-scale spatial–temporal Interaction Fusion Network for Digital twin-based Thermal Error Compensation in Precision Machine Tools127812 (Expert Systems with Applications, 2025).
Acknowledgements
We sincerely thank the editor and the anonymous reviewers for their valuable comments and constructive suggestions, which have greatly improved the quality of this manuscript.
Author information
Authors and Affiliations
Contributions
Writing Original Draft: B M & C K; Writing, reviewing, and editing: BM, CK, K H, JLS, and MP; Experiments: BM; Supervision: CK.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Informed consent
During the experiments of this study well-known public datasets are used: CUHK1,34, AR database35, XM2VTS database36, e-PRIP15,16 dataset, IIIT-Delhi databases40, FG-NET ageing databases41, Labeled Faces in Wild databases42. Apart from these Author (Bibek Majumder) has used his own image and he has given the consent for publication of his image in an online open-access publication.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Majumder, B., Kumar, C., Hajarathaiah, K. et al. Sketch to photo recognition using IF and Fuzzy minimal structure oscillation in the sift domain. Sci Rep 15, 37907 (2025). https://doi.org/10.1038/s41598-025-23417-w
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-23417-w

















