Introduction

It is exceedingly challenging to distinguish faces in different modalities in computer vision, for example faces of different poses, photos and face sketches1,2,3,4,5,6,7. For poor quality video surveillance, it is very difficult to identify the suspect correctly. Since it is not possible to monitor every part of the world with a camera, we need a good sketch (as described by victim) to photo matching algorithm. Reconstructing transparent objects with limited constraints has long been considered a highly challenging problem8,9. Many authors proposed and showed how fuzzy system is useful in the field of face recognition10,11,12. Up until now, it has been quite challenging to recognize faces from sketches because of the stark disparities among mug shot pictures and face sketches.

The primary inquiry into the issue of face identification from drawings is presented in this study. The authors in13 proposed a thorough investigation on face identification based on several style sketches. They considered three different kinds of sketches, and their experimental findings show that hand-drawn sketches yield more accurate outcomes than the other two scenarios. This collection of sketches was put together using information from databases that are open to the public, including those from IIIT, CUHK, and e-PRIP (Details given in Sect. 2). Fuzzy logic has an excellent ability to handle ambiguity, approximate reasoning and partial truth. Classes become more difficult to split when they overlap and are linearly inseparable. If there is coinciding between two classes, then whatever the rate of belongingness may be, it is sure that there is a pattern between the two classes. In this condition fuzzy logic plays a significant role in recognizing the boundaries of the classes by specifying different degrees of belongingness of the same pattern in multiple classes. So along with the artificial neural network, fuzzy logic also plays a very significant role in solving our real-life complex problems. Through this study, we attempt to show how fuzzy sets can be used in the field of face recognition, and the experimental results were encouraging.

The following five areas best describe our contributions:

  • From sketches and digital images, keypoint regions are found using SIFT.

  • Then choose these keypoints as the coordinates for the feature.

  • After the feature extraction, call IF \(\:{m}_{X}^{*}\) and Fuzzy \(\:{m}_{X}^{*}\) oscillation-based classification for the recognition stage.

  • To determine similarity, take into account four different scenarios for each pixel (four cases are listed in Sect. 3).

  • Finally, the error has been eliminated using the landmark detector.

Our paper is structured as follows: Sect. 2 provides pertinent work. The concept and characteristics of the IF and Fuzzy minimum structure, SIFT are briefly described in Sect. 3. We also examine the digital photos and sketch images that were employed in our experiment in this section. The algorithm and a thorough explanation of our suggested approach are provided in Sect. 4. The landmark detector was then introduced in Sect. 5 for discussion of error-removal methods. Section 6 presents the experimental outcomes of face identification based SIFT and fuzzy \(\:{m}_{X}^{*}\) oscillation, as well as those of face identification based SIFT and IF \(\:{m}_{X}^{*}\) oscillation. In addition, this section discusses the comparison of our developed algorithms to other previously defined algorithms. Section 7 of the document discusses the conclusion portion.

Related works

A fascinating study subject in biometrics and computer vision is face matching between two facial photos14. Many composite sketches have been used in the face recognition area during the past few years; some of these can be found in15,16,17,18. Today, fuzzy based approaches have a significant impact on how face recognition issues are resolved19. Since IF Set uses both membership and non-membership values in its experiments, we may anticipate that it will provide us with results with a high degree of precision. Using neural network together with fuzzy techniques gives an impressive output in face recognition domain about the familiar and unfamiliar faces20. Since to executing this algorithm first we need to train the algorithm so, we need a high amount of face datasets. Face recognition based on a neuro-fuzzy system is beautifully described by the authors in10,11,12,21,22, which required precisely training techniques. Since in that procedure we need to train neural network system for better precision, it consumes too much storage and required high computational price.

Recently, authors in13 proposed a detailed investigation of face recognition based on numerous artistic sketches. For the identification of the various stylistic sketches, three criteria have been established: (i) face identification from several hand-drawn drawings; (ii) face identification from numerous hand-drawn drawings and composite drawings; and (iii) face identification from numerous composite drawings. The goal of authors is to cover every facial recognition scenario that could possibly occur. Finally, authors concluded that hand-drawn sketches produce results that are more accurate than those produced by other situations.

In23, Zhang et al. proposed an algorithm based principal component analysis and they first considered the sketches are made by multiple artists. A saliency and attribute feedback-based technique is proposed by Mittal et al.24, and to enhance the matching ability, the composite sketch was created utilizing software. Gao et al.25, created a face drawing and photo synthesis technique using a sparse encoding and numerous style drawings, and they used this dataset as the training set. However, throughout the testing phase, this approach was only taken into account for one sketch. Since, in the real world multiple stylistic sketches are very helpful for recognize the suspect, that’s why we took a lot of multiple sketches for testing. Klar et al. employed multi - resolution lbps and SIFT for feature identifiers in26 to compare forensic sketches to mug picture photos. For minimum distance matching between the photos, they are used multiple discriminant projections. In this paper we have used IF m_X^* oscillation and fuzzy m_X^* oscillation to measure the minimum distance between the images. Using the proposed algorithm we have tried to illustrate a hybrid face recognition technique from sketches.

Preliminaries

This section explains the notations that serve as the foundation for our suggested method. The symbol and definition of fuzzy \(\:{m}_{X}^{*}\) open and closed oscillatory operator, fuzzy oscillation fuzzy minimum structure, etc. will be described in the paragraphs that follow.

Definition 2.1

A collection M of fuzzy sets in a universal set X is referred to as a fuzzy minimum structure on \(\:X\) if \(\:\alpha\:{1}_{X}\:\in\:\:M\), where \(\:\alpha\:\in\:\left[\text{0,1}\right]\)27,28.

Definition 2.2

The following operators must be defined from \(\:{I}^{X}\)\(\:{I}^{X}\) in order to define Fuzzy \(\:{m}_{X}^{*}\)19

$$\:\left(\text{i}\right)\:{{\Lambda\:}}_{{a}_{k}}\left(X\right)\:=\:inf\left\{\:{{\mu\:}_{a}}_{k}\right({X}_{i})\::{{\mu\:}_{a}}_{k}({X}_{i})\:\ge\:\:{{\mu\:}_{a}}_{k}(X),\:\:{X}_{i}\in\:G,\text{t}\text{h}\text{e}\:\text{s}\text{e}\text{t}\:G\:\text{i}\text{s}\:\text{o}\text{p}\text{e}\text{n}\:,\:i=1,\:2,\:\dots\:,n\}\:\:\:=\:\widehat{I}\:,\:\text{e}\text{l}\text{s}\text{e};$$

.

(ii)\(\:\:{Int}_{{a}_{k}}\left(X\right)=sup\left\{\:{{\mu\:}_{a}}_{k}\right({X}_{i})\::{{\mu\:}_{a}}_{k}({X}_{i})\le\:\:{{\mu\:}_{a}}_{k}(X),\:{X}_{i}\in\:G,\:\text{t}\text{h}\text{e}\:\text{s}\text{e}\text{t}\:G\:\text{i}\text{s}\:\text{o}\text{p}\text{e}\text{n},\:i=1,\:2,\dots\:,n\}\:\:\:\)

$$\:=\:\varphi\:,\:\text{e}\text{l}\text{s}\text{e};$$

(iii)\(\:\:{Cl}_{{a}_{k}}\left(X\right)=inf\left\{{{\mu\:}_{a}}_{k}\right({X}_{i})\::{{\mu\:}_{a}}_{k}({X}_{i})\ge\:\:{{\mu\:}_{a}}_{k}(X),\:{\:X}_{i}\in\:G,\text{t}\text{h}\text{e}\:\text{s}\text{e}\text{t}\:G\:\text{i}\text{s}\:\text{c}\text{l}\text{o}\text{s}\text{e}\text{d},\:i=1,\:2,\dots\:,n\}\:\)

$$\:=\widehat{I},\:\text{e}\text{l}\text{s}\text{e};$$
$$\:\left(\text{i}\text{v}\right)\:{\text{V}}_{{a}_{k}}\left(X\right)\:=\:sup\left\{\:{{\mu\:}_{a}}_{k}\right({X}_{i})\::{{\mu\:}_{a}}_{k}({X}_{i})\le\:\:{{\mu\:}_{a}}_{k}(X),\:\:{X}_{i}\in\:G,\:\:\text{t}\text{h}\text{e}\:\text{s}\text{e}\text{t}\:G\:\text{i}\text{s}\:\text{c}\text{l}\text{o}\text{s}\text{e}\text{d},\:i\:=1,\:2,\:\dots\:,n\}\:\:=\:\varphi\:,\:\text{e}\text{l}\text{s}\text{e};$$

Here, for any characteristic \(\:{X}_{i}\) any feature \(\:{a}_{k}\), \(\:{{\mu\:}_{a}}_{k}\left({X}_{i}\right)\) represents the membership value.

Definition 2.3

The symbols \(\:{O}^{o}\) and define a fuzzy \(\:{m}_{X}^{*}\) open oscillatory operator from \(\:\:{I}^{X}\to\:{I}^{X}\) and defined by \(\:{O}_{{a}_{k}}^{o}\left(X\right)={{\Lambda\:}}_{{a}_{k}}\left(X\right)-{Int}_{{a}_{k}}\left(X\right)\). Similar to that, an operator with a fuzzy \(\:{m}_{X}^{*}\) closed oscillatory structure from \(\:{I}^{X}\to\:{I}^{X}\) is referred to by \(\:{O}^{c}\)and described by \(\:{O}_{{a}_{k}}^{o}\left(x\right)={\text{C}\text{l}}_{{a}_{k}}\left(X\right)-{\text{V}}_{{a}_{k}}\left(X\right).\)19,29.

Definition 2.4

In the context of intuitionistic fuzzy \(\:{m}_{X}^{*}\:\)Oscillation (IF \(\:{m}_{X}^{*}\:\)Oscillation) a member of the IF set family in X is said to be IF \(\:{m}_{X}^{0*}\) structure if \(\:{1}_{\sim}\) or \(\:{0}_{\sim}\) is a member of \(\:{m}_{X}^{*}\) and \(\:\alpha\:\) is an IF set with \(\:\alpha\:\in\:{m}_{X}^{*}\).

Fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation based face recognition

The authors30 presented the idea of fuzzy \(\:{m}_{X}^{*}\) oscillation. The authors also discuss how choosing between various oscillatory operator settings can help one determine whether two things are different or similar.

Due to brightness, darkness, or varied stances, various images create an image database where each pixel with a grey level membership value of 0 or 1 must be present in the collection. The constituents of this collection are \(\:{m}_{X}^{*}\) open set structures. Let’s say the gallery picture dataset includes the following images: \(\:I=\:{I}_{1}\:,\:\:{I}_{2}\:,\:\dots\:.\:,\:{I}_{M}\). With the aid of fuzzy m X* oscillation, we must now discriminate between an unknown image and a recognized image. So, using an oscillator operator, we must talk about the four examples that follow. The following circumstances may arise for \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)={{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)\).

(1) \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)=1\:\text{o}\text{r}\:0\)

(2) \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)\in\:\left(\text{0,1}\right)\)

(3) \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)={{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-\varphi\:\)

(4) \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)=\widehat{I}-{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)\)

Situation (1)

(i) Assume \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)={{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)=0.\)

$$\:\iff\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)={\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right).$$

.

\(\:\iff\:\) Component \(\:(i,\:j)\) intensity of the training image equals component \(\:(i,\:j)\) intensity of the unknown image.

\(\:\iff\:\) The component \(\:(i,\:j)\) in the unfamiliar (flipped) picture and the training picture are identical.

(ii) Assume \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)={{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)=1\).

$$\:\iff\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)=1\:and\:{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)=0.$$

\(\:\iff\:\)Specifically, the intensity of the unfamiliar picture at component \(\:(i,j)\) is not over a predetermined range.

As a result, this pixel cannot be compared to component (i, j) of a familiar picture. As a result, this component is indeterminate.

Situation (2)

Let, \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)\in\:\left(\text{0,1}\right)\).

  1. (i)

    There might be a likeness between the photos of the pixel \(\:(i,\:j)\) if \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)\le\:0.\:1\).

  2. (ii)

    The variance among the intensity of the unfamiliar picture at component \(\:(i,j)\) and \(\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)\) or \(\:{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)\) must be checked if \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)\ge\:0.\:1\). We pick this pixel if the variation is less than 0.1.

Situation (3)

Assume \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)={{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-\varphi\:.\).

In this instance, we must determine whether the intensity of the unfamiliar picture at pixels \(\:(i,j)\:\)and \(\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)\) or \(\:{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)\) differs from each other. If the difference is less than 0.1, we use this pixel; otherwise, we use another.

Situation (4)

Let, if possible \(\:{\text{O}}_{{I}_{K}}^{o}\left({y}_{ij}\right)=\widehat{I}-{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)\).

In this instance, we must determine whether the intensity of the unfamiliar picture at pixels \(\:(i,j)\:\)and \(\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)\) differs. The picture may be regarded as familiar at component \(\:(i,j)\) if this variation is less than 0.1, otherwise distinct.

Face identification using intuitionistic fuzzy (IF) \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation

In31, the authors described the algorithm of Intuitionistic Fuzzy \(\:{m}_{X}^{*}\) Oscillation Based Face Recognition. In19,32, the authors described how fuzzy \(\:{m}_{X}^{*}\) oscillation may be useful in face recognition. The hypothesis that fuzzy sets oscillate between two closed and open fuzzy sets is the main foundation for fuzzy \(\:{m}_{X}^{*}\) oscillation. To compare the training and tested photos, we use the IF Set \(\:{m}_{X}^{*}\) oscillation and we treat the non-membership image as a closed set for this purpose. Since we treated non-membership images as a closed set, each non-membership image’s pixel values are \(\:\{0\:\le\:\:{{\upmu\:}}_{{a}_{j}}({x}_{i})\:+\:{{\upgamma\:}}_{{a}_{j}}({x}_{i})\:\le\:\:1,\:\:j=\:1,\:2,\:3\dots\:.\}.\) Similarly, for unknown image (i.e. tested image) both the membership and non-membership images are formed. Oscillatory operators are then computed from these membership and non-membership images. The IF open and IF closed operators are used in this instance to create an oscillatory operator matrix.

Forming the non-membership image for each image is crucial for determining the similarity measurement algorithm (i.e. unknown and known image). First some key points are calculated using SIFT algorithm (which is discussed in the next section). These key points are considered as a feature coordinates. The non-membership values need to be calculated from these feature coordinates, with \(\:\{0\:\le\:\:{{\upmu\:}}_{{a}_{j}}({x}_{i})\:+\:{{\upgamma\:}}_{{a}_{j}}({x}_{i})\:\le\:\:1,\:\:j=\:1,\:2,\:3\dots\:.\}\). The largest pixel value from these feature coordinates will be used to create the non-membership values, which is closure set. The highest pixel value difference, or \(\:1\:-{{\upmu\:}}_{{a}_{max}}\), is used to construct the base pixel value, and the condition \(\:\left[\right(\:\:{{\upmu\:}}_{{a}_{j}}+\:{{\upgamma\:}}_{{a}_{j}}\:)\le\:\:1]\) of IF is set to generate the remaining pixel values.

Each case’s explanation with an example

Let’s see how each fixed value’s touple is used to determine its supremum and infimum values. coordinate of a distinct image with the same coordinate’s unknown image pixel value, python allows us to extract some coordinate pixel values in the manner shown below.

  • I1: (x(21, 30), 〈0.786, 0.214〉), (x(27, 23), 〈0.335, 0.665〉), (x(15, 19), 〈0.435, 0.565〉), (x(15, 27), 〈0.512, 0.488〉)

  • I2: (x(21, 30), 〈0.663, 0.337〉), (x(27, 23), 〈0.292, 0.708〉), (x(15, 19), 〈0.341, 0.659〉), (x(15, 27, 〈0.685, 0.315〉).

  • I3: (x(21, 30), 〈0.373, 0.627〉), (x(27, 23), 〈0.275, 0.305〉), (x(15, 19), 〈0.241, 0.759〉), (x(15, 27), 〈0.631, 0.369〉)

  • I4: (x(21, 30), 〈0.466, 0.534〉), (x(27, 23), 〈0.308, 0.692〉), (x(15, 19), 〈0.531, 0.469〉), (x(15, 27), 〈0.527, 0.473〉)

  • I5: (x(21, 30), 〈0.513, 0.487〉), (x(27, 23), 〈0.541, 0.459〉), (x(15, 19), 〈0.499, 0.801〉), (x(15, 27), 〈0.735, 0.265〉).

The membership and non-membership values of those pictures at the corresponding pixel positions are represented by I1–I5. The membership values are categorized as an open set in this regard, while the non-membership values are categorized as a closed set. Let J be an unknown object with the following pixel values.

J (x(21, 30), 〈0.543, 0.457〉), (x(27, 23), 〈0.435, 0.565〉), (x(15, 19), 〈0.413, 0. 587〉), (x(15, 27), 〈0.369, 0.331〉).

For the pixel points (21, 30), picture J’s Λ operators will be 〈0.786, 0.214〉. It chooses the value with less closed value (i.e., less non-membership value) for similar open set values, and the operator will be 〈0.663, 0.337〉. Each ‘feature coordinate’ will have its Λ and Int value determined in this manner. Image J’s open oscillation operators \(\:{O}_{J}^{o}\) are:

$$\:{O}_{J}^{o}\left[\text{21,30}\right]=<{{\Lambda\:}}_{{J}_{\mu\:}}-{Int}_{{J}_{\mu\:}},{{\Lambda\:}}_{{J}_{\gamma\:}}-{Int}_{{J}_{\gamma\:}}>\times\:<\left(0.786-0.663\right),\left(0.335-0.292\right)><\left(\text{0.123,0.043}\right)>$$

Since \(\:\mu\:\:<\:1\) and \(\:\gamma\:\:<\:0.1\), situation 2 will be applied. The pixels at this point are similar based on the criteria of situation 2. In the same way the remaining coordinates will also be computed.

$$\:{O}_{J}^{o}\left[\text{27,23}\right]=<{{\Lambda\:}}_{{J}_{\mu\:}}-{Int}_{{J}_{\mu\:}},{{\Lambda\:}}_{{J}_{\gamma\:}}-{Int}_{{J}_{\gamma\:}}>\times\:<\left(0.541-0.275\right),\left(0.459-0.305\right)><\left(\text{0.266,0.154}\right)>$$

Since \(\:\mu\:\:<\:1\) and \(\:\gamma\:\:<\:0.1\), situation 2 will be applied. The difference between Λ and the unknown pixel point must be computed, i.e., (0.435–0.266) = 0.169 > 0.1, in accordance with situation 2 requirements, which state that \(\:\mu\:\:>\:0.1\). This indicates that the pixels are different at this location. The remaining coordinates will also be computed.

$$\:{O}_{J}^{o}\left[\text{15,19}\right]=<{{\Lambda\:}}_{{J}_{\mu\:}}-{Int}_{{J}_{\mu\:}},{{\Lambda\:}}_{{J}_{\gamma\:}}-{Int}_{{J}_{\gamma\:}}><\left({\upvarphi\:}-0.199\right),\left(0.801\right)>$$

\(\:\gamma\:\:<\:0.9\) will be regarded as situation 4. With an unknown image pixel, we must next determine the difference of \(\:\gamma\:\), which is (0.801–0. 587) = 0.214 < 0.9. Next, we must determine the difference between the membership value and the unknown image pixel, or (0.499–0.413). \(\:{O}_{J}^{o}\left[\text{15,27}\right]=<{{\Lambda\:}}_{{J}_{\mu\:}}-{Int}_{{J}_{\mu\:}},{{\Lambda\:}}_{{J}_{\gamma\:}}-{Int}_{{J}_{\gamma\:}}><\left(0.512-\widehat{I}\right),\left(0.488\rangle\:\right)>=0.003<0.1\). Thus, pixels at this pixel location are comparable. \(\:{O}_{J}^{o}\left[\text{15,27}\right]=<{{\Lambda\:}}_{{J}_{\mu\:}}-{Int}_{{J}_{\mu\:}},{{\Lambda\:}}_{{J}_{\gamma\:}}-{Int}_{{J}_{\gamma\:}}><\left(0.512-\widehat{I}\right),\left(0.488\rangle\:\right)>\). Since \(\:\gamma\:\:<\:0.9\), this will be regarded as situation 3. Finding the difference with an unknown image pixel is the next step, which is \(\:(0.488\:-\:0.331)\:=\:0.157\:<\:0.9\). With an unknown picture pixel, we must then determine the difference of \(\:\mu\:\), which is \(\:(0.512\:-\:0.369)\:=\:0.143\:>\:0.1\). At this pixel position, pixels are therefore different.

SIFT or scale invariant feature transform

David G. Lowe described his research strategy, known as SIFT33, for extracting the keypoints from the image collection. SIFT has been utilised extensively in the field of face identification for the past 14 years. In the branch of face identification, it is mostly employed for finding the characteristics of keypoints. We refer to it as scale invariant since it would supply almost the same amount of keypoints after increasing the image (in this proposed research work, keypoints treated as feature coordinates). Additionally, it provides approximately the same amount of keypoints for rotated images. Unreliable keypoints discarding, Scale-Space Extrema Detection, orientation assignment, and keypoints descriptor calculation are required steps in this technique. For example, Fig. 1 represents Image gradient and Keypoints descriptor. Figure 2 shows the results of SIFT algorithm on gallery images and Fig. 3 showing the keypoint region following the SIFT application.

Fig. 1
figure 1

Here 2 x 2 sub regions are derived from a 8 x 8 region, while throughout the trials we used 16 x 16 neighborhood and 4 x 4 sub regions.

Fig. 2
figure 2

Images in (a) are gallery images, while the results of the SIFT algorithm show the images in (b).

Fig. 3
figure 3

Keypoint region following SIFT application.

Databases of face sketches

We have gathered publicly accessible sketch datasets, such as the CUHK1,34 sketch database, which consists of three distinct databases: the CUHK database, the AR database, and the XM2VTS database. The CUHK student database has 188 drawing image pairs, the AR database35 has 123 drawing image pairs, and the XM2VTS database36 has 295 drawing image pairs. Thus, from this point on, we were able to compile a total of 606 picture databases. Because the XM2VTS database utilised in this instance is not a free one, we employed a total of 311 sketch databases. Next, we compared the result with the e-PRIP dataset’s composite sketches. Han et al.37 dataset was enhanced to create the e-PRIP15,16 dataset. It is the sole dataset that is provided for composite sketches. 123 composite and digital pictures from the AR dataset35 are included in this collection. The dataset includes four sets of composite sketches made by American, Asian, and Indian artists with the aid of the Identi-Kit38 and FACES39 tools. The CUHK sketch databases contain images with set backgrounds and extremely well controlled lighting. Despite what we saw, we cannot claim for sure that using these well-known databases, facial recognition is conceivable. We gathered sketch-digital images that were created by experts and contain a variety of sketches and images from different sources. This database has 231 image pairs. Among them the IIIT-Delhi databases, which have 568 sketch-digital image pairings40, the FG-NET ageing databases, which have 67 sketch-digital picture pairings41, the Labeled Faces in Wild databases, which have 92 drawings and its original pairs photos42, and the FG-NET ageing databases. The IIIT-Delhi databases are more aesthetically pleasing and intellectually fascinating than the CUHK databases, and they furthermore contain sketch-digital picture pairings made by their students and instructors. In addition, we have gathered some images with diverse facial expressions, motions, and lighting. In Fig. 4, there are several images that we have attempted to highlight.

Fig. 4
figure 4

A few drawings (collected from different sources).

Sift, fuzzy and IF \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation based face identification

Fuzzy and IF \(\:{m}_{X}^{*}\) oscillation was both used in our suggested approach. The procedure and method for matching sketches to images utilising IF \(\:{m}_{X}^{*}\) oscillation is introduced after the concepts of SIFT and fuzzy \(\:{m}_{X}^{*}\) oscillation. In the experimental section, the output of the fuzzy and IF minimal structure oscillations were contrasted with those of other methods. To distinguish the faces utilizing fuzzy \(\:{m}_{X}^{*}\) oscillation, we first collected probing sketches and digital photos from a gallery. Then, SIFT is utilized to highlight important areas in both the sketch and the photos. In Fig. 5 green plus points indicates the feature coordinates. The matrix of unfamiliar photos at each feature coordinate must then be compared to these pixel values at significant positions in the known image matrix. So, for that first we have to find the negative image of the given image. In Fig. 6 column (b) represent the negative image of the image in column (a). The first measurements are performed on the operator’s Λ (also known as the cl operator for negative photos) and Int (also known as the V operator for negative photos). In Fig. 6 we A rough distinction between the Λ and Int operators is the Open Oscillatory Operator (or Closed Oscillatory Operator for negative images). Depending on the many circumstances19 that would be taken into consideration for determining similarity, closed and open oscillatory operators are the targeted matrix values from both images.

Fig. 5
figure 5

Green plus points indicate the feature coordinates.

Fig. 6
figure 6

The images in column (b) are the mirror images of those in column (a).

Face identification process based on SIFT and fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation

Step 1: Assume that there are \(\:N\) photographs in our image database. Then SIFT is applied to each database photos. Then figure out the pixel values for each training image’s crucial areas. Feature coordinates from the unfamiliar photos were simultaneously extracted after utilising SIFT.

Step 2: We compute the supremum value with the operator Λ and the Int operator (INT OPTR).

Step 3: The complementary image’s maximum \(\:CL\_OPTR\) and \(\:V\_OPTR\) values should be determined. Then, using the normal picture as a guide, the complimentary images are used to determine the open oscillatory value \(\:{O}^{o}\) and the closed oscillatory value \(\:{O}^{cl}\).

Step 4: After computing the closed and open oscillatory operators in accordance with19, the four examples of the similarity measure are taken into consideration.

Pseudocode of fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation

The Pseudocode mentioned below was developed in this study to recognize faces utilizing SIFT and fuzzy \(\:{m}_{X}^{*}\) oscillation.

figure a
figure b

Block diagram for face identification based on SIFT with fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation

This flowchart displays the enhanced face detection algorithm using fuzzy \(\:{m}_{X}^{*}\) and SIFT (Fig. 7). Use SIFT to identify the most promising key points before attempting to distinguish unfamiliar face shots from a collection of familiar sketch face photos. Consider many instances for measuring similarity for each pixel. The flowchart for face recognition using fuzzy \(\:{m}_{X}^{*}\) oscillations and SIFT.

Fig. 7
figure 7

The flowchart for face identification using SIFT and fuzzy\(m_X^*\) oscillations

Face identification based on SIFT and IF \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation

Step 1: \(\:\text{M}\)images from an image database may be utilised as training examples or as input images. Specific feature values that are derived from predefined coordinates must be present in every database image. These feature values are used to quantify membership and non-membership pixels, resulting in the production of a non-membership image. The unknown image’s feature values are simultaneously acquired while taking the same coordinate value into account.

Step 2: Int_operator and Λ will now be computed. The “Infimum value” operator is determined by comparing the “feature value matrix” of the set of training pictures with the unfamiliar image (with less γ). Int_operator with supremum value is calculated in this manner. Now, the supremum value of the Λ operator and the Int operator (INT_ OPTR) will be determined.

Step 3: The closed oscillatory values, i.e., \(\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)\) from the non-membership images, and the open oscillatory values, i.e., \(\:{{\Lambda\:}}_{{I}_{K}}\left({y}_{ij}\right)-{\text{I}\text{n}\text{t}}_{{I}_{K}}\left({y}_{ij}\right)\) from the normal image, are evaluated.

Step 4: Four occurrences of \(\:{\text{O}}^{\text{o}}\) and \(\:{\text{O}}^{\text{c}\text{l}}\) should be taken into account for similarity measurement. To determine similarity and difference, a component-by-component comparison is taken into consideration.

Pseudocode of IF \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation

These noted stages are followed by the technique utilised in the proposed algorithm for face identification utilising SIFT and IF \(\:{m}_{X}^{*}\) Oscillation:

figure c
figure d

Block- diagram of face identification based on SIFT and IF \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation

The suggested technique can be employed for face identification using IF Fuzzy \(\:{m}_{X}^{*}\:\) and SIFT, as shown in this flow chart (Fig. 8).

Fig. 8
figure 8

The flowchart of SIFT and IF \(m_X^*\) Oscillation based face recognition.

Removing error using 6 point facial landmark detector

To find the facial landmarks on the frontal faces, based on the paper43,44 a shape estimator was applied that is perform in the dlib library. This shape estimator gives 68 facial landmarks along with the corner of the eyes, lips, eyebrow’s, nose tip etc.(Fig. 9(a)). In the proposed approach we modified the 68 facial landmarks as 6 point facial landmarks. These 6 facial landmark points include the left and right corner points of left and right eye, nose tip and 3 points on the mouth (Fig. 9(b)). In this paper we have used 6 facial landmark points because we want to create two regions in the face and use these regions in our future work. And besides, to create these regions we only need 6 facial landmark points and for which we have changed the 68 facial landmark points into 6 facial landmark points.

Fig. 9
figure 9

(a) is the 68-point facial landmarks and (b) is the 6-point facial landmarks (After modification).

Now, let us take these points as \(\:A({x}_{1},{y}_{1}),\:B\left({x}_{2},{y}_{2}\right),\:O\left({x}_{3},{y}_{3}\right),\:C\left({x}_{4},{y}_{4}\right),\:D\left({x}_{5},{y}_{5}\right)\:\) and \(\:E({x}_{6},{y}_{6})\) where, (\(\:{x}_{i},{y}_{i}),\:i=1,\dots\:,6\) denote the pixel position of the 6 landmark points. Using the landmark points 37, 46, 31, 65, 58 and 61 we can easily find out the pixel positions of A, B, O, C, D, E. And then using the following distance formula measured the ratio of perimeters between the triangle AOB and the quadrilateral OCDE (Fig. 10).

$$\:\text{P}\text{e}\text{r}\text{i}\text{m}\text{e}\text{t}\text{e}\text{r}\:\text{o}\text{f}\:\text{t}\text{h}\text{e}\:\text{t}\text{r}\text{i}\text{a}\text{n}\text{g}\text{l}\text{e}\:AOB=\:\sum\:_{i=1}^{2}\sqrt{{({x}_{i+1}-{x}_{i})}^{2}+{({y}_{i+1}-{y}_{i})}^{2}}+\sqrt{{({x}_{3}-{x}_{1})}^{2}+{({y}_{3}-{y}_{1})}^{2}}$$
(1)
$$\:\text{P}\text{e}\text{r}\text{i}\text{m}\text{e}\text{t}\text{e}\text{r}\:\text{o}\text{f}\:\text{t}\text{h}\text{e}\:\text{q}\text{u}\text{a}\text{d}\text{r}\text{i}\text{l}\text{a}\text{t}\text{e}\text{r}\text{a}\text{l}\:OCDE=\sum\:_{i=3}^{5}\sqrt{{({x}_{i+1}-{x}_{i})}^{2}+{({y}_{i+1}-{y}_{i})}^{2}}+\sqrt{{({x}_{6}-{x}_{3})}^{2}+{({y}_{6}-{y}_{3})}^{2}}$$
(2)

This ratio is measured for both the sketch and output images (when the output images are more than one). As our aim is to achieve one correct output image from two or more output images, we shall take the smallest difference image as an output after comparing their ratio of perimeters. The working procedure of proposed method is described below with the help of Fig. 11.

Fig. 10
figure 10

A, B, O, C, D, and E represents the landmark points and also AOB represents as a triangle and OCDE as a quadrilateral.

Fig. 11
figure 11

Face sketch to photo recognition proposed method.

Discussion of experimental outcomes

This section presents the experimental findings of our suggested method in detail. We separate the findings of our analysis into three parts; the first part includes the outcomes of fuzzy \(\:{m}_{X}^{*}\) oscillation in the SIFT domain and its comparison with other existing algorithms, the second part includes the outcomes of IF \(\:{m}_{X}^{*}\) oscillation using SIFT domain and its comparison with other existing algorithms and the last part includes comparison between the performances of IF set \(\:{m}_{X}^{*}\) Oscillation, Fuzzy \(\:{m}_{X}^{*}\) Oscillation and our algorithm. In order to conduct a comparison research, we used rank-10 and rank-50 recognition accuracy together with Cumulative Match Characteristic (CMC) curves45.

Experimental results of proposed algorithm (fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) Oscillation in SIFT domain) and its comparison with other existing algorithms

We made use of the CUHK Face drawing image set (CUFS), and the IIIT-D drawing image set, and four important databases. The results of the suggested approach and those of the current algorithms were then compared using the e-PRIP database and the IIIT-D semi-forensic sketch databases, which is given in Figs. 12, 13 and 14. According to the authors of17, they were able to reach rank-10 accuracy of 76.4% and 72.3% with attribute and 67.6% and 69.1% without attribute for the drawings made by Asian and Indian artists, respectively. Additionally, it is evident from their experimental part that for the drawings made by Asian and Indian artists, respectively, they attained rank-50 accuracy with attribute of 82.2% and 80% and without attribute of 68.1% and 68%.

Fig. 12
figure 12

Rank-10 cumulative match score results of proposed (with Fuzzy) and other algorithms. (a) FACES (Indian User), (b) Identi-kit (Asian User).

Fig. 13
figure 13

Rank-50 cumulative match score results of proposed (with Fuzzy) and other algorithms. (a) FACES (Indian user), (b) Identi-kit (Asian user).

Fig. 14
figure 14

Rank-50 CMC match score results on IIIT-D sets.

Table 1 Shows the percentage of the e-PRIP composite sketch database’s rank-10 recognition accuracy.
Table 2 Rank-50 recognition accuracy (%) on the e-PRIP composite drawing image set.
Table 3 Rank-50 recognition accuracy (%) on the IIIT-D Semi-forensic drawing image set.

Experimental results of proposed algorithm (IF \(\:{\varvec{m}}_{\varvec{X}}^{*}\) Oscillation in SIFT domain) and its comparison with other existing algorithms

Utilizing the CUHK Face drawing dataset (CUFS), and IIIT-D drawing image set, we assessed the findings of our experimental work. After that we compare our proposed algorithm using e-PRIP database and IIIT-D semi-forensic sketch databases, which is given in Fig. 15. First, we used the Scale Invariant Feature transformation, which produced the essential points depicted in Fig. 3. Then, from each facial image and sketch image, the pixel values of the keypoints are collected. We use these key locations as feature coordinates and compare them using the IF \(\:{m}_{X}^{*}\) oscillation to determine how similar they are. Following the conclusion of these first steps, a single image of a distinct human is compared, and the algorithm generates the desired outcome. On the other hand, if the same person were to be compared, it was declared that their faces were identical.

Fig. 15
figure 15figure 15

(a) and (b) represent rank-10, (c) and (d) represents rank-50 CMC score results of proposed (with IF) and other algorithms. (e) represent rank-50 CMC score results.

Fig. 16
figure 16

(a) represents a rank-10 accuracy bar chart and (b) represents rank-50 accuracy bar chart of proposed (with IF) and other algorithms.

Tables 1 and 2, and 3 already display the results of various current algorithms15,16,17,24, and18 on the e-PRIP database and the IIIT-D semi-forensic sketch database with ranks-10 and 50. We conducted the comparison after determining the suggested algorithm’s accuracy at rank-10 and rank-50 for the e-PRIP and IIIT-D semi-forensic sketch databases. At rank-10, we received 78.1% and 75% on the Faces (Indian User) and Identi-kit (Asian User) databases, respectively, while at rank-50, we received 84% and 78.6% on the same databases. Additionally, at rank-50 on the IIIT-D semi-forensic drawing image set, our suggested method achieved 93.5%.

Comparison between the performances of IF set \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation, fuzzy \(\:{\varvec{m}}_{\varvec{X}}^{*}\) oscillation and our algorithm

In this paper, two algorithms are developed; the first is called Fuzzy \(\:{m}_{X}^{*}\) oscillation with SIFT Operator, and the second is called IF set \(\:{m}_{X}^{*}\) oscillation with SIFT Operator. It is evident from the experimental results of Sect. 5(A) and (B) that both proposed algorithms produce results that are satisfactory. IF set \(\:{m}_{X}^{*}\) oscillation with SIFT Operator provides \(\:97.1\%\) for sketch faces, and fuzzy Set \(\:{m}_{X}^{*}\) oscillation with SIFT Operator provides \(\:97.1\%\) for sketch faces as shown in Figs. 16. Figure 17 makes it evident that our suggested algorithm produces better results when compared to the other algorithms that are currently in use30,31. The accuracy comparison is shown in Table 4.

Table 4 Accuracy comparison between Fuzzy \(\:{m}_{X}^{*}\) Oscillation30, IF set Oscillation31, Fuzzy \(\:{m}_{X}^{*}\) Oscillation with SIFT and Fuzzy \(\:{m}_{X}^{*}\) Oscillation with SIFT.
Fig. 17
figure 17

Performance of previously defined algorithms and IF set oscillation vs. our algorithm.

In recent years, deep learning approaches have grown in popularity and effectiveness. Deep learning does, however, have a lot of drawbacks. Deep learning needs a huge amount of data in order to perform better than previous techniques. Due to sophisticated data models, training is very expensive. In addition, deep learning requires hundreds of workstations and pricey GPUs. The cost of users goes up as a result. We conducted a comparison analysis of our method with the deep learning technique used by Mittal et al.15, who employ facial representation based on deep learning architecture. The performance chart for fuzzy \(\:{\text{m}}_{\text{X}}^{\text{*}}\), the IF set oscillation, and our algorithm for face identification are shown in Fig. 16. While the y-axes display precision, the x-axes show the quantity of feature coordinates. It may be argued that when training sets and feature coordinates are multiplied, fuzzy \(\:{\text{m}}_{\text{X}}^{\text{*}}\) oscillation SIFT domain generates superior accuracy than other face recognition methods like fuzzy \(\:{\text{m}}_{\text{X}}^{\text{*}}\) oscillation30 and IF set oscillation31.

Limitation and future scope

The face recognition performance of the proposed fuzzy shift-domain method is promising, but it has some limitations. For large-scale datasets, the method requires a lot of computational power and may be affected by images that are noisy, blurry, or of low resolution. Additionally, the selection of fuzzy membership functions and shift-domain parameters, which may differ between datasets, has a slight impact on overall accuracy.

This work can be extended in multiple directions in the future. To reduce reliance on manual selection of fuzzy functions, adaptive or automated parameter optimization methods can be developed. Pose, illumination, and occlusion variations may be mitigated by integration with deep learning models. Improving the method’s ability to handle noisy and low-resolution images46 will further strengthen its applicability in real-world scenarios. Multi-stage Siamese neural network47 and multi-task joint learning48 provide methodologies for combining diverse data sources, which can enhance performance. Furthermore, hyperrectangle embedding networks49 offer advanced techniques for modeling complex relationships and incorporating additional information, which could improve feature consistency representation. Moreover, multi-scale spatial–temporal interaction fusion50 addresses the challenges in this area. These could be adapted to improve the accuracy for the future directions. For surveillance and mobile authentication systems, real-time implementation on GPU/FPGA-based platforms may also be considered. In addition, the approach’s scalability can be evaluated on extremely large databases, and its use can be expanded beyond face recognition to include other biometric and image analysis fields like gesture recognition and medical imaging.

Conclusion

This work presented a sketch-to-photo face recognition framework that integrates Scale-Invariant Feature Transform (SIFT) with Intuitionistic Fuzzy (IF) and Fuzzy Minimal Structure Oscillation. In order to locate the keypoints from sketch faces and gallery photos that will be utilized to extract features for our new algorithm, we used SIFT. IF \(\:{\text{m}}_{\text{X}}^{\text{*}}\) oscillation and fuzzy \(\:{\text{m}}_{\text{X}}^{\text{*}}\) oscillation-based clustering is applied for these values. For a better outcome, we ultimately used a modified 6-point facial landmark detector. By creating a Matlab language application and testing it against several face datasets, we have validated our algorithm. Experimental evaluations across multiple benchmark datasets, including CUHK, IIIT-D, and e-PRIP, demonstrate that the proposed approach consistently outperforms several existing methods, achieving recognition accuracies up to 97.3%. The findings affirm the use of fuzzy-based oscillation operators that successfully deal with uncertainty and partial similarity among heterogeneous modalities of sketches and photos. The proposed method is highly accurate and does not need massive data or extensive computations compared with deep learning approaches and thus is applicable in forensic and surveillance applications where training data can be limited.

Overall, the work demonstrates the effectiveness of using classical feature extraction in conjunction with fuzzy logic concepts to enhance cross-domain face recognition. Future research can focus on adaptive parameter selection, integration with lightweight deep learning models, and real-time deployment on GPU/FPGA platforms to further expand the applicability of this approach in law enforcement, digital security, and biometric systems.