Skip to content
This repository was archived by the owner on Nov 12, 2019. It is now read-only.

Conversation

@mandykoh
Copy link
Owner

Currently, Simian fingerprints are simple resamplings of the pixel values in an image. We want to move to a Discrete Cosine Transform based fingerprint, which will provide the following advantages:

  • Better similarity comparison that weighs important visual details more.
  • Simpler derivation of lower-level fingerprints from a high level one (basically a crop of the DCT coefficients).
  • No need to keep thumbnails in the index for generating lower-level fingerprints.

@mandykoh mandykoh added the wip label Jun 11, 2017
@mandykoh mandykoh mentioned this pull request Jun 11, 2017
@mandykoh mandykoh force-pushed the dct-based-fingerprinting branch 2 times, most recently from d66124e to b1e5daf Compare August 17, 2017 12:30
@mandykoh mandykoh force-pushed the dct-based-fingerprinting branch from b1e5daf to fada4ef Compare August 18, 2017 08:25
}

result[u*width+v] = float32(sum)
result[u*width+v] = int16(sum)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the result no longer float? I am looking at the example of DCT II value that is shown below the DCT table picture in this IDCT example. The values in the matrix look like floats... and I would think that conterting them to int16 would remove most of the information. 🤔 What am I missing?

Copy link
Owner Author

@mandykoh mandykoh Aug 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure how they’re representing the colour values in that example—they might be using 0.0–1.0 or they might be doing some scaling (there’s lots of leeway for scaling a DCT to make it easier to handle for different purposes that doesn’t affect the result).

Here’s the greyscale values I get for the same 8x8 'A' character image they use, expressed as 8-bit signed (I think this will be nicest for fingerprinting purposes, for reasons we can discuss):

  127  127  127  127  127  127  127  127
  127  127   93 -128   42  127  127  127
  127  127    8  -94  -60  127  127  127
  127  127 -111   42 -111   93  127  127
  127   42 -128 -128 -128    8  127  127
  127  -60    8  127   59 -111  127  127
   93 -128  110  127  127  -94   42  127
  127  127  127  127  127  127  127  127

Here’s the resulting DCT using the current function (which may benefit from some scaling eg to pack them into 8 or even 4 bits; I haven’t yet tested):

  4439  -491  1791   215   228   395  -104    80
   318    21   459   402  -800  -447   102  -260
  1503   225 -1021  -277    80  -199   285   480
  -337   -39  -266  -292   647   357  -146   362
   396    23   125   222  -263   -75  -208  -602
    94    43  -481  -296   483   293   -29  -133
   457    55  -105   -22    -5   103  -168  -153
  -428   -63   199    65  -115  -105   192   147

Code used: https://gist.github.com/mandykoh/e90b72ac22c5668dcddba61d690749be

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Side discussion: this example isn’t doing gamma correction. Today, the x/image/draw doesn’t handle gamma but I think there’s some efforts to make it do so…we may end up having to handle it ourselves here before DCT.)

Copy link
Collaborator

@gonzalo-bulnes gonzalo-bulnes Aug 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got the color representation relevance 👍 and that the results are indeed ints when using the example code (thanks for the example code!). I'm playing with it to understand how the coefficients in the DCT modify the basic functions (the images in the matrix).


I'm not familiar with gamma correction, but I saw an example and I think I got the main idea of what the correction is doing, but maybe not what we would use it for.

Would that act as a sort of normalization, with a potentially different factor for each image? Would we use it as a way to make sure that we take into account as many details in the image as the "luminance? range" allows us to encode? (I'm not sure if the luminance is what we're extracting when converting the original image to a gray scale.)

I observe that two plain RGB #888820 and RGB #638888 images have the same DCT - the zero matrix using the example code - despite being at different "distances" from #888888 in terms of how much color was changed to go from #888888 to each of them, resp. 88 to 20 and 88 to 63. (Does that make sense?) And I imagine that different colors are not contributing in the same way to the image luminance. (I'll just keep saying luminance, it may not be the right term.)

That makes me think that depending on the color range of the image, part of the gray scale (I think of it as the encoding space we have for the luminance) could happen not to be used. If I understand well, a carefully chosen gamma correction would expand the pixel values to occupy all that space. Am I close?

It will probably be easier to talk anyway, but I wanted to write the idea down before I forget something :P

(I'm picturing that to myself as using close-IR illumination for security cameras at night: the details are there, we just don't see them. And it happens that the camera has some unused sensibility / "encoding space" in the close IR that can be leveraged if only we expand the image color range a little bit... (?) I might be getting both things wrong ^^)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I think I tricked you. If you look closely, you’ll see the example code is only using the R component. :P

Re gamma: I think it will help with those cases where there are near-white images being compared, since it might help to separate out the detail in that range. Needs testing. I don’t think it’s as valuable for doing any sort of range compression for this kind of application, but I haven’t thought too deeply about it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants