87 points by yantrams 2 months ago
It knows what a face looks like! https://i.imgur.com/wA1iQ7T.png
It didn't do so well on my most used Slack emoji http://imgs.fyi/img/78s9.png
Ah yes. This has to do with the thresholding 'bug' I discovered sometime ago. I will update the algorithm soon. Happens with color images in the preprocessing stage during the conversion of query image to a binary image, especially in images with flat/palette colours.
I eventually came up with a contrived set of heuristics to tackle this problem as you can see in the example below and managed to get more get accurate thresholding more than 90% of the times for pathological cases like these with the right set of weights. --- https://imgur.com/a/XMhdnjH
Well it did suggest 2 icons of presumably corporate scrum meetings...
Good one. Frameworthy this! Thank you.
Author here. Happy to answer any questions, hear feedback.
I spent some time confused by the animated running horse and how it was supposed to be related with the other icons. Until I found out that it was the loading gif.
I would change there for something more standard.
I found the running horse to be an adoring touch. The author's site is also worth checking out: http://linkdot.link
I believe if it would be both adorable and more easily understandable if it was a smaller image, giving a visual hint that is fulfills the role of an icon, not a featured image
It was the other way for me, I linked it to the duckduckgo icon svg and the horse started. I thought it was a loading animation (and took more than a few seconds) so I threw it to another monitor and continued reading HN.
...30 minutes later the horse is still running and I'm like 'wtf? what does a horse have to do with the DDG logo?' close tab. read comments...
It turns out the app doesn't handle svg (it is actually in the to do list) and returned a 500, but the failure was never presented to the user.
Ouch. Sorry to have put you through that :/ I will fix it asap. And I agree with the parent too. Maybe some text below the horse that says fetching results... would make things clearer.
haha, no bother, we all know whats up here; I'm just busy bikeshedding.
Massive kudos for delivering; I like it a lot :D
Thank you! Fixed it :)
It would be cool if you could draw an icon instead of uploading/linking one, like how Google Translate lets you draw characters
Great idea. I'll definitely look into it. Thanks for the suggestion. Cheers
Nice idea! I did a similar but different project for 32x32 icons.
Please help yourself to my icon data - I spent a while collecting it and hope it can be useful to someone!
Thanks for the share! These icons are golden! I did something similar for logos that could be of interest to you - http://compute.vision/brands/colorpicker.html
Hi, how did you source the whole noun project icon collection?
I scraped it from their website and then asked for their permission by sharing this link with them. They appreciated that I linked all the icons to their website and gave their consent to make this public.
Ah, I see. I was hoping there was an archive to download somewhere. Thanks!
How do/would you measure the accuracy of the algorithm?
The MPEG-7 dataset is what most researchers use to benchmark shape similarity algorithms. There are couple of other datasets that I used that I can't recollect now. These datasets are relatively simple with a single shape as opposed to logos, icons that comprise multiple elements in different configurations.
I would test on the MPEG-7 dataset to begin with and once the precision and recall values are good enough go ahead with testing on logos and icons. I must've manually tested the algorithm more than a 100,000 times probably because that was the only way to do with untagged datasets. Quite tedious indeed. This version gives out pretty decent results about 7-8 out of 10 times I'd say.
 - http://www.dabi.temple.edu/~shape/MPEG7/dataset.html
Traditional algorithms or ML? Image vectors?
Traditional algorithms and image vectors. I used a conconction of existing region and contour based techniques and threw in some original ideas as well.
Elaborate on some of your ideas or do a blog post?
Been meaning to do a blog post on this forever now :/ I will do it at some point for sure at http://linkdot.link
This is a great starting point in case you are interested in knowing more -- http://www.staff.science.uu.nl/~kreve101/asci/smi2001.pdf More recently, I've been exploring some ideas of Tversky.
One way is to scale the images down and threshold to black and white. At smaller scale they should be identical or much closer.
I tried with 2 icons I custom created. https://imgur.com/a/ZtseO8j
The first one (the arrow into the door) seems to have worked well for the first three 'similar icons'.
The second one (remove user) didn't work at all. Maybe because it is circled.
In both cases, half the similar icons are 'download' icons, and I can kind of see why for the first case but not at all for the second case.
I suspect it has to do with the lower resolution. I'm using nearest neighbors interpolation for resizing images and have noticed similar behaviour before. Would be great if you can try with higher resolution versions(preferably > 200px) of the same images and let me know the results.
A closer inspection of the results actually shows some of the results aren't that bad a match. Results ordered 1, 4, 5, 7 and 7 in particular vaguely have the same outline as that of the query image. If I have to score this result, I wouldn't give it more than a 3 out of 10 for sure.
I just realised the "download" icons aren't meant to be "similar icons"... they allow you to download the one above. Doh!
I've re-tried the "remove user" one but uploaded an SVG instead of a PNG (so technically the resolution is unlimited). Uploaded it both circled and not circled.
Here are the results: https://imgur.com/a/OT8Spjt
:) Please feel free to share the SVGs. I will convert them to PNGs and test them out. I will add SVG support real soon. Right now I've put an exception handler that passes an empty array as query if an image format that can't be decoded is thrown at it :|
Sure - is your email firstname.lastname@example.org? If you don't want to post your email publicly, can you email email@example.com and I'll reply with both SVGs + PNGs.
I actually uploaded SVGs so I think you might already (unintentionally) support SVGs?
Also tried this "in-out" icon. I can kinda see how the algorithm got the results. Happy to send you a bunch of SVGs if want for testing BTW.
Tried to get the icon recognized and discovered it works best with black on white: https://imgur.com/a/EyuJhQA
Thanks a ton for testing it quite exhaustively! Would really appreciate it if you can share the second image.
I spent a lot of time working on a hack for 'normalizing' white on black and black on white backgrounds and also between choosing adaptive vs gaussian thresholds dynamically during preprocessing.
Here is an example with white on black and black on white variations of Nike logo that works as intended.
Improved results using icons with higher resolution, for those who are following this conversation and waiting with bated breath :)
> In both cases, half the similar icons are 'download' icons
Those seem to be actual download buttons where you can download the found similar icon.
No results were shown after: reddit, google, ycombinator.
Your site is not secure. (SSL)
There is a broken-image sign after search and no warning.
If you are referring to entering those words in the searchbox, yes I should've put in some warnings/checks there to enter a valid image URL. Will fix it soon. And yes I should make the site secure too. Thanks for letting me know.
PS: You can explore company logos here http://compute.vision/brands/index.html . It's implemented using an older iteration of the algorithm and performance isn't that great compared to the one used with the icons database.
super cool man... great use of telugu and other iconography.. you are a cultural icon in the digital space...
Glad you liked it and wow are you sure you are not confusing me with someone else ? :) I have a suspicion you are mistaking with Anil Battula from http://sovietbooksintelugu.blogspot.com/ maybe.
Speaking of Telugu, I recently got hold of a treasure trove(about 700GB) of scanned copies of Telugu magazines and newspapers some of them as old as 1880! Gonna upload them on archive.org very soon.