We make our AI dataset Open-Source

Back again with a new blog where I want to share with you the reason why we opened up our Machine Learning dataset with the community.

With InForIntellgence we created several unique algorithms which are capable of uncovering, mapping and eventually visualize the partnership between companies, and by this reveal the Open Innovation activity of Corporates, Markets, and whole ecosystems. We see this as innovation 3.0, not R&D innovation, not Open Innovation but Accessible Open Innovation.

We wanted to bring out what is under the radar and difficult to track. Everywhere on the net, our algorithms are looking for partnerships, think of social media, newspapers, articles and web pages. We believe in the power of making data accessible and open source, especially when it comes to innovation.

However, these algorithms did not suddenly drop from the sky. It took us time to create these algorithms and especially the one which uses Image Recognition to recognize logos on web pages.

When it comes to Image Recognition you need to train your algorithms (Neural Networks) to recognize the particular images. For example, when your algorithm is trying to classify dogs versus cats, first you train him/her by ‘showing’; the algorithms hundreds (or more) pictures of cats and dogs. By this, it recognizes certain patterns and next time you show the Algorithm a picture of a cat or a dog it could classify it and tell you exactly if it is a cat or a dog.

” Did you even know that you are helping Google to build a self driving AI without knowing it? “

If you want to know more about AI and Machine Learning I could advise you to start reading this, article: https://becominghuman.ai/machine-learning-for-dummies-explained-in-2-mins-e83fbc55ac6d, followed by this one https://medium.com/machine-learning-for-humans/why-machine-learning-matters-6164faf1df12.

Did you know that you are unconsciously helping Google to build a self-driving AI? I am sure you recognize this picture:


Google uses this as import data for their Image Recognition software. More information could be found in this article: https://hackernoon.com/you-are-building-a-self-driving-ai-without-even-knowing-about-it-62fadbfa5fdf

In July this year, Machine Learning Algorithms beat doctors in diagnosing brain tumors, again Image Recognition was used.
“The AI correctly diagnosed 87 percent of 225 cases in just 15 minutes, while a team of 15 senior physicians diagnosed 66-percent of the cases accurately.”

” Because of this, our accuracy of the ‘IFI Logo Recognition’ algorithm drastically increased to around 96%. “

Back again to my company InForIntelligence where we are using Image Recognition to recognize logos and eventually use this data to find partnerships between companies. For us, (Niek Hermus, Co-Founder) it took an immense amount of time to learn how to create this kind of algorithms. Eventually, we used many papers to gather more knowledge regarding Image (especially Logo) Recognition. Because of this, our accuracy of the ‘IFI Logo Recognition’ algorithm drastically increased to around 96%.

We spent the most time to find suitable training data for our neural network because this was manually done. It took us more than 100 hours in total to create this unique dataset of logos from 29 different companies with on average around 250 images per company. We strolled down the whole internet to find as much logos as possible of certain companies.

Because the research community helped us enormously with their papers, and we strive to make data more accessible, we want to give something back, namely a unique dataset to use for research.

The dataset can be found here: www.inforintelligence.com/research.

By making this research set accessible, more research could be done to Image Recognition and by this, the innovation regarding Machine Learning will be accelerated.

” This is also a plea for commercial companies to give something back to the Academic world “

This will be one of the most extensive, (if not the biggest), open-source data set in the field of Logo Recognition.

This is also a plea for commercial companies to give something back to the Academic world, make it a two-way road and innovation would be accelerated.