Solved – Generating vector image from a hand drawn picture. Machine Learning

image processingmachine learning

I am new to machine learning!
I need a way to generate vector image out of hand drawn sketch. I dont need to trace bitmap like it is usally done because it gives you exactly what you drawn. I need to generate "simplified" drawing.
eg.
When human hand draws something with strait lines, the lines are not true.
When a human writes a text I want it to be recognised as such and converted to some existing font that best matches the text.

I would like somone to just guide mo to where and about what should I start research!

Thanks very much !

Best Answer

It depends a little on the exact problem.

If you are interested only in text, then two fields come to mind:

Optical character recognition (OCR), which is specifically about text recognition from images. However, these methods tend to focus on "nice" documents, and may not be applicable to harder case (i.e., generalizable). For instance, one could first determine what each character is, and then search through different fonts to get the best match.
Text detection in natural images. If you have harder images that standard OCR struggles with, you can attempt to first detect and extract the text using ML-based computer vision algorithms. Some starting points:
- Ye et al, Text Detection and Recognition in Imagery: A Survey
- Cheng et al, Focusing Attention: Towards Accurate Text Recognition in Natural Images

Again in this case one would simply detect the text, classify it, and then replace it with a vector version (discarding the rest of the image, or e.g. blending it).

Things become tougher when you have general vocabularies of discrete objects. For instance, I noticed the box in your example becomes a nice straight box. How should this be done? Should it detect there is a box, and then figure out what size it should be? Or should it detect four lines, and separately compute their lengths? This is a non-trivial problem, but there are numerous ways to approach it (some rather effective):

There is some work on directly generating vector images from raster ones: see Sbai et al, Vector Image Generation by Learning Parametric Layer Decomposition. This is not "object-centered" if you will, however.
An approach based on generative modelling could conceivably be used (see Lee et al, Context-Aware Synthesis and Placement of Object Instances). The idea would be to adapt the method from the aforementioned paper to "replace" everything in the input by placing objects around the image such that it reconstructs the image. How to define and parametrize the vocabulary would still be hard though.
The most general approach is using a discrete vocabulary of primitives. One paper doing pretty much exactly what you want is Ellis et al, Learning to Infer Graphics Programs from Hand-Drawn Images. This approach is somewhat complicated, but extremely general.

Overall, due to the requirement for differentiability in deep learning, handling discreteness is challenging. One can use techniques from reinforcement learning to circumvent this, since the likelihood ratio (i.e., the REINFORCE estimator) can compute gradient estimations in very general scenarios. In other words, you can set up your problem as a deep RL problem, where the agent gets a reward for reproducing your target image using choices from a vector vocabulary. Papers like Tucker et al, REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models (as well as those papers cited by/citing it) might be a good place to start learning about that area.

Hopefully that's a useful starting point :)

Best Answer

Related Solutions

Solved – What text or tutorials would you recommend for learning machine learning for image processing

Solved – how to detect the exact size of an object in an image using machine learning

Related Question