Medical data is horrible to work with. In medical imaging, data stores (archives) operate on clinical assumptions.

Unfortunately, this means that when you want to extract an image (say a frontal chest x-ray), you will often get a folder full of other images with no easy way to tell them apart. You can try to code some elegant solution, like: There are black borders at the sides of many chest x-rays (since most chests are taller than they are wide), so if there are more than 50 black pixel rows along the bottom, it is probably rotated 90 degrees But as always, we run into failure modes.

These brittle rules cannot solve these problems for us. Enter software 2.0, where we use machine learning to build the solutions that we cannot put into code ourselves.

Problems like rotated images are embarrassingly learnable. This means that, like humans, machines can very easily become (almost) perfect at these tasks.

So, the obvious answer is to use deep learning to fix our datasets for us. In this blog post I will show you where these techniques can work, how to do this with minimal effort, and present some examples of the methods in use. Read more from lukeoakdenrayner.wordpress.com…

thumbnail courtesy of wordpress.com