Finding a way to "de-halftone" an image

My grandfather used to own a litography business with his father back in the day, where they used litographic limestones to do halftone printing.

Eventually modern technology drove them out of business and they sold their press and closed the business. In the process he disposed of their film archive containing cecades of originals, color separation films, and so on. He smashed the limestones and used them as filler for the foundation of his garage, as he was "sick of it all" in the end. In his later years, he regrets destroyig his lifes work, but it is what it is.

Anyways, over some time I've been able to track down and get a hold of a lot of their work. I've scanned it in very high resolution, and now have a nice data set of halftone images.

The problem now is that all descreening algorithms on the market are pure shit, and common Photoshop tricks like applying Gaussian blur and downsampling is evem more shitty.

So I've been thinking: It seems to me that all the information needed to produce a raster image from a halftone image is there. The only things that are lost when making halftone images are 1) resolution and 2) color accuracy.

It seems to me that the distance of any 2 points in a halftone image depends on the radius of the point.

The angle of each of the CMYK separations usually follow one of only a few different standards, pic related was the most common one used.

So it seems to me that with this information, any halftone image coud be converted to a digital image.

I'm trying to write an algorithm for this as we speak, but any comments or insights would be apreciated. I can provide some test data for anyone who would be interested in helping me out. I plan on releasing it as a free as in freedom software if I can get it to work.

Other urls found in this thread:

en.wikipedia.org/wiki/OPTICS_algorithm
news.ycombinator.com
en.wikipedia.org/wiki/Multidimensional_sampling
twitter.com/SFWRedditGifs

Sorry, actual tech questions belong on /diy/. Sup Forums is for arguing about why one multi million dollar corporation is better than another

its easy, just [spoiler]install gentoo[/spoiler]

This is the unfortunate truth.

I want every AMD, Intel and Nvidia shill, along with every Sup Forums poster to fucking shoot themselves.

Sorry about all the typos btw.

Pic related is not completely representative of the problem, because the dot size is constant.

Pic related mirrors what I'm working with, which is halftone dots with varying size as well as different angles for each color seperation.

...

Sorry, you will find only trolls here.
also
>Please use Rust

Let's be honest, nobody in there know enough math or programming to solve this type of task. This is more of a Sup Forums-project kind of thing, just like Shillsaver if anyone remembers..

>I can provide some test data for anyone who would be interested in helping me out.

I think a sample of the messed up images and an example of the result you want would be great.

op can you post a sample of the actual thing? i want to try doing this

Examples of typical CMYK halftone screen angles.

By having the user align the image, it should be feasible to have the program do a color seperation of the scanned image and then be able to determine the angle of each of the CMYK seperations.

When the correct angles are known, the next step would be to isolate which points correspond to each pixel.

The pixel size of the inal image could be determined by the total number of halftone points, no? So countall the points, divide by number of hlftone points/pixel, then map the hlftone points that contituse a pixel to that pixel and detemine the color vlue for the pixel based on the radius of each of the CMYK halftone points?

i looked into this a few years ago, while doing scans of console game disc covers (and discs)
i tried a bunch of methods, including just simple selective gaussian blur

i didn't find a truly good method, blur + downscale did the job for making previews or simply displaying them on computer monitors, but leaving them alone was best for printouts

The problem with printouts is that inkjet printers print dots with uniform size but with varying distance. So, not accounting for moire, you loose a lot of resolution both spatial and color, when printing scanned halftone images on an inkjet printer.

To add to this, if you try to combat moire first by blurring and downscaling the scan, you loose aditional information during that process.

If we could convert the halftone images to color accurate digital images, it would be the same as restoring the original, but with less resolution. This would yield much better images for digital viewing, and a better starting point for printouts. I think it can be done.

no sample?
what is it cp?

>I'm trying to write an algorithm for this as we speak, but any comments or insights would be apreciated.
i have a few ideas on ways to do this
one fairly simple way would be something like;
1. decompose the image into separate C,M,Y,K images
2. try to orient a grid over each channel image, such that each dot in the original image falls square in the middle of a grid point
3. calculate the average value of each grid element, making single-color pixels out of them (like a normal point-scaled raster image, only rotated)
4. apply linear or other filtering on those
5. re-composte these grid back into a single image

the hard part is making an algorithm that can line up a grid

i'll draw up a diagram to better illustrate what i'm saying in a moment

I think part of the reason this haven't been done yet is that until recently it haven't been feasible or even technically doable to scan images with the resolution needed to clearily define the individual half tone dots. Especially if you want to determine the diameter of the points, you need enough resolution that each point size can be seperated by at least 1 pixel each. That means you need very high resolution, and each scan will be hundreds of MB each, if not into the GBs each.

copy and paste this in diy, it's where the mature people go. Sup Forums is full hands on aids with children not even age 25.

Did you try fucking around with layers and layer transparency?

...

Why don't you post a fucking image so we can see what you're working with retard. This is why I would kill myself if I worked in IT. Everyone tries to sound like they know what they're doing and rambles on about their problem while you have to figure out what's actually wrong.

The best way I know of, and have tried myself, is editing an inverse FFT transform of the image. It's a good if time-consuming method to remove any kind of pattern noise from an image.

Start by downloading Fiji and load up the image. (this works best on grayscale images, but color is fine too)
The you go into the filters and select FFT, which turns the image into a big square with a white star shape in the middle and spots all around it. Running the reverse FFT turns it into a regular image again.

While in FFT view, select the brush tool with a brush size large enough to cover the small white spots, and using black color, erase them by putting a dot over each one (and there will like a hundred). Avoid erasing any part of the big white star because it's a representation of the actual image data.

Once you're done putting them all out, run reverse FFT and the resulting image should have greatly reduced patterning. Unlike an overlay pattern or depressed paper texture, offset is impossible to fully remove, but it should look much better.

I'm working on it, but the shit laptop I'm on ATM struggles with opening and cropping out a detail from my 17446x23511 pixel 64-bit RGBI RAWs.

mega.co.nz

link is broken

I was thinking about this too
the problem I believe is the CMYK decomposition since neither the printer nor the scanner are 100% color accurate and the paper isn't perfectly white

I think you can still get a decomposition with good enough accuracy. Maybe you can try to adjust the colors and measure the decomposition's quality somehow. This is doable because halftone dots have a predictable pattern and a bad quality decomposition would have more noise and irregularities. A genetic algorithm would work well for this.

After you get the decomposition in CMYK you analyze each channel and look at peaks. You just have to determine their position and size by doing a region-growing segmentation around local maximas and then just interpolate between peeks to get a digital image.

It's actually more complex than that. To illustrate the problem, I've digitally converted the attached image to a halftone pattern with really big dots. This would represent a best case scenario. I'm gonna upload them as JPGs, this will cause a slight degredation in color compared to the ground state, but the same would have t be dealt with for any scanned image as well.

Original

I assume blurring doesn't work because it will just smudge all the colors.

But can't you simply:
1) separate the colors into 4 different layers.
2) blur each layer individually, using an appropriate radius.
3) combine the 4 layers to form the final image.

Machine learning. Conv nets.

Roleplaying general

I was shit posting here but I just realized it might could work.

1. Convert tons of images to halftone
2. Train a DEEP LEARNING CONVNET to generate the original image from the halftone image
2a. Compare the output of the net to the original image and backprop errors
3. When the network converges, run your dad's images through it to get the originals (or something hopefully close)

it would help if op had some originals of the same type from which to train from

It would help but if he can convert random images from the web to be (almost) exactly the same as the halftones he possesses, then he should get similar results.

I'd do this in a few steps (not an expert though).

1. Split the image into 3 color channels + black.
2. Un-rotate each channel, so that it becomes a square grid of dots.
3. Now that you're on a square grid, you should be able to interpolate each channel to a pixel grid of whatever resolution by interpreting each dot as a pixel that varies between white and the channel color, linearly based on dot size.
4. Re rotate and combine the interpolated channel images, creating your final image.

There are a few things I'm not confident about though. Interpolating on a square grid using dot size should give much better fidelity than blurring (though it should be identical to a specialized blurring algorithm that only averages the square region around each dot), but you'd still introduce some artifacts when rotating the channels back. And measuring dot size might be the wrong approach: it might be better to split the channels, un-rotate, segment the image into a square grid with one dot in each square, and average the colors in each segment to produce a single pixel, then interpolate normally and rotate as described above.

Halftone

Yellow color separation 0 degrees

Magenta color separation 75 degrees

Try FFT descreening methods. GIMP has a plugin for that.

Cyan color separation 15 degrees

And finally,

black color separation 45 degrees

If I spent enough time, I could probably work out every color separation ompletely except for the yellow one.

>the hard part is making an algorithm that can line up a grid
Here's an idea. It's kind of complex though, so I'd keep looking for simpler solutions first.

Search the image line-by-line for colored pixels from the bottom or top. When you reach a colored pixel, find every adjacent colored pixel and average their coordinates to get the center coordinates of the dot that the original colored pixel belonged to. Then perform a flood-fill on the original pixel to convert the dot to the background color. Ignore colored pixel clusters discovered below a certain minimum size (noise).

Pick up the line-by-line search where you left off, and repeat the above process until you've found every colored dot in the image. Should be O(n) for n pixels.

Once you have the coordinates of every dot, pick an arbitrary dot (for best results, close to the center of the image) and draw a line segment extending to the edges of the image at an arbitrary angle.

Set a maximum acceptable distance from a dot center to be considered on the line segment. Check the distance of every dot from the line segment. Count the dots which are close enough to the line according to your maximum acceptable distance. Should be O(m) for m dots.

Now do this for q angles from 0 to 90 degrees from your starting angle. Then you have complexity O(mq). Pick the angle which gives you the largest number of "intersected" points, and use this to determine the angle of the grid.

Once you have the angle of the grid, you can position the grid by centering an arbitrary dot in an arbitrary cell, then rotate it according to that angle. If you're not given it, you should be able to get a good enough approximation of the grid cell size arithmetically using the image size and dot density.

Tweak q and the maximum acceptable distance from a dot as a compute time/accuracy tradeoff.

Compute time could be dominated by the O(n) term or the O(mq) term.

Oh. If the dots of one color can touch each other, then this won't work as stated. You'd need some other method for dot discovery.

>It seems to me that all the information
>It seems to me that the distance
>it seems to me that with this information, any halftone

it seems to me that you're literally fucking gay

>line up a grid

Find clusters first:
en.wikipedia.org/wiki/OPTICS_algorithm

Each cluster gets a centering dot, minimize the total distance between that dot and everything in its cluster to find the center of each cluster.

Something like that?

Do you figure you can actually do better than simply averaging the color values inside a pixel side chosen to be slightly larger than the halftone dot distance?

...

...

So what was the original resolution of your source?

That's not bad. Maybe just go with blur, tune the blur radius based on the dot pitch, and skip all the more complicated channel separation, interpolation, etc.

You could search for a pure color tone on the picture, calculate the curvature that is not mixed with other colors. Then you can form a circle from that. Now do it again for the same color, and you can line up the grid perfectly. Repeat for every color you need.

This is a good question for Hacker News, OP. They've got a lot of people who do image processing with neural networks. Try asking there.
news.ycombinator.com

This seems like the kind of thing one could train a neural network for if they cared enough.
That way nobody needs to understand what is being done, but it will be done well.

It'd be reasonably easy to train too, because you could just halftone a bunch of images and give them to the network in reverse.

This is the best thread on Sup Forums right now.

This

I'm glad it's not:

>Haha lol shuld I get intel i7 hk7928jew or amd >da2782 for my gayming rig xDD

or:

>Why do nrmies use laptops when theyres a >supereor alternative called a desktop LOL >LAPTOPFAGS BTFO

>Waifu pic not related

> linus is an faggot amrite ??

>how do i learn linux hacking

Yeah. Maybe we need a software /diy/ board, like a more focused /prog/.

I'd be down for that. Sup Forums could be for vidya shit and /prog/ could be more compsci and less focused on gaming/hacking

There is simply no way you were get anything close to the original image, the information loss is too great.

That's true if you only use the information from the halftone image. But if you have multiple images, you can guess pretty well at what the original values were. Hence the talk of neural networks ITT.

>figure out where the grid is and extract values
>interpolate values out
>merge the channels

Or just FFT it and nerf the high frequency values.

Won't you get ringing this way?

"neural networks" is the "moar cores" of software

why are you making this so complicated, this a simple problem with modern computers, just oversample at 16x or whatever you scanning hardware will allow and do on the fly box downsampling to your desired master resolution

Can you use Hough transform to detect dots that form a line, and use that to determine your grid?

so basically you need input an CMYK 4 channel colour image, split it into 4 different channels (matrices), rotate each and then add back to the original image. And an option to convert to RGB?

Super interesting thread desu
Wish I could help

>So I've been thinking: It seems to me that all the information needed to produce a raster image from a halftone image is there. The only things that are lost when making halftone images are 1) resolution and 2) color accuracy.
>So it seems to me that with this information, any halftone image coud be converted to a digital image.
So I worked in this industry as a production artist.
In order to get what you want automated is going to be hard.
I had to "de-halftone-ize" some images as well. Which yes, was some combination of working at very high resolution, blurring, other filters, and HAND TOUCH UP.

Like someone else mentioned, the information loss is just to great to automatically get a blended color image from halftones.
For spot colors, it's not too bad. For CYMK it's really awful.

No. You don't un-rotate.
The rotation is there because if you have the halftones aligned the same, you end up with an optical illusion due to very minor variations in the film output size and not being able to line them up exactly with print after print.
It's hard to describe what that looks like.

Use imageworsener, it does downsampling and blurring right.

desu i would look into pixel scattering.

One thing that hasn't been considered is that there's no guarantee that the pattern will align to a grid because 1. The print out could have drifted while being pressed, this is a known factor and is usually compensated for, and 2. The scanned source may also not have been perfectly flat, after many years the paper could have distorted slightly.
This definitely seems like a job for a "smart" algorithm/neuronet, because a straight grid-alignment technique would only work for part of the image.

Scanning at a higher res will just make the half-tone more detailed, just like scanning film past a certain point will just make the film grains more prominent. Ultimately you still have to get rid of them somehow.

I've seen the idea of using interpolation being used after separating the layers to guess what the intended tones and colors were, but that alone is a bit simplistic, and would be better fixed with a "fractal" system like those used software for blowing up images to a large size, while retaining edge detail, it's the only way out can maintain some semblance of detail after the fact, otherwise it's probably not much better than blurring.

Of course the resolution is limited to the box size of any CMYK point group, the point of oversampling is the get more accurate weights of the individual points in any grouping. Each point group is essentially one pixel, no neuro-bullshit or anything else is needed to get this pixel mapping. You can apply whatever interpolation method after this, but you are not getting more information out of the original image than the point group resolution, thus it is very important to accurately measure individual points. Easiest way to do that is oversampling.

So with such theoretical software we would be looking at the following process flow:

1. Sample area and detect offset type
2. Determine range values for each color
3. Seperate into layers (color deconvolution?)
4. Run flexible (non-grid) dot detection scheme to compensate for drift
5. "Connect the dots" with a fractal or weight-based system, perhaps with adjustable parameters to trade between blurriness and ringing.
6. Collapse the image back and save

Also at some point there should be a provision for dust and dirt detection to avoid errors in that respect. Treating the colors separately, while a good start, would be better if the system worked more like a RAW image processor and interpolated image data from all color channels simultaneously. Here we have the advantage that luminance doesn't have to be guessed because it's already partially there.

>Each point group is essentially one pixel
That's an overly simplistic way of thinking about it because the original wasn't separated that way, extra information can be gained if each point, it's size, its relation to the group and other groups are all considered.
Imagine if RAW conversion software worked this way for digital cameras, every camera would now have 1/4th its current resolution and still wouldn't look as good.

except this is nothing like a Bayer filter process where there is significant spatial distance between sampling points and reconstruction via extrapolating neighboring samples is required, in this case each point group from the lithograph has already been reconstructed and down sampled from higher resolution film. What you are suggesting would only apply if the OP was using the original negative, though if that was the case the grain would be so small once could just do a simple a scan.

what if i only antishill whatever is being shilled, i know im still a shill, but i feel like it balances out

Steps I would do:
1. Separate color layers.
2. On each layer mark dot position and size.
3. Create rasterized semi-transparent color layer based on dot sizes and position. I would do something along bilinear interpolation between dot positions or something like that.
4. Merge layers in the correct order.

Make sure you use premultiplied/associated alpha and a linear colorspace.

considering the resolution, you did really well

used a paint.net fragment filter
a simple box filter could do better

frosted glass filter worked better

also closer to real of image size given radius of points

what are the images about anyways ? 30's ads ?
this sounds really nice, you should set a chat up if you want help with development

imageworsener's output, it's much lighter than the paint.net guy's images. iw uses linear colorspace.

...

...

I see everyone remembers user's story about relative's CP mag printing business.

I think dumb-proof commercial software that is hand-tuned for most popular technological combinations has already been provided, with instructions as simple as:
1. Scan the page at resolution N. Note: we only officially support HP NotSoCheap9000 mode ABC, and Kodak CorpGizmo mode XYZ.
2. Press the magic button.
3. Processing image...
4. Your licence has expired. Pay for another month to see the results.

My uneducated advice would be:

0. Scan the page at highest resolution. This should be native sensor resolution without interpolation and/or introduced artifacts.
1. Find the printing screen resolution (lpi at the proper angle).
2. Divide the image into squares at 2xlpi. Average the color of each square.
3. As per Whittaker–Nyquist–Kotelnikov–Shannon theorem, you'll have enough data to reconstruct the continuous signal at original rate with DFT.
4. Generate a smoothed higher/lower resolution image from that with IDFT.

I think you'll need to find suitable window function and sample size for DCT empirically.

I don't think working in CMYK gives you any benefits (if you work with 16 bit per channel RGB data). Sure, you can try to recover original color separation by filtering junk that bled to different CMYK planes of a scanned image, and average them individually afterwards, but then you have to match the color balance manually.

Tricky parts:

— I am not sure sampling theorem works this way for 2D data. See en.wikipedia.org/wiki/Multidimensional_sampling
— Different screens have different angles, and that increases data resolution. How do you calculate that? I guess sampling at 8x lpi and adding a lowpass filter might do the trick.

Ackchually, you can JUST drop your lazy-ass high-res scan at DFT, filter the high frequencies, and do an inverse transform. I can't see why it should be different from sampling at matching resolutions. In case of extra high resolution scans, you'll probably need DFT supporting big set lengths, and that's all.

Audio processing software has filters that allow to do just that and preview the result while you change the window function and lowpass frequency. I am sure some image editors have some plugins like that, too.

Crazy neckbeard way: convert image data to audio data and process it in audio editor. Hint: what's the difference between raw single channel audio file and raw single channel image file? (There isn't any!) As usual, some imagemagick magic will do the trick.