Megapixel race unnecessary?

Have we reached the point where where the megapixel race is more about the race, of having more then the other guy, than about image quality?

Only a few years ago 6MP was touted as the optimum number of MP that you needed to take really good pictures.

But lately, like most technology, MP have been jumping over each other in leaps and bounds.

Nikon recently released the d800 with an (in my opinion) insane 36.3MP. But fine, the d800 is a pretty high end camera, easy way to drop a few grand. But then they also just released d3200, which aimed at being an entry level 'learner' DSLR, with 24.2MP. That's twice as much as d5000 I bought two years ago.

I know that more MP is good. Higher MP = sharper image. But at what point do these increases in sharpness become negligible at best, and increases in MP count serve nothing more then bragging rights?

When you consider people have been taking gorgeous photographs for decades, that some amazing pictures were taken on early DSLRs with less then 10MP, how often is 36MP really going to be useful?

Answer

Megapixels are Necessary!

The megapixel race is certainly not "unnecessary". Consistently throughout the last decade, progress has been made on the megapixel front while consistently increasing image quality. The anecdotal adages would have you thinking that was impossible, but there are quite a few technological and fabrication improvements that have made lower noise, greater signal-to-noise ratio, and increased dynamic range possible despite shrinking pixel areas.

I think the advent of the 36.3mp Sony Exmor sensor currently used in the Nikon D800 is an exquisite example of what low-level technological improvements can do to lower noise and increase dynamic while still allowing significant increases in image resolution. As such, I think the D800 is a superb example of why the megapixel race is most definitely not over by any means.

As for whether it is just bragging rights? I doubt it. Better tools can always be used effectively in the hands of a skilled artisan. Higher resolution and more low-ISO dynamic range have some specific high value use cases. Namely, landscape photography and some forms of studio photography. The D800 is in a very unique spot, offering near-medium format image quality in a package approximately 1/10th the cost. For some studios, there is no substitute for the best, and they will use $40,000 digital medium format cameras as a matter of providing the right perception to their customers. For many other studios, however, and for many landscape photographers, the D800 is a dream come true: loads of megapixels AND high dynamic range.

No, the megapixel race is most definitely not over, and it is certainly not unnecessary. Competition on all fronts produces progress on all fronts, and that is only ever a good thing for the consumer.

Potential for Improvement

To go a little deeper than my conclusions above, there is more to the story than simply that competition on all fronts is good. Technologically, physically, and practically, there are limitations that will indeed restrict the potential gains as we continue to increase sensor pixel counts. Once we have reached those limits, useful gains at reasonable cost will have to be made elsewhere. Two areas where that can occur would be optics and software.

Technological Limitations

Technologically, there are distinct limits to how much you can improve IQ. The primary source of image degradation in sensors is noise, and there are a variety of electronically introduced forms of noise that can be controlled. I think Sony, with their Exmor sensors, is very near to reaching technological limits, if they have not already. They have utilized a variety of patents to reduce sources of noise production at a hardware level directly in their sensors. Key sources of controllable noise are dark current noise, read noise, pattern noise, non-uniformity noise, conversion (or quantization) noise, and thermal noise.

Both Sony and Canon use CDS, or correlated double-sampling, to reduce dark current noise. Sony's approach is a touch more efficient, but both use essentially the same approach. Read noise is a byproduct of amplification due to fluctuations in current through the circuit. There are a variety of patented and experimental approaches to detecting voltage variation in a circuit, and correcting it during amplification, to produce a "more pure, accurate" read result. Sony uses a patented approach of their own in Exmor sensors, including the 36.3mp one used in the D800. The other two types of pre-conversion electronic noise are pattern noise and non-uniformity noise. These are the result of discontinuities in circuit response and efficiency.

Pattern noise is a fixed aspect of each of the transistors used to construct a single sensor pixel and the electronic gates used to initiate read and signal flush. At a quantum level it is near impossible to make every single transistor exactly identical to each other, and this produces a fixed pattern of horizontal and vertical lines in sensor noise. Generally speaking, pattern noise is a minor contributor to overall noise, and is only really a problem in very low SNR regions or during very long exposures. Pattern noise can be relatively easy to remove given you approach the problem correctly. A "dark frame" can be constructed by averaging multiple samples together to create a pattern-noise template that can be differenced with a color frame to remove pattern noise. This is essentially how long-exposure noise removal works, and it is also how one can manually remove fixed pattern noise from long exposures. At a hardware level, fixed pattern noise can be mitigated by burning in a template that reverses the effects of FPN such that the differences can be added/subtracted at read time, similar to CDS, thereby improving the "purity" of pixel reads. A variety of experimental approaches to burning in FPN templates, as well as more abstract approaches, do exist today.

Non-uniformity noise, often called PRNU or Pixel Response Non Uniformity, is the result of slight variations in the quantum efficiency (QE) of each pixel. QE refers to a pixels ability to capture photons, and is usually rated as a percentage. The Canon 5D III, for example, has a QE of 47%, which indicates it is efficient enough to regularly capture 47% of the photons that reach each pixel. Actual per-pixel QE may vary by +/- a couple percent, which produces another source of noise, as each pixel may not capture the same number of photons as its neighbors despite receiving the same amount of incident light. PRNU changes with sensitivity as well, and this form of noise can become exacerbated as ISO is increased. PRNU can be mitigated by normalizing the quantum efficiency of each pixel, minimizing variation between neighbors and across the entire sensor area. Improvements to QE can be achieved by reducing the gap between photodiodes in each pixel, introduction of one or more layers of microlenses above each pixel to refract non-photodiode incident light onto the photodiode, and the use of backlit sensor technology (which moves a lot or all of the read wiring and transistors behind the photodiode, eliminating the chance that they might obstruct incident photons and either reflect them or convert them to heat energy.)

Thermal noise is noise introduced by heat. Heat is essentially just another form of energy, and it can excite the generation of electrons in a photodiode much like a photon can. Thermal noise is caused directly by the application of heat, often via hot electronic components such as an image processor or ADC. It can be mitigated by thermally isolating such components from the sensor, or by actively cooling the sensor.

Finally there is conversion noise, or quantization noise. This type of noise is generated due to inherent inaccuracies during ADC, or analog-to-digital conversion. A non-integral gain (a decimal gain with whole and fractional part) is usually applied to the analog image signal read from the sensor when digitizing an image. Since an analog signal and gain are real numbers, the digital (integral) result of conversion is often inconsistent. A gain of 1 would produce one ADU for every electron captured by a pixel, however a more realistic gain might be 1.46, in which case you might get 1 ADU per electron in some cases and 2 ADU per electron in other cases. This inconsistency can introduce conversion/quantization noise in the digital output post-ADC. This contribution to noise is pretty low, and produces a fairly fine deviation of noise from pixel to pixel. It is often fairly easy to remove with software noise reduction.

The removal of electronic forms of noise has the potential of improving the black point and black purity of an image. The more forms of electronic noise you can eliminate or mitigate, the better your signal to noise ratio will be, even for very low signal levels. This is the major front upon which Sony has made significant progress with their Exmor sensors, which has opened up the possibility of true 14 stop dynamic range with truly stunning shadow recovery. This is also the primary area where many competing sensor fabrication technologies are lagging behind, particularly Canon and medium format sensors. Canon sensors in particular have very high read noise levels, lower levels of QE normalization, lower QE overall, and only use CDS to mitigate dark current noise in their sensors. This results in much lower overall dynamic range, and particularly poor shadow SNR and shadow DR.

Once all forms of electronic noise are mitigated to levels where they no longer matter, there will be little manufacturers can do to improve within sensors themselves. Once this point is reached, then the only thing that will really matter from a per-pixel quantum efficiency standpoint is pixel area...and with near-perfect electronic characteristics, we could probably stand pixels sizes considerably smaller than the highest density DSLR sensors today (which would be the Nikon D800 with its 4.6 micron pixels, the Canon 7D with its 4.3 micron pixels, and eventually the Nikon D3200 with 3.8 micron pixels.) Cell phone sensors use pixels around the 1 micron size, and have demonstrated that such pixels are viable and can produce pretty decent IQ. The same technology in a DSLR could go even farther with maximal noise reduction, so we really do have a long ways to go.

Physical Limitations

Beyond technological limitations to the perfection of image quality, there are a few physical limitations. The two primary limitations are photon noise and spatial resolution. These are aspects of physical reality, and are things we really don't have much control over. They cannot be mitigated with technological enhancements, and are (and have been) present regardless of the quality of our equipment.

Photon noise, or photon shot noise, is a form of noise due to the inherently unpredictable nature of light. At a quantum level we cannot exactly predict what pixel a photon might strike, or how frequently photons might strike one pixel and not another. We can roughly fit photon strikes to a probability curve, but we can never make the fit perfect, so photons from an even light source will never perfectly and evenly distribute over the area of a sensor. This physical aspect of reality produces the bulk of the noise we encounter in our photographs, and amplification of this form of noise by the sensor's amplifiers is the primary reason photos get noisier at higher ISO settings. Lower signal to noise ratios mean there is less total signal range within which to capture and amplify photons, so a higher SNR can help mitigate the effects of photon noise and help us achieve higher ISO settings...however photon noise itself can not be eliminated, and will always be a limitation on digital camera IQ. Software can play a role in minimizing photon shot noise, and as there is some predictability in light, advanced mathematical algorithms can eliminate the vast majority of this form of noise after a photo has been taken and imported in a RAW format. The only real limitation here would be the quality, accuracy, and precision of the noise reduction software.

Spatial resolution is another physical aspect of two dimensional images that we have to work with. Spatial frequencies, or two dimensional waveforms of varying luminosity, are a way of conceptualizing the image projected by a lens and recorded by a sensor. Spatial resolution describes the scale of these frequencies, and is a fixed attribute of an optical system. When it comes to sensors, spatial resolution is a direct consequence of sensor size and pixel density.

Spatial resolution is often measured in line pairs per millimeter (lp/mm) or cycles per millimeter. The D800 with its 4.3 micron pixels, or 4912 rows of pixels in 24mm of sensor height, is capable of 102.33 lp/mm. Intriguingly, the Canon 7D, with its 3456 rows of pixels in 14.9mm of sensor height, is capable of 115.97 lp/mm...a higher resolution than the D800. Similarly, the Nikon D3200 with 4000 rows of pixels in 15.4mm of sensor height will be capable of 129.87 lp/mm. Both the 7D and D3200 are APS-C, or cropped-frame sensors...smaller in physical dimensions than the full-frame sensor of the D800. If we were to keep increasing the number of megapixels in a full-frame sensor until they had the same pixel size as the D3200 (3.8 microns) we could produce a 9351x6234 pixel sensor, or 58.3mp. We could take this thought to the extreme, and assume it is possible to produce a full-frame DSLR sensor with the same pixel size as the sensor in the iPhone 4 (which is well-known to take some very good photos with IQ that, while not as good as from a DSLR, is more than acceptable), which is 1.75 microns. That would translate into a 20571x13714 pixel sensor, or 282.1mp! Such a sensor would be capable of 285.7 lp/mm spatial resolution, a number that, as you'll see shortly, has limited applicability.

The real question is whether such resolution in a DSLR form factor would be beneficial. The answer to that is potentially. The spatial resolution of a sensor represents an upper limit on what the entire camera could be possible of, assuming you had a corresponding lens capable of producing enough resolution to maximize the sensor's potential. Lenses have their own inherent physical limitations on the spatial resolution of the images they project, and those limitations are not constant...they vary with aperture, glass quality, and aberration correction. Diffraction is another physical attribute of light that reduces the maximum potential resolution as it passes through an increasingly narrow opening (in the case of a lens, that opening is the aperture.) Optical aberrations, or imperfections in the refraction of light by a lens, are another physical aspect that reduces the maximum potential resolution. Unlike diffraction, optical aberrations increase as the aperture is widened. Most lenses have a "sweet spot" at which point the effects of optical aberrations and diffraction are roughly equivalent, and the lens reaches its maximum potential. A "perfect" lens is a lens that does not have any optical aberrations of any kind, and is therefor diffraction limited. Lenses often become diffraction limited around roughly f/4.

The spatial resolution of a lens is limited by diffraction and aberrations, and as diffraction increases as aperture is stopped down, spatial resolution shrinks with the size of the entrance pupil. At f/4, the maximum spatial resolution of a perfect lens is 173 lp/mm. At f/8, a diffraction limited lens is capable of 83 lp/mm, which is about the same as most full-frame DSLR's (excluding the D800), which range from about 70-85 lp/mm. At f/16 a diffraction limited lens is capable of a mere 43 lp/mm, half the resolution of most full-frame cameras and less than half the resolution of most APS-C cameras. Wider than f/4, for a lens that is still affected by optical aberrations, resolution can quickly drop to 60 lp/mm or less, and as low as 25-30 lp/mm for ultra fast wide angle f/1.8 or faster primes. Going back to our theoretical 1.75 micron pixel 282mp FF sensor...it would be capable of 285 lp/mm spatial resolution. You would need a perfect, diffraction-limited f/2.4 lens to achieve that much spatial resolution. Such a lens would require extreme aberration correction, greatly increasing cost. Some lenses do exist that can achieve nearly perfect characteristics at even wider apertures (a specialized lens from Zeiss comes to mind that is purportedly capable of about 400 lp/mm, which would require an aperture of about f/1.6-f/1.5), however they are rare, highly specialized, and extremely expensive. Its a lot easier to achieve perfection around f/4 (if the last several decades of lens production are any hint), which indicates that the maximum viable, cost-effective resolution for a lens is about 173 lp/mm or a touch less.

When we factor in physical limitations into the equation of when the megapixel race will be over, we find that (assuming near technological perfection) the highest cost-effective resolution is about 173 lp/mm. That's about a 103mp full-frame or 40mp APS-C sensor. It should be noted that pushing sensor resolution that high will only see the benefits at an increasingly narrow band of aperture around about f/4, where lens performance is optimal. If the correction of optical aberrations becomes easier, we may be able to achieve higher resolutions, pushing 200 lp/mm, but again, such resolutions would only be possible at or near maximum aperture, where as at all other apertures the overall resolution of your camera will be lower, potentially far lower, than what the sensor itself is capable of. Significantly outresolving the lens leads to perceptual issues, namely the perception that photographs taken at other than the ideal aperture appear soft, lacking sharpness.

So when does the megapixel race end?

Answering this question is not really something I believe anyone is qualified to answer. Ultimately, its a personal choice, and will depend on a variety of factors. Some photographers may always want the potential that higher resolution sensors can offer at ideal aperture, so long as they are photographing scenes with increasingly fine detail that necessitates such resolution. Other photographers may prefer the improved perception of sharpness that is achieved by improving the characteristics of lower-resolution sensors. For many photographers, I believe the megapixel race has already ended, with around 20mp in a FF DSLR package is more than enough. Further still, many photographers see image quality in an entirely different light, preferring frame rate and the ability to capture more frames continuously at a lower resolution paramount to their success as a photographer. In such cases, its been indicated by many Nikon fans that around 12mp is more than enough so long as they can capture 10 frames a second in sharp clarity.

Technologically and physically, there is still a tremendous amount of room to grow and continue making gains in terms of megapixels and resolution. Where the race ends us up to you. The diversity of options on the table has never been higher than today, and you are free to choose the combination of resolution, sensor size, and camera capabilities like AF, ISO, and DR that fit your needs.

Blog

Friday, 3 June 2016

Megapixel race unnecessary?

No comments:

Post a Comment

Why is the front element of a telephoto lens larger than a wide angle lens?