Is there something that we could do in designing an image format that would aid computational photography and future AI-ish applications? Something that would make it easier for software to process images and make inferences about them, features, face recognition, etc.? Is it possible to design a format that would be faster for cameras to encode? Could it help with photo speed, latency, burst modes, etc.? Is it possible to create a format that takes significantly less energy/battery to encode? That would be handy.Īnd then all the computational photography and deep learning stuff that flagship smartphones are using. I would look for opportunities for improvement and value-added in the design of an image format re: hardware implementation. It might be as straightforward as whether a new image codec can be efficiently implemented in hardware, or as an ASIC, or in the same sort of image processor cores as currently used ('hardware", "ASIC", and "image processor cores" might turn out to be the same thing, or not – I'm not sure). The designers of the image format need to thoroughly dig into how cameras encode JPEGs and what the implications are with the new format. In all cases, I think standalone cameras and smartphones are using a hardware JPEG encoder, but this hardware and industry is kind of obscured.Īt the very least a new image format needs to be implementable in a manner similar to JPEG from the perspective of hardware encoder designers or firmware programmers. Flagship smartphone SOCs seem to have an Image Processing Chip or similar on the die, separate from the GPU, display controller, and the more versatile "deep learning" or "neural" cores. There are obviously specialized chips that do it, but I'm not sure if they're technically ASICs or something more general. Someone needs to take a very rigorous and systematic look at how image files are created by cameras in real time. That's been JPEG for a long, long time, and these new formats from Google never seem to consider acquisition, or do any serious testing. It needs to be a good acquisition format, meaning the format that a camera creates in situ when the photo is taken. I think any new image format needs to be smartly designed for modern and anticipated future cameras. What's the status of VVC? Is it better than HEVC and AV1? In video you don't necessarily need to get the low contrast textures right in the keyframe - if the scenery is static, consequent P or B frames can fill in the low contrast texturing rather quickly and maintain a more stable use of bandwidth for video. 1 supports this hypothesis - JPEG XL is not benefiting from the soothing the same way AVIF is in compression ratio, and the most likely culprit is the modeling of visual masking.)ĪVIF's filtering is more compatible with producing smooth geometry (such as photos of plastic or metallic objects and outline drawings), while JPEG XL's filtering is more compatible with preserving low contrast textures (photography of real life objects, textures like clouds, skin, marble, wood). Likely when and if we would like to introduce smoothing as a way to achieve higher compression ratios, we should reduce the effect of visual masking in the respective areas. Even a small smoothed area (such as three near-by 4x4 pixel areas that are smooth) within the same integral transform can reduce quantization opportunities significantly in the current encoder. If there are smooth areas in an image, the current encoder of JPEG XL considers that there is no visual masking there and can end up putting more bits there than without smoothing. Smoothing the input is a double edged sword for JPEG XL. Two rather speculative considerations on this:
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |