Data compression provides a way to represent information in compact form using structures present in the data itself. This is especially important nowadays as most of the information generated and used is in digital form.
There are several techniques that can be used, however, all of them are composed fundamentally by two processes: compression and reconstruction.

In the context of communication systems, these processes are referred to as source coding, which transforms the signal in order for it to be passed to a receptor more efficiently.
First, through the encoding process, the source signal is mapped into the bitstream. This bitstream is then transmitted over the error control channel, which, after processing, sends out a bitstream that will be decoded and used to reconstruct the original signal, which will finally be sent to the destination, commonly referred as sink(fig.1).

Depending on the requirements for the desired reconstructed data, it is possible to divide the compression schemes in two types: lossless (reconstructed data is identical to the original) and lossy (higher compression but allowing loss in fidelity).
Lossless compression is necessary for many applications that need to obtain the data without any alterations. For example, in radiology applications, if the image obtained is compressed with lossy techniques and reconstructed after, at first glance it may not be detectable, however, if it is needed to enhance the image for further study, it could show some artifacts that could lead to a wrong diagnosis.
Lossy compression, on the other hand, is useful for applications where the reconstructed information could be deduced even with losses. A very common example is found in the use of mp3 files. Typically, music is widely distributed in this format rather than, for example, wav format. This is due to it being more lightweight and even though it would not present as much quality, the loss in detail does not have a great impact in the listening experience for the general public.
This project aims to study different source coding schemes, both lossless and lossy, and analyse their performance. For this purpose, three different coding schemes were chosen: Non-uniform quantization (lossy), Huffman coding (lossless) and Arithmetic coding (lossless).
A MATLAB program will be created for each one of these methods, which will then be included in a main program for final testing.
The testing will allow a better understanding of the benefits of using one or another scheme, depending on the data that the software need to process.
The following blog posts will be organized according to these three schemes, explaining separately the progress made through the process of elaboration of code in MATLAB and testing for each one.
Commenti