Arithmetic coding

emgallardo
Mar 12, 2021
6 min read

Updated: Mar 12, 2021

Arithmetic coding is a coding method based in stream-based encoding. It depends on performing arithmetic operations on the probabilities of an alphabet which is often extracted by the source message itself.

By the nature of its process, it works better for sources with low entropy, small alphabets, and skewed probabilities.

Week1:

Objectives:

Study materials provided by Academic Advisor about communication systems and data compression.
Study MATLAB to refresh how to it works and some syntax.

Progress:

The first week, after allocation of the tasks, the material provided by our academic advisor was the main focus of study. These lectures covered some basic aspects of information theory, portraying the essential layout of communication systems:

Following this, some context about the importance of probability and the introduction of the term Entropy acting as the “uncertainty” of the information being transmitted.

Further study of entropy can give more insight on how effective is really performing the coding scheme. However, a superficial understanding of it can tell us that compressed data should offer higher “quantity of information” per character, as is transmitting the same overall “quantity of information” of the original source message but in less characters, which also results in less redundancy.

For this project we will limit to assure, through the process of decodification, that the information recovered is the same as the original (so, we’ll make sure the process is lossless, transmitting all the information) and check the compression ratio (which will indicate how well the compression is being performed and deduce how the entropy would be affected).

Week 2:

Objectives:

Further study of MATLAB syntax and tools
Study of Arithmetic coding materials
Elaboration of functional prototype program to encode and decode a simple user input string.

Progress:

This week, the materials studied were centered in how Arithmetic coding works.

Regarding media consumption, a couple of videos were especially useful:

“Arithmetic Compression (Ep 5, Compressor Head) Google”

https://www.youtube.com/watch?v=FdMoL3PzmSA

“Arithmetic coding Procedure”

https://www.youtube.com/watch?v=4yYgRAHtDLk

Written sources that were especially useful:

“Practical Implementations of Arithmetic Coding”

https://www.cc.gatech.edu/~jarek/courses/7491/Arithmetic2.pdf

“An Introduction to Arithmetic Coding”

https://www.cs.cmu.edu/~aarti/Class/10704/Intro_Arith_coding.pdf

“Introduction to Data Compression”, by Khalid Sayood, 2006.

ISB: 978-0-12-620862-7 (Available at University’s online library).

This last resource was the main reference for later work as it comprises one of the most comprehensive explanations for how arithmetic coding works, and was the material used as reference to build the main MATLAB tools used for the program.

Regarding MATLAB coding, once the basic syntax and knowledge about how the program works was done, the focus was on the study of tools available related to arithmetic coding.

MATLAB counts with different toolboxes, and one of them is the Communications Toolbox, which contains the arithenco and arithdeco functions.

Arithenco function:

The arithenco function performs arithmetic encoding using two double type positive numeric row vectors as parameters: seq and counts.

“seq” is a sequence of double values which represent the original sequence of values from the source. Is important to notice that the highest value the vector holds cannot be greater than the length of the vector itself. This would prompt an error.

“counts” is the representation of the probabilities of the elements of the alphabet used in the source. Again, if the specified format is not correct, it will prompt an error.

It gives as output a double type binary vector.

Arithdeco function:

The arithdeco function performs arithmetic decodification using three parameters:

“code”: Is a nonnegative binary row vector which contains the encoded source message.

“counts”: Same as in arithenco.

“len”: Is a positive scalar containing the length of the original source message.

Using these tools, a simple program was made to test the codification of a string introduced by the user:

In order for this to work, it was needed to adapt the input to the parameters required for the functions.

Once, the input is acquired in line 3, it is converted to double in the next line. From there, the alphabet is created with the “unique” function. Here, also is created a sequence equivalent to the original input but based on the alphabet created.

This sequence is created as the arithenco function will not take a sequence that have values higher than the size of the counts. For example, if we have a message 'abc', the double characters are 97, 98 and 99, but the size of the sequence is 3 (a, b, and c). if we create this sequence based on the alphabet, it will be just 1,2 and 3 instead.

After that, with “histc” we can create a vector “counts” containing the number of times the elements in the alphabet occur in the source message. Although “histcounts” is recommended over “histc”, it would require additional modifications that were considered not necessary as the advantages that “hiscounts” present over “histc” are not particularly useful for this case.

Once the code is generated the arithdeco function performs the decodification giving an output similar to “seq”.

This however needs to be translated back in line 11, using the alphabet, in order to recover the original message.

Finally, the message is displayed.

Week 3:

Objectives:

Implementation of data obtention from external files.
Implementation of image and audio processing.

Progress:

This week the two main challenges were to manage the correct reading of images and sound, and make sure the formats were acceptable for later processing.

Looking through some documentation from MATLAB, the chosen functions to read these three types of files were “fileread”, “imread” and “audioread”.

Using these tools and working through taking input files, a successful prototype was achieved:

Text processing

A menu to allow the user to choose was made in order to improve the quality of interaction with the user. From the option chosen, the switch statement will select what type of file to open choosing from the templates.

The input is assigned to an initial variable “data” to then be manipulated for later processing. The processing is very similar to the one made for the initial sequence obtained from the user input.

Image processing:

In the case of images, we obtain first a matrix, so we convert it to a vector and assign it to “data”. The size of the original matrix is also stored to use in the decoding process later.

An important step when reaching the reconstruction phase is to consider the format of the current vector.

Once the decoded sequence is obtained, it is translated with the alphabet and reshaped into a matrix converting the elements into “uint8” as that is the default format for the jpg images, which is the format selected for testing. Otherwise, the image could not be reconstructed.

Audio processing:

For the audio process, the important factors to consider are the reading of the audio into a vector of samples (Au), and the sample rate(Fs) in hertz.

“Au” hence will be treated as the source sequence, and “Fs” will be stored for later reconstruction.

Also, the signal is plotted and the sound reproduced with “soundsc” in order to verify the reconstruction is the same as the original.

Week 4:

Objectives:

Separation of program into modules creating different .m files for encoding and decoding, so they can be called from the main program function comprising all three methods.
Implementation of file creation for encoding and decoding process.

Progress:

This week, the last steps into a more modular program were made. A main arithmetic menu was made in a separated file, so the inclusion of this part in the general program can be made easily just calling this file.

From that menu, the user is prompted to choose either the encoding or decoding, and then the appropriate file is called.

Encoding:

The user is first prompted to choose what type of files to encode, which will determine the class of filter that will be applied when the gui is showed to the user to choose a file.

Then the image is shown and encoded. In the example portrayed the warning of an existent directory is due to previous tests.

The code and the necessary data for the decoding process is then saved in a new folder:

Decoding:

The decoding process is very similar in terms of how the user interacts with the program. The user chooses through a gui the file to decode (the gui uses a filter in order to show only bin files), and then process the file reading also the “arithmetic_setup.txt” document” which will contain the necessary information for reconstruction, such as the sample rate for audio.

An output file is generated also in the same folder where the binary file was read from:

Final testing and conclusions:

As the early testing went on, there was already some insight on what the best conditions were. It effectively was showing a better performance on data that contained less variety of values and a more skewed probability distribution.

Finally, some samples were selected for the conclusive testing:

For the text, an optimum sample and a non-optimum sample were tried. The optimum sample was a string of a, b and c characters with a much more prominent presence of “c”:

“abcccccccbcccccaccccccbccccccccccccacccccccccccccccccccc”

The non-optimum sample had just a normal phrase:

“This is a trial for non-optimal piece of text for arithmetic coding”

Also, a lorem ipsum sample text was included:

“Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Maecenas porttitor congue massa. Fusce posuere, magna sed pulvinar ultricies, purus lectus malesuada libero, sit amet commodo magna eros quis urna. Nunc viverra imperdiet enim. Fusce est. Vivamus a tellus. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Proin pharetra nonummy pede. Mauris et orci. Aenean nec lorem. In porttitor. Donec laoreet nonummy augue. Suspendisse dui purus, scelerisque at, vulputate vitae, pretium mattis, nunc. Mauris eget neque at sem venenatis eleifend. Ut nonummy.”

In the case of images, an image of a green circle in a black background was chosen as “optimal”, due the small variation and the predominance of the black color:

For the “non-optimal” image, a “doge” meme image was chosen:

As for sound, no further comparation with another sample was made as it would be enough with the other two study cases.

The results obtained are the following:

As expected, there are great variations in the compression ratios obtained according to the characteristics of the sample alphabet and the probability distribution, however, it was also observed that when the size of data processed is greater, the differences in performance are even more accentuated.

Arithmetic coding

Recent Posts

Comments