Free Essay

Text Compression Using Ambigrams

In: Computers and Technology

Submitted By tamaraiselvia
Words 1801
Pages 8
Text Compression Using Ambigrams

Arun Prasad R., Gowtham S., Iyshwarya G. ,Kaushik Veluru, Tamarai Selvi A., Vasudha J.

Amrita School of Engineering, Coimbatore.
{arun837, gowtham035, iysh16, kaushikveluru, tamarai1990, vasudha.1990}@gmail.com

Abstract

Networking field is looking forward for improved and efficient methods in channel utilization. For some text, data recovery becomes indispensable because of importance of data it holds. Therefore, a lossless decomposition algorithm which is independent of the nature and pattern of text is today’s top concern. Efficiency of algorithms used today varies greatly depending on the nature of text. Such algorithms need some characters to be frequently appearing in the text and randomness in the characters present distorts the consistency to a large extent. This paper brings in the idea of using an art form called ambigram to compress text with consistency in the efficiency of the compression.

Keywords

Ambigrams, lossless compression, steganography, stego key, embedded algorithms, encryption.

1. Introduction

When so many algorithms are available for compressing text, they hamper the readability of the text once compressed. Compressing the text using ambigrams also reduces the text to nearly 50% of its size. When most of the other compressing techniques depend on the nature of the text to be compressed, this technique is independent of the type of the text and requires only the corresponding data set. Hence, this method is consistent in performance. Ambigrams though known to many, has not found major applications in the field of technology. Probing further into this unravelled art form, it can be substantiated that ambigrams can be used to compress text.

1.1 Ambigram- Definition

The word ambigram was coined by Douglas R. Hofstadter, a computer scientist who is best known as the publizer prize winning author of the book Gödel, Escher, Bach. In [1] Hofstadter defines what he means by an ambigram.

“An ambigram is a visual pun of a special kind: a calligraphic design having two or more (clear) interpretations as written words. One can voluntarily jump back and forth between the rival readings usually by shifting one’s physical point of view (moving the design in some way) but sometimes by simply altering one’s perceptual bias towards a design (clicking an internal mental switch, so to speak). Sometimes the readings will say identical things; sometimes they will say different things.”

1.2 Types of Ambigrams

1.2.1 Half Turn Ambigrams

Half-turn ambigrams have two different readings and to switch from one to the other you simply have to rotate the ambigram 180 degrees in the plane it is living in as shown in Figure 1.

[pic]
Figure 1: Half Turn Ambigram of the word “ambigram”

1.2.2 Quarter Turn Ambigrams

Quarter Turn Ambigrams have three different readings and to switch from one to another you simply have to rotate 90 degrees in the clockwise or anti-clockwise direction as shown in Figure 2.

[pic]
Figure 2: Quarter Turn Ambigram of the word “OHIO”.

1.2.3 Wall Reflection Ambigrams

Wall Reflection Ambigrams have two different readings and to switch from one to another you simply have to reflect through a vertical line in the plane as shown in Figure 3.See more examples in [2].

[pic]

Figure 3: Wall Reflection Ambigram of the word “Geometry”

1.2.4 Lake Reflection Ambigrams

Lake reflection ambigrams have two different readings and to switch from one to the other you simply have to reflect through a horizontal line in the plane. See more examples in [3] and [4].

1.2.5. Dissection Ambigrams

Another type ambigram that does not fall into either of the categories is the dissection ambigram. The example below illustrates that the circle can be squared after all as shown in Figure 4.

[pic]
Figure 4: A dissection ambigram of Squaring the Circle.

2. Working Model

In this model, the text to be compressed is got from the user which is stored in a temporary memory. First step is to calculate the position of white spaces in the entered text and store the same in a file. Similarly, the positions of special characters are stored in a separate file, after which the white spaces and special characters are removed from the original text. Then the number of alphabets in the text is calculated and the text is divided into two equal parts. The first part is taken and for each letter present, a symbol from the font file is chosen in such a way that when the text is rotated by 180 degrees, the second part of the text can be read. In this way the text can be compressed to about 50%.

3. Implementation

1. Creating font file

Creating a font file for ambigram would require 26 symbols for a single character. For e.g., ‘a’ alone requires 26 symbols for it has look like all possible letters of alphabet when rotated. An example for this is given below: A true type font file containing about 676 ambigram symbols is created and each symbol is given a code as follows:

• Each of the letters in the English alphabet set is given an index from 0 to 25. For example, letter ‘a’ is given an index 0. Under each alphabet index, a set of 26 symbols is created. For example, under ‘a’, i.e. under 0, 26 ambigram symbols are created by combining ‘a’ with all the 26 alphabets in such a way that when rotated 180 degrees, every other letters from a to z can be formed following which the code for each symbol is assigned to be

code = (first alphabet’s index*26) +second alphabet’s index (1)

For example, the code of the symbol which represents ‘ca’ is calculated as (2*26) + 0 = 52

Figure 5: Compression mechanism

Thus the font file with 676 ambigram symbols with each one mapped to a user defined code is created.

3.2 Text Compression

During compression, the first letter of the first part on rotating should be the last letter of the second part and the second letter of the first part of the first part must be the last but one letter of the last part. Thus the first letter of the first part and the last letter of the second part are taken and their corresponding indices are found out and assigned to i and j respectively. Then the code of the symbol for representing these two letters is found out using (1). The corresponding symbol is fetched from the font file and stored in a file and the two letters are removed from the original file. The process is repeated till there are no more letters left. If the total count of letters in the original file is odd, then a single letter will be left out, which will be copied as it is without any replacement in the compressed file containing symbols.

2. Decompressing Text

While decompressing, the compressed text with ambigram symbols is read from the end of the file. If the end contains any letter it is copied as it is to a file. As and when a symbol is encountered, the code of the symbol is obtained on comparison with the font file. The indices of the two letters are calculated from the code as follows:

• Perform the operation (code / 26).

• The quotient gives the index of the first character (i) and the remainder gives the index of the second character (j).

Once the indices are calculated, the ambigram symbol is replaced by the corresponding pair of alphabets in the new file, by appending the alphabet corresponding to index ‘i’ to the beginning of the file and the alphabet corresponding to index ‘j’ to the end of the new file. After decompressing all the symbols with respective characters, the files with the position of white spaces and special characters are read and the white spaces and special characters are inserted accordingly in the new file thereby getting back the original text.

4. Application of Steganography

The purpose of steganography is to hide the very presence of communication by embedding messages into innocuous-looking cover objects, such as digital images [5]. To accommodate a secret message, the original cover image is slightly modified by the embedding algorithm to obtain the stego image. Using ambigram, the secret text which the user wants to send over the network is compressed and encrypted in a single step which is a major advantage of using ambigrams. The compression method can be modified by dividing the text into parts of variable length and each part being compressed using ambigram compression technique. The receiver is then informed about the decryption logic. This method enhances the security and undetectability by manifolds as the original font set is made available to only the receiver. Therefore the secret message cannot be tracked by any external agent. The output of this compression technique is then hidden in the cover image using embedding algorithm which is then suitably encrypted and sent to the receiver with the corresponding stego key.

5.References

[1] Douglas R. Hofstadter, Ambigrammi (in Italian), Hopeful-monster Edi-tor, Firenze, 1987.

[2] Burkard Polster, Les Ambigrammes–l’art de symétriser les mots, Édi-tions Écritextes, 2004.

[3]John Langdon, “Wordplay Ambigrams and Reflections on the art of Ambigrams,Harcout” Brace Jovanovich,1992.

[4] Scott Kim, Inversions: A Catalog of Calligraphic Cartweels, Byte Books, McGraw-Hill, 1981.

[5] Faisal Alturki, and Russell Mertsereau, “A Novel Approach for Increasing Security and Data Embedding Capacity in Images for Data Hiding Applications”, International Conference on Information Technology: Coding and Computing, 2011. Page(s):228-233

5. CONCLUSIONS

We have presented here a technique whereby text can be efficiently compressed by around 50% which is comparable to other methods in existence. Moreover, unlike many other algorithms, this method does not restrict the user to give only specific types of inputs. Also, this is a lossless compression technique which involves no data loss while decompressing. This idea can further be extended by embedding this technique in any other compression technique. Thereby the overall efficiency of compression can be further increased. Ambigram by itself is encrypted to an extent as a commoner will not be able to percept it so easily. Text compression is of so much demand in the field of networks where bulk data are to be compressed and sent which should cater to the concern of data security as well. In such instances this technique would prove its ability to provide data security by dividing the text into parts of variable length and each part being compressed by the given technique and incorporating the steganography methods on top of this.

-----------------------
File containing the position of white spaces and special characters.

User Input (Text to be compressed)

Ambigram ttf file

Intermediate output with spaces and special characters removed

Preliminary compression steps

Compression mechanism

Final compressed text

Similar Documents