Teoría de la información y métodos de codificación

jueves, 9 de mayo de 2013

Homework 5 - Error Correction

This week, it was made a code to correct errors, implementing the Hamming code.

Functionality

The hamming code allows us to find errors in a received text.

The technique used is applied the hamming code

This indicates that will be a string of 4 bits to represent numbers, letters, etc., and will be accompanied by "parity bits".

Then, with 4 bits are limited to using only 16 combinations, this means that we can only apply this technique only to 16 numbers.

	Code	Code + Paraty bit
0	0000	0000000
1	0001	0001111
2	0010	0010110
3	0011	0011001
4	0100	0100101
5	0101	0101010
6	0110	0110011
7	0111	0111100
8	1000	1000011

Now, we take the bits paraty:

P1 = d2 + d3 + d4
P2 = d1 + d3 + d4
P3 = d1 + d2 + d4

Example:

We took the code + parity bits of "0111" would be as follows:

0 = D1, 1 = D2, 1 = D3, 1 = D4

We do the operation "exor":

B1 = 1 +1 +1 = 1 (1 +1) = 0 (0 +1) = 1
B2 = 0 +1 +1 = 0 (0 +1) = 1 (1 +1) = 0
B3 = 0 +1 +1 = 0 (0 +1) = 1 (1 +1) = 0

Have "0111" + "100" and is equal to "01111000".

Now, we generate our matrix "G":

D1 = 1000 -> 011
D2 = 0100 -> 101
D3 = 0010 -> 110
D4 = 0001- > 111

Now, the code

***This code, I took a page I found the internet in order to do my job, but I could not finish it

Code:

Now, the results

Results:

:(

Analysis:

:(

Repository:

References:

jueves, 25 de abril de 2013

Activity Point Extra - Lempel-Ziv-Welch

Lempel–Ziv–Welch (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch.

The algorithm is simple to implement, and has the potential for very high throughput in hardware implementations. It was the algorithm of the widely used Unix file compression utility compress, and is used in the GIF image format.

Encoding

A high level view of the encoding algorithm is shown here:

Initialize the dictionary to contain all strings of length one.
Find the longest string W in the dictionary that matches the current input.
Emit the dictionary index for W to output and remove W from the input.
Add W followed by the next symbol in the input to the dictionary.
Go to Step 2.

A dictionary is initialized to contain the single-character strings corresponding to all the possible input characters (and nothing else except the clear and stop codes if they're being used).

The algorithm works by scanning through the input string for successively longer substrings until it finds one that is not in the dictionary.

Algorithm Compress:

Symbols and variables

Algorithm Decompress

Symbols an variables

Example Code in Python

Results

Funcionality

Compress

The string is "HolaHola".

Compressed program sequence of 8-bit symbols.

First, the word "Hello", it's the same, and not added the same.

Then, the string "Ho" is added to the 256:
Ho = 256

After the next, the character "la" one adds the 258:
la = 258

They met the characters "Ho" and "la".

Decompress

256,258 = Hola

The values are replaced by the characters

References

Homework 4 - Adaptative Huffman

For this week, as was the task at adaptive Huffman algorithm.

In this task, use the code that I had already used in the previous post.

The functionality is explained in the previous post, and this time add the piece of code to be something dynamic and not is adaptative.

Functionality

The program compresses the text, in this occasion not enter text in the terminal, as it had done earlier in the program, the text is generated within the program, and it gives us the compressed data results.

Now the code:

Code:

Now the results:

Results:

Test 1:

Test 2:

Analysis of Results:

Testing time:

Test 1:

Test 2:

One problem with the program is that, when calculating the time, really does not take long to process the program, perhaps would be the size of characters produced, maybe that's the point at which this happens.

Repository:

Theory-of-the-information-and-methods-of-coding

References:

jueves, 11 de abril de 2013

Homework 3 - Text Compression

This week, we deliver a program on "compress text", with the help of Huffman coding.

But first, a little introduction:

Huffman Coding.

Huffman coding is an algorithm used for data compression.

The term refers to the use of a table of variable length codes to encode a given symbol (such as a character in a file) where the table has been filled in a specific manner based on the estimated probability of occurrence for each possible value of the symbol.

Technical

Is the creation of a binary tree in which the leaf nodes are labeled with characters, along with their frequencies, and consecutively are joining each pair of nodes that join less frequently, from creating a new intermediate node labeled with that sum.

We proceed to do this until there are no leaf nodes join any top node, and has formed the binary tree.

Subsequently, the label edges joining each of the nodes with zeros and ones (left and right child, respectively, for example. Resultant code for each character is read following the branch, from the root to each character (or vice versa ) of each of the labels of the edges.

Operation.

First, we give the text to be processed, and then run the program so it shows the table with their coding and we decode the text also in binary numbers.

** For clarification, seeing a code example, were used heapq module functions, such as: heapify, heappop and heappush, once a little explanation:

heapq.heapify(): Transform list x into a heap, in-place, in linear time

heapq.heappop(): Pop and return the smallest item from the heap, maintaining the heap invariant.

heapq.heappush(): Push item on the heap, then pop and return the smallest item from the heap. The combined action runs more efficiently than heappush() followed by a separate call to heappop().

Now, the code

Code:

Now, run the program and their results

Results:

Test one

Test two

First, use a short text, after a long text to detect if the program worked well

Analysis

Experiment the typical-case

The first analysis was monitoring time, which results are as follows:

Results of one test

Results of two test

Experiment the worst-case

The second, was to monitor the weight of the uncompressed file and of the compressed file

First text: 25 bytes
Compressed: 89 bytes

Second text 340 bytes
Compressed: 1.28 KB

In the test to check the weight of the archives, the weight was higher, the space can be consumed in the file with the data entered

Repository:

Teoria-de-la-informacion-y-metodos-de-codificacion

References:

jueves, 21 de febrero de 2013

Homework 2 - String matching

For this week, was commissioned a "string matching" with the implementation of the algorithms of Boyer-Moore and Knuth-Morris-Pratt.

Note*: Only apply the Boyer-Moore algorithm and basic algorithm

Operation:

The program runs in a terminal, as the first argument we give the number of letters for the random text in the second argument and give the number of letters for the random pattern.

Code:

Program Running:

Results:

Now, the simulation is as follows:

Simulation:

....

No Results :(

Note*: Code issues

Repository

Teoria-de-la-informacion-y-metodos-de-codificacion

jueves, 14 de febrero de 2013

Homework 1 - Noisy Channel

For this week, was commissioned a program to simulate a noisy channel, built in python to generate a conversion of zeros and ones given the input, for data transmission on the channel, then analyze the data.

Operation:

The program works by terminal, we first as argument the length of the word, then put the following argument amount to generate zeros, then the probability of receiving argument zeros and then the receiving one, finally the argument many reps per word

Code:
Program running:

Image:

Image 2:

Repository

Teoria-de-la-informacion-y-metodos-de-codificacion

viernes, 8 de febrero de 2013

Activity Points Extras - Probability Exercise

Perform the following exercise:

Exercise 1.3.2

(Exercise taken from the PDF:)

Traduccion:

1.3.2 Una urna contiene 6 bolas rojas, 5 bolas verdes y 3 bolas amarillas, 3 bolas sin reemplazo. ¿Cual es la probabilidad de que una bola sea amarilla?.

Solution:

First, we have the color sequence and represent with letters, for example:

Red = R

Green = G

Yellow = Y

We add the balls in total, it is 14, now, we get the chance, we have 3 yellow balls out of 14 balls in total, is as follows:

P ( Y ) = 3/14

We describe the most important events that can happen, to get as many yellow probability on each play we get

P = {YYY, YYR, YRG .....}

We have that on the first play I can get 3 yellow, the second two yellow and one red, one yellow in the third, one red and one green.

We select the three most important events and multimplicamos by probabilities, we have:

And the likelihood is:

We have that the probability of drawing a yellow ball is 3/364 opportunities

Referencias:

http://www.amazon.com/exec/obidos/ISBN=0849339855