> )+(M "bjbj== v>WWl844@H(WYYYYYY! #6YYn>WW`%4%ӿ
3$<,$$%Surviving Information Theory MRP-J
This guide is based on the 8-hour lecture course on Information Theory given in the summer term by Tasos Papatsoris.
Information theory concerns the measurement of information, and the use of coding techniques to reduce data rate or improve the robustness of a communications system.
This course concerns digital transmission only, on the most part in binary form.
A given information source must have some random element to it in order to convey information. Something that is 100% predictable contains no information!
Information sources generate a number of symbols which carry the information. The source has a finite number of permitted symbols, known as an alphabet.
To understand random sources, we need to use probability theory. Probabilities are usually expressed as decimal numbers between 0 and 1. A probability of 0 means something will never happen. A probability of 1 is totally certain.
To find probabilities, we count the number of occurrences N(S) of our symbol S in a random stream of N symbols. Dividing N(S) by N gives the probability. For example, if you see 25 occurrences of S in 100 symbols, then the probability is 0.25
Of course, the more symbols you observe, the more accurate your probability figure.
Note that if you have a number of mutually exclusive symbols, i.e. the source always transmits one of these symbols at any given time, then the total probability is 1.
For example, if we have two mutually exclusive symbols, S and T, and S has a probability of occurrence of 0.25, then T must have a probability of occurrence of 0.75.
Probability arithmetic
If you want to know the probability of symbol A or symbol B occurring then you add their individual probabilities:
For example, the probability of throwing a 2 or a 3 on a single die is EMBED Equation.3
If you want to know the probability of symbol A and symbol B occurring then you multiply their individual probabilities:
For example, the probability of throwing double 6 on two dice is: EMBED Equation.3
These two rules only work where the occurrence of symbols is independent, i.e. where the occurrence of one symbol does not affect the occurrence of another.
Conditional Probabilities (Bayes Theorem)
When two symbols are not independent, the following formula must be used:
P(A) means the probability of symbol A occurring.
P(B) means the probability of symbol B occurring.
P(B|A) means the probability of B occurring, on the condition that A has already occurred. This is read as probability of B, given A
P(A and B) means the probability of A occurring and then B occurring.
EMBED Equation.3
Bear in mind that P(B and A) is not the same as P(A and B)
Information Theory
This was dreamed up by Claude Shannon, a researcher at Bell Labs in the USA in 1948.
Information is the resolution of uncertainty
Since uncertainty is measured using probability we can define information mathematically:
EMBED Equation.3
Conventionally, when dealing with binary systems, we take the logarithm to the base 2, and define the unit of information as being the bit. NOTE this is not the same bit as the binary digit.
Entropy
The entropy of a source is its average information rate. To find it, take the probability of each symbol and multiply by the information carried by that symbol. Then add up all of the products.
For a binary source, entropy (H) is defined thus:
EMBED Equation.3
H is maximum when p is 0.5, i.e. when all symbols are equally probable.
EMBED Equation.3 where n is the number of symbols generated by the source.
Redundancy
Redundancy is the difference between the maximum entropy and the actual entropy:
EMBED Equation.3
Source encoding
Coding techniques can be used to reduce the redundancy of the source, leading to data compression.
Average codeword length:
EMBED Equation.3
li is the length in binary digits of codeword i
pi is the probability of codeword i occurring
Basically, measure the length of your codeword, multiply by its probability and then repeat for the other codewords. Add them together to give the average codeword length.
Encoded source efficiency:
EMBED Equation.3
Input length in output symbols:
EMBED Equation.3
where n is the number of symbols transmitted by the source.
Compression ratio:
EMBED Equation.3
Desirable properties of codes:
uniquely decodable (i.e. no codeword is a prefix of any other)
instantaneous (i.e. the end of a codeword should not be defined by the beginning of the next)
compact (i.e. codewords should be as short as possible)
Fano-Shannon Coding Method:
Write the symbols down in order of decreasing probability.
Insert a dividing line between two symbols so that you have a probability of close 0.5 on each side of the line. Then take each half and try to split the probabilities in half again.
Repeat until youve separated all of the symbols. Then read off the codewords down the tree from right to left.
Huffman coding method:
List the symbols in order of decreasing probability.
Join the two symbols with the lowest probability to form a node. Label the upper branch with a 0 and the lower with a 1. Label the node with sum of the two probabilities.
Repeat until all leaves and nodes are joined into one tree.
Read the codewords from right to left down the tree.
Information Transmission
Transmitting data over a noisy channel causes errors.
Information content of data stream before it gets corrupted by noise is known as a priori information. The erroneous information content you actually receive is known as a posteriori information.
If a 1 is transmitted then:
EMBED Equation.3
where P(T1) is the probability that a 1 is transmitted.
If a 1 is received:
EMBED Equation.3
If this is larger than Ipri then information has been gained at the receiver. Otherwise, information is lost in transmission.
The information received is EMBED Equation.3
Mutual information and equivocation
Mutual information is the average information transferred:
I(T;R) is the notation for mutual information:
EMBED Equation.3
Equivocation is the information lost in transmission:
H(T|R) is the notation for equivocation.
Channel Capacity:
Capacity is the maximum possible mutual information of a channel.
For a binary symmetric channel, with error probability p:
EMBED Equation.3
Shannons Theorem states that:
It is possible to transmit information over a channel at any rate less than or equal to its capacity with arbitrarily small error probability.
EMBED Equation.3
C= channel capacity
B=channel bandwidth
S/N = signal to noise ratio
Channel Codes:
In a noisy channel, redundancy is added to give protection against errors.
To reduce error probabilities, codewords should be as dissimilar as possible.
The number of digits by which two codewords differ is known as the Hamming Distance.
If you evaluate this for a whole coding scheme, then the limiting factor is the Minimum Hamming Distance (dmin)- i.e. the amount by which the most similar codewords differ.
A code can guarantee to detect up to and including dmin-1 errors.
A code can guarantee to correct up to and including EMBED Equation.3 errors.
Parity codes:
In a parity code, an extra digit is added, known as the parity digit. A decision is made whether to use odd or even parity by the code designer. If the parity is odd, the parity digit is set so as to give an odd number of 1s in the codeword. If the parity is even, the parity digit is set so as to give an even number of 1s in the codeword.
Classical Hamming Code:
This comprises the information digits interspersed with parity digits. If a single error occurs, some of the parity checks will fail. By looking at the states of the erroneous parity bits you can find the bit that is in error.
PAGE 5 of NUMPAGES 5
&C[b 7
J
a
0123bcdgopPɹjAEHOJQJUj<
CJUVaJmH sH jOJQJUjEHOJQJUj<
CJUVaJmH sH HhdE&OJQJjHhdE&OJQJU5OJQJ\6OJQJ]OJQJCJ$OJQJ6$%&CD12D$$Ifl0l x64
la $$Ifa$
$If"" I
J
a
b
4
P
PQiC$
!
C
D
R
S
u
v
./CDQRefgh;69;nrzopǼj<
CJUVaJmH sH jEHOJQJUj1<
CJUVaJmH sH jpEHOJQJUj<
CJUVaJmH sH jOJQJU56OJQJ\]5OJQJ\6OJQJ]OJQJ5<qrz<no#$/
'?@p$/'(;<=>@ABCnopqrKgh{|注昍ބބtj<
CJUVaJmH sH 6H*OJQJ]jEHOJQJUj<
CJUVaJmH sH j
EHOJQJUj<
CJUVaJmH sH jEHOJQJUj<
CJUVaJmH sH 6OJQJ]OJQJjOJQJUjT EHOJQJU+KLg=|./j!N
&F|}~=./0CDEF注昍}rjEHOJQJUj<
CJUVaJmH sH j~EHOJQJUj]<
CJUVaJmH sH jWEHOJQJUjh<
CJUVaJmH sH 6OJQJ]j7EHOJQJUj=<
CJUVaJmH sH OJQJjOJQJUjEHOJQJU+NO/G+_` ?hi{2GH[\]^`i{
/RW & - wooo5OJQJ\j$EHOJQJUj <
CJUVaJmH sH j!EHOJQJUj_<
CJUVaJmH sH 6OJQJ]j4EHOJQJUj<
CJUVaJmH sH jEHOJQJUj8<
CJUVaJmH sH jOJQJUOJQJH*OJQJ+2/0{| a b p !!!"""""$a$- B C V W X Y b p !!"""""""""""""""ȽȽ0JmHnHu0J
j0JU6OJQJ]j&EHOJQJUjv<
CJUVaJmH sH jOJQJUOJQJt&P +p
,p
-p
.p
1h. A!"#$%n4
5
6
7
ADdlJ
CA?"2Dp`!wT
ExeRJQ=3&pbPB#,(,,OPHo'XΝwaq`9gfv^+0l9Q&R82eiUS=.3X!3IOzր'Or{x|H_l3DZqߗ|֒X9w9n/*{:ff_[t^Fɰy\5i8{oV5;9ӇVXqOr~w, ɺϢ~xe|Mo_WӅ5CTMuQn7&2G/Dd`lJ
CA?"2/mik#L_qomp`!e/mik#L_qoX
3xeQJAqZpă=xSBo/hZOœѣ>Y5)I$|B0u6ZKLqYѦ"m*{}]?
jb+Bi?Mt/Fj&˴$ovVKu&VBzcqihyݯ{#Nl:n[gü?]7B]lA=?:G^XɤE~>9XsfWU59qVt@MDdh@J
CA?"2>с]Kp@
ﻖp`!>с]Kp@
ﻖ@0J Qxcdd``fd``beV dX,XĐ AHRcgb x@c@P5<%!`fjvF+B2sSRsA.si#AbM8?_p7}P嗰N`(|
ff +1/#(X`|F
~d$GFp@
G8xA8?U Ty7_3 \s{r2pA CN9LLJ%VkVSDdJ
CA?"26hzr{XC vܳp`!6hzr{XC vܳF 2xڅ1K@߽ڤƦ)Kiu"*b*+ :989:)
~?8cQQhwMp{^]14Јi Y:ȘXaXY`3b,PǎK%57$=h>eM,F_m}P*h8"O lnem@ƞQ:vu@e%s^[)#sKI~WlE_^s=Ĉwu]MU=fn=m7!E4IEND)[on&BaȖ!w*."T=o հaA5ٮ(jIƨheal
➊6SL
7!:vzDnruDd
TJ
CA?"2AOz=>M p`!AOz=>M !XJyxcdd``~ @b1##X`=F !#T57LQ! KA?HZj䆪aM,,He`1%@0@penR~~](6cRA>F? )$^~F7#
_T?H?$37X/\!(?71
KKD@z
vT*ߍ|"RP̨`?b!a%dJ`%?3W uW*V )'u
b|Kvyl04.=\Ĥ\Y\9r0@&Dd@TJ
CA?"2l mqhl=$E|=d
p`!\l mqhl=$E|=l
pXJ*xcdd``6cd``beV dX,XĐ IKRcgb uǀjx|K2B*R. `W01d++&10\F\
^ Śp/v_;0ߍOڡX;c0l#2t-$F+[8@z|7ÍD^F}#L@(\m", \И&8f
0y{Ĥ\Y\1U`u[
DdTJ
CA?"2lxi-62"GXH3p`!@xi-62"GX@ PVXJxcdd``~ @bD"L1JE`x8,56~) M@ kP#7T
obIFHeA*C0d@Hfnj_jBP~nbC%@i mxwc6('F?qMzd:FZp
Ma`b#-@Ju9`W2pAÄn`S!v0y{!Ĥ\Y\1ٕ`uݸD^Dd
!"#$%&'Q*-s./1023465798:;<>=?A@BCDFEGIHJKLMNPO_^RSTUVWXYZ[\]tabcdefghijklmnopqrRoot Entry_ F%ӿ,@ Data
(WordDocument^v>ObjectPoola'@%ӿ%ӿ_1021112566F@%ӿ@%ӿOle
CompObjfObjInfo"%(+.147:=@CFILOPSVY\]`cdehknqstuvwxy{|}~
FMicrosoft Equation 3.0DS EquationEquation.39q;\`mI\yI
16+16=26=13
FMicrosoft Equation 3.0DS EqEquation Native x_1021112774F@%ӿ@%ӿOle
CompObj
fuationEquation.39q;H`mI\yI
1616=136
FMicrosoft Equation 3.0DS EquationEquation.39qObjInfo
Equation Native
d_1021114372F@%ӿ@%ӿOle
CompObj
fObjInfoEquation Native _1021114673 F@%ӿ`D%ӿ#dIPwI
P(A and B)=P(A)P(B|A)
FMicrosoft Equation 3.0DS EquationEquation.39q#ׄ`mI\yI
I(A)="log(P(A))=log1Ole
CompObjfObjInfoEquation Native P(A)()
FMicrosoft Equation 3.0DS EquationEquation.39q#ה`mI\yI
H="plog2
p()"(1"p)log2
1"p()_1021115379F`D%ӿ`D%ӿOle
CompObjfObjInfoEquation Native _1021115596"F`D%ӿ`D%ӿOle
CompObj !f
FMicrosoft Equation 3.0DS EquationEquation.39q#P`mI\yI
HMAX
=log2
(n)a
FMicrosoft Equation 3.0DS EqObjInfo!#Equation Native $l_1021115800$F`D%ӿ`D%ӿOle
&CompObj#%'fObjInfo&)Equation Native *P_1021116156;)F`D%ӿ`D%ӿuationEquation.39q#4`mI\yI
R=HMAX
"H
FMicrosoft Equation 3.0DS EquationEquation.39qOle
,CompObj(*-fObjInfo+/Equation Native 0l#P`mI\yI
L0
=pi
li
"
FMicrosoft Equation 3.0DS EquationEquation.39q#0`mI\yI
E=HL0_1021116433.F`D%ӿ`D%ӿOle
2CompObj-/3fObjInfo05Equation Native 6L_1021116477,63F`D%ӿ`D%ӿOle
8CompObj249f
FMicrosoft Equation 3.0DS EquationEquation.39q#HXInI
Li
=log2
(n)
FMicrosoft Equation 3.0DS EqObjInfo5;Equation Native <d_1021116520E8F`D%ӿ l%ӿOle
>CompObj79?fObjInfo:AEquation Native B`_10211183011J=F l%ӿ l%ӿuationEquation.39q#DsIxI
CR=L0
Li@e
FMicrosoft Equation 3.0DS EquationEquation.39qOle
DCompObj<>EfObjInfo?GEquation Native Hp#TXInI
Ipri
="log(P(T1))
FMicrosoft Equation 3.0DS EquationEquation.39q#אIPI
Ipost
(R1|T1)="log2
_1021118389BF %ӿ %ӿOle
JCompObjACKfObjInfoDMEquation Native N_1021118264GF %ӿ %ӿOle
QCompObjFHRf(P(R1|T1))d
FMicrosoft Equation 3.0DS EquationEquation.39q#P`mI\yI
I=Ipri
"IpostObjInfoITEquation Native Ul_1021118672@YLF%ӿ%ӿOle
WCompObjKMXfObjInfoNZEquation Native [_1021272056QF%ӿ%ӿ
FMicrosoft Equation 3.0DS EquationEquation.39q#ׄ`mI\yI
I(T;R)=H(T)"H(T|R)=H(R)"H(R|T)
FMicrosoft Equation 3.0DS EqOle
^CompObjPR_fObjInfoSaEquation Native buationEquation.39q@`mI\yI
C=1"plog2
1p()+(1"p)log2
11"p()()
FMicrosoft Equation 3.0DS Eq_1021118950VF%ӿ%ӿOle
fCompObjUWgfObjInfoXiuationEquation.39q#L`mI\yI
C=Blog2
(1+S/N)
FMicrosoft Equation 3.0DS EquationEquation.39qEquation Native jh_1021119350TO[F%ӿ%ӿOle
lCompObjZ\mfObjInfo]oEquation Native pT1Table`$SummaryInformation(`r#8`mI\yI
dmin
"12
Oh+'0$ @L
ht
"Analogue Electronics Recipe CardsosnalMichael Prior-Jonesichichelectronics.dotMichael Prior-Jones21hJ
CA?"2[ApP&]=p`!U[ApP&^h#xڕQMKPݴMBAzwԃ
@%<g<
J}ۗ ,3fu;&ц}Ru]EtBlɱ\<ߒ!QC-ep`!]ek(cpqi>QC-<`P+xڍQJAfN%)joeI<#\@;k_GIV
'ъxDEH$w/V ,Ǵp<˫Ӷ;8@UH~pZ;yV'uK w,%V*vP}M!1ނٸt8ySĞ镞ӮfV.In+wھǶ9,Zh$v,Yk
P/DdD|J
CA?"2ΚEci8mp`!eΚEci8R`03xڍQJA}3]C감\!D94 ~VX6bk/x#Dwv̾3p"dejƘH#eYjd4e>Z[xe"?X&߷9-&q'@
>yBpC4uxK^t}Ӝޔ<E3Jq?i
noQv^?8I1o>k>.߅`Ȫ~%WCBW7GR;~)tfxILb*o_~asI12J~}.)_s.}ٻNQ}]zS=o6_n'**> Heading 1$$@&a$CJ0OJQJ>@> Heading 2$@&5CJOJQJ\J@J Heading 3$<@&5OJQJ\^JaJ<A@<Default Paragraph Font,,Header
9r , @,Footer
9r 2)@2Page Number5OJQJ>$%&CD12IJab4 P
P
Q
i
<qrz<
n
o
#$/
'?@pKLg=|./j!NO/G+_` ?hi{2/0{|abp000000000000000000000000J0J0J0J0J0J0J0J0J00000000000000
0
0
0
0
0
0
0
0
(0
0r0r0r0r0r0r0r0r0r(0
0$0$0$000(00000000(00L(000(00(0 0 0 0(00000(0000000000000000000(00`0`0`0`0`0`0`0`0`(00000000i0i0i0i0i0i0i0i0i0000000000(00
0
(00@0
0!|- "N2""02Q
e
g
o
';=g{}/CEG[]
BVX:::::::::::::::::: !!tt3Q
h
o
'>g~/FG^BY3Q
h
o
'>g~/FG^BY#hs3w 0
9
E
N
Q
h
o
'>g~fq/FG^Michael Prior-Jones/C:\Electronics\Surviving Information Theory.docMichael Prior-Jones/C:\Electronics\Surviving Information Theory.docMichael Prior-Jones/C:\Electronics\Surviving Information Theory.docMichael Prior-Jones`C:\WINDOWS\Application Data\Microsoft\Word\AutoRecovery save of Surviving Information Theory.asdMichael Prior-Jones/C:\Electronics\Surviving Information Theory.docMichael Prior-Jones`C:\WINDOWS\Application Data\Microsoft\Word\AutoRecovery save of Surviving Information Theory.asdMichael Prior-Jones/C:\Electronics\Surviving Information Theory.docMichael Prior-Jones`C:\WINDOWS\Application Data\Microsoft\Word\AutoRecovery save of Surviving Information Theory.asdMichael Prior-Jones`C:\WINDOWS\Application Data\Microsoft\Word\AutoRecovery save of Surviving Information Theory.asdMichael Prior-Jones/C:\Electronics\Surviving Information Theory.doct@zh^`OJQJo(h^`OJQJo(ohpp^p`OJQJo(h@@^@`OJQJo(h^`OJQJo(oh^`OJQJo(h^`OJQJo(h^`OJQJo(ohPP^P`OJQJo(t@ @$%@P@UnknownMichael Prior-JonesG:Times New Roman5Symbol3&:ArialA&Gill Sans MT9Garamond?5 :Courier New;Wingdings"1hQE&UFUFzr[6!0d#2Q?C:\WINDOWS\Application Data\Microsoft\Templates\electronics.dot!Analogue Electronics Recipe CardsMichael Prior-JonesMichael Prior-JonesrdDocWord.Document.89q*