Search code examples
rmatrixmutual-information

How to calculate the mutual information for continous data (as matrix) in R


Suppose I have the following data:

structure(c(4.72457077562954, 7.32894135282556, 6.95011739471982, 
10.8300240696743, 6.39856613705841, 0.172228289296637, 0.844120336629789, 
0.562443278169962, 1.63326403093408, 0.16239767405664, 0.382171889691901, 
0.810277454839471, 0.554783718855198, 0.984595540919338, 0.411404693691339, 
5.73870719543878, 9.01409973262736, 7.72381458300557, 6.37830282117131, 
9.47132398634276, 0.15482730111734, 1.03922178569039, 1.23589542664667, 
0.715001115507959, 1.39842251707572, 0.391191005095164, 0.952033657767654, 
0.834109579354131, 0.913091389777743, 0.96332445990317, 11.1303412174241, 
4.22756468513327, 5.23723049739091, 7.99103832413648, 8.96884038257952, 
1.36840764709506, 0.230539354189812, 0.269840740760912, 0.580582533466914, 
1.37660280446994, 0.9923629561887, 0.298903147137495, 0.124580618664885, 
0.637977373342453, 0.985680485237176, 6.71106736523235, 8.99128954870899, 
8.63360185300458, 7.36149017631701, 7.08867456999427, 0.557225307438651, 
1.35221655870466, 1.03051952595894, 0.783264858049435, 1.10737584174096, 
0.613340461129714, 0.920112522808632, 0.843916510900149, 0.873510522729128, 
0.650948894579644, 2.76390988506319, 7.57870936069703, 8.87523033565843, 
7.41311877346505, 9.48210392055385, 0.372707166990599, 0.848147234379902, 
0.743066514018534, 0.702178365125204, 1.16207510006004, 0.148978802075519, 
0.776008309000174, 0.829054592676146, 0.678470198306072, 0.929969250893989, 
9.94675474575133, 6.11614429085342, 3.81882067604256, 8.63204254144698, 
6.44840440140108, 0.702506156623805, 0.573327530386587, 0.331057613898793, 
1.13183172898237, 0.549734692917344, 0.82202942610082, 0.537400868536421, 
0.26537926409777, 0.962289247776827, 0.645694392155912, 8.84982387462716, 
7.73353886692751, 3.84722224587904, 9.26567096436445, 1.93759809098197, 
0.74723479305391, 0.732168886777685, 0.188149144567337, 1.49030693944634, 
0.202694580856647, 0.774881009588518, 0.74218555275509, 0.137733938095486, 
0.96383454189271, 0.185646072816786, 6.97742589140545, 8.92353398393719, 
5.98754266767129, 6.29926666403901, 6.82540354418578, 0.603749258836118, 
1.20416956220779, 0.466975392638043, 0.830645783503278, 0.795872414797791, 
0.706058817783705, 0.861981874102617, 0.423897414070843, 0.617275633899248, 
0.942287712517571, 6.97987444768975, 7.43839817177457, 7.2780403754511, 
8.79261638474757, 5.75160173569962, 1.25649885742083, 1.0839021399685, 
0.641942164672113, 0.911591609252122, 0.294465576434456, 0.742010663850341, 
0.767567404457631, 0.744295545274559, 0.834933474594128, 0.536439346102231, 
8.12175760661722, 12.2995463339431, 6.85115203581506, 7.88799981848227, 
6.22667437885265, 0.836658394885195, 1.29306889844364, 0.534850939698971, 
0.735479048732013, 0.608789667144013, 0.886827418155642, 0.913481166679054, 
0.769082102902062, 0.775522950801252, 0.644051430346751, 9.08384586427308, 
7.9015301614423, 5.56917716558971, 6.33267835773375, 8.9951760327848, 
0.844954474137962, 0.910883713196837, 0.574094417843773, 0.534107880161307, 
0.905555155073914, 0.840834639100438, 0.755805878158685, 0.576605481875742, 
0.65272781224219, 0.868398308964556, 7.49738042991982, 4.07099454379948, 
7.24691193454969, 3.71738767187575, 5.42260108152064, 0.57356312425008, 
0.305353623110512, 1.69615648444821, 0.264927168692647, 0.431646332470873, 
0.926584522654845, 0.353822083240478, 0.907790568459807, 0.463017509833345, 
0.66180304314081, 9.08358816337313, 9.48624735065088, 5.27607609138432, 
8.39227840581828, 9.15324441664443, 0.804937764692451, 0.703900772005047, 
0.327301037454378, 2.3192944304264, 0.822376389849826, 0.736691901924554, 
0.732405980285683, 0.594887043293648, 0.844991150621333, 0.865763083466299, 
4.73783118352711, 9.11761385953671, 2.46637354604484, 8.62058132225964, 
8.49158535833496, 0.109105887891648, 0.788416375263538, 0.204897788010416, 
1.34523853382844, 1.04034720902708, 0.185449657956587, 0.876155235276036, 
0.318843268936829, 0.826366209584705, 0.863586743394383, 7.31425470998536, 
6.4989871940062, 8.21592257049613, 7.84734951265805, 5.79018837897879, 
0.580415490488174, 0.684058171030335, 1.02164870066231, 1.15447664674484, 
0.464389656341906, 0.796224929103418, 0.734346598078264, 0.856339918497313, 
0.819434831134987, 0.591404368588854, 6.60913486742217, 6.10049818919972, 
5.12567495427164, 5.80914331323161, 8.71064979239817, 0.493860615269147, 
0.585318519286361, 0.403731956734902, 0.533231892713865, 0.994128445180692, 
0.522643628333466, 0.325313411055224, 0.48594576177647, 0.456594697463537, 
0.902261710675099, 4.10808936208925, 6.73895243499404, 5.20784798590633, 
7.22554527931516, 6.79499209605229, 0.329373606428687, 0.393620978096304, 
0.452217855246701, 0.572490093758077, 0.452842053246849, 0.381239875165474, 
0.571062552632277, 0.584035212200433, 0.872945090144076, 0.624881951098868, 
7.02899497225506, 6.35483464545515, 5.27579878626006, 6.17611306107641, 
3.48705043221044, 0.564055219840898, 0.69856769345164, 0.415454261418235, 
0.414161693925461, 0.271720412018482, 0.675502022998783, 0.576097889726207, 
0.442105794773672, 0.532934154784836, 0.368688689127041, 4.92612615073866, 
4.87701676382899, 6.31314829321634, 5.11475071309432, 10.7863274123984, 
0.478058673374829, 0.591481910961738, 0.575569979167631, 0.4234470931441, 
1.37130725218495, 0.377307895190169, 0.692475913659366, 0.710505684717724, 
0.447702696803841, 0.806389393217168, 4.05016966195857, 8.93864955075016, 
3.9801876227727, 9.3772564690746, 5.2512394217308, 0.275664246907636, 
0.698718331632714, 0.179110813958808, 1.51871516504579, 0.275715561489644, 
0.241305374970582, 0.875017001616101, 0.0623775016829812, 0.919183335352403, 
0.498484080250377, 5.55492369731472, 5.15438475284941, 4.3470471187058, 
3.50218085467535, 10.7977120757034, 0.582518907022327, 0.526784273783535, 
0.161704966514967, 0.428499492242044, 1.00762793037014, 0.550993731387902, 
0.496376744275591, 0.442772361340726, 0.284283001601334, 0.910352245708
), dim = c(15L, 21L))

I can find the Spearman correlation between the variables using the following code:

SpearMatrix <- abs(cor(t(mydata), method = "spearman"))

My question is:

How can I use mutual information for my data instead of Spearman? I expect the code to be similar to the one here (it is not true, but I need to save the result as a matrix).

MiMatrix <- abs(mutual inforamtion(t(mydata)))

Solution

  • This should do. Note, that there are decisions to be made on settings, these are the defaults.

    > library(infotheo)
    > mydata.mi <- mutinformation(discretize(mydata, disc="equalfreq"), method="emp")
    > mydata.mi
               V1        V2             V20       V21
    V1  0.6909233 0.1118188   ... 0.6909233 0.1118188
    V2  0.1118188 0.6909233   ... 0.1118188 0.2985916
    V3  0.1118188 0.2985916   ... 0.1118188 0.2985916
    V4  0.1118188 0.2985916   ... 0.1118188 0.2985916
    V5  0.1118188 0.2985916   ... 0.1118188 0.6909233
              ...       ...   ...       ...       ...
    V17 0.2985916 0.1118188   ... 0.2985916 0.2985916
    V18 0.1118188 0.1118188   ... 0.1118188 0.1118188
    V19 0.1118188 0.2985916   ... 0.1118188 0.6909233
    V20 0.6909233 0.1118188   ... 0.6909233 0.1118188
    V21 0.1118188 0.2985916   ... 0.1118188 0.6909233