Search code examples
statisticswolfram-mathematicadesign-patternsnumerical

Avoid & Count non-numerical values computing basic statistics in Mathematica


Please consider:

dalist={{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, 
       {2.88`, 2.04`, 4.64`,0.56`, 4.92`, 2.06`, 3.46`, 2.68`, 2.72`,0.820},   
       {"Laura1", "Laura1", "Laura1", "Laura1", "Laura1", 
       "Laura1", "Laura1", "Laura1", "Laura1","Laura1"}, 
       {"RIGHT", 0, 1, 15.1`, 0.36`, 505, 20.059375`,15.178125`, ".", "."}}

enter image description here

The actual dataset is about 6 000 rows and 147 columns. However the above reflects its content. I would like to compute some basic statistics, such as the mean. My attempt:

Table[Mean@dalist[[colNO]], {colNO, 1, 4}]

enter image description here

How could I create a function such as to:

  • Avoid non-numerical values and

  • Count the number of non numerical values found in each lists.

I have not succeeded in finding the right pattern mechanism yet.


Solution

  • First observation: you could use Mean /@ dalist if you wanted to average across rows. You don't need a Table function here.

    Try using Cases (documentation), eg. Mean /@ (Cases[#,_?NumericQ] & /@ dalist)

    If you want to be tricky and eliminate rows from your data that have no numeric elements (eg your third column), try the following. It first picks only the rows that have some numeric elements, and then takes only the numeric elements from those rows.

    Mean /@ (Cases[#,_?NumericQ] & /@ (Cases[dalist, {___,_?NumericQ,___}]))
    

    To count the non-numeric elements, you would use a similar approach:

    Length /@ (Cases[#,Except[_?NumericQ]] & /@ dalist)
    

    This answer has the caveat that I typed it out without the benefit of a Mathematica installation to actually check my syntax. Some typos could remeain.