Search code examples
javajasper-reports

How to get average value ignoring some values of a field


I know how to get the average between numbers, but when I try to do the same approach (by setting up the variable's properties) to get the average from mixed values of numbers and strings it always returns an exception (and I know why).

How to get the average value for a column_1 with mixed values of numbers and string?

column_1
  3
  4
  N/A
  3
  N/A

Meaning to say, in column 1, the average must be 3.33. The (3+4+3)/3 formula should be applied, the two N/A values must be ignored/excluded from the computation. How to do that?


Solution

  • This task can be decomposed into several simple subtasks:

    1. Getting numeric value from mixed type. We should check the field for numeric value.
    2. Calculating the average value for all numeric values (skipping non-numeric values).

    For solving first subtask we can use Apache commons-lang library. The StringUtils.isNumeric(java.lang.CharSequence) method help us to check if the value is numeric.

    For solving second subtask we can use two variables - the first for storing numeric value and the second - for calculating average value.

    Working example

    Datasource

    I used the csv datasource in this sample:

    numValue
    3
    4
    N/A
    3
    N/A
    

    The name of data adapter for this datasource in the example below is str_num.csv. The first line from the file is skipped - it is contains the column's name.

    The report's template

    I used Jaspersoft Studio (JSS) for designing a report template. The Detail band is using for showing data and the Summary band for showing average value.

    The field with values (numeric and non-numeric) will be of the java.lang.String type.

    The first variable is for storing only numeric values. We are initializing the variable with null if the value

    The expression will be:

    StringUtils.isNumeric($F{numValue}) ? Integer.valueOf($F{numValue}) : null
    

    The Java library is using in this expression, this is why we need to set report's language as java and add import for class StringUtils.

    The second variable will of Average calculation type:

    <variable name="averageValue" class="java.lang.Double" calculation="Average">
        <variableExpression><![CDATA[$V{number}]]></variableExpression>
        <initialValueExpression><![CDATA[0]]></initialValueExpression>
    </variable>
    

    The full template:

    <?xml version="1.0" encoding="UTF-8"?>
    <jasperReport xmlns="http://jasperreports.sourceforge.net/jasperreports" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://jasperreports.sourceforge.net/jasperreports http://jasperreports.sourceforge.net/xsd/jasperreport.xsd" name="Average value" pageWidth="595" pageHeight="842" columnWidth="555" leftMargin="20" rightMargin="20" topMargin="20" bottomMargin="20" uuid="bee405b0-c35a-442a-841d-f51423c1c797">
        <property name="com.jaspersoft.studio.data.defaultdataadapter" value="str_num.csv"/>
        <import value="org.apache.commons.lang3.StringUtils"/>
        <queryString>
            <![CDATA[]]>
        </queryString>
        <field name="numValue" class="java.lang.String"/>
        <variable name="number" class="java.lang.Integer">
            <variableExpression><![CDATA[StringUtils.isNumeric($F{numValue}) ? Integer.valueOf($F{numValue}) : null]]></variableExpression>
        </variable>
        <variable name="averageValue" class="java.lang.Double" calculation="Average">
            <variableExpression><![CDATA[$V{number}]]></variableExpression>
            <initialValueExpression><![CDATA[0]]></initialValueExpression>
        </variable>
        <detail>
            <band height="30" splitType="Stretch">
                <textField>
                    <reportElement x="0" y="0" width="230" height="30" uuid="9d00d7e3-b9bc-404d-8842-728326a5d2df"/>
                    <textFieldExpression><![CDATA["Original value: " + $F{numValue}]]></textFieldExpression>
                </textField>
                <textField>
                    <reportElement x="230" y="0" width="250" height="30" uuid="4509cae9-3b89-4860-838a-36aa4466fb37"/>
                    <textFieldExpression><![CDATA["Only numeric value: " + $V{number}]]></textFieldExpression>
                </textField>
            </band>
        </detail>
        <summary>
            <band height="53">
                <staticText>
                    <reportElement x="0" y="0" width="230" height="30" uuid="937af139-259c-49da-a0d9-191bcf801cca"/>
                    <text><![CDATA[Average value (non numeric values are skipped):]]></text>
                </staticText>
                <textField pattern="0.00">
                    <reportElement x="230" y="0" width="140" height="30" uuid="9abd786b-32ee-4ab1-91dc-851e8bfef0db"/>
                    <textFieldExpression><![CDATA[$V{averageValue}]]></textFieldExpression>
                </textField>
            </band>
        </summary>
    </jasperReport>
    

    Output result

    The generated resut in JSS looks like this:

    The result in JSS. Average of 3 records

    If we will change the expression of first variable (for storing numeric values) to this one:

    StringUtils.isNumeric($F{numValue}) ? Integer.valueOf($F{numValue}) : 0
    

    the result will be:

    The result in JSS. Average of 5 records

    - It is the average value for 5 records, not for 3 as in the first case above.