java math mapreduce mathematical-optimization

For given operations on a large set of data, is there a way to determine if the data can be decomposed into mapreduce operations?

We do stats and such on large sets of data. Right now it is all done on one machine. We're studying the feasibility of moving to a map-reduce paradigm where we decompose the data into subsets, run some operations on that, then combine the results.

Is there any sort of mathematical test that can be applied to a set of operations to determine if the data they operate on can be decomposed?

Or maybe a list somewhere saying what can and cannot be decomposed?

For instance, I didn't think there was a way to decompose standard deviation, but there is...

edit: added tags

Solution

Take a look at this paper: http://www.janinebennett.org/index_files/ParallelStatisticsAlgorithms.pdf . They have algorithms for many common statistical problems, and there is open source code available.

NullPointerException in Java JDK code while performing parallelStream.forEach(..)
Locating code that is filling PermGen with dead Groovy code
Cannot find test class in project - "The input type of the launch configuration does not exist"
Passing the values to the fraction class in java
How to use different authentication methods for different paths using Spring Security
how to fetch and validate csv header in open csv?
4-Sum algorithm failing with duplicate values in Java
Error getting month while using Calendar object
java Calendar Timezones strange stuff
java.util.Date class with different approach for same date gives different output
Finding out the type of invoked method in JDT
How to get the size of a file in MB (Megabytes)?
Exception in thread "main" java.lang.IllegalStateException: Attempted to load Config resource 'class path resource [application.yml]'
Why does Google Calendar.Events.Watch say my request is not HTTPS
Java - Class.forName, how to get a Field from Class
Use of verify() method with and without times(1) parameter
JDBC connection without database definition
Running my testNG project from a jar using Maven
Do subclasses inherit interfaces?
Spring Security - how can I ask invoke access control methods directly?
most accurate time type in java?
How to get the currently selected application from Windows in Java
validate the credit card expiry date using java?
Finding the square root of a number by using binary search
log4j : current time in milliseconds
Why Joda DateTime gives different result than Java Date?
Tomcat - maxThreads vs. maxConnections
Android notification importance cannot be changed
How to enable all endpoints in actuator (Spring Boot 2.0.0 RC1)
Map rainbow colors to RGB