Machine learning algorithm with fitness score

I'm not sure if this is for StackOverflow or Programmers but since it's more leaning towards implementation, I'm asking it here.

I'm looking for an algorithm that could take n inputs (all floats) and produce m (all floats; m < n) outputs. This system could then be trained using a sort of fitness score to learn the correlation between the inputs and the outputs.

What would be the best algorithm to use for such a purpose?

A little bit of context: I want to use machine learning instead of a self invented algorithm because I don't know the (full) correlation between the data, I do know if the outcome of the machine learning algorithm will be any good or not and train it from there.

I have a couple of variables to pass in like:

Information that only I know (confidence 0-1)
Information about me that is known by all (resources and previous accomplishments 0-1)
The risk profile of the person I'm looking into (respectively, based on other players 0-1)
The behavior profile of the person I'm looking into (respectively, based on other players 0-1)
The resources that the player I'm looking into has (respectively 0-1)
The amount of players in total (based on max players allowed 0-1)
Prediction of the outcome (bias 0-1)

The output should be:

Action to take (from "do nothing" to "act quickly" 0-1)
Amount of action to undertake (from "not a lot" to "the most you can do" 0-1)

I have very large data sets that can be processed, so ideally the algorithm suggested can also be persisted.

I have seen algorithms like Artificial Neural Networks but those don't allow for a fitness score as they need input and output coupled together. I can't provide that, I can only calculate the chance that those numbers would be correct (the fitness score – by design never >= 1)

Solution

From the description it looks like a classical problem of reinforcement learning where you do have some agent performing actions (here defined as action+strength, but this is still an action) which changes some internal state of the agent and gets (at some point at least) a reward.

There are many methods to learn a good policy (rule selecting a particular action) from your environment, including (but not limited to):

Q-learning
MDP (Markov decision process)
Monte carlo methods