Search code examples
statisticsneural-networkalgorithmic-tradingfannforex

Neural network to produce 1 output (FOREX stop loss decision) from 7 inputs


I have a mix of floating point- and integer values that play part in generating one floating point value. Some correlate and some may not. Using FANN I want to see if neural networks are the answer. Given 6 or 7 input numbers that determine a single output number, what networks and layouts to use in FANN?

With an expert in FOREX trading's help I developed a system that finds potential entry points (using back testing), these positions have a high percentage of wins when properly played. The problem is choosing the "stop loss" to get this win. The back tester has shown these are winning trades, but the "stop loss" settings are not easy to pick. I am picking based on best outcome. I tried deterministic solutions unsuccessfully.

I would like a neural network make the many data output the correct "stop loss" and the expected results. I cannot account for market events so these losses are expected. Inputs are moving averages, trading range, deltas in closing value, the perfect "stop loss" (by historical results), and some other.

I have a list of inputs and the perfect stop loss and results. I want the black box magic output the "stop loss" and "expected result" as close to perfect as possible. Inputs are double precision values, as well as the outputs. Some are integers but can be presented as double precision. One boolean for trading short or long. If a problem I can train shorts separately from longs.

Some values may not play a part, possibly none, but I want to figure out values for the determination. Sometimes "result" is negative meaning lost money.


Solution

  • As a marvellous intro into this domain, you could try Stuart REID's "10 Misconceptions about Neural Networks".

    While a highly general question, these would be my points:

    • fast learning curve

      ( having spent countless time with "products" does not get easily justified. Product may quote speed, but beware that they speak about saving miliseconds / seconds in production phase on doing predictions on a well-pre-trained ANN, but your time will be mostly allocated on other activities than this -- first learning and in-depth understanding the product and it's capabilities and weaknesses, then on ANN-model prototyping and most of all, on ANN-Feature-Engineering, then on design-candidate-ANN hyperparametrisation grid-searches for tuning it's best generalisation and CrossValidation properties )

    • support and tools for rapid prototyping

      ( once going beyond a scholarly elaborated ANN-mimicking-XOR, the very prototyping phase is the innovative playground and as such, it is very costly both in time and CPU-resources )

    • support for smart automated feature scaling

      ( is inevitable for evolutionary ( genetic et al ) search-processing for robust networks, reduced to a just-enough scale ( to achieve computability within acceptable time-frame ) under given precision targets )

    • support for automated hyperparametric controls for high-bias / overfitting tuning

    • support for both fully-meshed orthogonal, gradient-driven and random processing of "covering" the (hyper-) parametrisation-space

    • support for local vectorised processing, means for fast distributed processing ( not a marketing-motivated blabble, but a fair & reasonable architecture ( as GPGPU I/O latencies do not help much on trained networks ( finally it's low-computing intensity task and nothing more than a set of single sumproduct decision, so the high GPU IO-bound latency masking does not cover the immense delays and thus GPU "help" can even become disastrous [quantitative citations available], compared to plain, well configured, CPU-based ANN-computing )

    AI/ML NN-Feature Engineering! ...forget about 6:?[:?[:?]]:1 architectures

    This is The Key.

    Whatever AI/ML-Predictor you choose,
    be it an ANN or SVM or even an ensemble-based "weak"-learner, the major issue is not the engine, but the driver -- the predictive powers of the set of Features.

    Do not forget, what complexity the FOREX multi-instrument Marketplace exhibits in real-time. Definitely many orders of magnitude more complex than 6:1. And you aspire to create a Predictor being capable to predict what happens. enter image description here

    How to make it within a reasonable computability cost?

    smart tools exist:
    
            feature_selection.RFECV(    estimator,                #  loc_PREDICTOR
                                        step             = 1,     # remove 1 FEATURE at a time
                                        cv               = None,  # perform 3-FOLD CrossValidation  <-opt. sklearn.cross_validation
                                        scoring          = None,  #                                 <-opt. aScoreFUN with call-signature aScoreFUN( estimator, X, y )
                                        estimator_params = None,  # {}-params for sklearn.grid_search.GridSearchCV()
                                        verbose          = 0
                                        )
    # Feature ranking with recursive feature elimination
    # and cross-validated selection of the best number of features.
    

     |>>> aFeatureImportancesMAP_v4( loc_PREDICTOR, X_mmap )              
      0.  0.3380673 _ _ _____________________f_O.............RICE_: [216]
      1.  0.0147430 _ _ __________________________________f_A...._: [251]
      2.  0.0114801 _ _ ___________________f_............ul_5:3_8_: [252]
      3.  0.0114482 _ _ ______________________________H......GE_1_: [140]
      4.  0.0099676 _ _ ______________________________f_V....m7_4_: [197]
      5.  0.0083556 _ _ ______________________________f.......7_3_: [198]
      6.  0.0081931 _ _ ________________________f_C...........n_0_: [215]
      7.  0.0077556 _ _ ______________________f_Tr..........sm5_4_: [113]
      8.  0.0073360 _ _ _____________________________f_R.......an_: [217]
      9.  0.0072734 _ _ ______________________f_T............m5_3_: [114]
     10.  0.0069267 _ _ ______________________d_M.............0_4_: [ 12]
     11.  0.0068423 _ _ ______________________________f_......._1_: [200]
     12.  0.0058133 _ _ ______________________________f_......._4_: [201]
     13.  0.0054673 _ _ ______________________________f_......._2_: [199]
     14.  0.0054481 _ _ ______________________f_................2_: [115]
     15.  0.0053673 _ _ _____________________f_.................4_: [129]
     16.  0.0050523 _ _ ______________________f_................1_: [116]
     17.  0.0048710 _ _ ________________________f_..............1_: [108]
     18.  0.0048606 _ _ _____________________f_.................3_: [130]
     19.  0.0048357 _ _ ________________________________d_......1_: [211]
     20.  0.0048018 _ _ _________________________pc.............1_: [ 86]
     21.  0.0047817 _ _ ________________________________d.......3_: [212]
     22.  0.0045846 _ _ ___________________f_K..................8_: [260]
     23.  0.0045753 _ _ _____________________f_.................2_: [131]
    
     1st.[292]-elements account for 100% Importance Score ________________
     1st. [50]-elements account for  60%
     1st. [40]-elements account for  56%
     1st. [30]-elements account for  53% . . . . . . . . . . . . . . . . . 
     1st. [20]-elements account for  48% 
     1st. [10]-elements account for  43%
    

    Precision?

    Assembler guys and C gurus will object on first sight, however let me state, that a numerical (im)-precision does not make an issue in FX/ANN solutions.

    Dimensionality curse does... O(2) & O(3) class problems are not seldom.

    Doable with both smart/efficient ( read fast ... ) representation, even for nanosecond resolution time-stamped HFT data-stream I/O hoses.

    Sometimes, there is even a need to reduce a numerical "precision" of inputs ( sub-sampling and blurring ) to avoid adverse effects of ( unreasonably computationally expensive ) high dimensionality and also to avoid a tendency to overfitting, to benefit from better generalisation abilities of the well adjusted AI/ML-Predictor.

            (im)PRECISION JUST-RIGHT FOR UNCERTAINTY LEVELs MET ON .predict()-s
         ___:__:___________MANTISSA|
         |  v  |                  v|_____________________________________________________________________________________
        0.001  |               1023|    np.float16      Half   precision float: 10 bits mantissa + sign bit|  5 bits exp|
        1.02?  |                                                                                
               v            123456:|_____________________________________________________________________________________
    E00 0.000001            8388607|    np.float32      Single precision float: 23 bits mantissa + sign bit|  8 bits exp|
    +00 12345.6? DAX              ^                                                             
    +05 1.23456? DAX                                                                            
                   123456789012345:|_____________________________________________________________________________________
                   4503599627370495|    np.float64      Double precision float: 52 bits mantissa + sign bit| 11 bits exp|
                                  ^|
    

    Anyway, a charming FX project, let me know if onboarding:

    dMM()


    To receive another, un-biased view about the top-down similar situation to your one, reported from another person, one might want to read and think a bit about this experience and just count the weeks and months estimates for making there presented list mastered top-down and complete to make one's final decision.