I have data in a csv file in the following format
Name Power Money
Jon Red 30
George blue 20
Tom Red 40
Bob purple 10
I consider values like "jon", "red" and "30 as inputs. Each input as a label. For instance inputs [jon,george,tom,bob] have label "name". Inputs [red,blue,purple] have label "power". This is basically how I have training data. I have bunch of values that are each mapped to a label.
Now I want to use svm to train a model based on my training data to accurately identify given a new input what is its correct label. so for instance if the input provided is "444" , the model should be smart enough to categorize it as a "Money" label.
I have installed py and also installed sklearn. I have completed the following tutorial as well. I am just not sure on how to prepare input data to train the model.
Also I am new to machine learning if i have said something that sounds wrong or odd please point it out as I will be happy to learn the correct.
With how your current question is formulated, you are not dealing with a typical machine learning problem. Currently, you have column-wise data:
Name Power Money
Jon Red 30
George blue 20
Tom Red 40
Bob purple 10
If a user now inputs "Jon", you know it is going to be type "Name", by a simple hash-map look up, e.g.,:
hashmap["Jon"] -> "Name"
The main reason people are saying it is not a machine-learning problem is your "categorisation" or "prediction" is being defined by your column names. Machine learning problems, instead (typically), will be predicting some response variable. For example, imagine instead you had asked this:
Name Power Money Bought_item
Jon Red 30 yes
George blue 20 no
Tom Red 40 no
Bob purple 10 yes
We could build a model to predict Bought_item using the features Name, Power, and Money using SVM.
Your problem would have to look more like:
Feature1 Feature2 Feature3 Category
1.0 foo bar Name
3.1 bar foo Name
23.4 abc def Money
22.22 afb dad Power
223.1 dad vxv Money
You then use Feature1, Feature2, and Feature3 to predict Category. At the moment your question does not give enough information for anyone to really understand what you need or what you have to reformulate it this way, or consider an unsupervised approach.
Edit:
So frame it this way:
Name Power Money Label
Jon Red 30 Foo
George blue 20 Bar
Tom Red 40 Foo
Bob purple 10 Bar
OneHotEncode Name and Power, so you now have a variable for each name that can be 0/1.
Standardise Money so that it has a range between, approximately, -1/1.
LabelEncode your labels so that they are 0,1,2,3,4,5,6 and so on.
Use a One vs. All classifier, http://scikit-learn.org/stable/modules/generated/sklearn.multiclass.OneVsRestClassifier.html.