Search code examples
rfunctionuniquerecords

How to assign a value to a variable according to conditions


I hope someone could help me or at least give me a good advice. I have a large dataframe to store scientific papers (classified by Author/Year/Journal). Most of the scientific papers give me more records, so I am trying to write a function (until now without success) that return me a unique value (named n) that identifies the paper from which the record belongs.


Solution

  • For calculating unique values, you could use the digest function from the digest package. For example,

    library(digest)
    digest(c("Granger", "1987", "Econometrica"))
    

    returns a unique MD5 string for a publication. digest is not vector-able, i.e. you have to use sapply or similar to calculate the id for each row of your data frame.