Detecting and removing multibyte strings in R

So I have this multibyte string "UCA1\xa6\xc1" within a large vector of RNA names, which yields UCA1�� upon using the cat() function. I am trying to screen the vector for such strings and rename them to something else or if all else fails, remove them from the vector, as I cannot capitalize such strings with functions like toupper().

I'm not too sure of the data type that '\xa6' and '\xc1' encodes so I am unsure of how to screen for them using any form of regex. Could anybody help me with this?

Solution

This is probably an encoding issue, so try change the encoding during load! Try something like this,

df<- read.csv(file_path, 
                encoding = "iso-8859-1", "use different encodings/langs"
                header = TRUE, 
                stringsAsFactors = FALSE)

Create multiple lagged variables with different offsets
Expanding dataframe to include non existing values
Split string to columns based on paragraph ending from ocr'd image
from magick-image to rasterBrick
How to remove repeated elements in a vector, similar to 'set' in Python
Rename multiple variables at once using dplyr
Reading large multi-part table from file and combing its parts into one tibble
Processing multiple images with Magick (in R) with transformations
R: Convert/Read 3D Matrix into a 'magick' object and vice versa
Error using magick R to import PDF
Method in R to crop whitespace on svg file
Read table from a website into R Studio and create a dataframe with the info
Perspective transformation using R and magick
R magick: Square crop and circular mask
r piping image_annotate doesn't work as expected
reading text portion from list of images and saving in R, using magick
Difference betweeen Fix and Edit in R
Conditional coloring in the Flextable in R
Fast NMF in R on sparse matrices
cannot find -lMagick++-6.Q16
Suppressing output from ImageMagick when calling function from the R Animation package
How to identify relevant strings in this emmeans() output
Why is message() a better choice than print() in R for writing a package?
How can I prevent my computer from crashing when running R-script on large dataset
Crop out circle from image and lay over second image
R: Swap two variables without using a third
Literal curly brackets in gtsummary
Hide "Panel" row from rbinded modelsummary tables
Ignoring NA cases when getting column index of lowest value in row
Is there a way to change the color of the tab_spanner when creating a gt table?