This function tries to detect character encoding.
detect_str_enc(x)
x | Character vector. |
---|
A character vector of length equal to the length of x and contains guessed iconv-compatible encodings names.
# detect character vector with ASCII strings ascii <- "I can eat glass and it doesn't hurt me." detect_str_enc(ascii)#> [1] "ASCII"#> [1] "下午好"detect_str_enc(utf8)#> [1] "UTF-8"# function to read ASCII or UTF-8 files read_file <- function(x) readChar(x, file.size(x)) # path to examples ex_path <- system.file("examples", package = "uchardet") # russian text ru_utf8 <- read_file(file.path(ex_path, "ru.txt")) print(ru_utf8)#> [1] "Я могу есть стекло, оно мне не вредит.\n"#> [1] "IBM866"#> [1] "KOI8-R"#> [1] "WINDOWS-1251"#> [1] "我能吞下玻璃而不傷身體。\n"#> [1] "BIG5"#> [1] "GB18030"#> [1] "나는 유리를 먹을 수 있어요. 그래도 아프지 않아요\n"#> [1] "UHC"#> [1] "ISO-2022-KR"