-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_delim() fails when sample of parsing problems contains non-ASCII characters #1136
Labels
bug
an unexpected problem or unintended behavior
Comments
I have a similar issue with Shift JIS encoding. readr::read_delim("https://www.dropbox.com/s/aji9zb5qqu76hn8/13_2018.csv?dl=1" , ",",
quote = "\"", skip = 0 , col_names = TRUE , na = c('','NA') ,
locale=readr::locale(encoding = "SHIFT_JIS",
decimal_mark = ".", tz = "America/Los_Angeles", grouping_mark = "," ), trim_ws = TRUE , progress = TRUE)
#>
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#> 都道府県名 = col_character(),
#> 市区町村名 = col_character(),
#> `大字・丁目名` = col_character(),
#> `小字・通称名` = col_logical(),
#> `街区符号・地番` = col_double(),
#> 座標系番号 = col_double(),
#> X座標 = col_double(),
#> Y座標 = col_double(),
#> 緯度 = col_double(),
#> 経度 = col_double(),
#> 住居表示フラグ = col_double(),
#> 代表フラグ = col_double(),
#> 更新前履歴フラグ = col_double(),
#> 更新後履歴フラグ = col_double()
#> )
#> |======== | 12% 3 MB|======== | 12% 3 MB|======== | 12% 3 MB|======== | 12% 3 MB|========= | 13% 3 MB|========= | 13% 3 MB|========= | 13% 3 MB|========= | 13% 3 MB|========= | 14% 3 MB|========= | 14% 4 MB|========== | 14% 4 MB|========== | 15% 4 MB|========== | 15% 4 MB|========== | 15% 4 MB|========== | 15% 4 MB|=========== | 16% 4 MB|=========== | 16% 4 MB|=========== | 16% 4 MB|=========== | 16% 4 MB|=========== | 17% 4 MB|============ | 17% 4 MB|============ | 17% 4 MB|============ | 17% 5 MB|============ | 18% 5 MB|============ | 18% 5 MB|============ | 18% 5 MB|============= | 18% 5 MB|============= | 19% 5 MB|============= | 19% 5 MB|============= | 19% 5 MB|============= | 20% 5 MB|============= | 20% 5 MB|============== | 20% 5 MB|============== | 20% 5 MB|============== | 21% 5 MB|============== | 21% 5 MB|============== | 21% 6 MB|=============== | 21% 6 MB|=============== | 22% 6 MB|=============== | 22% 6 MB|=============== | 22% 6 MB|=============== | 22% 6 MB|=============== | 23% 6 MB|================ | 23% 6 MB|================ | 23% 6 MB|================ | 23% 6 MB|================ | 24% 6 MB|================ | 24% 6 MB|================= | 24% 6 MB|================= | 25% 7 MB|================= | 25% 7 MB|================= | 25% 7 MB|================= | 25% 7 MB|================= | 26% 7 MB|================== | 26% 7 MB|================== | 26% 7 MB|================== | 26% 7 MB|================== | 27% 7 MB|================== | 27% 7 MB|=================== | 27% 7 MB|=================== | 27% 7 MB|=================== | 28% 7 MB|=================== | 28% 7 MB|=================== | 28% 8 MB|=================== | 28% 8 MB|==================== | 29% 8 MB|==================== | 29% 8 MB|==================== | 29% 8 MB|==================== | 29% 8 MB|==================== | 30% 8 MB|==================== | 30% 8 MB|===================== | 30% 8 MB|===================== | 30% 8 MB|===================== | 31% 8 MB|===================== | 31% 8 MB|===================== | 31% 8 MB|====================== | 32% 8 MB|====================== | 32% 9 MB|====================== | 32% 9 MB|====================== | 32% 9 MB|====================== | 33% 9 MB|====================== | 33% 9 MB|======================= | 33% 9 MB|======================= | 33% 9 MB|======================= | 34% 9 MB|======================= | 34% 9 MB|======================= | 34% 9 MB|======================== | 34% 9 MB|======================== | 35% 9 MB|======================== | 35% 9 MB|======================== | 35% 9 MB|======================== | 35% 10 MB|======================== | 36% 10 MB|======================== | 36% 10 MB|======================== | 36% 10 MB|========================= | 36% 10 MB|========================= | 37% 10 MB|========================= | 37% 10 MB|========================= | 37% 10 MB|========================= | 37% 10 MB|========================= | 38% 10 MB|========================== | 38% 10 MB|========================== | 38% 10 MB|========================== | 38% 10 MB|========================== | 39% 11 MB|========================== | 39% 11 MB|=========================== | 39% 11 MB|=========================== | 40% 11 MB|=========================== | 40% 11 MB|=========================== | 40% 11 MB|=========================== | 40% 11 MB|=========================== | 41% 11 MB|============================ | 41% 11 MB|============================ | 41% 11 MB|============================ | 41% 11 MB|============================ | 42% 11 MB|============================ | 42% 11 MB|============================ | 42% 11 MB|============================= | 42% 12 MB|============================= | 43% 12 MB|============================= | 43% 12 MB|============================= | 43% 12 MB|============================= | 43% 12 MB|============================== | 44% 12 MB|============================== | 44% 12 MB|============================== | 44% 12 MB|============================== | 44% 12 MB|============================== | 45% 12 MB|============================== | 45% 12 MB|=============================== | 45% 12 MB|=============================== | 46% 12 MB|=============================== | 46% 12 MB|=============================== | 46% 13 MB|=============================== | 46% 13 MB|================================ | 47% 13 MB|================================ | 47% 13 MB|================================ | 47% 13 MB|================================ | 47% 13 MB|================================ | 48% 13 MB|================================ | 48% 13 MB|================================= | 48% 13 MB|================================= | 48% 13 MB|================================= | 49% 13 MB|================================= | 49% 13 MB|================================= | 49% 13 MB|================================= | 49% 13 MB|================================== | 50% 14 MB|================================== | 50% 14 MB|================================== | 50% 14 MB|================================== | 50% 14 MB|================================== | 51% 14 MB|================================== | 51% 14 MB|=================================== | 51% 14 MB|=================================== | 51% 14 MB|=================================== | 52% 14 MB|=================================== | 52% 14 MB|=================================== | 52% 14 MB|==================================== | 53% 14 MB|==================================== | 53% 14 MB|==================================== | 53% 15 MB|==================================== | 53% 15 MB|==================================== | 54% 15 MB|==================================== | 54% 15 MB|===================================== | 54% 15 MB|===================================== | 54% 15 MB|===================================== | 55% 15 MB|===================================== | 55% 15 MB|===================================== | 55% 15 MB|====================================== | 55% 15 MB|====================================== | 56% 15 MB|====================================== | 56% 15 MB|====================================== | 56% 15 MB|====================================== | 56% 15 MB|====================================== | 57% 16 MB|======================================= | 57% 16 MB|======================================= | 57% 16 MB|======================================= | 58% 16 MB|======================================= | 58% 16 MB|======================================= | 58% 16 MB|======================================== | 58% 16 MB|======================================== | 59% 16 MB|======================================== | 59% 16 MB|======================================== | 59% 16 MB|======================================== | 59% 16 MB|======================================== | 60% 16 MB|========================================= | 60% 16 MB|========================================= | 60% 17 MB|========================================= | 60% 17 MB|========================================= | 61% 17 MB|========================================= | 61% 17 MB|========================================= | 61% 17 MB|========================================== | 61% 17 MB|========================================== | 62% 17 MB|========================================== | 62% 17 MB|========================================== | 62% 17 MB|========================================== | 63% 17 MB|=========================================== | 63% 17 MB|=========================================== | 63% 17 MB|=========================================== | 63% 17 MB|=========================================== | 64% 17 MB|=========================================== | 64% 18 MB|=========================================== | 64% 18 MB|============================================ | 64% 18 MB|============================================ | 65% 18 MB|============================================ | 65% 18 MB|============================================ | 65% 18 MB|============================================ | 65% 18 MB|============================================= | 66% 18 MB|============================================= | 66% 18 MB|============================================= | 66% 18 MB|============================================= | 67% 18 MB|============================================= | 67% 18 MB|============================================= | 67% 18 MB|============================================== | 67% 18 MB|============================================== | 68% 19 MB|============================================== | 68% 19 MB|============================================== | 68% 19 MB|============================================== | 68% 19 MB|============================================== | 69% 19 MB|=============================================== | 69% 19 MB|=============================================== | 69% 19 MB|=============================================== | 69% 19 MB|=============================================== | 70% 19 MB|=============================================== | 70% 19 MB|================================================ | 70% 19 MB|================================================ | 70% 19 MB|================================================ | 71% 19 MB|================================================ | 71% 20 MB|================================================ | 71% 20 MB|================================================ | 71% 20 MB|================================================= | 72% 20 MB|================================================= | 72% 20 MB|================================================= | 72% 20 MB|================================================= | 72% 20 MB|================================================= | 73% 20 MB|================================================= | 73% 20 MB|================================================== | 73% 20 MB|================================================== | 73% 20 MB|================================================== | 74% 20 MB|================================================== | 74% 20 MB|================================================== | 74% 20 MB|=================================================== | 75% 21 MB|=================================================== | 75% 21 MB|=================================================== | 75% 21 MB|=================================================== | 75% 21 MB|=================================================== | 76% 21 MB|=================================================== | 76% 21 MB|==================================================== | 76% 21 MB|==================================================== | 76% 21 MB|==================================================== | 77% 21 MB|==================================================== | 77% 21 MB|==================================================== | 77% 21 MB|==================================================== | 77% 21 MB|===================================================== | 78% 21 MB|===================================================== | 78% 21 MB|===================================================== | 78% 22 MB|===================================================== | 78% 22 MB|===================================================== | 79% 22 MB|====================================================== | 79% 22 MB|====================================================== | 79% 22 MB|====================================================== | 80% 22 MB|====================================================== | 80% 22 MB|====================================================== | 80% 22 MB|====================================================== | 80% 22 MB|======================================================= | 81% 22 MB|======================================================= | 81% 22 MB|======================================================= | 81% 22 MB|======================================================= | 81% 22 MB|======================================================= | 82% 23 MB|======================================================= | 82% 23 MB|======================================================== | 82% 23 MB|======================================================== | 82% 23 MB|======================================================== | 83% 23 MB|======================================================== | 83% 23 MB|======================================================== | 83% 23 MB|========================================================= | 83% 23 MB|========================================================= | 84% 23 MB|========================================================= | 84% 23 MB|========================================================= | 84% 23 MB|========================================================= | 85% 23 MB|========================================================= | 85% 23 MB|========================================================== | 85% 23 MB|========================================================== | 85% 24 MB|========================================================== | 86% 24 MB|========================================================== | 86% 24 MB|========================================================== | 86% 24 MB|=========================================================== | 86% 24 MB|=========================================================== | 87% 24 MB|=========================================================== | 87% 24 MB|=========================================================== | 87% 24 MB|=========================================================== | 87% 24 MB|=========================================================== | 88% 24 MB|============================================================ | 88% 24 MB|============================================================ | 88% 24 MB|============================================================ | 88% 24 MB|============================================================ | 89% 25 MB|============================================================ | 89% 25 MB|============================================================= | 89% 25 MB|============================================================= | 90% 25 MB|============================================================= | 90% 25 MB|============================================================= | 90% 25 MB|============================================================= | 90% 25 MB|============================================================= | 91% 25 MB|============================================================== | 91% 25 MB|============================================================== | 91% 25 MB|============================================================== | 91% 25 MB|============================================================== | 92% 25 MB|============================================================== | 92% 25 MB|============================================================== | 92% 25 MB|=============================================================== | 92% 26 MB|=============================================================== | 93% 26 MB|=============================================================== | 93% 26 MB|=============================================================== | 93% 26 MB|=============================================================== | 93% 26 MB|================================================================ | 94% 26 MB|================================================================ | 94% 26 MB|================================================================ | 94% 26 MB|================================================================ | 94% 26 MB|================================================================ | 95% 26 MB|================================================================ | 95% 26 MB|================================================================= | 95% 26 MB|================================================================= | 96% 26 MB|================================================================= | 96% 26 MB|================================================================= | 96% 27 MB|================================================================= | 96% 27 MB|================================================================== | 97% 27 MB|================================================================== | 97% 27 MB|================================================================== | 97% 27 MB|================================================================== | 98% 27 MB|================================================================== | 98% 27 MB|===================================================================| 98% 27 MB|===================================================================| 98% 27 MB|===================================================================| 99% 27 MB|===================================================================| 99% 27 MB|===================================================================| 99% 27 MB|===================================================================| 99% 27 MB|===================================================================| 100% 28 MB
#> Error in nchar(x): invalid multibyte string, element 3 Created on 2020-11-16 by the reprex package (v0.3.0) This works if I set FALSE for the df <- readr::read_delim("https://www.dropbox.com/s/aji9zb5qqu76hn8/13_2018.csv?dl=1" , ",",
quote = "\"", skip = 0 , col_names = FALSE , na = c('','NA') ,
locale=readr::locale(encoding = "SHIFT_JIS",
decimal_mark = ".", tz = "America/Los_Angeles", grouping_mark = "," ), trim_ws = TRUE , progress = TRUE)
#>
#> ── Column specification ────────────────────────────────────────────────────────
#> cols(
#> X1 = col_character(),
#> X2 = col_character(),
#> X3 = col_character(),
#> X4 = col_character(),
#> X5 = col_character(),
#> X6 = col_character(),
#> X7 = col_character(),
#> X8 = col_character(),
#> X9 = col_character(),
#> X10 = col_character(),
#> X11 = col_character(),
#> X12 = col_character(),
#> X13 = col_character(),
#> X14 = col_character()
#> )
#> |===== | 7% 2 MB|===== | 7% 2 MB|===== | 7% 2 MB|===== | 8% 2 MB|===== | 8% 2 MB|====== | 8% 2 MB|====== | 8% 2 MB|====== | 9% 2 MB|====== | 9% 2 MB|====== | 9% 2 MB|====== | 10% 2 MB|======= | 10% 2 MB|======= | 10% 2 MB|======= | 10% 3 MB|======= | 11% 3 MB|======= | 11% 3 MB|======== | 11% 3 MB|======== | 11% 3 MB|======== | 12% 3 MB|======== | 12% 3 MB|======== | 12% 3 MB|======== | 12% 3 MB|========= | 13% 3 MB|========= | 13% 3 MB|========= | 13% 3 MB|========= | 13% 3 MB|========= | 14% 3 MB|========= | 14% 4 MB|========== | 14% 4 MB|========== | 15% 4 MB|========== | 15% 4 MB|========== | 15% 4 MB|========== | 15% 4 MB|=========== | 16% 4 MB|=========== | 16% 4 MB|=========== | 16% 4 MB|=========== | 16% 4 MB|=========== | 17% 4 MB|============ | 17% 4 MB|============ | 17% 4 MB|============ | 17% 5 MB|============ | 18% 5 MB|============ | 18% 5 MB|============ | 18% 5 MB|============= | 18% 5 MB|============= | 19% 5 MB|============= | 19% 5 MB|============= | 19% 5 MB|============= | 20% 5 MB|============= | 20% 5 MB|============== | 20% 5 MB|============== | 20% 5 MB|============== | 21% 5 MB|============== | 21% 5 MB|============== | 21% 6 MB|=============== | 21% 6 MB|=============== | 22% 6 MB|=============== | 22% 6 MB|=============== | 22% 6 MB|=============== | 22% 6 MB|=============== | 23% 6 MB|================ | 23% 6 MB|================ | 23% 6 MB|================ | 23% 6 MB|================ | 24% 6 MB|================ | 24% 6 MB|================= | 24% 6 MB|================= | 25% 7 MB|================= | 25% 7 MB|================= | 25% 7 MB|================= | 25% 7 MB|================= | 26% 7 MB|================== | 26% 7 MB|================== | 26% 7 MB|================== | 26% 7 MB|================== | 27% 7 MB|================== | 27% 7 MB|=================== | 27% 7 MB|=================== | 27% 7 MB|=================== | 28% 7 MB|=================== | 28% 7 MB|=================== | 28% 8 MB|=================== | 28% 8 MB|==================== | 29% 8 MB|==================== | 29% 8 MB|==================== | 29% 8 MB|==================== | 29% 8 MB|==================== | 30% 8 MB|==================== | 30% 8 MB|===================== | 30% 8 MB|===================== | 30% 8 MB|===================== | 31% 8 MB|===================== | 31% 8 MB|===================== | 31% 8 MB|====================== | 32% 8 MB|====================== | 32% 9 MB|====================== | 32% 9 MB|====================== | 32% 9 MB|====================== | 33% 9 MB|====================== | 33% 9 MB|======================= | 33% 9 MB|======================= | 33% 9 MB|======================= | 34% 9 MB|======================= | 34% 9 MB|======================= | 34% 9 MB|======================== | 34% 9 MB|======================== | 35% 9 MB|======================== | 35% 9 MB|======================== | 35% 9 MB|======================== | 35% 10 MB|======================== | 36% 10 MB|======================== | 36% 10 MB|======================== | 36% 10 MB|========================= | 36% 10 MB|========================= | 37% 10 MB|========================= | 37% 10 MB|========================= | 37% 10 MB|========================= | 37% 10 MB|========================= | 38% 10 MB|========================== | 38% 10 MB|========================== | 38% 10 MB|========================== | 38% 10 MB|========================== | 39% 11 MB|========================== | 39% 11 MB|=========================== | 39% 11 MB|=========================== | 40% 11 MB|=========================== | 40% 11 MB|=========================== | 40% 11 MB|=========================== | 40% 11 MB|=========================== | 41% 11 MB|============================ | 41% 11 MB|============================ | 41% 11 MB|============================ | 41% 11 MB|============================ | 42% 11 MB|============================ | 42% 11 MB|============================ | 42% 11 MB|============================= | 42% 12 MB|============================= | 43% 12 MB|============================= | 43% 12 MB|============================= | 43% 12 MB|============================= | 43% 12 MB|============================== | 44% 12 MB|============================== | 44% 12 MB|============================== | 44% 12 MB|============================== | 44% 12 MB|============================== | 45% 12 MB|============================== | 45% 12 MB|=============================== | 45% 12 MB|=============================== | 46% 12 MB|=============================== | 46% 12 MB|=============================== | 46% 13 MB|=============================== | 46% 13 MB|================================ | 47% 13 MB|================================ | 47% 13 MB|================================ | 47% 13 MB|================================ | 47% 13 MB|================================ | 48% 13 MB|================================ | 48% 13 MB|================================= | 48% 13 MB|================================= | 48% 13 MB|================================= | 49% 13 MB|================================= | 49% 13 MB|================================= | 49% 13 MB|================================= | 49% 13 MB|================================== | 50% 14 MB|================================== | 50% 14 MB|================================== | 50% 14 MB|================================== | 50% 14 MB|================================== | 51% 14 MB|================================== | 51% 14 MB|=================================== | 51% 14 MB|=================================== | 51% 14 MB|=================================== | 52% 14 MB|=================================== | 52% 14 MB|=================================== | 52% 14 MB|==================================== | 53% 14 MB|==================================== | 53% 14 MB|==================================== | 53% 15 MB|==================================== | 53% 15 MB|==================================== | 54% 15 MB|==================================== | 54% 15 MB|===================================== | 54% 15 MB|===================================== | 54% 15 MB|===================================== | 55% 15 MB|===================================== | 55% 15 MB|===================================== | 55% 15 MB|====================================== | 55% 15 MB|====================================== | 56% 15 MB|====================================== | 56% 15 MB|====================================== | 56% 15 MB|====================================== | 56% 15 MB|====================================== | 57% 16 MB|======================================= | 57% 16 MB|======================================= | 57% 16 MB|======================================= | 58% 16 MB|======================================= | 58% 16 MB|======================================= | 58% 16 MB|======================================== | 58% 16 MB|======================================== | 59% 16 MB|======================================== | 59% 16 MB|======================================== | 59% 16 MB|======================================== | 59% 16 MB|======================================== | 60% 16 MB|========================================= | 60% 16 MB|========================================= | 60% 17 MB|========================================= | 60% 17 MB|========================================= | 61% 17 MB|========================================= | 61% 17 MB|========================================= | 61% 17 MB|========================================== | 61% 17 MB|========================================== | 62% 17 MB|========================================== | 62% 17 MB|========================================== | 62% 17 MB|========================================== | 63% 17 MB|=========================================== | 63% 17 MB|=========================================== | 63% 17 MB|=========================================== | 63% 17 MB|=========================================== | 64% 17 MB|=========================================== | 64% 18 MB|=========================================== | 64% 18 MB|============================================ | 64% 18 MB|============================================ | 65% 18 MB|============================================ | 65% 18 MB|============================================ | 65% 18 MB|============================================ | 65% 18 MB|============================================= | 66% 18 MB|============================================= | 66% 18 MB|============================================= | 66% 18 MB|============================================= | 67% 18 MB|============================================= | 67% 18 MB|============================================= | 67% 18 MB|============================================== | 67% 18 MB|============================================== | 68% 19 MB|============================================== | 68% 19 MB|============================================== | 68% 19 MB|============================================== | 68% 19 MB|============================================== | 69% 19 MB|=============================================== | 69% 19 MB|=============================================== | 69% 19 MB|=============================================== | 69% 19 MB|=============================================== | 70% 19 MB|=============================================== | 70% 19 MB|================================================ | 70% 19 MB|================================================ | 70% 19 MB|================================================ | 71% 19 MB|================================================ | 71% 20 MB|================================================ | 71% 20 MB|================================================ | 71% 20 MB|================================================= | 72% 20 MB|================================================= | 72% 20 MB|================================================= | 72% 20 MB|================================================= | 72% 20 MB|================================================= | 73% 20 MB|================================================= | 73% 20 MB|================================================== | 73% 20 MB|================================================== | 73% 20 MB|================================================== | 74% 20 MB|================================================== | 74% 20 MB|================================================== | 74% 20 MB|=================================================== | 75% 21 MB|=================================================== | 75% 21 MB|=================================================== | 75% 21 MB|=================================================== | 75% 21 MB|=================================================== | 76% 21 MB|=================================================== | 76% 21 MB|==================================================== | 76% 21 MB|==================================================== | 76% 21 MB|==================================================== | 77% 21 MB|==================================================== | 77% 21 MB|==================================================== | 77% 21 MB|==================================================== | 77% 21 MB|===================================================== | 78% 21 MB|===================================================== | 78% 21 MB|===================================================== | 78% 22 MB|===================================================== | 78% 22 MB|===================================================== | 79% 22 MB|====================================================== | 79% 22 MB|====================================================== | 79% 22 MB|====================================================== | 80% 22 MB|====================================================== | 80% 22 MB|====================================================== | 80% 22 MB|====================================================== | 80% 22 MB|======================================================= | 81% 22 MB|======================================================= | 81% 22 MB|======================================================= | 81% 22 MB|======================================================= | 81% 22 MB|======================================================= | 82% 23 MB|======================================================= | 82% 23 MB|======================================================== | 82% 23 MB|======================================================== | 82% 23 MB|======================================================== | 83% 23 MB|======================================================== | 83% 23 MB|======================================================== | 83% 23 MB|========================================================= | 83% 23 MB|========================================================= | 84% 23 MB|========================================================= | 84% 23 MB|========================================================= | 84% 23 MB|========================================================= | 85% 23 MB|========================================================= | 85% 23 MB|========================================================== | 85% 23 MB|========================================================== | 85% 24 MB|========================================================== | 86% 24 MB|========================================================== | 86% 24 MB|========================================================== | 86% 24 MB|=========================================================== | 86% 24 MB|=========================================================== | 87% 24 MB|=========================================================== | 87% 24 MB|=========================================================== | 87% 24 MB|=========================================================== | 87% 24 MB|=========================================================== | 88% 24 MB|============================================================ | 88% 24 MB|============================================================ | 88% 24 MB|============================================================ | 88% 24 MB|============================================================ | 89% 25 MB|============================================================ | 89% 25 MB|============================================================= | 89% 25 MB|============================================================= | 90% 25 MB|============================================================= | 90% 25 MB|============================================================= | 90% 25 MB|============================================================= | 90% 25 MB|============================================================= | 91% 25 MB|============================================================== | 91% 25 MB|============================================================== | 91% 25 MB|============================================================== | 91% 25 MB|============================================================== | 92% 25 MB|============================================================== | 92% 25 MB|============================================================== | 92% 25 MB|=============================================================== | 92% 26 MB|=============================================================== | 93% 26 MB|=============================================================== | 93% 26 MB|=============================================================== | 93% 26 MB|=============================================================== | 93% 26 MB|================================================================ | 94% 26 MB|================================================================ | 94% 26 MB|================================================================ | 94% 26 MB|================================================================ | 94% 26 MB|================================================================ | 95% 26 MB|================================================================ | 95% 26 MB|================================================================= | 95% 26 MB|================================================================= | 96% 26 MB|================================================================= | 96% 26 MB|================================================================= | 96% 27 MB|================================================================= | 96% 27 MB|================================================================== | 97% 27 MB|================================================================== | 97% 27 MB|================================================================== | 97% 27 MB|================================================================== | 98% 27 MB|================================================================== | 98% 27 MB|===================================================================| 98% 27 MB|===================================================================| 98% 27 MB|===================================================================| 99% 27 MB|===================================================================| 99% 27 MB|===================================================================| 99% 27 MB|===================================================================| 99% 27 MB|===================================================================| 100% 28 MB Created on 2020-11-16 by the reprex package (v0.3.0) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When a non-ASCII file is read using
read_delim()
or its friends and there are parsing failures and the text used bywarn_problems()
contains non-ASCII characters, the following error is returned:This is quite hard to debug because the error makes it seem like the encoding was set incorrectly or the file has problems. (In my case it only occurred with 2 out of about 6000 CSV files read using
map_dfr()
even though all contained non-ASCII text.)In fact the error occurs in
warn_problems()
:readr/R/problems.R
Line 83 in 05890c3
where
nchar()
throws an error if one of the problems "sampled" for display contains a non-UTF-8 string. The data is in fact read correctly insideread_delimited()
; the call only fails oncewarn_problems()
is called.Once I supplied the column specification and the non-ASCII text no longer appeared in the list of reported problems, the error no longer occurred (see reprex).
Possibly related to #1111 where the same error is thrown by
parse_number()
; this may have a similar root cause though I haven't investigated specifically.Reprex:
Inspecting this in the debugger, the
probs_f
variable used insidewarn_problems()
does indeed contain non-ASCII text.This works as expected:
The file referenced in the reprex above was correctly read using
read.csv2
also using the "WINDOWS-1250" encoding. Likewise it can be read in python using the pandas CSV reader with the same encoding.System:
Mac OS
R 4.0.2
readr 1.4.0 from CRAN
The text was updated successfully, but these errors were encountered: