R smbinning package: why 'Too many categories' for some variables? -
i have dataset in r containing many variables of different types , attempting use smbinning package calculate information value.
i using following code:
smbinning.sumiv(sample,y="flag")
this code produces iv of variables, process column states 'too many categories' shown in output below:
char iv process 12 relationship na many categories 15 nationality na many categories 22 business_activity na many categories 23 business_activity_group na many categories 25 local_authority na many categories 26 neighbourhood na many categories
if take @ values of business_activity_group instance, can see there not many possible values can take:
affordable rent combined commercial community combined 2546 4 freeholders combined garages 23 6 general needs combined keyworker 57140 340 leasehold combined market rented combined 88 1463 older persons combined rent homebuy 4774 76 shared ownership combined staff acommodation combined 167 5 supported combined 2892
i thought due low volumes in of categories tried banding of groups together. did not change result.
can please explain why 'too many categories' occurs, , can these variables in order produce iv smbinning package?
Comments
Post a Comment