Solved – Split data into N equal groups

distributionsr

I have a dataframe which contains values across 4 columns:

For example:ID,price,click count,rating

What I would like to do is to "split" this dataframe into N different groups where each group will have equal number of rows with same distribution of price, click count and ratings attributes.

Any advice is strongly appreciated, as I don't have the slightest idea on how to tackle this !

Best Answer

If I understand the question correctly, this will get you what you want. Assuming your data frame is called df and you have N defined, you can do this:

split(df, sample(1:N, nrow(df), replace=T))

This will return a list of data frames where each data frame is consists of randomly selected rows from df. By default sample() will assign equal probability to each group.