dataframe - How to collapse a matrix in R while getting min and max of some columns -
I have the following data frame in R. It represents a built-in protein structure for ease of explanation.
Uniprots Chain ResSeq Serial P68871 D23 3446 P68871 D 24 3453 P68871 D 25 3457 P68871 D 26 3461 P68871 D 27 3470 P69011 A 38 3561 P69011 A39 3568 P69011 A40 3577 P69011 A41 3588 P69011 A42 35 99 P69011 A43 3610 P69011 A44 3625 P69011 A 463636 P0116 B2 4239 P0116 B4 4242 P0116 B5 4268 P0116 B6 4279 P0116 B 7 4285 P0116 B 8 4299 P0116 B90101 P0116 C15 5055 P0116 C 30 5199 P0116 C 42 5239 What do I need to do to drop it down, looks like this: Uniprot chain resSeq_start resSeq_end Serial_start Serial_end P68871 D 23 27 3446 3470 P69011 A 38 46 3561 3636 P0116 B 2 9 4239 5015 P0116 C 15 42 5055 5239 Actually, I want to fall down on the Fi RST 1,2 and 3 columns B. I can use in the fourth column as a Czech I which he had worked. I thought I could do it with the total, but that does not work. I can certainly do this with some glitch for some loops (reach a vector until a new uniprot / chain) but it's ugly. The noteable thing is that uniprot / chain combinations are not always unique, in particular, an uniprot can be multiple chains (as my example)
your help Thank You for!
total
: a base solution (which I like) @user Provided since 20650 (important from do.call
) is important because a data frame will return, but with matrix elements)
(column = (X), end = maximum (x))) #uniprops chain race sec (database, serial) ~ uniprots + chains, data = data, function (x) .start resSeq.end Serial.start Serial.end # 1 P69011 A 38 46 3561 3636 # 2 P0116 B294239 5015 # 3 P0116 C1 5 42 5055 5239 # 4 P68871 D23 27 3446 3470
plyr
dat < - psych :: read.clipboard () Library (plyr) ddply (dat. (Uniprots, Chains), Summary, ResSeq_start = Min (resSeq), resSeq_end = Max (ResSeq), serial_start = serial [that.min (resSeq) ], Serial_end = serial [which.max (resSeq)]) #uniprops chains resSeq_start resSeq_end Serial_start Serial_end # 1 P0116 B2 94239 5015 # 2 P0116C15 42 5055 5239 # 3 P68871 D23 27 3446 3470 # 4 P69011 One 38 46 3561 3636
(J. min / max probably not needed)
Comments
Post a Comment