i have binned data in intervals (of 100000) using 2 different frames: 0 100000 , onwards, , 50000 150000 , onwards. joined both dataframes, using 1 column identifier frames (represented in column "x100kb").
for purpose, if 2 rows (edit: don't need sequent each other; since data not ordered "chr" , "x100kb" right now) differ in "x100kb" 0.5 (preferably comparing whole numbers +0.5; eg: 60 60.5, 65 65.5; etc) have same values in "chr" , "occurrences_norm" , "occurrences_tum"; equal , want remove 1 of them. thing coming mind loops, obviusly not productive...
data example:
chr x100kb occurrences_norm occurrences_tum fold 19064 chr17 61.5 17 0 14.05333 38799 chr5 526.0 16 0 13.96587 38800 chr5 526.5 16 0 13.96587 39946 chr5 1113.5 16 0 13.96587 2377 chr1 1426.0 15 0 13.87277 21859 chr18 733.5 15 0 13.87277 20538 chr18 24.0 14 0 13.77324 21863 chr18 735.5 14 0 13.77324 37699 chr4 1835.5 14 0 13.77324 39924 chr5 1102.5 14 0 13.77324 21506 chr18 550.5 13 0 13.66633 21862 chr18 735.0 13 0 13.66633 22258 chr19 151.5 13 0 13.66633 38972 chr5 613.0 13 0 13.66633 41707 chr6 194.5 13 0 13.66633 2380 chr1 1427.5 12 0 13.55087 20541 chr18 25.5 12 0 13.55087 21252 chr18 421.0 12 0 13.55087 27384 chr2 2243.0 12 0 13.55087 39990 chr5 1135.5 12 0 13.55087
in example, 3rd row removed.
i read question in different way. thought need compare 2 sequent rows. example, check row 1 & 2, row 2 & 3, , on. thought condition difference in x100kb 0.5, not large 0.5. thought running 4 logical checks, using shift()
, 1 way achieve goal.
setdt(df1)[!((abs(x100kb - shift(x100kb, type = "lag", fill = -inf)) == 0.5) & (chr == shift(chr, type = "lag")) & (occurrences_norm == shift(occurrences_norm, type = "lag")) & (occurrences_tum == shift(occurrences_tum, type = "lag"))) ] # chr x100kb occurrences_norm occurrences_tum fold # 1: chr17 61.5 17 0 14.05333 # 2: chr5 526.0 16 0 13.96587 # 3: chr5 1113.5 16 0 13.96587 # 4: chr1 1426.0 15 0 13.87277 # 5: chr18 733.5 15 0 13.87277 # 6: chr18 24.0 14 0 13.77324 # 7: chr18 735.5 14 0 13.77324 # 8: chr4 1835.5 14 0 13.77324 # 9: chr5 1102.5 14 0 13.77324 #10: chr18 550.5 13 0 13.66633 #11: chr18 735.0 13 0 13.66633 #12: chr19 151.5 13 0 13.66633 #13: chr5 613.0 13 0 13.66633 #14: chr6 194.5 13 0 13.66633 #15: chr1 1427.5 12 0 13.55087 #16: chr18 25.5 12 0 13.55087 #17: chr18 421.0 12 0 13.55087 #18: chr2 2243.0 12 0 13.55087 #19: chr5 1135.5 12 0 13.55087
Comments
Post a Comment