bash - Check if two lines start with the same character, if so the output average, if not, print actual value -


i'd check if 2 rows start same number in 1st column, if happens, average of 2nd column should displayed. file example:

01  21    6    10%       93.3333% 01  22    50   83.3333%  93.3333% 02  20.5  23   18.1102%  96.8504% 02  21.5  100  78.7402%  96.8504% 03  22.2  0    0%        100% 03  21.2  29   100%      100% 04  22.5  1    5.55556%  100% 04  23.5  17   94.4444%  100% 05  22.7  9    7.82609%  100% 05  21.7  106  92.1739%  100% 06  23    11   17.4603%  96.8254% 06  22    50   79.3651%  96.8254% 07  20.5  14   18.6667%  96% 07  21.5  58   77.3333%  96% 08  21.8  4    100%      100% 09  22.6  0    0%        100% 09  21.6  22   100%      100% 

for instance, 2 first lines start 01, there 1 line starting 08 (15th line). therefore, output based on these 2 cases should be:

01 21.5 ... ... ... 08 21.8 ... ... ... 

i ended following awk line, works great when file has 2 similar lines, fails using file shown above (because 15th line):

awk '{sum+=$2} (nr%2)==0{print sum/2; sum=0;}' 

any hint welcomed,

this awk should work:

awk 'function dump(){if (n>0) printf "%s%s%.2f\n", p, ofs, sum/n}      nr>1 && $1 != p{dump(); sum=n=0} {p=$1; sum+=$2; n++} end{dump()}' file 01 21.5 02 21.0 03 21.7 04 23.0 05 22.2 06 22.5 07 21.0 08 21.8 09 22.1 

explanation: using 3 variables:

p -> hold previous row's $1 value n -> count of similar $1 values sum -> sum of $2 values similar $1 rows 

how works:

nr>1 && $1 != p     # when row #1 > 1 , prev $1 not current $1 dump()              # function print formatted value of $1 , average p=$1; sum+=$2; n++  # sets p $1, adds current $2 sum , increments n 

Comments